The document discusses the Analytical Information Markup Language (AnIML), which is an XML standard for storing and sharing instrument data. AnIML allows data to be stored in a vendor-neutral, sharable format. It has been in development since 2003 as a replacement for JCAMP-DX. The schema structure and data structures of AnIML allow arrays of data from multiple instruments to be stored together. AnIML data can be transformed and accessed using XML technologies like XSLT, XPath, and XQuery to convert or extract the data in different formats.
1. The Analytical Information
Markup Language (AnIML) for
instrument data storage and access
Stuart J. Chalk
Department of Chemistry
University of North Florida
Jacksonville, FL USA
schalk@unf.edu
Liberating Laboratory Data – Day 1
2. Liberating Laboratory Data
What does this mean?
In a vendor, platform, and language independent format
Archivable, Authenticated, Provenanced
Datatyped, Qualified (accuracy/precision for numeric data)
Contextualized – annotated with descriptive metadata
Uniquely Referenceable – URI, DOI
Shareable, Searchable, Readable by computers and human
3. AnIML History
AnIML is an activity under ASTM subcommittee E13.15 on
Analytical Data (http://animl.sourceforge.net/)
Work on AnIML began in 2003
Designed as a replacement for JCAMP-DX (backwards
compatible).
Charter: "Develop an analytical data standard that can be
used to store data from any analytical instrument"
Task group holds virtual meetings on a monthly basis to
develop the specification
Targeted to through ASTM balloting in 2014
6. AnIML Data Structures
The “Series” element is used to store arrays of data
Can contain many x/y spectra in one data file
(good for LC-UV/MS data for instance)
Also used for the chromatogram (time slice) data
Autoincrement Value Set
Typically used for evenly distributed data (e.g. x-axis)
Individual Value Set
Typically used for y-axis data
Encoded Value Set
Base64 encoded binary data (per XML specification)
11. Publishing AnIML Stored Data
AnIML being XML leverages a variety of tools and
technologies
Making data in AnIML files accessible can be achieved
by using
eXtensible Stylesheet Language (XSL) transformations
-> to convert data into different formats
-> to process data into results
XPath -> provide unique identifiers/references to data
points or data sets
XQuery -> search for particular data with a dataset
12. XSLT
eXensible Stylesheet Language (XSL) is an XML
standard for conversion of XML encode data to other
formats
E.g. HTML, PDF, Javascript Object Notation (JSON) , or
even graphics
Scaled Vector Graphics (SVG) is (another!) XML
specification for vector graphics
So we can use and XSL Transformation (XSLT)
processor (e.g. Saxon) to convert data stored in the
AnIML to a graphic representation of the data
13. XSLT
An XML file that extracts data from another
XML document and formats its based on
specifications
Returning data in JSON format
{"data":[200.0:.3720,200.5:.3503,201.0:.5042,201.5:.0130, …]}
14. XSLT
An XML file that extracts data from another XML
document and formats its based on specifications
Returning data in JSON format
21. Conclusion
AnIML being an XML specification makes it easily
readable, archivable, and searchable
The data within an AnIML file can easily be extracted,
manipulated and repurposed
With the development of additional XML
technologies the options for using and sharing AnIML
data will only increase over time