SlideShare une entreprise Scribd logo
1  sur  21
Experiment Markup Language:
A Combined Markup Language and
Ontology to Represent Science
Stuart J. Chalk
Department of Chemistry
University of North Florida
schalk@unf.edu
2014 Spring ACS Meeting – CINF Paper 19
 Digital Representation of Science
 Electronic Notebooks
 The Eureka Research Workbench
 Experiment Markup Language
 ExptML Schema and Files
 Semantic Data and Ontologies
 File Storage
 Eureka Interface
 Web Interface
 Conclusion
Outline
 Most research on digital science is focused on the data
 Standards exist for the digital representation of
 Data -> individual measurements, time series, spectra
 Molecules
 Chemical Reactions
 Context is important!
 Context can be added ad-hoc
 Needs to be added systematically - to be searchable
 We need a digital representation of the scientific process
Digital Representation of Science
 Conceptualized in 2006
 Need a way to store
 Research activities
 Laboratory resources
 Data
 Need to capture the workflow of scientists – not define it
 Writing in a lab notebook is equivalent to blogging…
 …but the context of the entries is important and varies
 Many data types, so how to capture information?
 Experiment Markup Language (ExptML)
Eureka Research Workbench
 A specification (written in XML) that describes different
types of information recorded during the scientific process
(http://exptml.sourceforge.net)
Experiment Markup Language (ExptML)
 Sample
 Solution
 Space
 Specimen
 Substance
 Task
 Template
 Timeline
 User
 Vendor
 Annotation
 Api
 Calculation
 Chemical
 Citation
 Customer
 Data
 Dataset
 Definition
 Element
 Equipment
 Event
 Experiment
 Group
 Message
 Project
 Protocol
 Quote
 Report
 Result
ExptML Chemical Schema
ExptML
Chemical
Schema
ExptML Chemical (Instance)
 To allow ExptML to capture a scientific workflow, an ontology
is needed to represent the structure
 Needs to be
 Flexible – able to be used in a wide variety of areas
 Logical – the links make sense in the context of science
 Searchable – so we can find research done in a similar way
 Comprehensive! This is the BIG problem
 Many existing ontologies
Linking ExptML Files
 In computer science and ontology
“formally represents knowledge as a set of concepts within
a domain, and the relationships between those concepts. It
can be used to model a domain and support reasoning about
concepts.”*
 In essence, an ontology allows us to define the
relationships and assertions about concepts
 For samples represented in ExptML we define
 isSample (assertion)
 hasSample (relationship)
 isSampleOf (relationship)
ExptML Ontology
*https://en.wikipedia.org/wiki/Ontology_(information_science)
ExptML Ontology
 XML is nice for storage, archiving and transmitting
information…
 …but it is not so easy to use in software
 Many XML readers but each have their own syntax
 Can be cumbersome to deal in software with
 File size (XML is verbose)
 Namespaces
 Data types (e.g. string, decimal, etc…)
 So the solution is…
Developments in ExptML
 JSONize it!
 Compact string representation of arrays of data
 Used in AJAX requests in web browsers
Javascript Object Notation (JSON)
{
“exptmlid”: “exptml:ann1”,
“anntype”: “comment”,
“text”: “Had to wait for the biochemistry lab
to finish using the spectrophotometer before the I
could get on it. The standards sat around for 1 hr
30 minutes before I could run them.”,
“date”: “2011-11-25T11:05:17-04:00”
}
<annotation id="exptml_ann1" xmlns="urn:exptml:schema:draft:0.4"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:exptml:schema:draft:0.4
http://exptml.sourceforge.net/files/schema/exptml_annotation.xsd"
version="0.4">
<anntype>comment</anntype>
<text>Had to wait for the biochemistry lab to finish using
the spectrophotometer before the I could get on it. The standards
sat around for 1 hr 30 minutes before I could run them.</text>
<date>2011-11-25T11:05:17-04:00</date>
</annotation>
 JSON-based Serialization for Linked Data
 Current W3C recommendation*
 Allows us to define a specification for the JSON data
 “@content” is equivalent to an XML Schema
JSON-LD
*http://www.w3.org/TR/json-ld
{
“@context”:
{
“exptmlid”: “http://www.w3.org/2001/XMLSchema#string”,
“anntype”: “http://www.w3.org/2001/XMLSchema#string”,
“text”: “http://www.w3.org/2001/XMLSchema#string”,
“date”: “http://www.w3.org/2001/XMLSchema#dateTime”
}
}
JSON-LD
{
“@context”:
{
“exptmlid”: “http://www.w3.org/2001/XMLSchema#string”,
“anntype”: “http://www.w3.org/2001/XMLSchema#string”,
“text”: “http://www.w3.org/2001/XMLSchema#string”,
“date”: “http://www.w3.org/2001/XMLSchema#dateTime”
}
“exptmlid”: “exptml:ann1”,
“anntype”: “comment”,
“text”: “Had to wait for the biochemistry lab to finish
using the spectrophotometer before the I could get on it. The
standards sat around for 1 hr 30 minutes before I could run
them.”,
“date”: “2011-11-25T11:05:17-04:00”
}
 @id represents an Internationalized Resource Identifier (IRI)
 The IRI identifies a node and allows this data to be linked
JSON-LD
{
“@context”: “http://exptld.org/annotation.jsonld”
“@id”: “https://eureka.coas.unf.edu/exptml:ann1”,
“anntype”: “comment”,
“text”: “Had to wait for the biochemistry lab to finish
using the spectrophotometer before the I could get on it. The
standards sat around for 1 hr 30 minutes before I could run
them.”,
“date”: “2011-11-25T11:05:17-04:00”
}
 Current the ontology defines generic relationships
 Should be expanded to provide additional context
Developments in the Ontology
<rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution">
<rdfs:label>has solution</rdfs:label>
<rdfs:comment>Indicates that an experiment makes use of a particular
solution</rdfs:comment>
<rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#rels"/>
</rdf:Property>
<rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasBuffer">
<rdfs:label>has buffer</rdfs:label>
<rdfs:comment>Indicates that an experiment makes use of a buffer (solution)</rdfs:comment>
<rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution"/>
</rdf:Property>
<rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasReagent">
<rdfs:label>has reagent</rdfs:label>
<rdfs:comment>Indicates that an experiment makes use of a reagent (solution)</rdfs:comment>
<rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution"/>
</rdf:Property>
<rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasCalibrationStandard">
<rdfs:label>has calibration standard</rdfs:label>
<rdfs:comment>Indicates that an experiment makes use of a calibration standard</rdfs:comment>
<rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution”/>
</rdf:Property>
 BIG Problem!
 Context is specific to the science and the scientist
 How many sub-properties of “hasSolution” are needed?
 Additional context is domain specific so…
 … we need to integrate other related ontologies
 Map “hasSolution” to predicates in other ontologies
 Use VIVO to choose the ‘best’ domain specific ontology
 Aggregate science ontologies? – requires software/time
 Evaluate ElasticSearch (http://www.elasticsearch.org)
Expand the Ontology
 JSON-LD is a concrete RDF syntax!*
 JSON-LD can be converted to triples
Combine ML and Ontology?
*http://www.w3.org/TR/json-ld/#relationship-to-rdf
{
"@context": "http://exptld.org/annotation.jsonld",
"@id": "https://eureka.coas.unf.edu/exptml:ann1",
"anntype": "comment",
"text": "Had to wait for the biochemistry lab to finish using the
spectrophotometer before the I could get on it. The standards
sat around for 1 hr 30 minutes before I could run them.",
"date": "2011-11-25T11:05:17-04:00",
"hasUser": [
{ "@id": "https://eureka.coas.unf.edu/exptml:usr1” },
{ "@id": "https://eureka.coas.unf.edu/exptml:usr11”}
],
"hasExperiment": { "@id": "https://eureka.coas.unf.edu/exptml:exp1" }
}
 Nice start - allows for conceptual evaluation of the approach
 Needs work – “science cannot be described by one alone”
 TODO
 Integrate and aggregate existing ontologies
 Work with ELN developers e.g. LabTrove and elnItemManifest*
 Encourage ontology development in areas where gaps exist
e.g. Chemical Analysis
 Contribute to standards development
e.g. Research Data Alliance (RDA) – http://rd-alliance.org
Conclusion
* “First steps towards semantic descriptions of electronic laboratory notebook records“,
S J Coles, J G Frey, C L Bird, R J Whitby and A E Day, J. Cheminformatics, 2013, 5:52 http://doi.dx.org/10.1186/1758-2946-5-52
References
 Eureka – http://sourceforge.net/projects/eureka
 Fedora-Commons – http://fedora-commons.org
 XML – http://www.w3.org/standards/xml
 ExptML – http://exptml.sourceforge.net/
 JSON-LD – http://www.w3.org/TR/json-ld
 UnitsML – http://unitsml.nist.gov/
 RDF – http://www.w3.org/RDF/
 CIR – http://cactus.nci.nih.gov/chemical/structure
 RDA – http://rd-alliance.org
 Research Data Alliance (https://rd-alliance.org/)
 http://www.nytimes.com/2013/08/13/science/how-to-share-scientific-data.html

Contenu connexe

Tendances

Building a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP ProjectBuilding a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP Project
Stuart Chalk
 

Tendances (13)

Instance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge AcquisitionInstance-Based Ontological Knowledge Acquisition
Instance-Based Ontological Knowledge Acquisition
 
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
 
Phd tesis olga giraldo 10mayo
Phd tesis olga giraldo 10mayoPhd tesis olga giraldo 10mayo
Phd tesis olga giraldo 10mayo
 
Project proposal for a fishery ontology service
Project proposal for a fishery ontology serviceProject proposal for a fishery ontology service
Project proposal for a fishery ontology service
 
Meghyn slides-hse-2014
Meghyn slides-hse-2014Meghyn slides-hse-2014
Meghyn slides-hse-2014
 
Building a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP ProjectBuilding a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP Project
 
NeXML - phylogenetic data as XML
NeXML - phylogenetic data as XMLNeXML - phylogenetic data as XML
NeXML - phylogenetic data as XML
 
The Chemtools LaBLog
The Chemtools LaBLogThe Chemtools LaBLog
The Chemtools LaBLog
 
Roadmap for a multilingual BioPortal
Roadmap for a multilingual BioPortalRoadmap for a multilingual BioPortal
Roadmap for a multilingual BioPortal
 
DataScience Meeting II - Roman Kern - Building an open source based search so...
DataScience Meeting II - Roman Kern - Building an open source based search so...DataScience Meeting II - Roman Kern - Building an open source based search so...
DataScience Meeting II - Roman Kern - Building an open source based search so...
 
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
 
SWiM – A wiki for collaborating on mathematical ontologies
SWiM – A wiki for collaborating on mathematical ontologiesSWiM – A wiki for collaborating on mathematical ontologies
SWiM – A wiki for collaborating on mathematical ontologies
 
Perspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from textPerspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from text
 

Similaire à 247th ACS Meeting: Experiment Markup Language (ExptML)

Sem facet paper
Sem facet paperSem facet paper
Sem facet paper
DBOnto
 
Vivo ontology overviewanddirections.2013-04-25
Vivo ontology overviewanddirections.2013-04-25Vivo ontology overviewanddirections.2013-04-25
Vivo ontology overviewanddirections.2013-04-25
joncr
 

Similaire à 247th ACS Meeting: Experiment Markup Language (ExptML) (20)

Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
Liberating Laboratory Data - Eureka
Liberating Laboratory Data - EurekaLiberating Laboratory Data - Eureka
Liberating Laboratory Data - Eureka
 
Lecture 7: Semantic Technologies and Interoperability
Lecture 7: Semantic Technologies and InteroperabilityLecture 7: Semantic Technologies and Interoperability
Lecture 7: Semantic Technologies and Interoperability
 
Semantic Web from the 2013 Perspective
Semantic Web from the 2013 PerspectiveSemantic Web from the 2013 Perspective
Semantic Web from the 2013 Perspective
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paper
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paper
 
Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1
 
Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...Integrating a Domain Ontology Development Environment and an Ontology Search ...
Integrating a Domain Ontology Development Environment and an Ontology Search ...
 
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
 
Vivo ontology overviewanddirections.2013-04-25
Vivo ontology overviewanddirections.2013-04-25Vivo ontology overviewanddirections.2013-04-25
Vivo ontology overviewanddirections.2013-04-25
 
Cross-lingual event-mining using wordnet as a shared knowledge interface
Cross-lingual event-mining using wordnet as a shared knowledge interfaceCross-lingual event-mining using wordnet as a shared knowledge interface
Cross-lingual event-mining using wordnet as a shared knowledge interface
 
ACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
ACS 248th Paper 146 VIVO/ScientistsDB Integration into EurekaACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
ACS 248th Paper 146 VIVO/ScientistsDB Integration into Eureka
 
E05412327
E05412327E05412327
E05412327
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of Information
 
Semantic Web: From Representations to Applications
Semantic Web: From Representations to ApplicationsSemantic Web: From Representations to Applications
Semantic Web: From Representations to Applications
 
Wanna search? Piece of cake!
Wanna search? Piece of cake!Wanna search? Piece of cake!
Wanna search? Piece of cake!
 
ACS 248th Paper 67 Eureka Collaboration
ACS 248th Paper 67 Eureka CollaborationACS 248th Paper 67 Eureka Collaboration
ACS 248th Paper 67 Eureka Collaboration
 
Expression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseExpression of Query in XML object-oriented database
Expression of Query in XML object-oriented database
 
Expression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseExpression of Query in XML object-oriented database
Expression of Query in XML object-oriented database
 
Expression of Query in XML object-oriented database
Expression of Query in XML object-oriented databaseExpression of Query in XML object-oriented database
Expression of Query in XML object-oriented database
 

Plus de Stuart Chalk

Plus de Stuart Chalk (18)

Semantic properties and units
Semantic properties and unitsSemantic properties and units
Semantic properties and units
 
Open semantic chemical structures
Open semantic chemical structuresOpen semantic chemical structures
Open semantic chemical structures
 
ChemExtractor: Enhanced Rule-Based Capture and Identification of PDF Based Pr...
ChemExtractor: Enhanced Rule-Based Capture and Identification of PDF Based Pr...ChemExtractor: Enhanced Rule-Based Capture and Identification of PDF Based Pr...
ChemExtractor: Enhanced Rule-Based Capture and Identification of PDF Based Pr...
 
AnIML: A New Analytical Data Standard
AnIML: A New Analytical Data StandardAnIML: A New Analytical Data Standard
AnIML: A New Analytical Data Standard
 
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical DataA Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
 
Scientific Units in the Electronic Age
Scientific Units in the Electronic AgeScientific Units in the Electronic Age
Scientific Units in the Electronic Age
 
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
 
The Electronic Notebook Ontology
The Electronic Notebook OntologyThe Electronic Notebook Ontology
The Electronic Notebook Ontology
 
Sharing Science Data: Semantically Reimagining the IUPAC Solubility Series Data
Sharing Science Data: Semantically Reimagining the IUPAC Solubility Series DataSharing Science Data: Semantically Reimagining the IUPAC Solubility Series Data
Sharing Science Data: Semantically Reimagining the IUPAC Solubility Series Data
 
Bringing Flow injection Analysis to the Semantic Web
Bringing Flow injection Analysis to the Semantic WebBringing Flow injection Analysis to the Semantic Web
Bringing Flow injection Analysis to the Semantic Web
 
Reactions to the Open Spectral Database
Reactions to the Open Spectral DatabaseReactions to the Open Spectral Database
Reactions to the Open Spectral Database
 
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
Integrating AnIML Files in Electronic Laboratory Notebooks - PittCon 2015
 
A Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSXA Standard Data Format for Computational Chemistry: CSX
A Standard Data Format for Computational Chemistry: CSX
 
Overview of the Analytical Information Markup Language (AnIML)
Overview of the Analytical Information Markup Language (AnIML)Overview of the Analytical Information Markup Language (AnIML)
Overview of the Analytical Information Markup Language (AnIML)
 
ACS 248th Paper 108 NIST-IUPAC Solubility Data
ACS 248th Paper 108 NIST-IUPAC Solubility DataACS 248th Paper 108 NIST-IUPAC Solubility Data
ACS 248th Paper 108 NIST-IUPAC Solubility Data
 
ACS 248th Paper 104 ChemData Project
ACS 248th Paper 104 ChemData ProjectACS 248th Paper 104 ChemData Project
ACS 248th Paper 104 ChemData Project
 
ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP Project
 
Liberating Laboratory Data - AnIML
Liberating Laboratory Data - AnIMLLiberating Laboratory Data - AnIML
Liberating Laboratory Data - AnIML
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

247th ACS Meeting: Experiment Markup Language (ExptML)

  • 1. Experiment Markup Language: A Combined Markup Language and Ontology to Represent Science Stuart J. Chalk Department of Chemistry University of North Florida schalk@unf.edu 2014 Spring ACS Meeting – CINF Paper 19
  • 2.  Digital Representation of Science  Electronic Notebooks  The Eureka Research Workbench  Experiment Markup Language  ExptML Schema and Files  Semantic Data and Ontologies  File Storage  Eureka Interface  Web Interface  Conclusion Outline
  • 3.  Most research on digital science is focused on the data  Standards exist for the digital representation of  Data -> individual measurements, time series, spectra  Molecules  Chemical Reactions  Context is important!  Context can be added ad-hoc  Needs to be added systematically - to be searchable  We need a digital representation of the scientific process Digital Representation of Science
  • 4.  Conceptualized in 2006  Need a way to store  Research activities  Laboratory resources  Data  Need to capture the workflow of scientists – not define it  Writing in a lab notebook is equivalent to blogging…  …but the context of the entries is important and varies  Many data types, so how to capture information?  Experiment Markup Language (ExptML) Eureka Research Workbench
  • 5.  A specification (written in XML) that describes different types of information recorded during the scientific process (http://exptml.sourceforge.net) Experiment Markup Language (ExptML)  Sample  Solution  Space  Specimen  Substance  Task  Template  Timeline  User  Vendor  Annotation  Api  Calculation  Chemical  Citation  Customer  Data  Dataset  Definition  Element  Equipment  Event  Experiment  Group  Message  Project  Protocol  Quote  Report  Result
  • 9.  To allow ExptML to capture a scientific workflow, an ontology is needed to represent the structure  Needs to be  Flexible – able to be used in a wide variety of areas  Logical – the links make sense in the context of science  Searchable – so we can find research done in a similar way  Comprehensive! This is the BIG problem  Many existing ontologies Linking ExptML Files
  • 10.  In computer science and ontology “formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to model a domain and support reasoning about concepts.”*  In essence, an ontology allows us to define the relationships and assertions about concepts  For samples represented in ExptML we define  isSample (assertion)  hasSample (relationship)  isSampleOf (relationship) ExptML Ontology *https://en.wikipedia.org/wiki/Ontology_(information_science)
  • 12.  XML is nice for storage, archiving and transmitting information…  …but it is not so easy to use in software  Many XML readers but each have their own syntax  Can be cumbersome to deal in software with  File size (XML is verbose)  Namespaces  Data types (e.g. string, decimal, etc…)  So the solution is… Developments in ExptML
  • 13.  JSONize it!  Compact string representation of arrays of data  Used in AJAX requests in web browsers Javascript Object Notation (JSON) { “exptmlid”: “exptml:ann1”, “anntype”: “comment”, “text”: “Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them.”, “date”: “2011-11-25T11:05:17-04:00” } <annotation id="exptml_ann1" xmlns="urn:exptml:schema:draft:0.4" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:exptml:schema:draft:0.4 http://exptml.sourceforge.net/files/schema/exptml_annotation.xsd" version="0.4"> <anntype>comment</anntype> <text>Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them.</text> <date>2011-11-25T11:05:17-04:00</date> </annotation>
  • 14.  JSON-based Serialization for Linked Data  Current W3C recommendation*  Allows us to define a specification for the JSON data  “@content” is equivalent to an XML Schema JSON-LD *http://www.w3.org/TR/json-ld { “@context”: { “exptmlid”: “http://www.w3.org/2001/XMLSchema#string”, “anntype”: “http://www.w3.org/2001/XMLSchema#string”, “text”: “http://www.w3.org/2001/XMLSchema#string”, “date”: “http://www.w3.org/2001/XMLSchema#dateTime” } }
  • 15. JSON-LD { “@context”: { “exptmlid”: “http://www.w3.org/2001/XMLSchema#string”, “anntype”: “http://www.w3.org/2001/XMLSchema#string”, “text”: “http://www.w3.org/2001/XMLSchema#string”, “date”: “http://www.w3.org/2001/XMLSchema#dateTime” } “exptmlid”: “exptml:ann1”, “anntype”: “comment”, “text”: “Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them.”, “date”: “2011-11-25T11:05:17-04:00” }
  • 16.  @id represents an Internationalized Resource Identifier (IRI)  The IRI identifies a node and allows this data to be linked JSON-LD { “@context”: “http://exptld.org/annotation.jsonld” “@id”: “https://eureka.coas.unf.edu/exptml:ann1”, “anntype”: “comment”, “text”: “Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them.”, “date”: “2011-11-25T11:05:17-04:00” }
  • 17.  Current the ontology defines generic relationships  Should be expanded to provide additional context Developments in the Ontology <rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution"> <rdfs:label>has solution</rdfs:label> <rdfs:comment>Indicates that an experiment makes use of a particular solution</rdfs:comment> <rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#rels"/> </rdf:Property> <rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasBuffer"> <rdfs:label>has buffer</rdfs:label> <rdfs:comment>Indicates that an experiment makes use of a buffer (solution)</rdfs:comment> <rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution"/> </rdf:Property> <rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasReagent"> <rdfs:label>has reagent</rdfs:label> <rdfs:comment>Indicates that an experiment makes use of a reagent (solution)</rdfs:comment> <rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution"/> </rdf:Property> <rdf:Property rdf:ID="http://exptml.sourceforge.net/exptml_ontology.owl#hasCalibrationStandard"> <rdfs:label>has calibration standard</rdfs:label> <rdfs:comment>Indicates that an experiment makes use of a calibration standard</rdfs:comment> <rdfs:subPropertyOf rdf:resource="http://exptml.sourceforge.net/exptml_ontology.owl#hasSolution”/> </rdf:Property>
  • 18.  BIG Problem!  Context is specific to the science and the scientist  How many sub-properties of “hasSolution” are needed?  Additional context is domain specific so…  … we need to integrate other related ontologies  Map “hasSolution” to predicates in other ontologies  Use VIVO to choose the ‘best’ domain specific ontology  Aggregate science ontologies? – requires software/time  Evaluate ElasticSearch (http://www.elasticsearch.org) Expand the Ontology
  • 19.  JSON-LD is a concrete RDF syntax!*  JSON-LD can be converted to triples Combine ML and Ontology? *http://www.w3.org/TR/json-ld/#relationship-to-rdf { "@context": "http://exptld.org/annotation.jsonld", "@id": "https://eureka.coas.unf.edu/exptml:ann1", "anntype": "comment", "text": "Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them.", "date": "2011-11-25T11:05:17-04:00", "hasUser": [ { "@id": "https://eureka.coas.unf.edu/exptml:usr1” }, { "@id": "https://eureka.coas.unf.edu/exptml:usr11”} ], "hasExperiment": { "@id": "https://eureka.coas.unf.edu/exptml:exp1" } }
  • 20.  Nice start - allows for conceptual evaluation of the approach  Needs work – “science cannot be described by one alone”  TODO  Integrate and aggregate existing ontologies  Work with ELN developers e.g. LabTrove and elnItemManifest*  Encourage ontology development in areas where gaps exist e.g. Chemical Analysis  Contribute to standards development e.g. Research Data Alliance (RDA) – http://rd-alliance.org Conclusion * “First steps towards semantic descriptions of electronic laboratory notebook records“, S J Coles, J G Frey, C L Bird, R J Whitby and A E Day, J. Cheminformatics, 2013, 5:52 http://doi.dx.org/10.1186/1758-2946-5-52
  • 21. References  Eureka – http://sourceforge.net/projects/eureka  Fedora-Commons – http://fedora-commons.org  XML – http://www.w3.org/standards/xml  ExptML – http://exptml.sourceforge.net/  JSON-LD – http://www.w3.org/TR/json-ld  UnitsML – http://unitsml.nist.gov/  RDF – http://www.w3.org/RDF/  CIR – http://cactus.nci.nih.gov/chemical/structure  RDA – http://rd-alliance.org  Research Data Alliance (https://rd-alliance.org/)  http://www.nytimes.com/2013/08/13/science/how-to-share-scientific-data.html