SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
Tactical Formalization of Linked Open Data
@micheldumontier::OntologySummit2014:March 6,20141
Michel Dumontier, Ph.D.
Associate Professor of Medicine (Biomedical Informatics)
Stanford University
Linked Open Data provides an incredibly dynamic, rapidly
growing set of interlinked resources
@micheldumontier::OntologySummit2014:March 6,20142
@micheldumontier::OntologySummit2014:March 6,2014
Linked Data for the Life Sciences
3
Bio2RDF converts bio-data in RDF format and ensures
URI integrity by conferring with its registry of datasets
the simplicity of the triple makes it
easy to proliferate
@micheldumontier::OntologySummit2014:March 6,20144
but the lack of coordination makes Linked
Open Data is quite chaotic and unwieldy
@micheldumontier::OntologySummit2014:March 6,20145
Massive Proliferation of Ontologies / Vocabularies
@micheldumontier::OntologySummit2014:March 6,20146
Despite all the data, it’s still hard to find answers to questions
Because there are many ways to represent the same data
and each dataset represents it differently
@micheldumontier::OntologySummit2014:March 6,20147
uniprot:P05067
uniprot:Protein
is a
sio:gene
is a is a
Semantic data integration, consistency checking and
query answering over Bio2RDF with the
Semanticscience Integrated Ontology (SIO)
dataset
ontology
Knowledge Base
@micheldumontier::OntologySummit2014:March 6,2014
pharmgkb:PA30917
refseq:Protein
is a
is a
omim:189931
omim:Gene pharmgkb:Gene
Querying Bio2RDF Linked Open Data with a Global Schema. Alison Callahan, José Cruz-Toledo and
Michel Dumontier. Bio-ontologies 2012.
8
@micheldumontier::OntologySummit2014:March
9
SRIQ(D)
10700+ axioms
1300+ classes
201 object properties (inc. inverses)
1 datatype property
Bio2RDF and SIO powered SPARQL 1.1 federated query:
Find chemicals (from CTD) and proteins (from SGD) that
participate in the same process (from GOA)
SELECT ?chem, ?prot, ?proc
FROM <http://bio2rdf.org/ctd>
WHERE {
?chemical a sio:chemical-entity.
?chemical rdfs:label ?chem.
?chemical sio:is-participant-in ?process.
?process rdfs:label ?proc.
FILTER regex (?process, "http://bio2rdf.org/go:")
SERVICE <http://sgd.bio2rdf.org/sparql> {
?protein a sio:protein .
?protein sio:is-participant-in ?process.
?protein rdfs:label ?prot .
}
}
@micheldumontier::OntologySummit2014:March 6,201410
multiple formalizations of the same kind of
data has emerged, each with their own merit
@micheldumontier::OntologySummit2014:March 6,201411
Multi-Stakeholder Efforts to Standardize
Representations are Reasonable,
Long Term Strategies for Data Integration
@micheldumontier::OntologySummit2014:March 6,201412
tactical formalization
discovery of drug and disease pathway associations
@micheldumontier::OntologySummit2014:March 6,201413
Take what you need
and represent it in a way that directly serves your objective
Biological Pathways Define A
Biological Objective
def: A biological pathway is constituted by a
set of molecular components that undertake
some biological transformation to achieve a
stated objective
glycolysis : a pathway that converts glucose to
pyruvate
@micheldumontier::OntologySummit2014:March 6,201414
aberrant and pharmacological pathways
Q1. Can we identify pathways that are associated with a
particular disease or class of diseases?
Q2. Can we identify pathways are associated with a
particular drug or class of drugs?
drug
pathway
disease
@micheldumontier::OntologySummit2014:March 6,201415
Identification of
drug and disease enriched pathways
• Approach
– Integrate 3 datasets
• DrugBank, PharmGKB and CTD
– Integrate 7 terminologies
• MeSH, ATC, ChEBI, UMLS, SNOMED, ICD, DO
– Formalize
– Identify significant associations using enrichment
analysis over the fully inferred knowledge base
Identifying aberrant pathways through integrated analysis of knowledge in pharmacogenomics.
Bioinformatics. 2012. @micheldumontier::OntologySummit2014:March 6,201416
Have you heard of OWL?
@micheldumontier::OntologySummit2014:March 6,201417
RDF triples are underspecified bits of
knowledge. OWL can help you nail
down what was really meant
Natural language statements:
The nucleus is a key part of the cell.
RDF triple:
<nucleus> <part-of> <cell>
OWL
• Nucleus and Cell are classes
• part-of is a relation between 2 instances
• Formalization: every instance of Nucleus ‘is part of’ at least
one instance of Cell
OWL axiom:
Nucleus subClassOf part-of some Cell 18@micheldumontier::OntologySummit2014:March 6,201418
assigning meaning to triples:
domain expertise + logics required!
Convert RDF triples into OWL axioms.
Triple in RDF:
<C1 R C2>
• C1 and C2 are classes, R a relation between 2 classes
• intended meaning:
o C1 SubClassOf: C2
o C1 SubClassOf: R some C2
o C1 SubClassOf: R only C2
o C2 SubClassOf: R some C1
o C1 SubClassOf: S some C2
o C1 SubClassOf: R some (S only C2)
o C1 DisjointFrom: C2
o C1 and C2 SubClassOf: owl:Nothing
o R some C1 DisjointFrom: R some C2
o C1 EquivalentClasses C2
o ...
• in general: P(C1, C2), where P is an OWL axiom (template)
Challenge:
Formalizing data
requires one to commit
to a particular meaning
– to make an ontological
commitment
@micheldumontier::OntologySummit2014:March 6,201419
Axiom Patterns for Triples
<nucleus> <part-of> <cell>
?X part-of ?Y
translated to axiom pattern
?X subClassOf: part-of some ?Y
-> Nucleus subClassOf: part-of some Cell
@micheldumontier::OntologySummit2014:March 6,201420
mercaptopurine
[pharmgkb:PA450379]
mercaptopurine
[drugbank:DB01033]
purine-6-thiol
[CHEBI:2208]
Class Equivalence
mercaptopurine
[ATC:L01BB02]
Top Level Classes
(disjointness)
drug diseasegenepathway
property chains
Class subsumption
mercaptopurine
[mesh:D015122]
Reciprocal
Existentials
drug disease
pathway gene
Formalized as an OWL-EL ontology
650,000+ classes, 3.2M subClassOf axioms, 75,000
equivalentClass axioms
@micheldumontier::OntologySummit2014:March 6,201421
Benefits: Enhanced Query Capability
– Use any mapped terminology to query a target resource.
– Use knowledge in target ontologies to formulate more
precise questions
• ask for drugs that are associated with diseases of the joint:
‘Chikungunya’ (do:0050012) is defined as a viral infectious disease
located in the ‘joint’ (fma:7490) and caused by a ‘Chikungunya
virus’ (taxon:37124).
– Learn relationships that are inferred by automated
reasoning.
• alcohol (ChEBI:30879) is associated with alcoholism (PA443309)
since alcoholism is directly associated with ethanol (CHEBI:16236)
• ‘parasitic infectious disease’ (do:0001398) retrieves 129 disease
associated drugs, 15 more than are directly associated.
@micheldumontier::OntologySummit2014:March 6,201422
Knowledge Discovery through Data
Integration and Enrichment Analysis
• OntoFunc: Tool to discover significant associations between sets of objects
and ontology categories. enrichment of attribute among a selected set of
input items as compared to a reference set. hypergeometric or the
binomial distribution, Fisher's exact test, or a chi-square test.
• We found 22,653 disease-pathway associations, where for each pathway
we find genes that are linked to disease.
– Mood disorder (do:3324) associated with Zidovudine Pathway
(pharmgkb:PA165859361). Zidovudine is used to treat HIV/AIDS. Side
effects include fatigue, headache, myalgia, malaise and anorexia
• We found 13,826 pathway-chemical associations
– Clopidogrel (chebi:37941) associated with Endothelin signaling
pathway (pharmgkb:PA164728163). Clopidogrel inhibits platelet
aggregation and prolongs bleeding time. Endothelins are proteins that
constrict blood vessels and raise blood pressure.
@micheldumontier::OntologySummit2014:March 6,201423
Tactical Formalization + Automated Reasoning
Offers Compelling Value for Certain Problems
We need to be smart about the goal, and how best to achieve it.
Tactical formalization is another tool in the toolbox.
We’ve formalized data as OWL ontologies:
• To identify mistakes in human curated knowledge
• To identify conflicting meaning in terms
• To identify mistakes in the representation of RDF data
o incorrect use of relations
o incorrect assertion of identity (owl:sameAs)
• To verify, fix and exploit Linked Data through expressive OWL
reasoning
• To generate/infer new triples to write back into RDF and use for
efficient retrieval
Many other applications can be envisioned.
@micheldumontier::OntologySummit2014:March 6,201424
Acknowledgements
Bio2RDF Release 2:
Allison Callahan, Jose Cruz-Toledo, Peter Ansell
Aberrant Pathways: Robert Hoehndorf, Georgios Gkoutos
@micheldumontier::OntologySummit2014:March 6,201425
dumontierlab.com
michel.dumontier@stanford.edu
Website: http://dumontierlab.com
Presentations: http://slideshare.com/micheldumontier
@micheldumontier::OntologySummit2014:March 6,201426

Contenu connexe

En vedette

Intro to Social Media
Intro to Social MediaIntro to Social Media
Intro to Social MediaSascha Funk
 
Maktjensen
MaktjensenMaktjensen
MaktjensenGjerde
 
Design thinking in efl context
Design thinking in efl contextDesign thinking in efl context
Design thinking in efl contextDebopriyo Roy
 
Camp It, June 2012, How To Design Your Bi Architecture To Capitalize on New T...
Camp It, June 2012, How To Design Your Bi Architecture To Capitalize on New T...Camp It, June 2012, How To Design Your Bi Architecture To Capitalize on New T...
Camp It, June 2012, How To Design Your Bi Architecture To Capitalize on New T...Craig Jordan
 
1st Network-of-BioThings Hackathon
1st Network-of-BioThings Hackathon1st Network-of-BioThings Hackathon
1st Network-of-BioThings HackathonMichel Dumontier
 
BigScrum - Scaling Teams to Programs
BigScrum - Scaling Teams to ProgramsBigScrum - Scaling Teams to Programs
BigScrum - Scaling Teams to ProgramsThinkLouder
 
Technical Communication Lab Projects
Technical Communication Lab ProjectsTechnical Communication Lab Projects
Technical Communication Lab ProjectsDebopriyo Roy
 
La MutilacióN Del Ser Y El Tratamiento Del Cuerpo En La Vida ..
La MutilacióN Del Ser Y El Tratamiento Del Cuerpo En La Vida ..La MutilacióN Del Ser Y El Tratamiento Del Cuerpo En La Vida ..
La MutilacióN Del Ser Y El Tratamiento Del Cuerpo En La Vida ..guest63c2e7
 
Presentation1
Presentation1Presentation1
Presentation1FatemaAli
 
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...Michel Dumontier
 
Russian & English: Nastas Ltrs of Confirmation-Contributions
Russian & English: Nastas Ltrs of Confirmation-ContributionsRussian & English: Nastas Ltrs of Confirmation-Contributions
Russian & English: Nastas Ltrs of Confirmation-ContributionsThomas Nastas
 
Colonna sonora in_openoffice
Colonna sonora in_openofficeColonna sonora in_openoffice
Colonna sonora in_openofficeToni Pierdomé
 
Adapting health systems to the challenge of diversity in the US and Europe
Adapting health systems to the challenge of diversity in the US and EuropeAdapting health systems to the challenge of diversity in the US and Europe
Adapting health systems to the challenge of diversity in the US and EuropediversityRx
 
Line Upgrade Deferral Scenarios for Distributed Renewable Energy Resources
Line Upgrade Deferral Scenarios for Distributed Renewable Energy ResourcesLine Upgrade Deferral Scenarios for Distributed Renewable Energy Resources
Line Upgrade Deferral Scenarios for Distributed Renewable Energy ResourcesIain Sanders
 

En vedette (20)

Chemistry Intro
Chemistry IntroChemistry Intro
Chemistry Intro
 
Intro to Social Media
Intro to Social MediaIntro to Social Media
Intro to Social Media
 
Maktjensen
MaktjensenMaktjensen
Maktjensen
 
Rims Metals and Mining Session
Rims Metals and Mining Session Rims Metals and Mining Session
Rims Metals and Mining Session
 
Using Big Data Analytics
Using Big Data AnalyticsUsing Big Data Analytics
Using Big Data Analytics
 
Design thinking in efl context
Design thinking in efl contextDesign thinking in efl context
Design thinking in efl context
 
Camp It, June 2012, How To Design Your Bi Architecture To Capitalize on New T...
Camp It, June 2012, How To Design Your Bi Architecture To Capitalize on New T...Camp It, June 2012, How To Design Your Bi Architecture To Capitalize on New T...
Camp It, June 2012, How To Design Your Bi Architecture To Capitalize on New T...
 
1st Network-of-BioThings Hackathon
1st Network-of-BioThings Hackathon1st Network-of-BioThings Hackathon
1st Network-of-BioThings Hackathon
 
Prezi polxtica lingxxstica
Prezi polxtica lingxxsticaPrezi polxtica lingxxstica
Prezi polxtica lingxxstica
 
BigScrum - Scaling Teams to Programs
BigScrum - Scaling Teams to ProgramsBigScrum - Scaling Teams to Programs
BigScrum - Scaling Teams to Programs
 
Technical Communication Lab Projects
Technical Communication Lab ProjectsTechnical Communication Lab Projects
Technical Communication Lab Projects
 
La MutilacióN Del Ser Y El Tratamiento Del Cuerpo En La Vida ..
La MutilacióN Del Ser Y El Tratamiento Del Cuerpo En La Vida ..La MutilacióN Del Ser Y El Tratamiento Del Cuerpo En La Vida ..
La MutilacióN Del Ser Y El Tratamiento Del Cuerpo En La Vida ..
 
Presentation1
Presentation1Presentation1
Presentation1
 
G4
G4G4
G4
 
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...
 
Russian & English: Nastas Ltrs of Confirmation-Contributions
Russian & English: Nastas Ltrs of Confirmation-ContributionsRussian & English: Nastas Ltrs of Confirmation-Contributions
Russian & English: Nastas Ltrs of Confirmation-Contributions
 
Monaco201209
Monaco201209Monaco201209
Monaco201209
 
Colonna sonora in_openoffice
Colonna sonora in_openofficeColonna sonora in_openoffice
Colonna sonora in_openoffice
 
Adapting health systems to the challenge of diversity in the US and Europe
Adapting health systems to the challenge of diversity in the US and EuropeAdapting health systems to the challenge of diversity in the US and Europe
Adapting health systems to the challenge of diversity in the US and Europe
 
Line Upgrade Deferral Scenarios for Distributed Renewable Energy Resources
Line Upgrade Deferral Scenarios for Distributed Renewable Energy ResourcesLine Upgrade Deferral Scenarios for Distributed Renewable Energy Resources
Line Upgrade Deferral Scenarios for Distributed Renewable Energy Resources
 

Similaire à Tactical Formalization of Linked Open Data (Ontology Summit 2014)

Generating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesGenerating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesMichel Dumontier
 
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Michel Dumontier
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Michel Dumontier
 
Guide to Pharmacology Poster - ELIXIR All Hands 2020
Guide to Pharmacology Poster - ELIXIR All Hands 2020Guide to Pharmacology Poster - ELIXIR All Hands 2020
Guide to Pharmacology Poster - ELIXIR All Hands 2020Guide to PHARMACOLOGY
 
Copy of template.pptx
Copy of template.pptxCopy of template.pptx
Copy of template.pptxdhahdqh
 
Mitochondrial Dysfunction in Myelodysplastic Syndromes YFC Poster.pptx
Mitochondrial Dysfunction in Myelodysplastic Syndromes YFC Poster.pptxMitochondrial Dysfunction in Myelodysplastic Syndromes YFC Poster.pptx
Mitochondrial Dysfunction in Myelodysplastic Syndromes YFC Poster.pptxMaliGlemaudThesee
 
Acquiring and representing drug-drug interaction knowledge and evidence, Litm...
Acquiring and representing drug-drug interaction knowledge and evidence, Litm...Acquiring and representing drug-drug interaction knowledge and evidence, Litm...
Acquiring and representing drug-drug interaction knowledge and evidence, Litm...jodischneider
 
Refining Gene Ontology to Include Terms and Relationships Relevant to exRNA C...
Refining Gene Ontology to Include Terms and Relationships Relevant to exRNA C...Refining Gene Ontology to Include Terms and Relationships Relevant to exRNA C...
Refining Gene Ontology to Include Terms and Relationships Relevant to exRNA C...exrna
 
Forest Environment Analysis for the Pandemic Health
Forest Environment Analysis for the Pandemic HealthForest Environment Analysis for the Pandemic Health
Forest Environment Analysis for the Pandemic HealthJun Steed Huang
 
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016Guide to PHARMACOLOGY
 
WEBINAR: The Yosemite Project PART 6 -- Data-Driven Biomedical Research with ...
WEBINAR: The Yosemite Project PART 6 -- Data-Driven Biomedical Research with ...WEBINAR: The Yosemite Project PART 6 -- Data-Driven Biomedical Research with ...
WEBINAR: The Yosemite Project PART 6 -- Data-Driven Biomedical Research with ...DATAVERSITY
 
Mobilizing informational resources webinar
Mobilizing informational resources   webinarMobilizing informational resources   webinar
Mobilizing informational resources webinarAnn-Marie Roche
 
The IUPHAR/MMV Guide to Malaria Pharmacology
The  IUPHAR/MMV Guide to Malaria Pharmacology  The  IUPHAR/MMV Guide to Malaria Pharmacology
The IUPHAR/MMV Guide to Malaria Pharmacology Chris Southan
 
Crowdsourcing applied to knowledge management in translational research: the ...
Crowdsourcing applied to knowledge management in translational research: the ...Crowdsourcing applied to knowledge management in translational research: the ...
Crowdsourcing applied to knowledge management in translational research: the ...SC CTSI at USC and CHLA
 
Next Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical PharmacologistsNext Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical PharmacologistsPhilip Bourne
 
SPARC 2013 Data Management Presentation
SPARC 2013 Data Management Presentation SPARC 2013 Data Management Presentation
SPARC 2013 Data Management Presentation Jackie Wirz, PhD
 
Open science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveOpen science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveKees van Bochove
 

Similaire à Tactical Formalization of Linked Open Data (Ontology Summit 2014) (20)

Generating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesGenerating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web Technologies
 
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
 
IRDiRC: progress and expectations
IRDiRC: progress and expectationsIRDiRC: progress and expectations
IRDiRC: progress and expectations
 
Guide to Pharmacology Poster - ELIXIR All Hands 2020
Guide to Pharmacology Poster - ELIXIR All Hands 2020Guide to Pharmacology Poster - ELIXIR All Hands 2020
Guide to Pharmacology Poster - ELIXIR All Hands 2020
 
Dynamics of developmental fate decisions - Luís A. Nunes Amaral
Dynamics of developmental fate decisions - Luís A. Nunes AmaralDynamics of developmental fate decisions - Luís A. Nunes Amaral
Dynamics of developmental fate decisions - Luís A. Nunes Amaral
 
Copy of template.pptx
Copy of template.pptxCopy of template.pptx
Copy of template.pptx
 
MURI Summer
MURI SummerMURI Summer
MURI Summer
 
Mitochondrial Dysfunction in Myelodysplastic Syndromes YFC Poster.pptx
Mitochondrial Dysfunction in Myelodysplastic Syndromes YFC Poster.pptxMitochondrial Dysfunction in Myelodysplastic Syndromes YFC Poster.pptx
Mitochondrial Dysfunction in Myelodysplastic Syndromes YFC Poster.pptx
 
Acquiring and representing drug-drug interaction knowledge and evidence, Litm...
Acquiring and representing drug-drug interaction knowledge and evidence, Litm...Acquiring and representing drug-drug interaction knowledge and evidence, Litm...
Acquiring and representing drug-drug interaction knowledge and evidence, Litm...
 
Refining Gene Ontology to Include Terms and Relationships Relevant to exRNA C...
Refining Gene Ontology to Include Terms and Relationships Relevant to exRNA C...Refining Gene Ontology to Include Terms and Relationships Relevant to exRNA C...
Refining Gene Ontology to Include Terms and Relationships Relevant to exRNA C...
 
Forest Environment Analysis for the Pandemic Health
Forest Environment Analysis for the Pandemic HealthForest Environment Analysis for the Pandemic Health
Forest Environment Analysis for the Pandemic Health
 
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
 
WEBINAR: The Yosemite Project PART 6 -- Data-Driven Biomedical Research with ...
WEBINAR: The Yosemite Project PART 6 -- Data-Driven Biomedical Research with ...WEBINAR: The Yosemite Project PART 6 -- Data-Driven Biomedical Research with ...
WEBINAR: The Yosemite Project PART 6 -- Data-Driven Biomedical Research with ...
 
Mobilizing informational resources webinar
Mobilizing informational resources   webinarMobilizing informational resources   webinar
Mobilizing informational resources webinar
 
The IUPHAR/MMV Guide to Malaria Pharmacology
The  IUPHAR/MMV Guide to Malaria Pharmacology  The  IUPHAR/MMV Guide to Malaria Pharmacology
The IUPHAR/MMV Guide to Malaria Pharmacology
 
Crowdsourcing applied to knowledge management in translational research: the ...
Crowdsourcing applied to knowledge management in translational research: the ...Crowdsourcing applied to knowledge management in translational research: the ...
Crowdsourcing applied to knowledge management in translational research: the ...
 
Next Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical PharmacologistsNext Generation Data and Opportunities for Clinical Pharmacologists
Next Generation Data and Opportunities for Clinical Pharmacologists
 
SPARC 2013 Data Management Presentation
SPARC 2013 Data Management Presentation SPARC 2013 Data Management Presentation
SPARC 2013 Data Management Presentation
 
Open science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveOpen science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The Hyve
 

Plus de Michel Dumontier

A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsMichel Dumontier
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsMichel Dumontier
 
The Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemThe Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemMichel Dumontier
 
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...Michel Dumontier
 
The role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemThe role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemMichel Dumontier
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Michel Dumontier
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Michel Dumontier
 
Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Michel Dumontier
 
The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...Michel Dumontier
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerMichel Dumontier
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureMichel Dumontier
 
Developing and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesDeveloping and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesMichel Dumontier
 
Advancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRAdvancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRMichel Dumontier
 
A Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsA Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsMichel Dumontier
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationMichel Dumontier
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessMichel Dumontier
 

Plus de Michel Dumontier (20)

A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge Graphs
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge Graphs
 
Evaluating FAIRness
Evaluating FAIRnessEvaluating FAIRness
Evaluating FAIRness
 
The Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemThe Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health System
 
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
 
The role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemThe role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health System
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
 
Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?
 
The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University Dinner
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star Lecture
 
Are we FAIR yet?
Are we FAIR yet?Are we FAIR yet?
Are we FAIR yet?
 
Developing and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesDeveloping and assessing FAIR digital resources
Developing and assessing FAIR digital resources
 
Advancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRAdvancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIR
 
A Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsA Framework to develop the FAIR Metrics
A Framework to develop the FAIR Metrics
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluation
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRness
 
Data Science for the Win
Data Science for the WinData Science for the Win
Data Science for the Win
 
2016 bmdid-mappings
2016 bmdid-mappings2016 bmdid-mappings
2016 bmdid-mappings
 

Dernier

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 

Dernier (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 

Tactical Formalization of Linked Open Data (Ontology Summit 2014)

  • 1. Tactical Formalization of Linked Open Data @micheldumontier::OntologySummit2014:March 6,20141 Michel Dumontier, Ph.D. Associate Professor of Medicine (Biomedical Informatics) Stanford University
  • 2. Linked Open Data provides an incredibly dynamic, rapidly growing set of interlinked resources @micheldumontier::OntologySummit2014:March 6,20142
  • 3. @micheldumontier::OntologySummit2014:March 6,2014 Linked Data for the Life Sciences 3 Bio2RDF converts bio-data in RDF format and ensures URI integrity by conferring with its registry of datasets
  • 4. the simplicity of the triple makes it easy to proliferate @micheldumontier::OntologySummit2014:March 6,20144
  • 5. but the lack of coordination makes Linked Open Data is quite chaotic and unwieldy @micheldumontier::OntologySummit2014:March 6,20145
  • 6. Massive Proliferation of Ontologies / Vocabularies @micheldumontier::OntologySummit2014:March 6,20146
  • 7. Despite all the data, it’s still hard to find answers to questions Because there are many ways to represent the same data and each dataset represents it differently @micheldumontier::OntologySummit2014:March 6,20147
  • 8. uniprot:P05067 uniprot:Protein is a sio:gene is a is a Semantic data integration, consistency checking and query answering over Bio2RDF with the Semanticscience Integrated Ontology (SIO) dataset ontology Knowledge Base @micheldumontier::OntologySummit2014:March 6,2014 pharmgkb:PA30917 refseq:Protein is a is a omim:189931 omim:Gene pharmgkb:Gene Querying Bio2RDF Linked Open Data with a Global Schema. Alison Callahan, José Cruz-Toledo and Michel Dumontier. Bio-ontologies 2012. 8
  • 9. @micheldumontier::OntologySummit2014:March 9 SRIQ(D) 10700+ axioms 1300+ classes 201 object properties (inc. inverses) 1 datatype property
  • 10. Bio2RDF and SIO powered SPARQL 1.1 federated query: Find chemicals (from CTD) and proteins (from SGD) that participate in the same process (from GOA) SELECT ?chem, ?prot, ?proc FROM <http://bio2rdf.org/ctd> WHERE { ?chemical a sio:chemical-entity. ?chemical rdfs:label ?chem. ?chemical sio:is-participant-in ?process. ?process rdfs:label ?proc. FILTER regex (?process, "http://bio2rdf.org/go:") SERVICE <http://sgd.bio2rdf.org/sparql> { ?protein a sio:protein . ?protein sio:is-participant-in ?process. ?protein rdfs:label ?prot . } } @micheldumontier::OntologySummit2014:March 6,201410
  • 11. multiple formalizations of the same kind of data has emerged, each with their own merit @micheldumontier::OntologySummit2014:March 6,201411
  • 12. Multi-Stakeholder Efforts to Standardize Representations are Reasonable, Long Term Strategies for Data Integration @micheldumontier::OntologySummit2014:March 6,201412
  • 13. tactical formalization discovery of drug and disease pathway associations @micheldumontier::OntologySummit2014:March 6,201413 Take what you need and represent it in a way that directly serves your objective
  • 14. Biological Pathways Define A Biological Objective def: A biological pathway is constituted by a set of molecular components that undertake some biological transformation to achieve a stated objective glycolysis : a pathway that converts glucose to pyruvate @micheldumontier::OntologySummit2014:March 6,201414
  • 15. aberrant and pharmacological pathways Q1. Can we identify pathways that are associated with a particular disease or class of diseases? Q2. Can we identify pathways are associated with a particular drug or class of drugs? drug pathway disease @micheldumontier::OntologySummit2014:March 6,201415
  • 16. Identification of drug and disease enriched pathways • Approach – Integrate 3 datasets • DrugBank, PharmGKB and CTD – Integrate 7 terminologies • MeSH, ATC, ChEBI, UMLS, SNOMED, ICD, DO – Formalize – Identify significant associations using enrichment analysis over the fully inferred knowledge base Identifying aberrant pathways through integrated analysis of knowledge in pharmacogenomics. Bioinformatics. 2012. @micheldumontier::OntologySummit2014:March 6,201416
  • 17. Have you heard of OWL? @micheldumontier::OntologySummit2014:March 6,201417
  • 18. RDF triples are underspecified bits of knowledge. OWL can help you nail down what was really meant Natural language statements: The nucleus is a key part of the cell. RDF triple: <nucleus> <part-of> <cell> OWL • Nucleus and Cell are classes • part-of is a relation between 2 instances • Formalization: every instance of Nucleus ‘is part of’ at least one instance of Cell OWL axiom: Nucleus subClassOf part-of some Cell 18@micheldumontier::OntologySummit2014:March 6,201418
  • 19. assigning meaning to triples: domain expertise + logics required! Convert RDF triples into OWL axioms. Triple in RDF: <C1 R C2> • C1 and C2 are classes, R a relation between 2 classes • intended meaning: o C1 SubClassOf: C2 o C1 SubClassOf: R some C2 o C1 SubClassOf: R only C2 o C2 SubClassOf: R some C1 o C1 SubClassOf: S some C2 o C1 SubClassOf: R some (S only C2) o C1 DisjointFrom: C2 o C1 and C2 SubClassOf: owl:Nothing o R some C1 DisjointFrom: R some C2 o C1 EquivalentClasses C2 o ... • in general: P(C1, C2), where P is an OWL axiom (template) Challenge: Formalizing data requires one to commit to a particular meaning – to make an ontological commitment @micheldumontier::OntologySummit2014:March 6,201419
  • 20. Axiom Patterns for Triples <nucleus> <part-of> <cell> ?X part-of ?Y translated to axiom pattern ?X subClassOf: part-of some ?Y -> Nucleus subClassOf: part-of some Cell @micheldumontier::OntologySummit2014:March 6,201420
  • 21. mercaptopurine [pharmgkb:PA450379] mercaptopurine [drugbank:DB01033] purine-6-thiol [CHEBI:2208] Class Equivalence mercaptopurine [ATC:L01BB02] Top Level Classes (disjointness) drug diseasegenepathway property chains Class subsumption mercaptopurine [mesh:D015122] Reciprocal Existentials drug disease pathway gene Formalized as an OWL-EL ontology 650,000+ classes, 3.2M subClassOf axioms, 75,000 equivalentClass axioms @micheldumontier::OntologySummit2014:March 6,201421
  • 22. Benefits: Enhanced Query Capability – Use any mapped terminology to query a target resource. – Use knowledge in target ontologies to formulate more precise questions • ask for drugs that are associated with diseases of the joint: ‘Chikungunya’ (do:0050012) is defined as a viral infectious disease located in the ‘joint’ (fma:7490) and caused by a ‘Chikungunya virus’ (taxon:37124). – Learn relationships that are inferred by automated reasoning. • alcohol (ChEBI:30879) is associated with alcoholism (PA443309) since alcoholism is directly associated with ethanol (CHEBI:16236) • ‘parasitic infectious disease’ (do:0001398) retrieves 129 disease associated drugs, 15 more than are directly associated. @micheldumontier::OntologySummit2014:March 6,201422
  • 23. Knowledge Discovery through Data Integration and Enrichment Analysis • OntoFunc: Tool to discover significant associations between sets of objects and ontology categories. enrichment of attribute among a selected set of input items as compared to a reference set. hypergeometric or the binomial distribution, Fisher's exact test, or a chi-square test. • We found 22,653 disease-pathway associations, where for each pathway we find genes that are linked to disease. – Mood disorder (do:3324) associated with Zidovudine Pathway (pharmgkb:PA165859361). Zidovudine is used to treat HIV/AIDS. Side effects include fatigue, headache, myalgia, malaise and anorexia • We found 13,826 pathway-chemical associations – Clopidogrel (chebi:37941) associated with Endothelin signaling pathway (pharmgkb:PA164728163). Clopidogrel inhibits platelet aggregation and prolongs bleeding time. Endothelins are proteins that constrict blood vessels and raise blood pressure. @micheldumontier::OntologySummit2014:March 6,201423
  • 24. Tactical Formalization + Automated Reasoning Offers Compelling Value for Certain Problems We need to be smart about the goal, and how best to achieve it. Tactical formalization is another tool in the toolbox. We’ve formalized data as OWL ontologies: • To identify mistakes in human curated knowledge • To identify conflicting meaning in terms • To identify mistakes in the representation of RDF data o incorrect use of relations o incorrect assertion of identity (owl:sameAs) • To verify, fix and exploit Linked Data through expressive OWL reasoning • To generate/infer new triples to write back into RDF and use for efficient retrieval Many other applications can be envisioned. @micheldumontier::OntologySummit2014:March 6,201424
  • 25. Acknowledgements Bio2RDF Release 2: Allison Callahan, Jose Cruz-Toledo, Peter Ansell Aberrant Pathways: Robert Hoehndorf, Georgios Gkoutos @micheldumontier::OntologySummit2014:March 6,201425