SlideShare a Scribd company logo
1 of 27
Graph Analytics in Pharmacology over the
Web of Life Sciences Linked Open Data
26th World Wide Web Conference (WWW)
Perth, 4th – 8th April 2017
MAU LIK R . KA MDA R A N D MA RK A . MU S E N
Stanford Center for Biomedical Informatics Research
maulikrk@stanford.edu
Linked Open Data (LOD) Cloud
Cyganiak, Richard et al. 2014
2
Life Sciences Linked Open Data (LSLOD) Cloud
…
3
4
Semantic Web: Publishing Data as a Graph
5
589.25
mol_weight
Gleevec (Mol. Wt.: 589.25 g/mol, Half-Life: 18 hours)
inhibits PDGFR, involved in signal transduction.
“18 hours”
half-life
x-ref
Gleevec
DrugB: DB00619
Gleevec
Resource Description Framework (RDF)
Inhibits
target name
type
GO:0007165
(Signal
Transduction)
process
PDGFR
KEGG: D01441http://bio2rdf.org/kegg:D01441
http://bio2rdf.org/drugbank:DB00619
Uniform Resource Identifier
Semantic Web: Querying the Graph
< 1000
mol_weight
?half-life
x-ref
?
?
What are the half-lives of drugs that have
Mol. Wt < 1000 g/mol and inhibit proteins
involved in signal transduction?
SPARQL Query Language
6
Inhibits
?target name
type
GO:0007165
(Signal
Transduction)
process
Life Sciences Linked Open Data Cloud – query federation
• Challenges associated with retrieving information from LSLOD sources
• Pattern-based method to rewrite queries across LSLOD sources
• An application in mechanism-based pharmacovigilance - PhLeGrA
What this talk is about …
7
8
Query Federation: Rewriting and executing
queries across different sources
QUERY FEDERATION
Drug
 molecular-weight < 1000
 target
 process = “GO:0007165”
 half-life
9Schwarte, et al. ISWC 2012
Drug
 molecular-weight < 1000
 target
 half-life
Drug
 molecular-weight < 1000
 target
 process = “GO:0007165”
What are the half-lives of drugs that
have Mol. Wt < 1000 g/mol and inhibit
proteins involved in signal transduction?
Heterogeneity in the LSLOD Cloud
10
Gleevec
molecular-weight
493.61 Gleevec
mol_weight
589.25
Label Mismatch: Different labels for classes, relations and attributes
(clinical features) (biological features)
Heterogeneity in the LSLOD Cloud
11
Gleevec
molecular-weight
493.61 Gleevec
mol_weight
589.25
Label Mismatch: Different labels for classes, relations and attributes
(clinical features) (biological features)
Heterogeneity in the LSLOD Cloud
12
Gleevec PDGFR
drug-target
Gleevec
Inhibits
PDGFR
target
name
type
PubMed: 21152856
source
Model Mismatch: Different graph patterns to capture granularity
Gleevec
molecular-weight
493.61 Gleevec
mol_weight
589.25
Label Mismatch: Different labels for classes, relations and attributes
(clinical features) (biological features)
Heterogeneity in the LSLOD Cloud
13
• Inconsistent Meanings
• Inconsistent URI labels for
classes, relations and attributes
• Inconsistent Attribute values for entities
• Inconsistent Graph patterns for
SPARQL queries
• Incomplete Relations between entities
Query Rewriting fails over the LSLOD Cloud
What are the half-lives of drugs that have Mol. Wt < 1000 g/mol and
inhibit proteins involved in signal transduction?
?s a <Drug>
?s <molecular-weight> ?mw
?s <target> ?protein
?s <half-life> ?hl
?mw < 1000 g/mol
?protein <hasGO> <GO:0007165>
?s a <Drug>
{?s <molecular-weight> ?mw}
{?s <half-life> ?hl}
?mw < 1000 g/mol
?s a <Drug>
{?s <target> ?protein}
?protein <hasGO> <GO:0007165>
Query
Rewriting
14
Using Graph Patterns for Query Rewriting
?Drug DrugBank:drug-target ?Protein
?Drug KEGG:target ?blank KEGG:link ?Protein
Mapping Rules:
15
?Drug hasTarget ?Protein
Using Graph Patterns for Query Rewriting
?Drug DrugBank:drug-target ?Protein
?Drug KEGG:target ?blank KEGG:link ?Protein
Mapping Rules:
What are the half-lives of drugs that have Mol. Wt < 1000 g/mol and
inhibit proteins involved in signal transduction?
?s a <Drug>
?s <hasMolWt> ?mw
?s <hasTarget> ?protein
?s <hasHalfLife> ?hl
?mw < 1000 g/mol
?protein <hasGO> <GO:0007165>
?s a <Drug>
{?s <molecular-weight> ?mw}
?s <drug-target> ?protein
{?s <half-life> ?hl}
?mw < 1000 g/mol
?s a <Drug>
?s <mol_wt> ?mw
{?s <target> ?protein_blank
?protein_blank <link> ?protein}
?protein <hasGO> <GO:0007165>
Query
RewriteQuery
Rewriting
16
?Drug hasTarget ?Protein
Life Sciences Linked Open Data Cloud – query federation
• Challenges associated with retrieving information from LSLOD sources
• Pattern-based method to rewrite queries across LSLOD sources
• An application in mechanism-based pharmacovigilance - PhLeGrA
What this talk is about …
17
PhLeGrA – Linked Graph Analytics in Pharmacology
18
Phlegra is a spider genus of the Salticidae family, commonly termed jumping spiders.
k-partite network will be generated as output
19
Entities and Relations from 4 different sources
are retrieved to create the k-partite Network
This k-partite network is generated in < 1 day
20
Query Federation overcomes heterogeneous
Distribution of Entities and Relations
R1: Drug hasTarget ProteinE1: Drug
• Similar and complete unique entities and relations exist between data sources
• Necessary to get the complete picture, but also determine sources of noise
21
Several underlying mechanisms are possible …
http://onto-apps.stanford.edu/phlegra 22
A graph analytics module to rank the mechanisms
23
Preliminary results using network-based
Apriori Algorithm for ranking mechanisms
24
The story so far …
25
Pattern-based federation methods can retrieve data
from multiple sources in the Life Sciences Linked Open
Data Cloud, and can enable development of advanced
methods for mechanism-based pharmacovigilance.
…
Acknowledgments
Musen Lab, Stanford
Biomedical Informatics Training Program
Michel Dumontier
US NIH Grant HG004028
26
PhLeGrA – Linked Graph Analytics in Pharmacology
27
www.stanford.edu/~maulikrk/research.html
www.onto-apps.stanford.edu/phlegra

More Related Content

What's hot

ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
Maulik Kamdar
 

What's hot (20)

ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
ChemSpider – A Platform to Gather, Host and Integrate Structure Based Data Ac...
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
ReVeaLD: A user-driven domain-specific interactive search platform for biomed...
 
GtoPdb ELIXIR-All Hands 2018
GtoPdb ELIXIR-All Hands 2018GtoPdb ELIXIR-All Hands 2018
GtoPdb ELIXIR-All Hands 2018
 
From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...
From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...
From Advanced Queries to Algorithms and Graph-Based ML: Tackling Diabetes wit...
 
Revolution in the Connectivity Between Medicinal Chemistry and Biology
Revolution in the Connectivity Between Medicinal Chemistry and BiologyRevolution in the Connectivity Between Medicinal Chemistry and Biology
Revolution in the Connectivity Between Medicinal Chemistry and Biology
 
Vanderwall cheminformatics Drexel Part 1
Vanderwall cheminformatics Drexel Part 1Vanderwall cheminformatics Drexel Part 1
Vanderwall cheminformatics Drexel Part 1
 
The GRID(General Repository for interaction datasets)
The GRID(General Repository for interaction datasets)The GRID(General Repository for interaction datasets)
The GRID(General Repository for interaction datasets)
 
BCSRCv1.3
BCSRCv1.3BCSRCv1.3
BCSRCv1.3
 
Antimalarial drug dscovery data disclosure
Antimalarial drug dscovery data disclosureAntimalarial drug dscovery data disclosure
Antimalarial drug dscovery data disclosure
 
GtoPdb and GtoImmuPdb in context
GtoPdb and GtoImmuPdb in contextGtoPdb and GtoImmuPdb in context
GtoPdb and GtoImmuPdb in context
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of Food
 
Integration of heterogeneous data
Integration of heterogeneous dataIntegration of heterogeneous data
Integration of heterogeneous data
 
The Language of the Gene Ontology
The Language of the Gene OntologyThe Language of the Gene Ontology
The Language of the Gene Ontology
 
Dinesh Barupal @ California Biomonitoring SGP Meeting July 2020
Dinesh Barupal @ California Biomonitoring SGP Meeting July 2020Dinesh Barupal @ California Biomonitoring SGP Meeting July 2020
Dinesh Barupal @ California Biomonitoring SGP Meeting July 2020
 
Towards semantic systems chemical biology
Towards semantic systems chemical biology Towards semantic systems chemical biology
Towards semantic systems chemical biology
 
Current advances to bridge the usability-expressivity gap in biomedical seman...
Current advances to bridge the usability-expressivity gap in biomedical seman...Current advances to bridge the usability-expressivity gap in biomedical seman...
Current advances to bridge the usability-expressivity gap in biomedical seman...
 
Integration of biomedical literature and databases
Integration of biomedical literature and databasesIntegration of biomedical literature and databases
Integration of biomedical literature and databases
 
David
DavidDavid
David
 
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
 

Similar to Graph Analytics in Pharmacology over the Web of Life Sciences Linked Open Data

Talk_linked_data_for_hcls_at_iswc2009
Talk_linked_data_for_hcls_at_iswc2009Talk_linked_data_for_hcls_at_iswc2009
Talk_linked_data_for_hcls_at_iswc2009
Jun Zhao
 
Use of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformaticsUse of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformatics
Remzi Çelebi
 
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 

Similar to Graph Analytics in Pharmacology over the Web of Life Sciences Linked Open Data (20)

2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG
 
Collaboration with GeneGo provides seamless access to compound databases, pat...
Collaboration with GeneGo provides seamless access to compound databases, pat...Collaboration with GeneGo provides seamless access to compound databases, pat...
Collaboration with GeneGo provides seamless access to compound databases, pat...
 
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
 
Talk_linked_data_for_hcls_at_iswc2009
Talk_linked_data_for_hcls_at_iswc2009Talk_linked_data_for_hcls_at_iswc2009
Talk_linked_data_for_hcls_at_iswc2009
 
Transparency in the Data Supply Chain
Transparency in the Data Supply ChainTransparency in the Data Supply Chain
Transparency in the Data Supply Chain
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
 
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS FoundationPistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
 
Use of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformaticsUse of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformatics
 
2015 TriCon - Clinical Grade Annotations - Public Data Resources for Interpre...
2015 TriCon - Clinical Grade Annotations - Public Data Resources for Interpre...2015 TriCon - Clinical Grade Annotations - Public Data Resources for Interpre...
2015 TriCon - Clinical Grade Annotations - Public Data Resources for Interpre...
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
 
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
 
Mechanism-Based Pharmacovigilance Over the Life-Sciences Linked-Open-Data Cloud
Mechanism-Based Pharmacovigilance Over the Life-Sciences Linked-Open-Data CloudMechanism-Based Pharmacovigilance Over the Life-Sciences Linked-Open-Data Cloud
Mechanism-Based Pharmacovigilance Over the Life-Sciences Linked-Open-Data Cloud
 
Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...
 
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
 
Molecular and data visualization in drug discovery
Molecular and data visualization in drug discoveryMolecular and data visualization in drug discovery
Molecular and data visualization in drug discovery
 
Data Integration in a Big Data Context: An Open PHACTS Case Study
Data Integration in a Big Data Context: An Open PHACTS Case StudyData Integration in a Big Data Context: An Open PHACTS Case Study
Data Integration in a Big Data Context: An Open PHACTS Case Study
 
Chem2bio2rdf portal
Chem2bio2rdf portalChem2bio2rdf portal
Chem2bio2rdf portal
 
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
IUPHAR Guide to IMMUNOPHARMACOLOGY Poster - Pharmacology 2016
 
GlyGen Warren Workshop in Boston
GlyGen Warren Workshop in BostonGlyGen Warren Workshop in Boston
GlyGen Warren Workshop in Boston
 
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and EducationGuide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
 

More from Maulik Kamdar

Analyzing User Interactions with Biomedical Ontologies: A Visual Perspective
Analyzing User Interactions with Biomedical Ontologies: A Visual PerspectiveAnalyzing User Interactions with Biomedical Ontologies: A Visual Perspective
Analyzing User Interactions with Biomedical Ontologies: A Visual Perspective
Maulik Kamdar
 
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
Maulik Kamdar
 

More from Maulik Kamdar (14)

Elsevier's Healthcare Knowledge Graph: An Actionable Medical Knowledge Platfo...
Elsevier's Healthcare Knowledge Graph: An Actionable Medical Knowledge Platfo...Elsevier's Healthcare Knowledge Graph: An Actionable Medical Knowledge Platfo...
Elsevier's Healthcare Knowledge Graph: An Actionable Medical Knowledge Platfo...
 
Text Snippets to Corroborate Medical Relations: An Unsupervised Approach usin...
Text Snippets to Corroborate Medical Relations: An Unsupervised Approach usin...Text Snippets to Corroborate Medical Relations: An Unsupervised Approach usin...
Text Snippets to Corroborate Medical Relations: An Unsupervised Approach usin...
 
Invited Talk at NASA Ames Research Center
Invited Talk at NASA Ames Research CenterInvited Talk at NASA Ames Research Center
Invited Talk at NASA Ames Research Center
 
Analyzing User Interactions with Biomedical Ontologies: A Visual Perspective
Analyzing User Interactions with Biomedical Ontologies: A Visual PerspectiveAnalyzing User Interactions with Biomedical Ontologies: A Visual Perspective
Analyzing User Interactions with Biomedical Ontologies: A Visual Perspective
 
BiOnIC: A Catalog of User Interactions with Biomedical Ontologies
BiOnIC: A Catalog of User Interactions with Biomedical OntologiesBiOnIC: A Catalog of User Interactions with Biomedical Ontologies
BiOnIC: A Catalog of User Interactions with Biomedical Ontologies
 
Preproposal Talk
Preproposal TalkPreproposal Talk
Preproposal Talk
 
BMI Research in Progress - Thursday talk
BMI Research in Progress - Thursday talkBMI Research in Progress - Thursday talk
BMI Research in Progress - Thursday talk
 
PRISM: A data-driven platform for monitoring mental health
PRISM: A data-driven platform for monitoring mental healthPRISM: A data-driven platform for monitoring mental health
PRISM: A data-driven platform for monitoring mental health
 
Investigating Term Reuse and Overlap in Biomedical Ontologies
Investigating Term Reuse and Overlap in Biomedical OntologiesInvestigating Term Reuse and Overlap in Biomedical Ontologies
Investigating Term Reuse and Overlap in Biomedical Ontologies
 
Integrating Wearables and User Interaction Patterns to Monitor Mental Health
Integrating Wearables and User Interaction Patterns to Monitor Mental HealthIntegrating Wearables and User Interaction Patterns to Monitor Mental Health
Integrating Wearables and User Interaction Patterns to Monitor Mental Health
 
BMI 201 - Investigating Term Reuse and Overlap in Biomedical Ontologies
BMI 201 - Investigating Term Reuse and Overlap in Biomedical OntologiesBMI 201 - Investigating Term Reuse and Overlap in Biomedical Ontologies
BMI 201 - Investigating Term Reuse and Overlap in Biomedical Ontologies
 
GenomeSnip: Fragmenting the Genomic Wheel to augment discovery in cancer rese...
GenomeSnip: Fragmenting the Genomic Wheel to augment discovery in cancer rese...GenomeSnip: Fragmenting the Genomic Wheel to augment discovery in cancer rese...
GenomeSnip: Fragmenting the Genomic Wheel to augment discovery in cancer rese...
 
Isolation and characterization of an extracellular antifungal protein from an...
Isolation and characterization of an extracellular antifungal protein from an...Isolation and characterization of an extracellular antifungal protein from an...
Isolation and characterization of an extracellular antifungal protein from an...
 
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
 

Recently uploaded

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 

Recently uploaded (20)

Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 

Graph Analytics in Pharmacology over the Web of Life Sciences Linked Open Data

  • 1. Graph Analytics in Pharmacology over the Web of Life Sciences Linked Open Data 26th World Wide Web Conference (WWW) Perth, 4th – 8th April 2017 MAU LIK R . KA MDA R A N D MA RK A . MU S E N Stanford Center for Biomedical Informatics Research maulikrk@stanford.edu
  • 2. Linked Open Data (LOD) Cloud Cyganiak, Richard et al. 2014 2
  • 3. Life Sciences Linked Open Data (LSLOD) Cloud … 3
  • 4. 4
  • 5. Semantic Web: Publishing Data as a Graph 5 589.25 mol_weight Gleevec (Mol. Wt.: 589.25 g/mol, Half-Life: 18 hours) inhibits PDGFR, involved in signal transduction. “18 hours” half-life x-ref Gleevec DrugB: DB00619 Gleevec Resource Description Framework (RDF) Inhibits target name type GO:0007165 (Signal Transduction) process PDGFR KEGG: D01441http://bio2rdf.org/kegg:D01441 http://bio2rdf.org/drugbank:DB00619 Uniform Resource Identifier
  • 6. Semantic Web: Querying the Graph < 1000 mol_weight ?half-life x-ref ? ? What are the half-lives of drugs that have Mol. Wt < 1000 g/mol and inhibit proteins involved in signal transduction? SPARQL Query Language 6 Inhibits ?target name type GO:0007165 (Signal Transduction) process
  • 7. Life Sciences Linked Open Data Cloud – query federation • Challenges associated with retrieving information from LSLOD sources • Pattern-based method to rewrite queries across LSLOD sources • An application in mechanism-based pharmacovigilance - PhLeGrA What this talk is about … 7
  • 8. 8
  • 9. Query Federation: Rewriting and executing queries across different sources QUERY FEDERATION Drug  molecular-weight < 1000  target  process = “GO:0007165”  half-life 9Schwarte, et al. ISWC 2012 Drug  molecular-weight < 1000  target  half-life Drug  molecular-weight < 1000  target  process = “GO:0007165” What are the half-lives of drugs that have Mol. Wt < 1000 g/mol and inhibit proteins involved in signal transduction?
  • 10. Heterogeneity in the LSLOD Cloud 10 Gleevec molecular-weight 493.61 Gleevec mol_weight 589.25 Label Mismatch: Different labels for classes, relations and attributes (clinical features) (biological features)
  • 11. Heterogeneity in the LSLOD Cloud 11 Gleevec molecular-weight 493.61 Gleevec mol_weight 589.25 Label Mismatch: Different labels for classes, relations and attributes (clinical features) (biological features)
  • 12. Heterogeneity in the LSLOD Cloud 12 Gleevec PDGFR drug-target Gleevec Inhibits PDGFR target name type PubMed: 21152856 source Model Mismatch: Different graph patterns to capture granularity Gleevec molecular-weight 493.61 Gleevec mol_weight 589.25 Label Mismatch: Different labels for classes, relations and attributes (clinical features) (biological features)
  • 13. Heterogeneity in the LSLOD Cloud 13 • Inconsistent Meanings • Inconsistent URI labels for classes, relations and attributes • Inconsistent Attribute values for entities • Inconsistent Graph patterns for SPARQL queries • Incomplete Relations between entities
  • 14. Query Rewriting fails over the LSLOD Cloud What are the half-lives of drugs that have Mol. Wt < 1000 g/mol and inhibit proteins involved in signal transduction? ?s a <Drug> ?s <molecular-weight> ?mw ?s <target> ?protein ?s <half-life> ?hl ?mw < 1000 g/mol ?protein <hasGO> <GO:0007165> ?s a <Drug> {?s <molecular-weight> ?mw} {?s <half-life> ?hl} ?mw < 1000 g/mol ?s a <Drug> {?s <target> ?protein} ?protein <hasGO> <GO:0007165> Query Rewriting 14
  • 15. Using Graph Patterns for Query Rewriting ?Drug DrugBank:drug-target ?Protein ?Drug KEGG:target ?blank KEGG:link ?Protein Mapping Rules: 15 ?Drug hasTarget ?Protein
  • 16. Using Graph Patterns for Query Rewriting ?Drug DrugBank:drug-target ?Protein ?Drug KEGG:target ?blank KEGG:link ?Protein Mapping Rules: What are the half-lives of drugs that have Mol. Wt < 1000 g/mol and inhibit proteins involved in signal transduction? ?s a <Drug> ?s <hasMolWt> ?mw ?s <hasTarget> ?protein ?s <hasHalfLife> ?hl ?mw < 1000 g/mol ?protein <hasGO> <GO:0007165> ?s a <Drug> {?s <molecular-weight> ?mw} ?s <drug-target> ?protein {?s <half-life> ?hl} ?mw < 1000 g/mol ?s a <Drug> ?s <mol_wt> ?mw {?s <target> ?protein_blank ?protein_blank <link> ?protein} ?protein <hasGO> <GO:0007165> Query RewriteQuery Rewriting 16 ?Drug hasTarget ?Protein
  • 17. Life Sciences Linked Open Data Cloud – query federation • Challenges associated with retrieving information from LSLOD sources • Pattern-based method to rewrite queries across LSLOD sources • An application in mechanism-based pharmacovigilance - PhLeGrA What this talk is about … 17
  • 18. PhLeGrA – Linked Graph Analytics in Pharmacology 18 Phlegra is a spider genus of the Salticidae family, commonly termed jumping spiders.
  • 19. k-partite network will be generated as output 19
  • 20. Entities and Relations from 4 different sources are retrieved to create the k-partite Network This k-partite network is generated in < 1 day 20
  • 21. Query Federation overcomes heterogeneous Distribution of Entities and Relations R1: Drug hasTarget ProteinE1: Drug • Similar and complete unique entities and relations exist between data sources • Necessary to get the complete picture, but also determine sources of noise 21
  • 22. Several underlying mechanisms are possible … http://onto-apps.stanford.edu/phlegra 22
  • 23. A graph analytics module to rank the mechanisms 23
  • 24. Preliminary results using network-based Apriori Algorithm for ranking mechanisms 24
  • 25. The story so far … 25 Pattern-based federation methods can retrieve data from multiple sources in the Life Sciences Linked Open Data Cloud, and can enable development of advanced methods for mechanism-based pharmacovigilance. …
  • 26. Acknowledgments Musen Lab, Stanford Biomedical Informatics Training Program Michel Dumontier US NIH Grant HG004028 26
  • 27. PhLeGrA – Linked Graph Analytics in Pharmacology 27 www.stanford.edu/~maulikrk/research.html www.onto-apps.stanford.edu/phlegra

Editor's Notes

  1. Using Semantic Web Technologies, data publishers and other researchers can represent data in a graphical format to create machine-readable platform called the Linked Open Data (LOD) cloud – this solves integrated data analysis and storage problems, and enables users to query these dataset without being concerned of underlying formats or representations. Semantic Web is the idea of a decentralized, distributed and heterogeneous data space, extending over the traditional Web.
  2. We will be focusing on the Life Sciences region of the linked open data cloud … consists of DrugBank … ----- Meeting Notes (3/17/17 13:11) ----- tell them what each source means ----- Meeting Notes (3/24/17 13:53) ----- this representation gives the impression everything is linked up ....
  3. To solve the integrative bioinformatics challenges, data publishers have started using Semantic Web Technologies to create the biomedical Semantic Web. Semantic Web is the vision to represent and link data and knowledge on the web for web-scale reasoning and inference. Using the Resource Description Framework, we can publish data as a graph. Attributes and relations that are typically stored in fields and tables in relational databases, are converted to nodes and edges, with explicit semantics. For example, Gleevec, a drug, has a molecular weight of 589.25. Complex relation, such as, Gleevec inhibits PDGFR involved in signal transduction, can be represented using “blank nodes”. One can link similar entities in different sources, using explicitly-labeled cross reference edges. Here we link Gleevec from KEGG, an interaction database, with Gleevec in DrugBank, a drug database. Hence, Using RDF, we can link multiple sources, without worrying of underlying formats and entity notations. ----- Meeting Notes (3/24/17 13:53) ----- use logo here
  4. Once, these graphs are published, they can be queried using the SPARQL graph query language. another SW technology. By using appropriate variables, We can retrieve a set of drugs from KEGG, that have molecular weight of < 1000 and inhibit proteins involved in signal transduction. We can query multiple sources on the Semantic web – for example, we can use the explicit cross reference edges to navigate to DrugBank, to get the half-life of these drugs. Hence SPARQL facilitates integrated querying of these sources and reconciles similar entities. ----- Meeting Notes (3/24/17 13:53) ----- half-lives .... symbol for kegg and drugbank
  5. ----- Meeting Notes (3/24/17 13:53) ----- larger font size for bullets
  6. Federated querying is a method that decomposes a source query and rewrites multiple sub-queries that can be executed separately over the different graphs. Federated querying is required because there may be some relations or attributes that are unique to a given data source: e.g. half life of drugs can only be obtained from DrugBank, whereas processes in which a protein target is involved can only be obtained from KEGG. Moreover, there may be some relations that can be obtained from multiple sources, e.g. molecular weights and protein targets. ----- Meeting Notes (3/24/17 13:53) ----- use logos here ... half-lives read the query and make it clear no one source has everything
  7. There is a lot of heterogeneity in the biomedical Semantic Web that stems from the heterogeneity in the schemas used to structure the underlying data and knowledge sources. For example, the labels for classes, relations and attributes may be completely different for the same relations, e.g. slight differences in the label of molecular weight in DrugBank and KEGG Moreover, Semantic Web aims to capture the entire granularity of the underlying source – some relations may be explained in greater detail in different sources. For example, in KEGG, you get details on the type of interaction between Gleevec and PDGFR, as well as the source of the interaction in literature. As SPARQL querying (or for that matter any database querying) requires these labels and the patterns to be exactly similar to those in the RDF graphs, for the query to retrieve results.
  8. There is a lot of heterogeneity in the biomedical Semantic Web that stems from the heterogeneity in the schemas used to structure the underlying data and knowledge sources. For example, the labels for classes, relations and attributes may be completely different for the same relations, e.g. slight differences in the label of molecular weight in DrugBank and KEGG Moreover, Semantic Web aims to capture the entire granularity of the underlying source – some relations may be explained in greater detail in different sources. For example, in KEGG, you get details on the type of interaction between Gleevec and PDGFR, as well as the source of the interaction in literature. As SPARQL querying (or for that matter any database querying) requires these labels and the patterns to be exactly similar to those in the RDF graphs, for the query to retrieve results.
  9. There is a lot of heterogeneity in the biomedical Semantic Web that stems from the heterogeneity in the schemas used to structure the underlying data and knowledge sources. For example, the labels for classes, relations and attributes may be completely different for the same relations, e.g. slight differences in the label of molecular weight in DrugBank and KEGG Moreover, Semantic Web aims to capture the entire granularity of the underlying source – some relations may be explained in greater detail in different sources. For example, in KEGG, you get details on the type of interaction between Gleevec and PDGFR, as well as the source of the interaction in literature. As SPARQL querying (or for that matter any database querying) requires these labels and the patterns to be exactly similar to those in the RDF graphs, for the query to retrieve results. ----- Meeting Notes (3/24/17 13:53) ----- animation does not work ...
  10. There is a lot of heterogeneity in the biomedical Semantic Web that stems from the heterogeneity in the schemas used to structure the underlying data and knowledge sources. For example, the labels for classes, relations and attributes may be completely different for the same relations, e.g. slight differences in the label of molecular weight in DrugBank and KEGG Moreover, Semantic Web aims to capture the entire granularity of the underlying source – some relations may be explained in greater detail in different sources. For example, in KEGG, you get details on the type of interaction between Gleevec and PDGFR, as well as the source of the interaction in literature. As SPARQL querying (or for that matter any database querying) requires these labels and the patterns to be exactly similar to those in the RDF graphs, for the query to retrieve results. ----- Meeting Notes (3/24/17 13:53) ----- make the font bigger two terms that are said to owl:sameAs might not be equal Cloud should be capitalized
  11. ----- Meeting Notes (3/24/17 13:53) --- there should not be animation half lives make the entire figure bigger ----- Meeting Notes (3/24/17 14:06) ----- cloud should be capitalized
  12. ----- Meeting Notes (3/24/17 14:06) ----- mapping rules bigger spend more time talking on it walk them through it ... use two arrows ... and only one label in the left ...
  13. ----- Meeting Notes (3/24/17 14:06) ----- mapping rules bigger spend more time talking on it walk them through it ... use two arrows ... and only one label in the left ...
  14. ----- Meeting Notes (3/24/17 13:53) ----- larger font size for bullets
  15. ----- Meeting Notes (3/24/17 14:06) ----- say that phlegra helps you jump around all sources