SlideShare une entreprise Scribd logo
1  sur  18
Day 2 of Computing on the
shoulders of giants:
how existing knowledge is represented
and applied in bioinformatics
Benjamin Good
bgood@scripps.edu
Assistant Professor of the Department of
Molecular and Experimental Medicine
Recap from Day 1
• Make things (articles, genes,
antibodies, etc.) easier to find
• Answer questions
• Generate hypotheses
Controlled vocabularies (MeSH)
Ontologies (Gene Ontology)
knowledge graphs on the Web:
the SPARQL query language
knowledge plus computation =
inference, the ABC model
Computing with knowledge
• Challenges with knowledge graphs
• Too much data
• ->> query, sort, visualize, interact
• Not enough data
• ->> mine for more..
• Goal for practical day: Go beyond PubMed!
• gain hands on experience using a knowledge graph
• either with tools built for the purpose or with your own code…
Assignment: knowledge graph to hypothesis
• Option 1 Coding
• Implement and apply an ABC Model style hypothesis generating program (can adapt
from example provided)
• explain its logic, explain how you used it to generate a hypothesis, explain the
hypothesis (provide a visual)
• Option 2 Non-coding
• Use a knowledge discovery application(s) (list provided) to define a new hypothesis
• if you can’t think of where to start, try to explain why Metformin may contribute to
cancer survival
• Assignment deliverables: a document containing
• the inputs you gave to your program or the online tool(s) you used
• what was generated in response and the underlying logic
• an image and text describing the results, especially any hypothesis you could derive
• (for Option 1 also submit any code written or files generated as a tar or zip archive)
Online tools for knowledge discovery
• http://knowledge.bio (* we make this one…)
• http://www.biograph.be (this is a good tool, but often breaks down)
• http://epiphanet.uth.tmc.edu (also on the flaky side, but can be
good)
• https://skr3.nlm.nih.gov/SemMed/ (works okay, requires a (free)
account)
• http://arrowsmith.psych.uic.edu (ugly interface, but good tool)
Demos
• http://knowledge.bio
• http://www.biograph.be
• http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/start.cgi
Example question: repurposing all drugs
http://tinyurl.com/hwm9388
?drug
?disease
interacts
with
protein
geneencoded by
genetic
association
treats??
Example program (feel free to follow or adapt
to your interest)
• Example
• Input = a disease (A)
• Output = a ranked list of drugs (C) that might be used for treatment
• Render the results of your workflow as a cytoscape network that illustrates the
reasoning behind the predictions
• Implementation
• Python
• Use a SPARQL endpoint such as http://query.wikidata.org
• + identify and use another endpoint (e.g. EBI, UniProt)
• ++ access pubmed articles and MeSH indexing
Python setup
• pip install RDFLib, SPARQLWrapper, pandas….
• Hopefully Jupyter already installed ? else install it
http://jupyter.readthedocs.io/en/latest/install.html
• get notebook from
https://github.com/SuLab/sparql_to_pandas/blob/master/SPARQL_p
andas.ipynb
• go to directory where you put the notebook
• run it with
• >jupyter notebook
• should be ready to run
the notebook
• will run a basic search for disease-gene-drug connections in wikidata
• will sort the results by the number of intervening genes
• will export the data to a tab-delimited file you can view in Excel, text
editor, or load into cytoscape
• Your job:
• Run it and extend it by one or more of:
• adapting the query
• changing the way the results are sorted
• working with the output in cytoscape to produce an informative visualization
example output rendered in cytoscape
Other queries from Day 1 (slides 48-54)
• Drugs that target a cancer and impact a specific biological process
• http://tinyurl.com/j222k6g
• Drugs that target a new disease linked via biological pathway
with shared genes to disease the drug is now used to treat
• http://tinyurl.com/gpfr9kj
Possible inputs for adaptations
• Browse and examine wikidata.org to see what you might make use of
• e.g.
• Type of physical interaction between gene and drug
• Gene ontology annotation (what evidence codes?)
• Disease ontology hierarchy
• Drug characteristics
Other possible knowledge sources
• SPARQL
• UniProt http://sparql.uniprot.org
• EBI SPARQL https://www.ebi.ac.uk/rdf/documentation/sparql-endpoints
• look for unique identifiers on genes and proteins that you can use to link
wikidata content to their content
• Text
• use the NCBI the E-utils API to programmatically access pubmed articles and
MeSH indexing http://www.ncbi.nlm.nih.gov/books/NBK25501/
• Can use to build co-occurrence networks of e.g. MeSH terms
Good luck! Ask questions!
ABC ranking algorithms
• Out of all C, which are most strongly
related to A?
• Rank by N shared B concepts
• c2: 4
• c4:3
• c1: 1
• c3: 1
• c5:1
• c6:1
• Next level: adjust to down-weight highly
connected nodes
A B C
c1
c2
c3
c4
c5
c6
ABC ranking algorithms – advanced (require
large networks to be useful)
• Wren – Average Minimum Weight (AMW) (Wren)
• http://bioinformatics.oxfordjournals.org/content/20/3/389.full.pdf
• Linking Term Count with Average Minimum Weight (LTC-AMW)
(Yetisgen-Yildiz and Pratt)
• https://www.researchgate.net/publication/23759128_A_new_evaluation_me
thodology_for_literature-based_discovery_systems
• Predicate inter-dependence (Rastegar-Mojarad)
• https://s3.amazonaws.com/uploads.hipchat.com/25885/154162/UaGvvQqbr
hPBAWN/A%20new%20method.pdf

Contenu connexe

Tendances

Wheat Data Interoperability (1) by Esther DZALE YEUMO KABORE and Richard FULSS
Wheat Data Interoperability (1) by Esther DZALE YEUMO KABORE and Richard FULSSWheat Data Interoperability (1) by Esther DZALE YEUMO KABORE and Richard FULSS
Wheat Data Interoperability (1) by Esther DZALE YEUMO KABORE and Richard FULSSCIARD Movement
 
Multivariate Data analysis Workshop at UC Davis 2012
Multivariate Data analysis Workshop at UC Davis 2012Multivariate Data analysis Workshop at UC Davis 2012
Multivariate Data analysis Workshop at UC Davis 2012Dmitry Grapov
 
Coursera Data Science 2015
Coursera Data Science 2015Coursera Data Science 2015
Coursera Data Science 2015Frank Hasbani
 
A Statewide Archaeological Predictive Model of Pennsylvania: Lessons Learned
A Statewide Archaeological Predictive Model of Pennsylvania: Lessons LearnedA Statewide Archaeological Predictive Model of Pennsylvania: Lessons Learned
A Statewide Archaeological Predictive Model of Pennsylvania: Lessons LearnedMrecos
 
JHU Data Science specialization_certificate
JHU Data Science specialization_certificateJHU Data Science specialization_certificate
JHU Data Science specialization_certificateGianfranco Campana
 
Heuristics Data Science Life Cycle
Heuristics Data Science Life CycleHeuristics Data Science Life Cycle
Heuristics Data Science Life Cyclemadhucharis
 

Tendances (10)

Clinical Anatomy 9566
Clinical Anatomy 9566Clinical Anatomy 9566
Clinical Anatomy 9566
 
Wheat Data Interoperability (1) by Esther DZALE YEUMO KABORE and Richard FULSS
Wheat Data Interoperability (1) by Esther DZALE YEUMO KABORE and Richard FULSSWheat Data Interoperability (1) by Esther DZALE YEUMO KABORE and Richard FULSS
Wheat Data Interoperability (1) by Esther DZALE YEUMO KABORE and Richard FULSS
 
ds_certificate
ds_certificateds_certificate
ds_certificate
 
Multivariate Data analysis Workshop at UC Davis 2012
Multivariate Data analysis Workshop at UC Davis 2012Multivariate Data analysis Workshop at UC Davis 2012
Multivariate Data analysis Workshop at UC Davis 2012
 
Coursera Data Science 2015
Coursera Data Science 2015Coursera Data Science 2015
Coursera Data Science 2015
 
Experimenta
ExperimentaExperimenta
Experimenta
 
Use of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature ScreeningUse of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature Screening
 
A Statewide Archaeological Predictive Model of Pennsylvania: Lessons Learned
A Statewide Archaeological Predictive Model of Pennsylvania: Lessons LearnedA Statewide Archaeological Predictive Model of Pennsylvania: Lessons Learned
A Statewide Archaeological Predictive Model of Pennsylvania: Lessons Learned
 
JHU Data Science specialization_certificate
JHU Data Science specialization_certificateJHU Data Science specialization_certificate
JHU Data Science specialization_certificate
 
Heuristics Data Science Life Cycle
Heuristics Data Science Life CycleHeuristics Data Science Life Cycle
Heuristics Data Science Life Cycle
 

En vedette

Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden Benjamin Good
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giantsBenjamin Good
 
Fedora Iptables
Fedora IptablesFedora Iptables
Fedora Iptableszubin71
 
Channeling Collaborative Spirit
Channeling Collaborative SpiritChanneling Collaborative Spirit
Channeling Collaborative SpiritBenjamin Good
 
EISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogueEISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogueeishimachinery
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidataBenjamin Good
 
Human Guided Forests (HGF)
Human Guided Forests (HGF)Human Guided Forests (HGF)
Human Guided Forests (HGF)Benjamin Good
 
Resume 2009 Compatible V2 1
Resume 2009 Compatible V2 1 Resume 2009 Compatible V2 1
Resume 2009 Compatible V2 1 schelby
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbioBenjamin Good
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingBenjamin Good
 
Buyer Remorse
Buyer RemorseBuyer Remorse
Buyer Remorsesmfox
 
Oslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaOslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaCominvent AS
 
B2B Branding Explained
B2B Branding ExplainedB2B Branding Explained
B2B Branding Explainedcsadhy
 
Welcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLCWelcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLCAlex Faynin
 
Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Benjamin Good
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyCominvent AS
 

En vedette (20)

Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden Building a Biomedical Knowledge Garden
Building a Biomedical Knowledge Garden
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
IMSafer Angel Round
IMSafer Angel RoundIMSafer Angel Round
IMSafer Angel Round
 
Fedora Iptables
Fedora IptablesFedora Iptables
Fedora Iptables
 
Channeling Collaborative Spirit
Channeling Collaborative SpiritChanneling Collaborative Spirit
Channeling Collaborative Spirit
 
EISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogueEISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogue
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata
 
Human Guided Forests (HGF)
Human Guided Forests (HGF)Human Guided Forests (HGF)
Human Guided Forests (HGF)
 
Resume 2009 Compatible V2 1
Resume 2009 Compatible V2 1 Resume 2009 Compatible V2 1
Resume 2009 Compatible V2 1
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio
 
(Bio)Hackathons
(Bio)Hackathons(Bio)Hackathons
(Bio)Hackathons
 
2016 mem good
2016 mem good2016 mem good
2016 mem good
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meeting
 
Buyer Remorse
Buyer RemorseBuyer Remorse
Buyer Remorse
 
Oslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaOslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alpha
 
B2B Branding Explained
B2B Branding ExplainedB2B Branding Explained
B2B Branding Explained
 
Welcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLCWelcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLC
 
Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoy
 
genegames.org
genegames.orggenegames.org
genegames.org
 

Similaire à Scripps bioinformatics seminar_day_2

Reproducibility and replicability: a practical approach
Reproducibility and replicability: a practical approachReproducibility and replicability: a practical approach
Reproducibility and replicability: a practical approachKrzysztof Gorgolewski
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theoryC. Tobin Magle
 
Databases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems ImmunologyDatabases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems ImmunologyYannick Pouliot
 
Designing a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardDesigning a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardEMBL-ABR
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its toolsGaurav Diwakar
 
Career oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsCareer oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsShikha Thakur
 
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...robertstevens65
 
Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0Elia Brodsky
 
Wikidata for biomedical knowledge integration and curation
Wikidata for biomedical knowledge integration and curationWikidata for biomedical knowledge integration and curation
Wikidata for biomedical knowledge integration and curationGregory Stupp
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to BioinformaticsLeighton Pritchard
 
Ontologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowOntologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowBarry Smith
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataPhilip Cheung
 
Ben Goertzel AIs, Superflies and the Path to Immortality - singsum au 2011
Ben Goertzel AIs, Superflies and the Path to Immortality - singsum au 2011Ben Goertzel AIs, Superflies and the Path to Immortality - singsum au 2011
Ben Goertzel AIs, Superflies and the Path to Immortality - singsum au 2011Adam Ford
 
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford MedMachine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford MedSri Ambati
 
Careers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and JobsCareers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and JobsM Abdullah Chaudhry
 
Directions in Open Science
Directions in Open ScienceDirections in Open Science
Directions in Open ScienceMike Travers
 
Towards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imagingTowards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imagingOla Spjuth
 
Cracking the (bio)code -- Professional Development Session at SACNAS 2014
Cracking the (bio)code -- Professional Development Session at SACNAS 2014Cracking the (bio)code -- Professional Development Session at SACNAS 2014
Cracking the (bio)code -- Professional Development Session at SACNAS 2014Tracy Heath
 
BioAssay Express
BioAssay ExpressBioAssay Express
BioAssay ExpressAlex Clark
 

Similaire à Scripps bioinformatics seminar_day_2 (20)

Reproducibility and replicability: a practical approach
Reproducibility and replicability: a practical approachReproducibility and replicability: a practical approach
Reproducibility and replicability: a practical approach
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theory
 
Databases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems ImmunologyDatabases, Web Services and Tools For Systems Immunology
Databases, Web Services and Tools For Systems Immunology
 
Designing a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardDesigning a community resource - Sandra Orchard
Designing a community resource - Sandra Orchard
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its tools
 
Career oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsCareer oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of Bioinformatics
 
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
 
Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0
 
Wikidata for biomedical knowledge integration and curation
Wikidata for biomedical knowledge integration and curationWikidata for biomedical knowledge integration and curation
Wikidata for biomedical knowledge integration and curation
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Ontologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowOntologies: What Librarians Need to Know
Ontologies: What Librarians Need to Know
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
Ben Goertzel AIs, Superflies and the Path to Immortality - singsum au 2011
Ben Goertzel AIs, Superflies and the Path to Immortality - singsum au 2011Ben Goertzel AIs, Superflies and the Path to Immortality - singsum au 2011
Ben Goertzel AIs, Superflies and the Path to Immortality - singsum au 2011
 
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford MedMachine Learning in Modern Medicine with Erin LeDell at Stanford Med
Machine Learning in Modern Medicine with Erin LeDell at Stanford Med
 
Careers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and JobsCareers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and Jobs
 
Directions in Open Science
Directions in Open ScienceDirections in Open Science
Directions in Open Science
 
Towards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imagingTowards automated phenotypic cell profiling with high-content imaging
Towards automated phenotypic cell profiling with high-content imaging
 
Cracking the (bio)code -- Professional Development Session at SACNAS 2014
Cracking the (bio)code -- Professional Development Session at SACNAS 2014Cracking the (bio)code -- Professional Development Session at SACNAS 2014
Cracking the (bio)code -- Professional Development Session at SACNAS 2014
 
BioAssay Express
BioAssay ExpressBioAssay Express
BioAssay Express
 

Plus de Benjamin Good

Representing and reasoning with biological knowledge
Representing and reasoning with biological knowledgeRepresenting and reasoning with biological knowledge
Representing and reasoning with biological knowledgeBenjamin Good
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsBenjamin Good
 
Pathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsPathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsBenjamin Good
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of FoodBenjamin Good
 
Gene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopGene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopBenjamin Good
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationBenjamin Good
 
Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Benjamin Good
 
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery (Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery Benjamin Good
 
Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KBenjamin Good
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfBenjamin Good
 
Building a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen scienceBuilding a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen scienceBenjamin Good
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBenjamin Good
 
Serious games for bioinformatics education. ISMB 2014 education workshop
Serious games for bioinformatics education.  ISMB 2014 education workshopSerious games for bioinformatics education.  ISMB 2014 education workshop
Serious games for bioinformatics education. ISMB 2014 education workshopBenjamin Good
 
The Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival predictionThe Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival predictionBenjamin Good
 
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Benjamin Good
 
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstractsMicrotask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstractsBenjamin Good
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationBenjamin Good
 
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...Benjamin Good
 

Plus de Benjamin Good (20)

Representing and reasoning with biological knowledge
Representing and reasoning with biological knowledgeRepresenting and reasoning with biological knowledge
Representing and reasoning with biological knowledge
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity Models
 
Pathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMsPathways2GO: Converting BioPax pathways to GO-CAMs
Pathways2GO: Converting BioPax pathways to GO-CAMs
 
Knowledge Beacons
Knowledge BeaconsKnowledge Beacons
Knowledge Beacons
 
Science Game Lab
Science Game LabScience Game Lab
Science Game Lab
 
Wikidata and the Semantic Web of Food
Wikidata and the  Semantic Web of FoodWikidata and the  Semantic Web of Food
Wikidata and the Semantic Web of Food
 
Gene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshopGene Wiki and Wikimedia Foundation SPARQL workshop
Gene Wiki and Wikimedia Foundation SPARQL workshop
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocuration
 
Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016Wikidata workshop for ISB Biocuration 2016
Wikidata workshop for ISB Biocuration 2016
 
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery (Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
(Poster) Knowledge.Bio: an Interactive Tool for Literature-based Discovery
 
Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2K
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdf
 
Building a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen scienceBuilding a massive biomedical knowledge graph with citizen science
Building a massive biomedical knowledge graph with citizen science
 
Branch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiersBranch: An interactive, web-based tool for building decision tree classifiers
Branch: An interactive, web-based tool for building decision tree classifiers
 
Serious games for bioinformatics education. ISMB 2014 education workshop
Serious games for bioinformatics education.  ISMB 2014 education workshopSerious games for bioinformatics education.  ISMB 2014 education workshop
Serious games for bioinformatics education. ISMB 2014 education workshop
 
The Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival predictionThe Cure: Making a game of gene selection for breast cancer survival prediction
The Cure: Making a game of gene selection for breast cancer survival prediction
 
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
 
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstractsMicrotask crowdsourcing for disease mention annotation in PubMed abstracts
Microtask crowdsourcing for disease mention annotation in PubMed abstracts
 
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotationMark2Cure: a crowdsourcing platform for biomedical literature annotation
Mark2Cure: a crowdsourcing platform for biomedical literature annotation
 
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
The Cure: A Game with the Purpose of Gene Selection for Breast Cancer Surviva...
 

Dernier

Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 

Dernier (20)

Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 

Scripps bioinformatics seminar_day_2

  • 1. Day 2 of Computing on the shoulders of giants: how existing knowledge is represented and applied in bioinformatics Benjamin Good bgood@scripps.edu Assistant Professor of the Department of Molecular and Experimental Medicine
  • 2. Recap from Day 1 • Make things (articles, genes, antibodies, etc.) easier to find • Answer questions • Generate hypotheses Controlled vocabularies (MeSH) Ontologies (Gene Ontology) knowledge graphs on the Web: the SPARQL query language knowledge plus computation = inference, the ABC model
  • 3. Computing with knowledge • Challenges with knowledge graphs • Too much data • ->> query, sort, visualize, interact • Not enough data • ->> mine for more.. • Goal for practical day: Go beyond PubMed! • gain hands on experience using a knowledge graph • either with tools built for the purpose or with your own code…
  • 4. Assignment: knowledge graph to hypothesis • Option 1 Coding • Implement and apply an ABC Model style hypothesis generating program (can adapt from example provided) • explain its logic, explain how you used it to generate a hypothesis, explain the hypothesis (provide a visual) • Option 2 Non-coding • Use a knowledge discovery application(s) (list provided) to define a new hypothesis • if you can’t think of where to start, try to explain why Metformin may contribute to cancer survival • Assignment deliverables: a document containing • the inputs you gave to your program or the online tool(s) you used • what was generated in response and the underlying logic • an image and text describing the results, especially any hypothesis you could derive • (for Option 1 also submit any code written or files generated as a tar or zip archive)
  • 5. Online tools for knowledge discovery • http://knowledge.bio (* we make this one…) • http://www.biograph.be (this is a good tool, but often breaks down) • http://epiphanet.uth.tmc.edu (also on the flaky side, but can be good) • https://skr3.nlm.nih.gov/SemMed/ (works okay, requires a (free) account) • http://arrowsmith.psych.uic.edu (ugly interface, but good tool)
  • 6. Demos • http://knowledge.bio • http://www.biograph.be • http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/start.cgi
  • 7.
  • 8. Example question: repurposing all drugs http://tinyurl.com/hwm9388 ?drug ?disease interacts with protein geneencoded by genetic association treats??
  • 9. Example program (feel free to follow or adapt to your interest) • Example • Input = a disease (A) • Output = a ranked list of drugs (C) that might be used for treatment • Render the results of your workflow as a cytoscape network that illustrates the reasoning behind the predictions • Implementation • Python • Use a SPARQL endpoint such as http://query.wikidata.org • + identify and use another endpoint (e.g. EBI, UniProt) • ++ access pubmed articles and MeSH indexing
  • 10. Python setup • pip install RDFLib, SPARQLWrapper, pandas…. • Hopefully Jupyter already installed ? else install it http://jupyter.readthedocs.io/en/latest/install.html • get notebook from https://github.com/SuLab/sparql_to_pandas/blob/master/SPARQL_p andas.ipynb • go to directory where you put the notebook • run it with • >jupyter notebook • should be ready to run
  • 11. the notebook • will run a basic search for disease-gene-drug connections in wikidata • will sort the results by the number of intervening genes • will export the data to a tab-delimited file you can view in Excel, text editor, or load into cytoscape • Your job: • Run it and extend it by one or more of: • adapting the query • changing the way the results are sorted • working with the output in cytoscape to produce an informative visualization
  • 12. example output rendered in cytoscape
  • 13. Other queries from Day 1 (slides 48-54) • Drugs that target a cancer and impact a specific biological process • http://tinyurl.com/j222k6g • Drugs that target a new disease linked via biological pathway with shared genes to disease the drug is now used to treat • http://tinyurl.com/gpfr9kj
  • 14. Possible inputs for adaptations • Browse and examine wikidata.org to see what you might make use of • e.g. • Type of physical interaction between gene and drug • Gene ontology annotation (what evidence codes?) • Disease ontology hierarchy • Drug characteristics
  • 15. Other possible knowledge sources • SPARQL • UniProt http://sparql.uniprot.org • EBI SPARQL https://www.ebi.ac.uk/rdf/documentation/sparql-endpoints • look for unique identifiers on genes and proteins that you can use to link wikidata content to their content • Text • use the NCBI the E-utils API to programmatically access pubmed articles and MeSH indexing http://www.ncbi.nlm.nih.gov/books/NBK25501/ • Can use to build co-occurrence networks of e.g. MeSH terms
  • 16. Good luck! Ask questions!
  • 17. ABC ranking algorithms • Out of all C, which are most strongly related to A? • Rank by N shared B concepts • c2: 4 • c4:3 • c1: 1 • c3: 1 • c5:1 • c6:1 • Next level: adjust to down-weight highly connected nodes A B C c1 c2 c3 c4 c5 c6
  • 18. ABC ranking algorithms – advanced (require large networks to be useful) • Wren – Average Minimum Weight (AMW) (Wren) • http://bioinformatics.oxfordjournals.org/content/20/3/389.full.pdf • Linking Term Count with Average Minimum Weight (LTC-AMW) (Yetisgen-Yildiz and Pratt) • https://www.researchgate.net/publication/23759128_A_new_evaluation_me thodology_for_literature-based_discovery_systems • Predicate inter-dependence (Rastegar-Mojarad) • https://s3.amazonaws.com/uploads.hipchat.com/25885/154162/UaGvvQqbr hPBAWN/A%20new%20method.pdf

Notes de l'éditeur

  1. This picture is derived from Greek mythology: the blind giant Orion carried his servant Cedalion on his shoulders to act as the giant's eyes.