SlideShare une entreprise Scribd logo
1  sur  35
Mining biological knowledge networks for
gene-phenotype discovery
Keywan Hassani-Pak
http://knetminer.rothamsted.ac.uk/
Plant and Animal Genomes Conference 2017
@KnetMiner
The Genotype to Phenotype Challenge
Genotype
SNPs and Indels
Omics
Includes any ‘omics
Phenotype
Flowering
Defence
Development
Stress tolerance
Biological Knowledge Network
1. Methods to assemble and visualise an integrated
knowledge network of the cell
2. Methods to use the knowledge network to
translate genotype to phenotype
• Free and open source
• Data warehousing using a graph-
database
• Platform to integrate public and private
datasets in various formats
• Provides a GUI, CLI and APIs for
reproducible data integration workflows
Ondex – Data Integration Platform
Ondex
www.ondex.org
The approach is generic and works similarly for other species
Let’s get a GWAS dataset…
http://plants.ensembl.org/biomart
#SNP=66,816 | #Gene=27,502 | #Phenotype=107
… transform into a network
(SNP)
(Phenotype)
associated
Biological interaction datasets
http://thebiogrid.org
(SNP)
(Phenotype)
associated
… add biological interactions
• Gene-GO
• Gene-Phenotype
Gene knock-out or overexpression
Text mining publications
• Gene-Publication
• Gene-Pathway
• Homology to yeast
• Homology to crops
Wheat
… finally add other open linked data
>500,000 nodes
>1,500,000 links
Genome-scale knowledge network
Relationships in Crop Knowledge Networks
GO
TO
encodes
text-mining
GWAS
P-Value 10-8
41% identity
EnsemblCompara
Genes Homology Annotations
encodes
Inferred from
Mutant Phenotype
PMID: 15598800
Genetics
QTL
GWAS
Marker
Interactions Phenotype
Mutations in TTG2
cause phenotypic
defects seed color
pigmentation.
PMID: 17766401
• Methods needed to evaluate millions of
relationships in knowledge network, prioritize
genes and extract relevant subnetworks
• Interactive and exploratory tools needed to
enable knowledge discovery and decision
making
• Interpretation should be the task of domain
experts i.e. biologists!
How to search and interpret too much information?
KnetMiner – Systematic and evidence-based gene discovery
http://knetminer.rothamsted.ac.uk
Web Browser
KnetMiner
Client
KnetMiner
Server
Servlets and JSP Page
Java Socket
Knowledge
Graph DBOndex API
DHTML
JavaScript
Apache Tomcat
Multithreaded
Java Server
HTML, JSON, XML and images
over HTTP via Ajax
Views
Java Socket
Java Applet
Flash
KnetMiner Software Architecture
Major improvements
to the user-interface.
Re-implemented Java
Applet and Flash
components in
JavaScript.
Now compatible with
most OS and touch
devices.
Which associations (genes) are worth following up?
Often a highly subjective decision
How is genotype translated to phenotype?
Often involves multi-omics interactions
KnetMiner search interface
KnetMiner Outputs
Use Case 1 – Mining GWAS and QTL data
• 96 or 192 Arabidopsis inbred lines
• Genotyped: 250,000 SNPs
• 107 phenotypes were measured
https://arapheno.1001genomes.org/study/1/
o Flowering
o Defence
o Ionomics
o Developmental
• Wilcoxon and EMMA (control population structure) statistical tests
GWAS of 107 Phenotypes in Arabidopsis
Atwell et al., Nature 2010
Examples where GWAS results are simple to interpret
Sodium concentration (Na)
Lesioning (LES)
AvrRpm1
Single, sharp peak of
association centred on
causal polymorphism
LD decays within 10 kb on average
in Arabidopsis
Examples where GWAS results are complex to interpret
FLC gene expression (FLC)
Leaf Number (LN22)
Days to flowering (FT Field)
Peaks are diffuse
covering several hundred
kb without a clear centre
Causal polymorphisms have not
always strongest association
Using KnetMiner to interpret GWAS results
Wilcoxon
results
EMMA
results
Atwell et al., Nature 2010
Flowering Locus C (FLC) gene expression
Demo: Exploring genes and networks controlling FLC expression
• Petal size QTL in Arabidopsis (in collaboration with John Doonan)
Using KnetMiner to prioritise genes in QTL
Use Case 2 – Mining differentially expressed
genes
#25
White grained wheat is more prone to pre-harvest sprouting (PHS)
• PHS is the result of premature germination of grain in
the ear and results in loss of bread-making quality
• Red grain colour is associated with increased dormancy
and resistance to PHS
• Grain colour is due to proanthocyanidins (condensed
tannins) in the testa
Sprouting
Grain colour
+ = white
o = red
Groos et al. (2002)TAG 104, 39-47
Red grain 20dpa
Andy Phillips
67 down-regulated genes
37 up-regulated genes
Over hundred statistically significant
genes.
How are these linked to grain colour
and PHS?
Differential Gene Expression Analysis
Google-like search interface
• Search knowledge graph using trait-
based keywords
• Real-time user feedback and query
suggestions
Trait related
keywords
Query term
suggestions
Genes linked to grain colour and/or PHS
Genes with direct or indirect links to grain colour and PHS
#29
KnetMiner methodology
Ondex Text-Mining Plugin
Input data
• 27,416 Arabidopsis gene names from Phytozome
• 52,561 Abstracts from PubMed that contain Arabidopsis
• 22,201 curated citations from TAIR
• 1,349 Trait Ontology terms from Planteome
Hassani-Pak et al., 2010
text-mining
x
y
BA
occurrs_in
Publication
Concepts
published_in
weighted association network
IP=1.7; M=1.2; N=2
yx
BAGeneTO
TO
Text-mining output
These steps connect 5553 Arabidopsis genes to 409 TO terms
based on 18,341 co-citations
• Uses TF*IDF to rank documents by their relevance to a search term
• Additionally, considers the properties of gene-evidence networks such as
 the specificity of documents to a gene
 the frequency of evidence concepts
• Smart pre-indexing of the knowledge network makes the computation of
the score very fast
Gene Ranking
• Web application for very fast search of
large genome-scale knowledge graphs
• Ranking of candidate genes based on
knowledge mining
• Interactive visualisation of genome
and knowledge maps
• Facilitates hypothesis validation and
generation
KnetMiner – Making Gene Discovery Efficient & Fun
http://knetminer.rothamsted.ac.uk/
Acknowledgements
John Doonan
Sergio Feingold
Martin Castellote
Uwe Scholz
Matthias Lange
Andy Law
Keywan Hassani-Pak
Ajit Singh
Marco Brandizi
Monika Mistry
Lisa Lill
Chris Rawlings
Dave Edwards
Philipp Bayer
Misha Kapushesky
Kevin Dialdestoro
@KnetMiner

Contenu connexe

Tendances

Bioinformatics Final Presentation
Bioinformatics Final PresentationBioinformatics Final Presentation
Bioinformatics Final Presentation
Shruthi Choudary
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
biinoida
 

Tendances (20)

BIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesBIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And Challenges
 
Bioinformatics Final Presentation
Bioinformatics Final PresentationBioinformatics Final Presentation
Bioinformatics Final Presentation
 
Careers in bioinformatics
Careers in bioinformaticsCareers in bioinformatics
Careers in bioinformatics
 
Career oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsCareer oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of Bioinformatics
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 
Bioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of NatureBioinformatics - Discovering the Bio Logic Of Nature
Bioinformatics - Discovering the Bio Logic Of Nature
 
Bioinformatics on internet
Bioinformatics on internetBioinformatics on internet
Bioinformatics on internet
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
B.sc biochem i bobi u-1 introduction to bioinformatics
B.sc biochem i bobi u-1 introduction to bioinformaticsB.sc biochem i bobi u-1 introduction to bioinformatics
B.sc biochem i bobi u-1 introduction to bioinformatics
 
Bioinformatics Software
Bioinformatics SoftwareBioinformatics Software
Bioinformatics Software
 
Bioinformatics, Its Usage and Advantages
Bioinformatics, Its Usage and AdvantagesBioinformatics, Its Usage and Advantages
Bioinformatics, Its Usage and Advantages
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
Basics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsBasics of Data Analysis in Bioinformatics
Basics of Data Analysis in Bioinformatics
 
Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future Perspectives
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 

Similaire à KnetMiner - Knowledge Network Miner

OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
Sean Davis
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009
bosc
 
Plant functionalgenomics
Plant functionalgenomicsPlant functionalgenomics
Plant functionalgenomics
Clifford Stone
 

Similaire à KnetMiner - Knowledge Network Miner (20)

OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein function
 
KnetMiner Overview Oct 2017
KnetMiner Overview Oct 2017KnetMiner Overview Oct 2017
KnetMiner Overview Oct 2017
 
Comparative genomics to the rescue: How complete is your plant genome sequence?
Comparative genomics to the rescue: How complete is your plant genome sequence?Comparative genomics to the rescue: How complete is your plant genome sequence?
Comparative genomics to the rescue: How complete is your plant genome sequence?
 
Functional annotation of invertebrate genomes
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomes
 
Ondex: Data integration and visualisation
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisation
 
Modern techniques of crop improvement.pptx final
Modern techniques of crop improvement.pptx finalModern techniques of crop improvement.pptx final
Modern techniques of crop improvement.pptx final
 
Predicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learningPredicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learning
 
Omics in crop improvement
Omics in crop improvementOmics in crop improvement
Omics in crop improvement
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics Technologies
 
21 kebere bezaweletaw 207-217
21 kebere bezaweletaw 207-21721 kebere bezaweletaw 207-217
21 kebere bezaweletaw 207-217
 
Investigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisInvestigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysis
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009
 
Plant functionalgenomics
Plant functionalgenomicsPlant functionalgenomics
Plant functionalgenomics
 
ICAR 2015 Poster - Araport
ICAR 2015 Poster - AraportICAR 2015 Poster - Araport
ICAR 2015 Poster - Araport
 
Omic Data Integration Strategies
Omic Data Integration StrategiesOmic Data Integration Strategies
Omic Data Integration Strategies
 
Pangenomics.pptx
Pangenomics.pptxPangenomics.pptx
Pangenomics.pptx
 
PLAZA 3.0 - an access point for plant comparative genomics
PLAZA 3.0 - an access point for plant comparative genomicsPLAZA 3.0 - an access point for plant comparative genomics
PLAZA 3.0 - an access point for plant comparative genomics
 
Expanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSExpanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGS
 

Dernier

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Dernier (20)

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 

KnetMiner - Knowledge Network Miner

  • 1. Mining biological knowledge networks for gene-phenotype discovery Keywan Hassani-Pak http://knetminer.rothamsted.ac.uk/ Plant and Animal Genomes Conference 2017 @KnetMiner
  • 2. The Genotype to Phenotype Challenge Genotype SNPs and Indels Omics Includes any ‘omics Phenotype Flowering Defence Development Stress tolerance Biological Knowledge Network 1. Methods to assemble and visualise an integrated knowledge network of the cell 2. Methods to use the knowledge network to translate genotype to phenotype
  • 3. • Free and open source • Data warehousing using a graph- database • Platform to integrate public and private datasets in various formats • Provides a GUI, CLI and APIs for reproducible data integration workflows Ondex – Data Integration Platform Ondex www.ondex.org
  • 4. The approach is generic and works similarly for other species
  • 5. Let’s get a GWAS dataset… http://plants.ensembl.org/biomart #SNP=66,816 | #Gene=27,502 | #Phenotype=107
  • 6. … transform into a network (SNP) (Phenotype) associated
  • 9. • Gene-GO • Gene-Phenotype Gene knock-out or overexpression Text mining publications • Gene-Publication • Gene-Pathway • Homology to yeast • Homology to crops Wheat … finally add other open linked data >500,000 nodes >1,500,000 links Genome-scale knowledge network
  • 10. Relationships in Crop Knowledge Networks GO TO encodes text-mining GWAS P-Value 10-8 41% identity EnsemblCompara Genes Homology Annotations encodes Inferred from Mutant Phenotype PMID: 15598800 Genetics QTL GWAS Marker Interactions Phenotype Mutations in TTG2 cause phenotypic defects seed color pigmentation. PMID: 17766401
  • 11. • Methods needed to evaluate millions of relationships in knowledge network, prioritize genes and extract relevant subnetworks • Interactive and exploratory tools needed to enable knowledge discovery and decision making • Interpretation should be the task of domain experts i.e. biologists! How to search and interpret too much information?
  • 12. KnetMiner – Systematic and evidence-based gene discovery http://knetminer.rothamsted.ac.uk
  • 13. Web Browser KnetMiner Client KnetMiner Server Servlets and JSP Page Java Socket Knowledge Graph DBOndex API DHTML JavaScript Apache Tomcat Multithreaded Java Server HTML, JSON, XML and images over HTTP via Ajax Views Java Socket Java Applet Flash KnetMiner Software Architecture Major improvements to the user-interface. Re-implemented Java Applet and Flash components in JavaScript. Now compatible with most OS and touch devices.
  • 14. Which associations (genes) are worth following up? Often a highly subjective decision How is genotype translated to phenotype? Often involves multi-omics interactions
  • 17. Use Case 1 – Mining GWAS and QTL data
  • 18. • 96 or 192 Arabidopsis inbred lines • Genotyped: 250,000 SNPs • 107 phenotypes were measured https://arapheno.1001genomes.org/study/1/ o Flowering o Defence o Ionomics o Developmental • Wilcoxon and EMMA (control population structure) statistical tests GWAS of 107 Phenotypes in Arabidopsis Atwell et al., Nature 2010
  • 19. Examples where GWAS results are simple to interpret Sodium concentration (Na) Lesioning (LES) AvrRpm1 Single, sharp peak of association centred on causal polymorphism LD decays within 10 kb on average in Arabidopsis
  • 20. Examples where GWAS results are complex to interpret FLC gene expression (FLC) Leaf Number (LN22) Days to flowering (FT Field) Peaks are diffuse covering several hundred kb without a clear centre Causal polymorphisms have not always strongest association
  • 21. Using KnetMiner to interpret GWAS results Wilcoxon results EMMA results Atwell et al., Nature 2010 Flowering Locus C (FLC) gene expression
  • 22. Demo: Exploring genes and networks controlling FLC expression
  • 23. • Petal size QTL in Arabidopsis (in collaboration with John Doonan) Using KnetMiner to prioritise genes in QTL
  • 24. Use Case 2 – Mining differentially expressed genes
  • 25. #25 White grained wheat is more prone to pre-harvest sprouting (PHS) • PHS is the result of premature germination of grain in the ear and results in loss of bread-making quality • Red grain colour is associated with increased dormancy and resistance to PHS • Grain colour is due to proanthocyanidins (condensed tannins) in the testa Sprouting Grain colour + = white o = red Groos et al. (2002)TAG 104, 39-47 Red grain 20dpa Andy Phillips
  • 26. 67 down-regulated genes 37 up-regulated genes Over hundred statistically significant genes. How are these linked to grain colour and PHS? Differential Gene Expression Analysis
  • 27. Google-like search interface • Search knowledge graph using trait- based keywords • Real-time user feedback and query suggestions Trait related keywords Query term suggestions
  • 28. Genes linked to grain colour and/or PHS
  • 29. Genes with direct or indirect links to grain colour and PHS #29
  • 31. Ondex Text-Mining Plugin Input data • 27,416 Arabidopsis gene names from Phytozome • 52,561 Abstracts from PubMed that contain Arabidopsis • 22,201 curated citations from TAIR • 1,349 Trait Ontology terms from Planteome Hassani-Pak et al., 2010 text-mining x y BA occurrs_in Publication Concepts published_in weighted association network IP=1.7; M=1.2; N=2 yx BAGeneTO TO
  • 32. Text-mining output These steps connect 5553 Arabidopsis genes to 409 TO terms based on 18,341 co-citations
  • 33. • Uses TF*IDF to rank documents by their relevance to a search term • Additionally, considers the properties of gene-evidence networks such as  the specificity of documents to a gene  the frequency of evidence concepts • Smart pre-indexing of the knowledge network makes the computation of the score very fast Gene Ranking
  • 34. • Web application for very fast search of large genome-scale knowledge graphs • Ranking of candidate genes based on knowledge mining • Interactive visualisation of genome and knowledge maps • Facilitates hypothesis validation and generation KnetMiner – Making Gene Discovery Efficient & Fun http://knetminer.rothamsted.ac.uk/
  • 35. Acknowledgements John Doonan Sergio Feingold Martin Castellote Uwe Scholz Matthias Lange Andy Law Keywan Hassani-Pak Ajit Singh Marco Brandizi Monika Mistry Lisa Lill Chris Rawlings Dave Edwards Philipp Bayer Misha Kapushesky Kevin Dialdestoro @KnetMiner

Notes de l'éditeur

  1. This is a reminder that you are scheduled to present in the PAG workshop Saturday, January 14, 2017. The schedule of presenters is as follows.   10:30 AM QTLNetMiner, interrogate plant and animal knowledge networks Keywan Hassani-Pak 10:50 AM BrAPI, a standard interface for plant databases Jan Erik Backlund 11:10 AM Visualizations of Phenotypic and QTL Data David Marshall 11:30 AM Cyverse Data Commons Ramona Walls 11:50 AM Transplant Integrated Search Using Apache Solr Paul J. Kersey 12:10 PM Wheatis : A Genetics and Genomics Information System for the Wheat Research Community Hadi Quesneville   Connecting Crop Phenotype Data Saturday, January 14, 2017 Golden Ballroom   You can upload your presentation at the Speaker Ready Room in Terrace Salon 2 on Friday until 8pm or Saturday morning starting at 7am. If you have any questions please fe free send me an email.   Clay Birkett Cornell University, USDA Ithaca, NY
  2. Creating improved crop varieties needs the identification of important traits and the discovery of causal genes Linking genotype and phenotype is one of the greatest challenges in biology Many phenotypes are complex, polygenic and the result of complex interactions on cellular level We need methods to build knowledge networks through 1) integration of heterogeneous datasets and 2) to search these networks with QTL, SNP, gene expression, keyword in order to link genotype to phenotype.
  3. SNP-Phenotype relations (122,919 relations) of significant SNPs (as defined by Ensembl, p-value<0.05?) linked to 107 phenotypes; on average 1,150 SNPs per phenotype. SNP-Gene relations are based on genes in close proximity to SNPs <1000bp (96,047 relations) How to integrate GWAS and biological interaction data
  4. Using Ondex
  5. http://www.sciencedirect.com/science/article/pii/S2212066116300308
  6. Highlight text-mining
  7. Scale…
  8. Worth = Have a positive impact on the biological outcome in the whole organism without producing negative side effects. Significant SNPs are rarely located within the causal gene sequence… Consider LD, closest gene is not always the correct candidate… Consider cofounding, strongest association not always the main causal effect…
  9. Non-parametric Wilcoxon rank-sum test (F-test for phenotypes that are categorical and not quantitative)
  10. LD in Arabidopsis decays within 10 kb on average https://www.ncbi.nlm.nih.gov/pubmed/17676040
  11. Up to 192 Arabidopsis inbred lines were genotyped for 250k SNPs and phenotyped for 107 traits including flowering, defence, ionomics and development Phenotype data available in AraPheno database https://arapheno.1001genomes.org/study/1/
  12. Search term: flowering FLC Integrating our own experimental data with the wealth of published open data Put your experiment in the context of hundred similar experiments… Compare myQTL to other QTL/GWAS and functional genomics studies.
  13. Boosting queries found in the title TO by Laurel Cooper