SlideShare une entreprise Scribd logo
1  sur  18
Télécharger pour lire hors ligne
Next-generation text-mining
applied to toxicogenomics data
            analysis

         Kristina Hettne
       PhD thesis defense


          20 December, 2012
Toxicogenomics: study if a chemical causes
 damage to genes

Text mining: teach a computer to “read”
 articles and extract explicit information

Next-generation text mining: teach a
 computer to find implicit information in
 articles
Drug safety is essential!
                                  But… how to minimize animal testing?




Image source: The Independent, July 12, 2012
Toxicogenomics data                                Interpretation using
                                                       knowledge from manually
                                                       curated databases




Image sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/
Toxicogenomics data                                Interpretation using
                                                       knowledge from manually
                                                       curated databases




                                                       Not sufficient in coverage

     We hypothesize that next-generation text mining
     can increase the information coverage
Image sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/
Next-generation text mining = concept profile
   matching
     Information cloud for
     a gene concept                   Shared concepts




                                                        Information cloud
                                                        for a chemical
                                                        concept




Image source: Herman van Haagen

                                  7
Concepts come from a thesaurus and are identified
   in text with concept identification software


   A good
   thesaurus =
   the basis for
   good concept
   identification



Image source: Herman van Haagen
Research objectives:
• Investigate information coverage in public
   biomedical and chemical thesauri and
   databases
• Provide methods to improve the quality
   and coverage
• Give recommendations for use
• Investigate added value of next-
   generation text mining when interpreting
   toxicogenomics data
                    9
Results




 10
A thesaurus of chemical concepts1 and
methods1,2,3 to prepare a thesaurus to be
used with concept identification software




http://www.biosemantics.org/casper http://www.biosemantics.org/jochem


1. Hettne et al. Bioinformatics, 2009
2. Hettne et al. Journal of Biomedical Semantics, 2010
                                        11
3. Hettne et al. Journal of Cheminformatics, 2010
A next-generation text mining-based method
   for interpreting biological data
                                                                         Next-generation
       Biological data                      Statistical test             text mining
                                                                                             12




     This method gives more, and more specific results1
     than other available tools
      http://www.biosemantics.org/weightedglobaltest

1. Jelier R, Goeman JJ, Hettne KM, Schuemie MJ, den Dunnen JT, 't Hoen PA. Briefings in Bioinformatics, 2011
Application to toxicogenomics
                            Hettne et al. (submitted)
http://www.biosemantics.org/index.php?page=chemicalresponse-specific-gene-sets
See developmental defects in stem cells instead of
       in animal embryos
                                                                          Embryonic
                                                                          structure
     1.



2.                                                                   Posterior neuropore open




     A) Control group rat embryo B)Triazole-exposed rat embryo
Image sources1. Verhallen and Piersma, 2011, 2. De Jong et al 2012
Toxicity class prediction (case study: Triazoles)
      25 times larger chemical-gene matrix compared to manual
      work (Comparative Toxicogenomics Database)
                                                     Chemical
     1.




Image source 1: Verhallen and Piersma, 2011
Conclusions
Next-generation text mining combined with
statistical tests complements, and is
sometimes superior to, manually curated
databases in:
- Relating chemical information to gene
   expression data
- Identifying toxic effects already at the
   gene expression stage
- Discriminating between different classes
   of chemicals
Future
1. Make the method easier to use
(currently being worked on)

2. Apply the method for new drugs
with unknown toxicity

Early prediction of toxicity ->
less animal testing and safer drugs
Thank you to all who made
      this possible!

Contenu connexe

Tendances

My Dissertation Proposal Defense
My Dissertation Proposal DefenseMy Dissertation Proposal Defense
My Dissertation Proposal DefenseLaura Pasquini
 
M.S. Thesis Defense
M.S. Thesis DefenseM.S. Thesis Defense
M.S. Thesis Defensepbecker1987
 
Dissertation defense ppt
Dissertation defense ppt Dissertation defense ppt
Dissertation defense ppt Dr. James Lake
 
Thesis Power Point Presentation
Thesis Power Point PresentationThesis Power Point Presentation
Thesis Power Point Presentationriddhikapandya1985
 
Thesis Defense Presentation
Thesis Defense PresentationThesis Defense Presentation
Thesis Defense Presentationosideloc
 
Presentation of PhD Research Proposal (Format).ppt
Presentation of PhD Research Proposal (Format).pptPresentation of PhD Research Proposal (Format).ppt
Presentation of PhD Research Proposal (Format).pptHassanRashid51
 
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...Gary Broils, DBA, PMP
 
Dissertation defense power point
Dissertation defense power pointDissertation defense power point
Dissertation defense power pointKelly Dodson
 
Masters thesis presentation
Masters thesis presentationMasters thesis presentation
Masters thesis presentationEsper Achkar
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense PresentationOnur Taylan
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense PresentationDavid Onoue
 
Master Thesis Presentation
Master Thesis PresentationMaster Thesis Presentation
Master Thesis PresentationWishofnight13
 
Dissertation Defense Presentation
Dissertation Defense PresentationDissertation Defense Presentation
Dissertation Defense PresentationAvril El-Amin
 
PhD defence presentation
PhD defence presentationPhD defence presentation
PhD defence presentationcsteinmann
 
A Qualitative Phenomenological Study on Prison Volunteers in California’s Cor...
A Qualitative Phenomenological Study on Prison Volunteers in California’s Cor...A Qualitative Phenomenological Study on Prison Volunteers in California’s Cor...
A Qualitative Phenomenological Study on Prison Volunteers in California’s Cor...Donna Madison-Bell
 
Final Phd Thesis Presentation
Final Phd Thesis PresentationFinal Phd Thesis Presentation
Final Phd Thesis PresentationFrancesco Mureddu
 
Thesis Oral Defense
Thesis Oral DefenseThesis Oral Defense
Thesis Oral DefenseDean Call
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaDr Mohan Savade
 
Thesis Defense Presentation
Thesis Defense PresentationThesis Defense Presentation
Thesis Defense Presentationbsr1122
 

Tendances (20)

My Dissertation Proposal Defense
My Dissertation Proposal DefenseMy Dissertation Proposal Defense
My Dissertation Proposal Defense
 
M.S. Thesis Defense
M.S. Thesis DefenseM.S. Thesis Defense
M.S. Thesis Defense
 
Dissertation defense ppt
Dissertation defense ppt Dissertation defense ppt
Dissertation defense ppt
 
Thesis Power Point Presentation
Thesis Power Point PresentationThesis Power Point Presentation
Thesis Power Point Presentation
 
Thesis Defense Presentation
Thesis Defense PresentationThesis Defense Presentation
Thesis Defense Presentation
 
Presentation of PhD Research Proposal (Format).ppt
Presentation of PhD Research Proposal (Format).pptPresentation of PhD Research Proposal (Format).ppt
Presentation of PhD Research Proposal (Format).ppt
 
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...
Gary Broils, D.B.A. - Dissertation Defense: Virtual Teaming and Collaboration...
 
Dissertation defense power point
Dissertation defense power pointDissertation defense power point
Dissertation defense power point
 
Masters thesis presentation
Masters thesis presentationMasters thesis presentation
Masters thesis presentation
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense Presentation
 
My Thesis Defense Presentation
My Thesis Defense PresentationMy Thesis Defense Presentation
My Thesis Defense Presentation
 
Master Thesis Presentation
Master Thesis PresentationMaster Thesis Presentation
Master Thesis Presentation
 
Dissertation Defense Presentation
Dissertation Defense PresentationDissertation Defense Presentation
Dissertation Defense Presentation
 
PhD defence presentation
PhD defence presentationPhD defence presentation
PhD defence presentation
 
PhD Viva PPT
PhD Viva PPTPhD Viva PPT
PhD Viva PPT
 
A Qualitative Phenomenological Study on Prison Volunteers in California’s Cor...
A Qualitative Phenomenological Study on Prison Volunteers in California’s Cor...A Qualitative Phenomenological Study on Prison Volunteers in California’s Cor...
A Qualitative Phenomenological Study on Prison Volunteers in California’s Cor...
 
Final Phd Thesis Presentation
Final Phd Thesis PresentationFinal Phd Thesis Presentation
Final Phd Thesis Presentation
 
Thesis Oral Defense
Thesis Oral DefenseThesis Oral Defense
Thesis Oral Defense
 
Powerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD VivaPowerpoint Presentation of PhD Viva
Powerpoint Presentation of PhD Viva
 
Thesis Defense Presentation
Thesis Defense PresentationThesis Defense Presentation
Thesis Defense Presentation
 

Similaire à PhD thesis presentation

DIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the GenomeDIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the GenomeMelanie Swan
 
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmRelation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmMangaiK4
 
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmRelation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmMangaiK4
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsPragya Pai
 
Bioinformatics-General_Intro
Bioinformatics-General_IntroBioinformatics-General_Intro
Bioinformatics-General_IntroAbhiroop Ghatak
 
Ontologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataOntologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataYannick Pouliot
 
Bioinformatics
BioinformaticsBioinformatics
BioinformaticsJTADrexel
 
404 Part II • Predictive AnalyticsMachine LearningAnother.docx
404 Part II • Predictive AnalyticsMachine LearningAnother.docx404 Part II • Predictive AnalyticsMachine LearningAnother.docx
404 Part II • Predictive AnalyticsMachine LearningAnother.docxdomenicacullison
 
BioVariance Services Flyer
BioVariance Services FlyerBioVariance Services Flyer
BioVariance Services FlyerJosef Scheiber
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Publishing House
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Journals
 
Research trends in different pharmaceutical areas.docx
Research trends in different pharmaceutical areas.docxResearch trends in different pharmaceutical areas.docx
Research trends in different pharmaceutical areas.docxImtiajChowdhuryEham
 
Computational Biology and Bioinformatics
Computational Biology and BioinformaticsComputational Biology and Bioinformatics
Computational Biology and BioinformaticsSharif Shuvo
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformaticaMartín Arrieta
 

Similaire à PhD thesis presentation (20)

DIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the GenomeDIYgenomics: An Open Platform for Democratizing the Genome
DIYgenomics: An Open Platform for Democratizing the Genome
 
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmRelation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
 
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble AlgorithmRelation Extraction using Hybrid Approach and an Ensemble Algorithm
Relation Extraction using Hybrid Approach and an Ensemble Algorithm
 
Mrr iti phar_mu
Mrr iti phar_muMrr iti phar_mu
Mrr iti phar_mu
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
 
Bioinformatics-General_Intro
Bioinformatics-General_IntroBioinformatics-General_Intro
Bioinformatics-General_Intro
 
Ontologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological DataOntologies for Semantic Normalization of Immunological Data
Ontologies for Semantic Normalization of Immunological Data
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
404 Part II • Predictive AnalyticsMachine LearningAnother.docx
404 Part II • Predictive AnalyticsMachine LearningAnother.docx404 Part II • Predictive AnalyticsMachine LearningAnother.docx
404 Part II • Predictive AnalyticsMachine LearningAnother.docx
 
BioVariance Services Flyer
BioVariance Services FlyerBioVariance Services Flyer
BioVariance Services Flyer
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseases
 
Nlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseasesNlp based retrieval of medical information for diagnosis of human diseases
Nlp based retrieval of medical information for diagnosis of human diseases
 
David
DavidDavid
David
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
Research trends in different pharmaceutical areas.docx
Research trends in different pharmaceutical areas.docxResearch trends in different pharmaceutical areas.docx
Research trends in different pharmaceutical areas.docx
 
Bioinformatics .pptx
Bioinformatics .pptxBioinformatics .pptx
Bioinformatics .pptx
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
Computational Biology and Bioinformatics
Computational Biology and BioinformaticsComputational Biology and Bioinformatics
Computational Biology and Bioinformatics
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformatica
 
Computational biology
Computational biologyComputational biology
Computational biology
 

PhD thesis presentation

  • 1. Next-generation text-mining applied to toxicogenomics data analysis Kristina Hettne PhD thesis defense 20 December, 2012
  • 2. Toxicogenomics: study if a chemical causes damage to genes Text mining: teach a computer to “read” articles and extract explicit information Next-generation text mining: teach a computer to find implicit information in articles
  • 3.
  • 4. Drug safety is essential! But… how to minimize animal testing? Image source: The Independent, July 12, 2012
  • 5. Toxicogenomics data Interpretation using knowledge from manually curated databases Image sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/
  • 6. Toxicogenomics data Interpretation using knowledge from manually curated databases Not sufficient in coverage We hypothesize that next-generation text mining can increase the information coverage Image sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/
  • 7. Next-generation text mining = concept profile matching Information cloud for a gene concept Shared concepts Information cloud for a chemical concept Image source: Herman van Haagen 7
  • 8. Concepts come from a thesaurus and are identified in text with concept identification software A good thesaurus = the basis for good concept identification Image source: Herman van Haagen
  • 9. Research objectives: • Investigate information coverage in public biomedical and chemical thesauri and databases • Provide methods to improve the quality and coverage • Give recommendations for use • Investigate added value of next- generation text mining when interpreting toxicogenomics data 9
  • 11. A thesaurus of chemical concepts1 and methods1,2,3 to prepare a thesaurus to be used with concept identification software http://www.biosemantics.org/casper http://www.biosemantics.org/jochem 1. Hettne et al. Bioinformatics, 2009 2. Hettne et al. Journal of Biomedical Semantics, 2010 11 3. Hettne et al. Journal of Cheminformatics, 2010
  • 12. A next-generation text mining-based method for interpreting biological data Next-generation Biological data Statistical test text mining 12 This method gives more, and more specific results1 than other available tools http://www.biosemantics.org/weightedglobaltest 1. Jelier R, Goeman JJ, Hettne KM, Schuemie MJ, den Dunnen JT, 't Hoen PA. Briefings in Bioinformatics, 2011
  • 13. Application to toxicogenomics Hettne et al. (submitted) http://www.biosemantics.org/index.php?page=chemicalresponse-specific-gene-sets
  • 14. See developmental defects in stem cells instead of in animal embryos Embryonic structure 1. 2. Posterior neuropore open A) Control group rat embryo B)Triazole-exposed rat embryo Image sources1. Verhallen and Piersma, 2011, 2. De Jong et al 2012
  • 15. Toxicity class prediction (case study: Triazoles) 25 times larger chemical-gene matrix compared to manual work (Comparative Toxicogenomics Database) Chemical 1. Image source 1: Verhallen and Piersma, 2011
  • 16. Conclusions Next-generation text mining combined with statistical tests complements, and is sometimes superior to, manually curated databases in: - Relating chemical information to gene expression data - Identifying toxic effects already at the gene expression stage - Discriminating between different classes of chemicals
  • 17. Future 1. Make the method easier to use (currently being worked on) 2. Apply the method for new drugs with unknown toxicity Early prediction of toxicity -> less animal testing and safer drugs
  • 18. Thank you to all who made this possible!