SlideShare a Scribd company logo
1 of 14
Genomics data integration,
analysis, and curation using
        MycoCosm
          Robert Riley
         Igor Shabalov
         Igor Grigoriev

          3 April 2012
US DOE Joint Genome Institute
                       Mission: Genomics user facility in
                       support of DOE missions in:




         bioenergy                           biogeochemistry
                           carbon cycling
                                            Programs: Fungal, Plant,
                                            Microbial, & Metagomics
 Genome sequencing
 using latest technology




Illumina HiSeq2000   PacBio                 Example: JGI fungal sequencing
Fungi in bioenergy



           degrade
           lignin &          White rot
           cellulose         Phanerochaete
                             chrysosporium




                                               Biofuel
Lignocellulose               Brown rot
                             Postia          precursors
                 degrade
                 cellulose   placenta
Data integration: MycoCosm

                        Comparative View




    MycoCosm
jgi.doe.gov/fungi
 120+ genomes;
5K visitors/month      Genome-Centric View
Data Integration: External Links
From GenBank to MycoCosm
Curation: user annotation




                      Users may
                   create improved
                     gene models




                      Your name here!
                                        7
Curated genes




                                                                 1
                                                                                   100



                                                                         10
                                                                                                          10000



                                                                                              1000
                                                         Hebcy1
                                                         Hypsu1
                                                         Treme1
                                                   Agabi_varbis…
                                                          Paxin1
                                                         Phaca1
                                                            Plicr1
                                                           Jaaar1
                                                           Phlgi1
                                                       Bjead1_1
                                                         Botbo1
                                                          Phlbr1




                                        Organism
                                                         Gansp1
                                                         Wolco1
                                                         Conpu1
                                                          Cersu1
                                                     PleosPC9_1
                                                         Gymlu1
                                                    SerlaS7_9_2
                                                         Aurde1
                                                          Punst1
                                                        Glotr1_1
                                                        Fomme1
                                                   PleosPC15_2
                                                          Pospl1
                                                         Fompi1
                                                          Trave1
                                                                                                                  Statistics on manual curation of gene models




    by increasing number of curations


                                                          Schco2
                                                           Stehi1
                                                         Hetan2
                                                          Phchr1
                                                                                                                                                                 Curation: user annotation




                                                           Lacbi2
                                                         Dacsp1
                                                   Agabi_varbur…
                                                    SerlaS7_3_2
                                                          Dicsq1
                                                                     5


                                                                 0
                                                                              10
                                                                                   15
                                                                                         20
                                                                                                     25
                                                                                                          30




                                                                               Curators
8
Curation: user annotation
Example: user finds a more sensible gene model

                                                      Promote to
                                                      gene catalog
 Compare
 to ESTs

 Transcript page
                                       Protein page




Cluster viewer




                                                                     9
Analysis: genome-centric view



  GC content

 Sequence
 conservation

     Gene
    catalog
      ESTs

 PFAM domains

   BLAST hits

Alternate gene
    models
                                                 10
Analysis: comparative view




                             11
Analysis: evolution of
                      lignocellulose degradation
CAZy                               CAZy and lignin-degrading genes
genes




Oxidoreductase
genes




    Eastwood et al. Science 2011                  Riley et al. in prep
Summary
MycoCosm
• Integrates functional and comparative
  genomic data and analytical tools for
  energy and environment fungi
• Offers tools for community annotation,
  data repository, and manual curation
• Facilitates comparative genome
  analysis
Acknowledgments

Igor V. Grigoriev   Robert Otillar
Henrik Nordberg     Alex Poliakov
Igor Shabalov       Igor Ratnere
Andrea Aerts        Frank Korzeniewski
Mike Cantor         Xueling Zhao
David Goodstein     Tatyana Smirnova
Alan Kuo            Daniel Rokhsar
Simon Minovitsky    Inna Dubchak
Roman Nikitin
Robin A. Ohm

More Related Content

Viewers also liked

Bioinformatics and functional genomics
Bioinformatics and functional genomicsBioinformatics and functional genomics
Bioinformatics and functional genomics
Aisha Kalsoom
 
Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics
Senthil Natesan
 

Viewers also liked (6)

Ensembl Plants: Visualising, mining and analysing crop genomics data
Ensembl Plants: Visualising, mining and analysing crop  genomics dataEnsembl Plants: Visualising, mining and analysing crop  genomics data
Ensembl Plants: Visualising, mining and analysing crop genomics data
 
Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...
 
The complexity of plant genomes
The complexity of plant genomesThe complexity of plant genomes
The complexity of plant genomes
 
David
DavidDavid
David
 
Bioinformatics and functional genomics
Bioinformatics and functional genomicsBioinformatics and functional genomics
Bioinformatics and functional genomics
 
Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

Genomics data integration, analysis, and curation using MycoCosm

  • 1. Genomics data integration, analysis, and curation using MycoCosm Robert Riley Igor Shabalov Igor Grigoriev 3 April 2012
  • 2. US DOE Joint Genome Institute Mission: Genomics user facility in support of DOE missions in: bioenergy biogeochemistry carbon cycling Programs: Fungal, Plant, Microbial, & Metagomics Genome sequencing using latest technology Illumina HiSeq2000 PacBio Example: JGI fungal sequencing
  • 3. Fungi in bioenergy degrade lignin & White rot cellulose Phanerochaete chrysosporium Biofuel Lignocellulose Brown rot Postia precursors degrade cellulose placenta
  • 4. Data integration: MycoCosm Comparative View MycoCosm jgi.doe.gov/fungi 120+ genomes; 5K visitors/month Genome-Centric View
  • 6. From GenBank to MycoCosm
  • 7. Curation: user annotation Users may create improved gene models Your name here! 7
  • 8. Curated genes 1 100 10 10000 1000 Hebcy1 Hypsu1 Treme1 Agabi_varbis… Paxin1 Phaca1 Plicr1 Jaaar1 Phlgi1 Bjead1_1 Botbo1 Phlbr1 Organism Gansp1 Wolco1 Conpu1 Cersu1 PleosPC9_1 Gymlu1 SerlaS7_9_2 Aurde1 Punst1 Glotr1_1 Fomme1 PleosPC15_2 Pospl1 Fompi1 Trave1 Statistics on manual curation of gene models by increasing number of curations Schco2 Stehi1 Hetan2 Phchr1 Curation: user annotation Lacbi2 Dacsp1 Agabi_varbur… SerlaS7_3_2 Dicsq1 5 0 10 15 20 25 30 Curators 8
  • 9. Curation: user annotation Example: user finds a more sensible gene model Promote to gene catalog Compare to ESTs Transcript page Protein page Cluster viewer 9
  • 10. Analysis: genome-centric view GC content Sequence conservation Gene catalog ESTs PFAM domains BLAST hits Alternate gene models 10
  • 12. Analysis: evolution of lignocellulose degradation CAZy CAZy and lignin-degrading genes genes Oxidoreductase genes Eastwood et al. Science 2011 Riley et al. in prep
  • 13. Summary MycoCosm • Integrates functional and comparative genomic data and analytical tools for energy and environment fungi • Offers tools for community annotation, data repository, and manual curation • Facilitates comparative genome analysis
  • 14. Acknowledgments Igor V. Grigoriev Robert Otillar Henrik Nordberg Alex Poliakov Igor Shabalov Igor Ratnere Andrea Aerts Frank Korzeniewski Mike Cantor Xueling Zhao David Goodstein Tatyana Smirnova Alan Kuo Daniel Rokhsar Simon Minovitsky Inna Dubchak Roman Nikitin Robin A. Ohm