SlideShare a Scribd company logo
1 of 16
Download to read offline
The first near-complete assembly of
the hexaploid bread wheat genome,
Tritricum aestivum
Daniela Puiu
Aleksey Zimin, Richard Hall, Sarah Kingan, Bernardo Clavijo, Steven Salzberg
ICG-12
Oct 27 2017
IGC-12The Wheat Genome 2
Sequencing and Assembly of the
Ancestral and Common Wheat
Aegilops tauschii ssp strangulata accession AL8/78
Chinese spring variety (CS42, accession Dv418)
2013-2017
IGC-12The Wheat Genome 3
History of Wheat
~8,000 years ago: spontaneous hybridization
Emmer Wheat + Goat grass = Bread Wheat (World's 3rd
cereal crop)
Triticum turgidum + Aegilops tauschii = Triticum aestivum
AABB + DD = AABBDD
Whole Genome => Assisted Breeding => Improved Yield
IGC-12The Wheat Genome 4
The Wheat Genome
One of the most complex genomes !
1) Genome size: over 15 billion bases
2) Allohexapoild : six copies of each chromosome
3) >90% repeats
Multiple past attempts to assemble =>
assemblies shorter than the estimated genome size.
IGC-12The Wheat Genome 5
New vs Previous Assemblies
Tritricum 3.1
N50
232K
IGC-12The Wheat Genome 6
Data Reduction
Original Reads Number Sum Coverage Accuracy
Illumina 7.06G 1Tb 65x 99.5%
PacBio 55.5M 545Gb 36x 87.5%
Processed Seq Number Sum Coverage Accuracy
super-reads 95.7M 31Gb 2x 99.95%
mega-reads 57M 278Gb 18x 99.65%
MaSuRCA mega-reads
hybrid correction
IGC-12The Wheat Genome 7
MaSuRCA mega-reads Correction
IGC-12The Wheat Genome 8
Assembly Pipeline
MaSuRCA Correction
Illumina
Celera WGS Assembler
Mega-reads
Remove Duplicates
Tritricum 1.0
Tritricum 2.0
FALCON Correction
PacBio
FALCON Assembler
pReads
Arrow Polishing
FALCON Trit 0.5
FALCON Trit 1.0
k-mer Analysis
Merge
Tritricum 3.1
IGC-12The Wheat Genome 9
k-mer Analysis
50M
k-mers missing from the
PacBio assembly only
40M
30M
20M
10M
31-mer frequencies
IGC-12The Wheat Genome 10
Assembly Merge
Merging of the Hybrid and PacBio assembliesMerging of the Hybrid and PacBio assemblies
Tritricum 2.0 contig
FALCON contigA FALCON contigB
Tritricum 3.1
>5Kb >5Kb>5Kb
IGC-12The Wheat Genome 11
Assembly Statistics
Assembly Number Total size
(bp)
N50 size
(bp)
Triticum 2.0 375,328 14,395,027,822 75,599
FALCON Trit 1.0 97,809 12,939,100,857 215,314
Triticum 3.1 279,439 15,344,693,583 232,659
IGC-12The Wheat Genome 12
Run Time: 100 CPU years
Main
Steps
Run
Time
CPUhrs
Wall
Time
Months
MaSuRCA 100K 1.5
Celera WGS 470K 5
FALCON 150K 0.75
ARROW 160K 0.75
total 880K 9
100K CPU hrs=11.5 years
800K CPU hrs=100 years
IGC-12The Wheat Genome 13
Genome Repetitiveness
k-mer uniqueness ratios
WHEAT
FLY
COW
RICE
PINE
Ae tauschii
IGC-12The Wheat Genome 14
Publication
IGC-12The Wheat Genome 15
Conclusions
The most challenging genome (we) assembled!
Learning experience!
Assembly quality vs computational resources?
Share your data!
The most challenging genome (we) assembled!
Learning experience!
Assembly quality vs computational resources?
Share your data!
IGC-12The Wheat Genome 16
Acknowledgements
Steven Salzberg
Aleksey ZImin
Johns Hopkins University UCDavis Plant Sciences
Jan Dvorak
Earlham Institute
Bernardo Clavijo
Mingcheng Luo

More Related Content

Similar to Daniela Puiu at #ICG12: The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum

2013 GRM: Improve chickpea productivity for marginal environments in sub-Sah...
2013 GRM: Improve chickpea productivity for marginal environments in  sub-Sah...2013 GRM: Improve chickpea productivity for marginal environments in  sub-Sah...
2013 GRM: Improve chickpea productivity for marginal environments in sub-Sah...CGIAR Generation Challenge Programme
 
Research Program Genetic Gains (RPGG) Review Meeting 2021: From Discovery to ...
Research Program Genetic Gains (RPGG) Review Meeting 2021: From Discovery to ...Research Program Genetic Gains (RPGG) Review Meeting 2021: From Discovery to ...
Research Program Genetic Gains (RPGG) Review Meeting 2021: From Discovery to ...ICRISAT
 
CRYOPRESERVATION.pptx
CRYOPRESERVATION.pptxCRYOPRESERVATION.pptx
CRYOPRESERVATION.pptxsatish rana
 
CRYOPRESERVATION.pptx
CRYOPRESERVATION.pptxCRYOPRESERVATION.pptx
CRYOPRESERVATION.pptxsatish rana
 
CRISPR Is On The Move: Genome Editing From Rice To Wheat
CRISPR Is On The Move: Genome Editing From Rice To WheatCRISPR Is On The Move: Genome Editing From Rice To Wheat
CRISPR Is On The Move: Genome Editing From Rice To WheatFabio Caligaris
 
Establishment of an in vitro propagation and transformation system of Balani...
Establishment of an in vitro propagation  and transformation system of Balani...Establishment of an in vitro propagation  and transformation system of Balani...
Establishment of an in vitro propagation and transformation system of Balani...PGS
 
Tropical maize genome: what do we know so far and how to use that information
Tropical maize genome: what do we know so far and how to use that informationTropical maize genome: what do we know so far and how to use that information
Tropical maize genome: what do we know so far and how to use that informationCIMMYT
 
GRM 2013: Delivering drought tolerance to those who need it: From genetic res...
GRM 2013: Delivering drought tolerance to those who need it: From genetic res...GRM 2013: Delivering drought tolerance to those who need it: From genetic res...
GRM 2013: Delivering drought tolerance to those who need it: From genetic res...CGIAR Generation Challenge Programme
 
THEME – 4 Genomic diversity of domestication in soybean
THEME – 4 Genomic diversity of domestication in soybeanTHEME – 4 Genomic diversity of domestication in soybean
THEME – 4 Genomic diversity of domestication in soybeanICARDA
 

Similar to Daniela Puiu at #ICG12: The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum (12)

2013 GRM: Improve chickpea productivity for marginal environments in sub-Sah...
2013 GRM: Improve chickpea productivity for marginal environments in  sub-Sah...2013 GRM: Improve chickpea productivity for marginal environments in  sub-Sah...
2013 GRM: Improve chickpea productivity for marginal environments in sub-Sah...
 
Research Program Genetic Gains (RPGG) Review Meeting 2021: From Discovery to ...
Research Program Genetic Gains (RPGG) Review Meeting 2021: From Discovery to ...Research Program Genetic Gains (RPGG) Review Meeting 2021: From Discovery to ...
Research Program Genetic Gains (RPGG) Review Meeting 2021: From Discovery to ...
 
CRYOPRESERVATION.pptx
CRYOPRESERVATION.pptxCRYOPRESERVATION.pptx
CRYOPRESERVATION.pptx
 
CRYOPRESERVATION.pptx
CRYOPRESERVATION.pptxCRYOPRESERVATION.pptx
CRYOPRESERVATION.pptx
 
CRISPR Is On The Move: Genome Editing From Rice To Wheat
CRISPR Is On The Move: Genome Editing From Rice To WheatCRISPR Is On The Move: Genome Editing From Rice To Wheat
CRISPR Is On The Move: Genome Editing From Rice To Wheat
 
Hybrid seed production of pigeonpea
Hybrid seed production of pigeonpea Hybrid seed production of pigeonpea
Hybrid seed production of pigeonpea
 
Establishment of an in vitro propagation and transformation system of Balani...
Establishment of an in vitro propagation  and transformation system of Balani...Establishment of an in vitro propagation  and transformation system of Balani...
Establishment of an in vitro propagation and transformation system of Balani...
 
Irc 2011-sm
Irc 2011-smIrc 2011-sm
Irc 2011-sm
 
Tropical maize genome: what do we know so far and how to use that information
Tropical maize genome: what do we know so far and how to use that informationTropical maize genome: what do we know so far and how to use that information
Tropical maize genome: what do we know so far and how to use that information
 
PFO_SBI_2015
PFO_SBI_2015PFO_SBI_2015
PFO_SBI_2015
 
GRM 2013: Delivering drought tolerance to those who need it: From genetic res...
GRM 2013: Delivering drought tolerance to those who need it: From genetic res...GRM 2013: Delivering drought tolerance to those who need it: From genetic res...
GRM 2013: Delivering drought tolerance to those who need it: From genetic res...
 
THEME – 4 Genomic diversity of domestication in soybean
THEME – 4 Genomic diversity of domestication in soybeanTHEME – 4 Genomic diversity of domestication in soybean
THEME – 4 Genomic diversity of domestication in soybean
 

More from GigaScience, BGI Hong Kong

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...GigaScience, BGI Hong Kong
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteGigaScience, BGI Hong Kong
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...GigaScience, BGI Hong Kong
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...GigaScience, BGI Hong Kong
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...GigaScience, BGI Hong Kong
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...GigaScience, BGI Hong Kong
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...GigaScience, BGI Hong Kong
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...GigaScience, BGI Hong Kong
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...GigaScience, BGI Hong Kong
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixGigaScience, BGI Hong Kong
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserGigaScience, BGI Hong Kong
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...GigaScience, BGI Hong Kong
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceGigaScience, BGI Hong Kong
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...GigaScience, BGI Hong Kong
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...GigaScience, BGI Hong Kong
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveGigaScience, BGI Hong Kong
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...GigaScience, BGI Hong Kong
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...GigaScience, BGI Hong Kong
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...GigaScience, BGI Hong Kong
 

More from GigaScience, BGI Hong Kong (20)

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
 

Recently uploaded

Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Silpa
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Silpa
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsbassianu17
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxANSARKHAN96
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body Areesha Ahmad
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxSilpa
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.Silpa
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Silpa
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLkantirani197
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Silpa
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptxSilpa
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 

Recently uploaded (20)

Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 

Daniela Puiu at #ICG12: The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum

  • 1. The first near-complete assembly of the hexaploid bread wheat genome, Tritricum aestivum Daniela Puiu Aleksey Zimin, Richard Hall, Sarah Kingan, Bernardo Clavijo, Steven Salzberg ICG-12 Oct 27 2017
  • 2. IGC-12The Wheat Genome 2 Sequencing and Assembly of the Ancestral and Common Wheat Aegilops tauschii ssp strangulata accession AL8/78 Chinese spring variety (CS42, accession Dv418) 2013-2017
  • 3. IGC-12The Wheat Genome 3 History of Wheat ~8,000 years ago: spontaneous hybridization Emmer Wheat + Goat grass = Bread Wheat (World's 3rd cereal crop) Triticum turgidum + Aegilops tauschii = Triticum aestivum AABB + DD = AABBDD Whole Genome => Assisted Breeding => Improved Yield
  • 4. IGC-12The Wheat Genome 4 The Wheat Genome One of the most complex genomes ! 1) Genome size: over 15 billion bases 2) Allohexapoild : six copies of each chromosome 3) >90% repeats Multiple past attempts to assemble => assemblies shorter than the estimated genome size.
  • 5. IGC-12The Wheat Genome 5 New vs Previous Assemblies Tritricum 3.1 N50 232K
  • 6. IGC-12The Wheat Genome 6 Data Reduction Original Reads Number Sum Coverage Accuracy Illumina 7.06G 1Tb 65x 99.5% PacBio 55.5M 545Gb 36x 87.5% Processed Seq Number Sum Coverage Accuracy super-reads 95.7M 31Gb 2x 99.95% mega-reads 57M 278Gb 18x 99.65% MaSuRCA mega-reads hybrid correction
  • 7. IGC-12The Wheat Genome 7 MaSuRCA mega-reads Correction
  • 8. IGC-12The Wheat Genome 8 Assembly Pipeline MaSuRCA Correction Illumina Celera WGS Assembler Mega-reads Remove Duplicates Tritricum 1.0 Tritricum 2.0 FALCON Correction PacBio FALCON Assembler pReads Arrow Polishing FALCON Trit 0.5 FALCON Trit 1.0 k-mer Analysis Merge Tritricum 3.1
  • 9. IGC-12The Wheat Genome 9 k-mer Analysis 50M k-mers missing from the PacBio assembly only 40M 30M 20M 10M 31-mer frequencies
  • 10. IGC-12The Wheat Genome 10 Assembly Merge Merging of the Hybrid and PacBio assembliesMerging of the Hybrid and PacBio assemblies Tritricum 2.0 contig FALCON contigA FALCON contigB Tritricum 3.1 >5Kb >5Kb>5Kb
  • 11. IGC-12The Wheat Genome 11 Assembly Statistics Assembly Number Total size (bp) N50 size (bp) Triticum 2.0 375,328 14,395,027,822 75,599 FALCON Trit 1.0 97,809 12,939,100,857 215,314 Triticum 3.1 279,439 15,344,693,583 232,659
  • 12. IGC-12The Wheat Genome 12 Run Time: 100 CPU years Main Steps Run Time CPUhrs Wall Time Months MaSuRCA 100K 1.5 Celera WGS 470K 5 FALCON 150K 0.75 ARROW 160K 0.75 total 880K 9 100K CPU hrs=11.5 years 800K CPU hrs=100 years
  • 13. IGC-12The Wheat Genome 13 Genome Repetitiveness k-mer uniqueness ratios WHEAT FLY COW RICE PINE Ae tauschii
  • 14. IGC-12The Wheat Genome 14 Publication
  • 15. IGC-12The Wheat Genome 15 Conclusions The most challenging genome (we) assembled! Learning experience! Assembly quality vs computational resources? Share your data! The most challenging genome (we) assembled! Learning experience! Assembly quality vs computational resources? Share your data!
  • 16. IGC-12The Wheat Genome 16 Acknowledgements Steven Salzberg Aleksey ZImin Johns Hopkins University UCDavis Plant Sciences Jan Dvorak Earlham Institute Bernardo Clavijo Mingcheng Luo