SlideShare une entreprise Scribd logo
1  sur  24
Arang Rhie
Adam Phillippy’s Group
Genome Informatics Section, Computational and Statistical Genomics Branch, NHGRI
De Novo Assembly of Haplotype-Resolved Genomes
and Building a Human Pan-Genome Reference
@ArangRhie
The genome assembly problem
The diploid genome assembly problem
Diploid genome
Smashed Assembly
Phased (haploid) assembly
phasing
?
De novo: From scratch,
without looking at the
original picture
(reference)
Sequenced reads
sequencing assembling
Pseudo-haplotype + alts
Why assemble genomes again,
de novo?
Asian specific insertions and the frequency, found from AK1
Under-Represented Variations in GRCh38
Seo, Rhie, Kim, and Lee et al., De novo assembly and phasing of a Korean human genome, Nature (2016)
Identify haplotype differences
A
B
• CYP2D6 is involved in metabolizing >50% of available drugs
• Genetic variation and copy number affects drug efficacy
CYP2D6*10: Intermediate ~ poor metabolizer
CYP2D6*2: Extensive metabolizer
Seo, Rhie, Kim, and Lee et al., De novo assembly and phasing of a Korean human genome, Nature (2016)
Chr. 22
Can we phase across the whole chromosomes?
Seo, Rhie, Kim, and Lee et al., De novo assembly and phasing of a Korean human genome, Nature (2016)
Complete haplotype-resolved
assemblies with trio binning
The diploid genome assembly problem
Diploid genome
Smashed Assembly
Phased (haploid) assembly
phasing
?
De novo: From scratch,
without looking at the
original picture
(reference)
Sequenced reads
sequencing assembling
Complete haplotypes
The diploid genome assembly problem
Diploid genome
Paternal assembly
?
De novo: From scratch,
without looking at the
original picture
(reference)
Phased reads
sequencing assembling
Phased reads
Maternal assembly
assembling
Trio binning with parental k-mers
Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018)
Paternal haplotigs
Maternal haplotigs
• K-mer profiling of each parent (Illumina, 60x)
Paternal
k-mers
Maternal
k-mers
• K-mer profiling of the child (PacBio, 120x)
Child
Paternal Maternal
49.6%
(67.3x)
10.9 kb
49.3%
(66.9x)
11.7 kb
1.1% (1.4x), avg 1.3 kb
Paternal reads Maternal reads
• Childs’ read binning and assembling
canu
Robust for a wide range of heterozygosity
0.8% 1.2% 1.6%0.9%
*Heterozygosity level estimated with GenomeScope
1.5%
0.12 % 0.20 % 0.29 %
NA12878 (CEU) F HG00733 (PUR) F NA19240 (YRI) F HG002 (Ashkenazi) M
Platform PacBio (WashU) PacBio 60kb (20kb) PacBio (WashU) PacBio 15kb CCS
Haplotype
(Cov.)
Maternal
(32+9x)
Paternal
(31+9x)
Maternal
(44.6x)
Paternal
(43.6x)
Maternal
(37x)
Paternal
(31x)
Maternal
(11+8x)
Paternal
(11+8x)
NG50 (Mb) 1.2 1.2 19.1 23.9 9.0 3.0 20.1 16.8
0.17 %
A nearly perfect diploid genome
125x PacBio coverage (~60x per haplotype), TrioCanu haplotig NG50 ~70 Mbp, BUSCOs 94%
Maternal (yak)Paternal (highland) Esperanza
GRCh38
1
4
Human Pan-Genome Project
Population: http://www.internationalgenome.org/
Initiative to collect diverse, high-quality haplotypes with trio binning
• Illumina WGS for the parents, PacBio and Nanopore for the child
• Pilot 10 trios selected to maximize non-ref haplotype AF
2 PUR
1 KHV
3 ACB
1 MSL
1 PJL
1 GWD1 CLM
5 African
3 American
1 East Asian
1 South Asian
What can you see from a phased assembly?
Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018)
0
Phasing the MHC region
Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018)
Maternal
Paternal
• Diploid assembly is solved by trios
Trio binning is current best practice
All levels of assembly quality improved
Complete haplotypes will become the new norm
• A human pan-genome reference
A collection of diverse, high-quality haplotypes
Including complex heterozygous SVs
Summary
VGP GenomeArk: 1st data release
https://vgp.github.io/genomeark
Jennifer Vashon of Maine Department of Inland Fisheries and
Wildlife, left, and UMass lynx team coordinator, Tanya Lama,
with an adult male lynx from northern Maine whose DNA was
used to create first-ever whole genome for the species. The
lynx has since been released to the wild. (MassWildlife photo
/ Bill Byrne)
Acknowledgements
genomeinformatics.github.io
• Adam Phillippy
• Sergey Koren
• Brian Walenz
• Alexander Dilthey
• Brian Ondov
• Jay Ghurye
Korean (AK1)
Jeong-Sun Seo
Changhoon Kim
Junsoo Kim
Sangjin Lee
Tim Smith
John Williams
Cattle/pigs
Pan-Genome
Karen Miga
Benedict Paten
NIH NHGRI NISC
VGP Assembly
Working Group
Erich Jarvis
Richard Durbin
Gene Myers
Kerstin Howe
Harris Lewin
Olivier Fedrigo
Shane McCarthy
Martin Pippel
Will Chow
Joana Damas
PacBio CCS
Michael Hunkapiller
Paul Peluso
David Rank
We are hiring!
Trio binning is available in https://github.com/marbl/canu
Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018) 21
Pseudo-haplotype + alts
Complete haplotypes
Assembly Graph
Smashed haplotypes
Trio-binning outperforms FALCON-Unzip
Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018)
Primary = Longest path in the graph (pseudo-hap)
Alternate haplotigs = Alternate path in the bubble
Haplotigs = Contigs in each assembly
agree with parental haplotypes (Phased)
TrioCanu FALCON-unzip
Angusspecifick-mercounts
Angusspecifick-mercounts
Brahman specific k-mer countsBrahman specific k-mer counts
Phasing NA12878
Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018)
TrioCanu FALCON-UnzipSupernova
Phasing the F1 Cattle
Kronenberg and Kingan et al.,
FALCON-Phase: Integrating PacBio and Hi-C data for phased diploid genomes, bioRxiv (2018)
0
1,000,000
2,000,000
3,000,000
0 1,000,000 2,000,000 3,000,000
Brahman
Angus
Contig Size
20,000,000
40,000,000
60,000,000
Contig
Hap1
Hap2
Contig
Hap1
Hap2
0
1,000,000
2,000,000
3,000,000
0 1,000,000 2,000,000 3,000,000
Brahman
Angus
Contig Size
20,000,000
40,000,000
60,000,000
80,000,000
Assembly
Angus
Brahman
Assembly
Angus
Brahman
TrioCanu FALCON-Unzip FALCON-Phase

Contenu connexe

Tendances

hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)Shaojun Xie
 
Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCGenome Reference Consortium
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesGenome Reference Consortium
 
Theory and practice of graphical population analysis
Theory and practice of graphical population analysisTheory and practice of graphical population analysis
Theory and practice of graphical population analysisGenome Reference Consortium
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyGenome Reference Consortium
 
Understanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonUnderstanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonGenome Reference Consortium
 

Tendances (20)

Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
 
hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)hg19 (GRCh37) vs. hg38 (GRCh38)
hg19 (GRCh37) vs. hg38 (GRCh38)
 
Previewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRCPreviewing GRCm39: Assembly Updates from the GRC
Previewing GRCm39: Assembly Updates from the GRC
 
Explaining the assembly model
Explaining the assembly modelExplaining the assembly model
Explaining the assembly model
 
Telomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomesTelomere-to-telomere assembly of a complete human chromosomes
Telomere-to-telomere assembly of a complete human chromosomes
 
Theory and practice of graphical population analysis
Theory and practice of graphical population analysisTheory and practice of graphical population analysis
Theory and practice of graphical population analysis
 
Ashg2015 schneider final
Ashg2015 schneider finalAshg2015 schneider final
Ashg2015 schneider final
 
Variation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copyVariation graphs and population assisted genome inference copy
Variation graphs and population assisted genome inference copy
 
GRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slidesGRCWorkshop_geval_1KG_slides
GRCWorkshop_geval_1KG_slides
 
TAGC2016 schneider
TAGC2016 schneiderTAGC2016 schneider
TAGC2016 schneider
 
agbt 2016 workshop church
agbt 2016 workshop churchagbt 2016 workshop church
agbt 2016 workshop church
 
Ashg2014 grc workshop_schneider
Ashg2014 grc workshop_schneiderAshg2014 grc workshop_schneider
Ashg2014 grc workshop_schneider
 
Ashg grc workshop2014_tg
Ashg grc workshop2014_tgAshg grc workshop2014_tg
Ashg grc workshop2014_tg
 
150224 grc kms
150224 grc kms150224 grc kms
150224 grc kms
 
20181016 grc presentation-pa
20181016 grc presentation-pa20181016 grc presentation-pa
20181016 grc presentation-pa
 
Understanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL HackathonUnderstanding the reference assembly: CSHL Hackathon
Understanding the reference assembly: CSHL Hackathon
 
Ashg grc workshop2015_tg
Ashg grc workshop2015_tgAshg grc workshop2015_tg
Ashg grc workshop2015_tg
 
Alignment Approaches II: Long Reads
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
 
Ashg2015 grc-pruitt
Ashg2015 grc-pruittAshg2015 grc-pruitt
Ashg2015 grc-pruitt
 
Agbt2015 workshop schneider
Agbt2015 workshop schneiderAgbt2015 workshop schneider
Agbt2015 workshop schneider
 

Similaire à Haplotype-Resolved Genome Assemblies

HHMI Research poster -6-9-2014 Bipolar
HHMI Research poster -6-9-2014 BipolarHHMI Research poster -6-9-2014 Bipolar
HHMI Research poster -6-9-2014 BipolarHana (Hoang) Willner
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsgroovescience
 
How giab fits in the rest of the world human genome structural variation co...
How giab fits in the rest of the world   human genome structural variation co...How giab fits in the rest of the world   human genome structural variation co...
How giab fits in the rest of the world human genome structural variation co...GenomeInABottle
 
ASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleGenomeInABottle
 
Iowa State Bioinformatics BCB Symposium 2018 - There and Back Again
Iowa State Bioinformatics BCB Symposium 2018 - There and Back AgainIowa State Bioinformatics BCB Symposium 2018 - There and Back Again
Iowa State Bioinformatics BCB Symposium 2018 - There and Back AgainAdina Chuang Howe
 
Credit seminar on rice genomics crrected
Credit seminar on rice genomics crrectedCredit seminar on rice genomics crrected
Credit seminar on rice genomics crrectedVarsha Gayatonde
 
SNPs Presentation Cavalcanti Lab
SNPs Presentation Cavalcanti LabSNPs Presentation Cavalcanti Lab
SNPs Presentation Cavalcanti Labjsrep91
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics TechnologiesSean Davis
 
Munne et al ASRM 2009 Abstract O6
Munne et al ASRM 2009 Abstract O6Munne et al ASRM 2009 Abstract O6
Munne et al ASRM 2009 Abstract O6smunne
 
2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotes2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotesc.titus.brown
 
The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...Borlaug Global Rust Initiative
 
PGT-A (Pre-implantation genetic testing).pptx
PGT-A (Pre-implantation genetic testing).pptxPGT-A (Pre-implantation genetic testing).pptx
PGT-A (Pre-implantation genetic testing).pptxexomeunipath
 

Similaire à Haplotype-Resolved Genome Assemblies (20)

Sweden_eemis_big_data
Sweden_eemis_big_dataSweden_eemis_big_data
Sweden_eemis_big_data
 
HHMI Research poster -6-9-2014 Bipolar
HHMI Research poster -6-9-2014 BipolarHHMI Research poster -6-9-2014 Bipolar
HHMI Research poster -6-9-2014 Bipolar
 
Synthetic biology
Synthetic biologySynthetic biology
Synthetic biology
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
Big Data Field Museum
Big Data Field MuseumBig Data Field Museum
Big Data Field Museum
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traits
 
How giab fits in the rest of the world human genome structural variation co...
How giab fits in the rest of the world   human genome structural variation co...How giab fits in the rest of the world   human genome structural variation co...
How giab fits in the rest of the world human genome structural variation co...
 
CE-Symm jLBR talk
CE-Symm jLBR talkCE-Symm jLBR talk
CE-Symm jLBR talk
 
ASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottle
 
Iowa State Bioinformatics BCB Symposium 2018 - There and Back Again
Iowa State Bioinformatics BCB Symposium 2018 - There and Back AgainIowa State Bioinformatics BCB Symposium 2018 - There and Back Again
Iowa State Bioinformatics BCB Symposium 2018 - There and Back Again
 
Credit seminar on rice genomics crrected
Credit seminar on rice genomics crrectedCredit seminar on rice genomics crrected
Credit seminar on rice genomics crrected
 
SNPs Presentation Cavalcanti Lab
SNPs Presentation Cavalcanti LabSNPs Presentation Cavalcanti Lab
SNPs Presentation Cavalcanti Lab
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics Technologies
 
Church isca2012
Church isca2012Church isca2012
Church isca2012
 
Munne et al ASRM 2009 Abstract O6
Munne et al ASRM 2009 Abstract O6Munne et al ASRM 2009 Abstract O6
Munne et al ASRM 2009 Abstract O6
 
Animal Epigenetics
Animal Epigenetics Animal Epigenetics
Animal Epigenetics
 
2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotes2013 ucdavis-smbe-eukaryotes
2013 ucdavis-smbe-eukaryotes
 
The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...
 
PGT-A (Pre-implantation genetic testing).pptx
PGT-A (Pre-implantation genetic testing).pptxPGT-A (Pre-implantation genetic testing).pptx
PGT-A (Pre-implantation genetic testing).pptx
 

Plus de Genome Reference Consortium

The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectGenome Reference Consortium
 
Haplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsHaplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsGenome Reference Consortium
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesGenome Reference Consortium
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsGenome Reference Consortium
 
Graph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGraph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGenome Reference Consortium
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesGenome Reference Consortium
 

Plus de Genome Reference Consortium (17)

Genome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkitGenome variation graphs with the vg toolkit
Genome variation graphs with the vg toolkit
 
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) ProjectThe Matched Annotation from NCBI and EMBL-EBI (MANE) Project
The Matched Annotation from NCBI and EMBL-EBI (MANE) Project
 
Lrg and mane 16 oct 2018
Lrg and mane   16 oct 2018Lrg and mane   16 oct 2018
Lrg and mane 16 oct 2018
 
Ashg2017 workshop tg
Ashg2017 workshop tgAshg2017 workshop tg
Ashg2017 workshop tg
 
101717.kh miga ashg_grc
101717.kh miga ashg_grc101717.kh miga ashg_grc
101717.kh miga ashg_grc
 
AGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
 
AGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: SchneiderAGBT2017 Reference Workshop: Schneider
AGBT2017 Reference Workshop: Schneider
 
AGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: LindsayAGBT2017 Reference Workshop: Lindsay
AGBT2017 Reference Workshop: Lindsay
 
Haplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long readsHaplotype resolved structural variation assembly with long reads
Haplotype resolved structural variation assembly with long reads
 
Everyday de novo diploid assembly
Everyday de novo diploid assemblyEveryday de novo diploid assembly
Everyday de novo diploid assembly
 
Getting the most from the reference assembly
Getting the most from the reference assemblyGetting the most from the reference assembly
Getting the most from the reference assembly
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
 
Genome in a Bottle
Genome in a BottleGenome in a Bottle
Genome in a Bottle
 
ClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materialsClinVar: Getting the most from the reference assembly and reference materials
ClinVar: Getting the most from the reference assembly and reference materials
 
Graph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regionsGraph and assembly strategies for the MHC and ribosomal DNA regions
Graph and assembly strategies for the MHC and ribosomal DNA regions
 
Creating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome AssembliesCreating Reference-Grade Human Genome Assemblies
Creating Reference-Grade Human Genome Assemblies
 
Everyday de novo assembly
Everyday de novo assemblyEveryday de novo assembly
Everyday de novo assembly
 

Dernier

Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 

Dernier (20)

Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 

Haplotype-Resolved Genome Assemblies

  • 1. Arang Rhie Adam Phillippy’s Group Genome Informatics Section, Computational and Statistical Genomics Branch, NHGRI De Novo Assembly of Haplotype-Resolved Genomes and Building a Human Pan-Genome Reference @ArangRhie
  • 3. The diploid genome assembly problem Diploid genome Smashed Assembly Phased (haploid) assembly phasing ? De novo: From scratch, without looking at the original picture (reference) Sequenced reads sequencing assembling Pseudo-haplotype + alts
  • 4. Why assemble genomes again, de novo?
  • 5. Asian specific insertions and the frequency, found from AK1 Under-Represented Variations in GRCh38 Seo, Rhie, Kim, and Lee et al., De novo assembly and phasing of a Korean human genome, Nature (2016)
  • 6. Identify haplotype differences A B • CYP2D6 is involved in metabolizing >50% of available drugs • Genetic variation and copy number affects drug efficacy CYP2D6*10: Intermediate ~ poor metabolizer CYP2D6*2: Extensive metabolizer Seo, Rhie, Kim, and Lee et al., De novo assembly and phasing of a Korean human genome, Nature (2016) Chr. 22
  • 7. Can we phase across the whole chromosomes? Seo, Rhie, Kim, and Lee et al., De novo assembly and phasing of a Korean human genome, Nature (2016)
  • 9. The diploid genome assembly problem Diploid genome Smashed Assembly Phased (haploid) assembly phasing ? De novo: From scratch, without looking at the original picture (reference) Sequenced reads sequencing assembling Complete haplotypes
  • 10. The diploid genome assembly problem Diploid genome Paternal assembly ? De novo: From scratch, without looking at the original picture (reference) Phased reads sequencing assembling Phased reads Maternal assembly assembling
  • 11. Trio binning with parental k-mers Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018) Paternal haplotigs Maternal haplotigs • K-mer profiling of each parent (Illumina, 60x) Paternal k-mers Maternal k-mers • K-mer profiling of the child (PacBio, 120x) Child Paternal Maternal 49.6% (67.3x) 10.9 kb 49.3% (66.9x) 11.7 kb 1.1% (1.4x), avg 1.3 kb Paternal reads Maternal reads • Childs’ read binning and assembling canu
  • 12. Robust for a wide range of heterozygosity 0.8% 1.2% 1.6%0.9% *Heterozygosity level estimated with GenomeScope 1.5% 0.12 % 0.20 % 0.29 % NA12878 (CEU) F HG00733 (PUR) F NA19240 (YRI) F HG002 (Ashkenazi) M Platform PacBio (WashU) PacBio 60kb (20kb) PacBio (WashU) PacBio 15kb CCS Haplotype (Cov.) Maternal (32+9x) Paternal (31+9x) Maternal (44.6x) Paternal (43.6x) Maternal (37x) Paternal (31x) Maternal (11+8x) Paternal (11+8x) NG50 (Mb) 1.2 1.2 19.1 23.9 9.0 3.0 20.1 16.8 0.17 %
  • 13. A nearly perfect diploid genome 125x PacBio coverage (~60x per haplotype), TrioCanu haplotig NG50 ~70 Mbp, BUSCOs 94% Maternal (yak)Paternal (highland) Esperanza GRCh38
  • 14. 1 4 Human Pan-Genome Project Population: http://www.internationalgenome.org/ Initiative to collect diverse, high-quality haplotypes with trio binning • Illumina WGS for the parents, PacBio and Nanopore for the child • Pilot 10 trios selected to maximize non-ref haplotype AF 2 PUR 1 KHV 3 ACB 1 MSL 1 PJL 1 GWD1 CLM 5 African 3 American 1 East Asian 1 South Asian
  • 15. What can you see from a phased assembly? Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018) 0
  • 16. Phasing the MHC region Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018) Maternal Paternal
  • 17. • Diploid assembly is solved by trios Trio binning is current best practice All levels of assembly quality improved Complete haplotypes will become the new norm • A human pan-genome reference A collection of diverse, high-quality haplotypes Including complex heterozygous SVs Summary
  • 18. VGP GenomeArk: 1st data release https://vgp.github.io/genomeark Jennifer Vashon of Maine Department of Inland Fisheries and Wildlife, left, and UMass lynx team coordinator, Tanya Lama, with an adult male lynx from northern Maine whose DNA was used to create first-ever whole genome for the species. The lynx has since been released to the wild. (MassWildlife photo / Bill Byrne)
  • 19. Acknowledgements genomeinformatics.github.io • Adam Phillippy • Sergey Koren • Brian Walenz • Alexander Dilthey • Brian Ondov • Jay Ghurye Korean (AK1) Jeong-Sun Seo Changhoon Kim Junsoo Kim Sangjin Lee Tim Smith John Williams Cattle/pigs Pan-Genome Karen Miga Benedict Paten NIH NHGRI NISC VGP Assembly Working Group Erich Jarvis Richard Durbin Gene Myers Kerstin Howe Harris Lewin Olivier Fedrigo Shane McCarthy Martin Pippel Will Chow Joana Damas PacBio CCS Michael Hunkapiller Paul Peluso David Rank We are hiring! Trio binning is available in https://github.com/marbl/canu
  • 20.
  • 21. Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018) 21 Pseudo-haplotype + alts Complete haplotypes Assembly Graph Smashed haplotypes
  • 22. Trio-binning outperforms FALCON-Unzip Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018) Primary = Longest path in the graph (pseudo-hap) Alternate haplotigs = Alternate path in the bubble Haplotigs = Contigs in each assembly agree with parental haplotypes (Phased) TrioCanu FALCON-unzip Angusspecifick-mercounts Angusspecifick-mercounts Brahman specific k-mer countsBrahman specific k-mer counts
  • 23. Phasing NA12878 Koren and Rhie et al, De novo assembly of haplotype-resolved genomes with trio binning, Nat. Biotech (2018) TrioCanu FALCON-UnzipSupernova
  • 24. Phasing the F1 Cattle Kronenberg and Kingan et al., FALCON-Phase: Integrating PacBio and Hi-C data for phased diploid genomes, bioRxiv (2018) 0 1,000,000 2,000,000 3,000,000 0 1,000,000 2,000,000 3,000,000 Brahman Angus Contig Size 20,000,000 40,000,000 60,000,000 Contig Hap1 Hap2 Contig Hap1 Hap2 0 1,000,000 2,000,000 3,000,000 0 1,000,000 2,000,000 3,000,000 Brahman Angus Contig Size 20,000,000 40,000,000 60,000,000 80,000,000 Assembly Angus Brahman Assembly Angus Brahman TrioCanu FALCON-Unzip FALCON-Phase

Notes de l'éditeur

  1. Before phasing, short reads indicated a copy gain in CYP2D6 After phasing, we identified that the duplicated copy of CYP2D6 was fused with the last exon of CYP2D7 on haplotype B
  2. ref allele = #1 weight by non-ref allele’s global AF
  3. Black are typed genes, correct call for both haplotypes, all in phase. 1 indel in the DQB1. Confirms expected missing DRB3 in mother, presence in father but also shows there is other sequence there not a simple deletion