SlideShare une entreprise Scribd logo
1  sur  46
Télécharger pour lire hors ligne
An Introduction to NGS
(Next Generation Sequencing)
        François Paillier - 22/02/2011
Plan
  [ Reminder about Sanger Sequencing ]



• NGS Definition
• Overview of NGS technologies
• NGS Applications & examples
• Conclusion

 NOT discussed here : Sequence accuracy, assembly and sampling ; NGS
 data Analysis & BioInformatics tools
A word about Sanger Sequencing
  (First generation sequencing machine  Video)
                                                                         3730xl
Principle (only the tube G + dideoxyG)




                                                                               From gel to
                                                                               capillary




         Still a gold standard but capillary sequencing has reached its technical
         limitation (costs and performance will remain unchanged)
Short Reminder about « Classical » Assembly
                 projects

     Sample  Libraries

                                 Target genome


 n Sequencing sub-projects                    Cloning
                                 SubTargets (BACs, cosmids, ..)




           Assembly
                                     Clone selection &
                                        Sequencing
      Finishing: Draft (Q40)


          Annotation
                                       Assembly

     Annotated Genome
                                                 Other strategy : wgs
Sequencing, what for ?
                          Assembly projects for example

           In bioinformatics, sequence assembly refers to aligning and merging fragments of
           a much longer DNA sequence in order to reconstruct the original sequence. This
           is needed as DNA sequencing technology cannot read whole genomes in one go,
           but rather small pieces between 20 and 1000 bases, depending on the technology
           used. Typically the short fragments, called reads, result from shotgun sequencing
           genomic DNA, or gene transcript (ESTs).



Target genome


                                          Sequencing




                                                                                  reads

                                           Assembly
                                                                   Assembled reads




                    gap                               gap       gap
                            4X Local coverage                         Consensus
scaffold
Vocabulary that should be kept in mind
                  in the sequencing field

•   Assembly : result of the sequence clustering based on their local
    similarity
•   Contig : A set of overlapping DNA segments
•   Coverage (in sequencing) : The mean number of times a nucleotide is
    sequenced in a genome (example: 10X coverage)

•   Scaffolds : A series of contigs that are in the right order but not necessarily
    connected in one contiguous stretch
•   Mate pairs Sequences known to be in the 3′ and 5′ of a contig from a single
    clone




•   WGS = Whole genome shotgun sequencing strategy
•   ESS = Environmental Shotgun Sequencing
NGS = Next Generation
         Sequencing



    After PCR,
THE new revolution
   in Biology ?
NGS Synonym is : High-throughput Sequencing
                     (HTS)




                                    Third Generation :
                                    NGS = HTS, Single
                                    Molecule Sequencing

                     Second Generation :
                     NGS = Massively
                     Parallel Sequencing
First Generation :
SANGER Sequencing
Overview of actual NGS technologies
                 (Second generation sequencing machines)

Year 2005*

                                Roche, 454 GS-FLX
                                Titanium Protocol a must                           Each machine with
                                                                                   different :
 2006                                                                              - Throughput
                                                                                   - Sequence accuracy
                                 Illumina,        GA1 then      GA2
                                                                                   - Data formats (and
                                                                                   programs)
 2007
                                                   Applied Bio.,
                                                   Solid v3


*NGS “proof of principle” was done in 2000 by Lynx Therapeutics : They publishes and markets "MPSS" - a parallelized,
adapter/ligation-mediated, bead-based sequencing technology, launching "next-generation" sequencing.
Throughput per
Illumina Channel
HOW is it
Possible ? 
NGS Principle

Building sequencing devices at nanoscale

 Polony : Discrete clonal amplifications of a single DNA molecule,
  grown in a gel matrix. The clusters can then be individually
  sequenced, producing short reads. Polony-based sequencing is
  the basis of most second generation sequencers


A typical NGS Workflow is:
1) Library construction
2) Template CLONAL amplification
3) Massively PARALLEL sequencing
High Parallelism is Achieved in
     Polony Sequencing

Sanger                   Polony
Generation of Polony array: DNA
       Beads (454, SOLiD)




DNA Beads are generated using Emulsion PCR
Generation of Polony array: DNA
     Beads (454, SOLiD)




   DNA Beads are placed in wells
Sequencing: Pyrosequencing (454)

                                          DNA Polymerase




« pyrogram » / « Flowgram »
454 Process : Emulsion PCR &
       Pyrosequencing




              Titanium =
              Read lengths approx. 400 nt
              1 million reads / Run
               400 Mb / day


              VIDEOs
              About Pyrosequencing 1’53’’: <here>

              Summary about GS Flex 4’34’’: <click
              here>
 Ngs intro_v6_public
454 GS FLX titanium



No more Cloning step                   - Seq. Accuracy not so high
From purified DNA to Sequencing        (especially in case of
Fit the laboratory bench top / small   homopolymers
LONG Sequences (400 nt)                 Main error type is indel
GS Junior system not so expensive
                                       - Cost : approx. 20K€ / Gb
Capabilities :   Multiplexing &        Cost per base is cheaper
                 paired-ends           (regarding Sanger) but still
                                       High regarding others NexGen
Well fitted to :                       Machines
         - proK. Genome sequencing
         - RNA-seq
Illumina* : Bridge PCR




                GA2x Version =
                Read lengths
                approx. 100 nt
                240 million reads
                 1500 Mb / day
                 30000 Mb / Run
Generation of Polony array: Bridge-
          PCR (Solexa)




DNA fragments are attached to array and
        used as PCR templates

<Watch VIDEO : Related Links  Video : Genome
    Analyzer workflow  Panel technology>
Illumina Chemistry : 4-color DNA sequencing-by-synthesis using reversible
              terminators with removable flourescent dyes




                                                                   8
                                                                   Lanes




                                                   A Flow cell
Illumina seq. Accuracy
Illumina Throughput
Illumina



No more Cloning step
From purified DNA to Sequencing          - Machine is very expensive
Fit the laboratory bench top / small     Main error type is mismatch
Good Sequence Accuracy
                                         - Read lengths are still too short
Capabilities :   Multiplexing &          Not fitted to big genomes
                 paired-ends             (Repeats)

Cost : approx. 2K€ / Gb , Cost per       - Poor coverage of AT rich regions
base is cheaper than 454                 - Most widely used NGS platform.
                                         - Requires least DNA
Well fitted to :
         - proK. Genome sequencing
         - RNA-seq, ChIP-Seq,
         Methyl-Seq
SOLiD system : 4-color DNA Sequencing by
                 Ligation




                         SOLiD V3 =
                         Read lengths
                         approx. 50 nt
                         400 million reads
                          1500 Mb / day
                          20000 Mb / Run
                          1500€ / Gb

                         <Watch Video> 4’46’’
Sequencing by ligation rxn: Fluorescently Labeled
             Nucleotides (ABI SOLiD)




Complementar y strand elongation: DNA Ligase
Sequencing by ligation ABI SOLiD
Sequencing: Fluorescently Labeled Nucleotides
                (ABI SOLiD)




            5 reading frames, each
             position is read twice
Sequencing: Fluorescently Labeled
    Nucleotides (ABI SOLiD)
SOLiD



No more Cloning step
From purified DNA or RNA to Seq.          - This Technology is NOT
Fit the laboratory bench top / small      Intuitive
Good Sequence Accuracy
                                          - Machine is VERY expensive
Capabilities :   Multiplexing &
                 paired-ends              -HUGE amount of data produced
                                          (1500 Gb !!)
Cost : approx. 1.5K€ / Gb , Cost per
base is cheaper than illumina             -Long Run times

Well fitted to :                          -Has been demonstrated
         - REsequencing                   certain reads don’t match
         - RNA-seq, ChIP-Seq,             Reference !
         Methyl-Seq
Focusing NGS effort on predefined targets :
« Target Enrichment » Technology (Capture Array)
Focusing NGS effort on predefined targets :
« Target Enrichment » Technology (Capture Beads)
Summary : NGS Workflows




   +/- Target Enrichment Strategy

                                    Source: BCG
Prokaryotic Genome Sequencing
 Project as a mix of NGS technologies




                                         Conclusion :
  - High quality drafts can be produced for small genomes without any Sanger data input.
- We found that 454 GSFLX and Solexa/Illumina show great complementarity in producing
                     large contigs and supercontigs with a low error rate.
NGS Applications
DEEPER insight into biological processes
BROADER sampling of populations (cells, viruses,
Ecosystems…)



   • In different fields…
      – Metagenomics
      – Genomics
      – Transcriptomics
      – proteomics
Genome
  * De Novo Sequencing
  * Targeted Resequencing           …for different
(SNP, Indel, CNV)
  * Whole Genome Resequencing       purposes…
                                    -Towards Personalized
  * Metagenome analyses             Medicine
                                    - Biodiversity assessment
Transcriptome                       -De Novo Sequencing of
  * Gene Expression Profiling       prokaryotic or eukaryotic
                                    genomes (or re-sequencing)
  * Small RNA Analysis
                                    -RNA-Seq  Annotation of
  * Whole Transcriptome Analysis    eukaryotic genomes
                                    -SNP calling : identification of
Epigenome                           mutations
  * Chromatin Immunoprecipitation   -Chip-Seq : identification of
                                    DNA/protein interactions
      Sequencing (ChIP-Seq)
  * Methylation Analysis
 Ngs intro_v6_public
What is the current impact of
                NGS on Biology ?



• Both transcriptomics and genomics can now be
  adressed using one technology with higher
  accuracy and robustess (instead of Sanger
  sequencing + µarrays p.e.) ( Example of RNA-SEQ)
• SNP calling can rely on ultra-deep assemblies
• Whole genome overview of transcription factors
  binding sites
• Biodiversity assessment ( Metagenomics projects)
• And so much more…
About whole-exome sequencing :
 « For the First Time, DNA Sequencing Technology
                Saves A Child's Life »




« Proponents of genetic medicine say DNA sequencing is the future of
medicine and that soon every truly sick person will have his or her genome
sequenced. Critics cite privacy concerns and note that genetic mutations and
variations don’t necessarily lead to medical outcomes. Whatever the
position, it’s hard to argue that this isn’t good news: the first child – plagued
by undiagnosable illness – has been saved by DNA sequencing.
That may be a bit of a strong statement – six-year-old Nicholas Volker is
doing well, though complications could soon arise. But it’s highly likely that
the sequencing of young Nicholas’s genome saved his life. »
<Link> <Article>
                     Mayer & Al. Genetics IN Medicine • Volume xx, Number xx, 01 2011
What’s Next ?


                            IonTorrent
                               PacBio


 Roche, 454 GS-FLX
 Titanium




Illumina, GA2              Third Generation :
                           - Single
                           Molecule Sequencing (no bias)
                           - Faster
Applied BioSys, Solid v3
                           - Cheaper (or not)
Second Generation :        - 1000€ Human genome ?
NGS = Massively
Parallel Sequencing
(polony sequencing)
Conclusion : impact of NGS
               Global Shift to sequencing-based technologies

 Great improvements on-going : Higher throughput, longer reads
 Is it the end of µarrays ? A sub-part of NGS workflows restricted to target-
enrichment ?
 Is it the end of forward genetics ? Reverse genetics only ?
 Biologists education should integrate NGS knowledge
 Is it the end of « Big sequencing centers »? change in their mission ?


Next bottleneck : BioInformatics


- Storing data a problem (SRA soon down ?) AND IT networks speed
FAR too low  Very difficult to share NGS data  Fridges instead of
disks !?
- Analyzing data a problem  great improvements but still a lot of work
remain to be done
 Ngs intro_v6_public
Thanks
for your attention !
Technology Summary

                Read length   Sequencing   Throughput   Cost
                              Technology   (per run)    (1mbp)*
   Sanger       ~800bp        Sanger       400kbp       500$

   454          ~400bp        Polony       500Mbp       60$

   Solexa/Illumi 75bp         Polony       20Gbp        2$
   na
   SOLiD        75bp          Polony       60Gbp        2$

   Helicos      30-35bp       Single       25Gbp        1$
                              molecule

*Source: Shendure & Ji, Nat Biotech, 2008
NGS Technology Comparison
           ABI SOLiD               Illumina GA               454 Roche FLX
Cost       SOLiD 4: $495k          IIe: $470k                Titanium: $500k
           SOLiD PI: $240k         IIx: $250k
                                   HiSeq: $690k
Quantity   SOLiD 4: 100Gb          IIe: 20 - 38 Gb           450 Mb
of Data    SOLiD PI: 50Gb          IIx: 50 – 95 Gb
per run                            HiSeq: 200Gb +

Run Time   7 Days                  4 Days                    9 Hours

Pros       Low error rate due to   Most widely used          Short run time. Long
           dibase probes           NGS platform.             reads better for de
                                   Requires least DNA        novo sequencing
Cons       Long run times. Has     Least multiplexing        Expensive reagent
           been demonstrated       capability of the 3.      cost. Difficulty
           certain reads don’t     Poor coverage of AT       reading
           match reference         rich regions              homopolymer
                                                             regions
                                                     Source: The University of Western Ontario

Contenu connexe

Tendances

NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGBilal Nizami
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGScursoNGS
 
454 pyrosequencing @ujjwalsirohi
454 pyrosequencing @ujjwalsirohi454 pyrosequencing @ujjwalsirohi
454 pyrosequencing @ujjwalsirohiujjwal sirohi
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicshemantbreeder
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingPALANIANANTH.S
 
Sanger sequencing
Sanger sequencing Sanger sequencing
Sanger sequencing JYOTI PAWAR
 
Introduction to Next Generation Sequencing
Introduction to Next Generation SequencingIntroduction to Next Generation Sequencing
Introduction to Next Generation SequencingFarid MUSA
 
Roche Pyrosequencing 454 ; Next generation DNA Sequencing
Roche Pyrosequencing 454 ; Next generation DNA SequencingRoche Pyrosequencing 454 ; Next generation DNA Sequencing
Roche Pyrosequencing 454 ; Next generation DNA SequencingAbhay jha
 
SNPs analysis methods
SNPs analysis methodsSNPs analysis methods
SNPs analysis methodshad89
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingSajad Rafatiyan
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingTapish Goel
 
Genomics(functional genomics)
Genomics(functional genomics)Genomics(functional genomics)
Genomics(functional genomics)IndrajaDoradla
 
Next Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology OverviewNext Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology OverviewDominic Suciu
 

Tendances (20)

NEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCINGNEXT GENERATION SEQUENCING
NEXT GENERATION SEQUENCING
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
454 pyrosequencing @ujjwalsirohi
454 pyrosequencing @ujjwalsirohi454 pyrosequencing @ujjwalsirohi
454 pyrosequencing @ujjwalsirohi
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Whole genome sequencing
Whole genome sequencingWhole genome sequencing
Whole genome sequencing
 
Sanger sequencing
Sanger sequencing Sanger sequencing
Sanger sequencing
 
Ion torrent sequencing
Ion torrent sequencingIon torrent sequencing
Ion torrent sequencing
 
Introduction to Next Generation Sequencing
Introduction to Next Generation SequencingIntroduction to Next Generation Sequencing
Introduction to Next Generation Sequencing
 
Whole Exome Sequencing .pptx
Whole Exome Sequencing .pptxWhole Exome Sequencing .pptx
Whole Exome Sequencing .pptx
 
Pyrosequencing
PyrosequencingPyrosequencing
Pyrosequencing
 
Ngs ppt
Ngs pptNgs ppt
Ngs ppt
 
Roche Pyrosequencing 454 ; Next generation DNA Sequencing
Roche Pyrosequencing 454 ; Next generation DNA SequencingRoche Pyrosequencing 454 ; Next generation DNA Sequencing
Roche Pyrosequencing 454 ; Next generation DNA Sequencing
 
SNPs analysis methods
SNPs analysis methodsSNPs analysis methods
SNPs analysis methods
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Genomics(functional genomics)
Genomics(functional genomics)Genomics(functional genomics)
Genomics(functional genomics)
 
Next Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology OverviewNext Gen Sequencing (NGS) Technology Overview
Next Gen Sequencing (NGS) Technology Overview
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
 
Dna sequencing ppt
Dna sequencing pptDna sequencing ppt
Dna sequencing ppt
 

Similaire à Ngs intro_v6_public

20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pubsesejun
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.mkim8
 
Conventional and next generation sequencing ppt
Conventional and next generation sequencing pptConventional and next generation sequencing ppt
Conventional and next generation sequencing pptAshwini R
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshopc.titus.brown
 
THIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptxTHIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptxRITHIKA R S
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 
Ngs microbiome
Ngs microbiomeNgs microbiome
Ngs microbiomejukais
 
2016. daisuke tsugama. next generation sequencing (ngs) for plant research
2016. daisuke tsugama. next generation sequencing (ngs) for plant research2016. daisuke tsugama. next generation sequencing (ngs) for plant research
2016. daisuke tsugama. next generation sequencing (ngs) for plant researchFOODCROPS
 
Knowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and VariantsKnowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and VariantsGolden Helix Inc
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyQIAGEN
 
Approaches to cDNA Cloning and Analysis
Approaches to cDNA Cloning and AnalysisApproaches to cDNA Cloning and Analysis
Approaches to cDNA Cloning and AnalysisMatthias Harbers
 
Next-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesNext-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesJan Aerts
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngsDin Apellidos
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencingshinycthomas
 
RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2BITS
 

Similaire à Ngs intro_v6_public (20)

20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.
 
Conventional and next generation sequencing ppt
Conventional and next generation sequencing pptConventional and next generation sequencing ppt
Conventional and next generation sequencing ppt
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshop
 
THIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptxTHIRD GEN SEQUENCING.pptx
THIRD GEN SEQUENCING.pptx
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
Ngs introduction
Ngs introductionNgs introduction
Ngs introduction
 
Ngs microbiome
Ngs microbiomeNgs microbiome
Ngs microbiome
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
2016. daisuke tsugama. next generation sequencing (ngs) for plant research
2016. daisuke tsugama. next generation sequencing (ngs) for plant research2016. daisuke tsugama. next generation sequencing (ngs) for plant research
2016. daisuke tsugama. next generation sequencing (ngs) for plant research
 
Knowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and VariantsKnowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and Variants
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) Technology
 
Approaches to cDNA Cloning and Analysis
Approaches to cDNA Cloning and AnalysisApproaches to cDNA Cloning and Analysis
Approaches to cDNA Cloning and Analysis
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Next-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesNext-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologies
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2
 

Dernier

Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
The Kubernetes Gateway API and its role in Cloud Native API Management
The Kubernetes Gateway API and its role in Cloud Native API ManagementThe Kubernetes Gateway API and its role in Cloud Native API Management
The Kubernetes Gateway API and its role in Cloud Native API ManagementNuwan Dias
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdfPaige Cruz
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 

Dernier (20)

Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
The Kubernetes Gateway API and its role in Cloud Native API Management
The Kubernetes Gateway API and its role in Cloud Native API ManagementThe Kubernetes Gateway API and its role in Cloud Native API Management
The Kubernetes Gateway API and its role in Cloud Native API Management
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf99.99% of Your Traces  Are (Probably) Trash (SRECon NA 2024).pdf
99.99% of Your Traces Are (Probably) Trash (SRECon NA 2024).pdf
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 

Ngs intro_v6_public

  • 1. An Introduction to NGS (Next Generation Sequencing) François Paillier - 22/02/2011
  • 2. Plan [ Reminder about Sanger Sequencing ] • NGS Definition • Overview of NGS technologies • NGS Applications & examples • Conclusion NOT discussed here : Sequence accuracy, assembly and sampling ; NGS data Analysis & BioInformatics tools
  • 3. A word about Sanger Sequencing (First generation sequencing machine  Video) 3730xl Principle (only the tube G + dideoxyG) From gel to capillary Still a gold standard but capillary sequencing has reached its technical limitation (costs and performance will remain unchanged)
  • 4. Short Reminder about « Classical » Assembly projects Sample  Libraries Target genome n Sequencing sub-projects Cloning SubTargets (BACs, cosmids, ..) Assembly Clone selection & Sequencing Finishing: Draft (Q40) Annotation Assembly Annotated Genome Other strategy : wgs
  • 5. Sequencing, what for ? Assembly projects for example In bioinformatics, sequence assembly refers to aligning and merging fragments of a much longer DNA sequence in order to reconstruct the original sequence. This is needed as DNA sequencing technology cannot read whole genomes in one go, but rather small pieces between 20 and 1000 bases, depending on the technology used. Typically the short fragments, called reads, result from shotgun sequencing genomic DNA, or gene transcript (ESTs). Target genome Sequencing reads Assembly Assembled reads gap gap gap 4X Local coverage Consensus scaffold
  • 6. Vocabulary that should be kept in mind in the sequencing field • Assembly : result of the sequence clustering based on their local similarity • Contig : A set of overlapping DNA segments • Coverage (in sequencing) : The mean number of times a nucleotide is sequenced in a genome (example: 10X coverage) • Scaffolds : A series of contigs that are in the right order but not necessarily connected in one contiguous stretch • Mate pairs Sequences known to be in the 3′ and 5′ of a contig from a single clone • WGS = Whole genome shotgun sequencing strategy • ESS = Environmental Shotgun Sequencing
  • 7. NGS = Next Generation Sequencing After PCR, THE new revolution in Biology ?
  • 8. NGS Synonym is : High-throughput Sequencing (HTS) Third Generation : NGS = HTS, Single Molecule Sequencing Second Generation : NGS = Massively Parallel Sequencing First Generation : SANGER Sequencing
  • 9. Overview of actual NGS technologies (Second generation sequencing machines) Year 2005* Roche, 454 GS-FLX Titanium Protocol a must Each machine with different : 2006 - Throughput - Sequence accuracy Illumina, GA1 then GA2 - Data formats (and programs) 2007 Applied Bio., Solid v3 *NGS “proof of principle” was done in 2000 by Lynx Therapeutics : They publishes and markets "MPSS" - a parallelized, adapter/ligation-mediated, bead-based sequencing technology, launching "next-generation" sequencing.
  • 12. NGS Principle Building sequencing devices at nanoscale  Polony : Discrete clonal amplifications of a single DNA molecule, grown in a gel matrix. The clusters can then be individually sequenced, producing short reads. Polony-based sequencing is the basis of most second generation sequencers A typical NGS Workflow is: 1) Library construction 2) Template CLONAL amplification 3) Massively PARALLEL sequencing
  • 13. High Parallelism is Achieved in Polony Sequencing Sanger Polony
  • 14. Generation of Polony array: DNA Beads (454, SOLiD) DNA Beads are generated using Emulsion PCR
  • 15. Generation of Polony array: DNA Beads (454, SOLiD) DNA Beads are placed in wells
  • 16. Sequencing: Pyrosequencing (454) DNA Polymerase « pyrogram » / « Flowgram »
  • 17. 454 Process : Emulsion PCR & Pyrosequencing Titanium = Read lengths approx. 400 nt 1 million reads / Run  400 Mb / day VIDEOs About Pyrosequencing 1’53’’: <here> Summary about GS Flex 4’34’’: <click here>
  • 19. 454 GS FLX titanium No more Cloning step - Seq. Accuracy not so high From purified DNA to Sequencing (especially in case of Fit the laboratory bench top / small homopolymers LONG Sequences (400 nt)  Main error type is indel GS Junior system not so expensive - Cost : approx. 20K€ / Gb Capabilities : Multiplexing & Cost per base is cheaper paired-ends (regarding Sanger) but still High regarding others NexGen Well fitted to : Machines - proK. Genome sequencing - RNA-seq
  • 20. Illumina* : Bridge PCR GA2x Version = Read lengths approx. 100 nt 240 million reads  1500 Mb / day  30000 Mb / Run
  • 21. Generation of Polony array: Bridge- PCR (Solexa) DNA fragments are attached to array and used as PCR templates <Watch VIDEO : Related Links  Video : Genome Analyzer workflow  Panel technology>
  • 22. Illumina Chemistry : 4-color DNA sequencing-by-synthesis using reversible terminators with removable flourescent dyes 8 Lanes A Flow cell
  • 25. Illumina No more Cloning step From purified DNA to Sequencing - Machine is very expensive Fit the laboratory bench top / small Main error type is mismatch Good Sequence Accuracy - Read lengths are still too short Capabilities : Multiplexing & Not fitted to big genomes paired-ends (Repeats) Cost : approx. 2K€ / Gb , Cost per - Poor coverage of AT rich regions base is cheaper than 454 - Most widely used NGS platform. - Requires least DNA Well fitted to : - proK. Genome sequencing - RNA-seq, ChIP-Seq, Methyl-Seq
  • 26. SOLiD system : 4-color DNA Sequencing by Ligation SOLiD V3 = Read lengths approx. 50 nt 400 million reads  1500 Mb / day  20000 Mb / Run  1500€ / Gb <Watch Video> 4’46’’
  • 27. Sequencing by ligation rxn: Fluorescently Labeled Nucleotides (ABI SOLiD) Complementar y strand elongation: DNA Ligase
  • 29. Sequencing: Fluorescently Labeled Nucleotides (ABI SOLiD) 5 reading frames, each position is read twice
  • 30. Sequencing: Fluorescently Labeled Nucleotides (ABI SOLiD)
  • 31. SOLiD No more Cloning step From purified DNA or RNA to Seq. - This Technology is NOT Fit the laboratory bench top / small Intuitive Good Sequence Accuracy - Machine is VERY expensive Capabilities : Multiplexing & paired-ends -HUGE amount of data produced (1500 Gb !!) Cost : approx. 1.5K€ / Gb , Cost per base is cheaper than illumina -Long Run times Well fitted to : -Has been demonstrated - REsequencing certain reads don’t match - RNA-seq, ChIP-Seq, Reference ! Methyl-Seq
  • 32. Focusing NGS effort on predefined targets : « Target Enrichment » Technology (Capture Array)
  • 33. Focusing NGS effort on predefined targets : « Target Enrichment » Technology (Capture Beads)
  • 34. Summary : NGS Workflows +/- Target Enrichment Strategy Source: BCG
  • 35. Prokaryotic Genome Sequencing Project as a mix of NGS technologies Conclusion : - High quality drafts can be produced for small genomes without any Sanger data input. - We found that 454 GSFLX and Solexa/Illumina show great complementarity in producing large contigs and supercontigs with a low error rate.
  • 36. NGS Applications DEEPER insight into biological processes BROADER sampling of populations (cells, viruses, Ecosystems…) • In different fields… – Metagenomics – Genomics – Transcriptomics – proteomics
  • 37. Genome * De Novo Sequencing * Targeted Resequencing …for different (SNP, Indel, CNV) * Whole Genome Resequencing purposes… -Towards Personalized * Metagenome analyses Medicine - Biodiversity assessment Transcriptome -De Novo Sequencing of * Gene Expression Profiling prokaryotic or eukaryotic genomes (or re-sequencing) * Small RNA Analysis -RNA-Seq  Annotation of * Whole Transcriptome Analysis eukaryotic genomes -SNP calling : identification of Epigenome mutations * Chromatin Immunoprecipitation -Chip-Seq : identification of DNA/protein interactions Sequencing (ChIP-Seq) * Methylation Analysis
  • 39. What is the current impact of NGS on Biology ? • Both transcriptomics and genomics can now be adressed using one technology with higher accuracy and robustess (instead of Sanger sequencing + µarrays p.e.) ( Example of RNA-SEQ) • SNP calling can rely on ultra-deep assemblies • Whole genome overview of transcription factors binding sites • Biodiversity assessment ( Metagenomics projects) • And so much more…
  • 40. About whole-exome sequencing : « For the First Time, DNA Sequencing Technology Saves A Child's Life » « Proponents of genetic medicine say DNA sequencing is the future of medicine and that soon every truly sick person will have his or her genome sequenced. Critics cite privacy concerns and note that genetic mutations and variations don’t necessarily lead to medical outcomes. Whatever the position, it’s hard to argue that this isn’t good news: the first child – plagued by undiagnosable illness – has been saved by DNA sequencing. That may be a bit of a strong statement – six-year-old Nicholas Volker is doing well, though complications could soon arise. But it’s highly likely that the sequencing of young Nicholas’s genome saved his life. » <Link> <Article> Mayer & Al. Genetics IN Medicine • Volume xx, Number xx, 01 2011
  • 41. What’s Next ? IonTorrent PacBio Roche, 454 GS-FLX Titanium Illumina, GA2 Third Generation : - Single Molecule Sequencing (no bias) - Faster Applied BioSys, Solid v3 - Cheaper (or not) Second Generation : - 1000€ Human genome ? NGS = Massively Parallel Sequencing (polony sequencing)
  • 42. Conclusion : impact of NGS Global Shift to sequencing-based technologies  Great improvements on-going : Higher throughput, longer reads  Is it the end of µarrays ? A sub-part of NGS workflows restricted to target- enrichment ?  Is it the end of forward genetics ? Reverse genetics only ?  Biologists education should integrate NGS knowledge  Is it the end of « Big sequencing centers »? change in their mission ? Next bottleneck : BioInformatics - Storing data a problem (SRA soon down ?) AND IT networks speed FAR too low  Very difficult to share NGS data  Fridges instead of disks !? - Analyzing data a problem  great improvements but still a lot of work remain to be done
  • 45. Technology Summary Read length Sequencing Throughput Cost Technology (per run) (1mbp)* Sanger ~800bp Sanger 400kbp 500$ 454 ~400bp Polony 500Mbp 60$ Solexa/Illumi 75bp Polony 20Gbp 2$ na SOLiD 75bp Polony 60Gbp 2$ Helicos 30-35bp Single 25Gbp 1$ molecule *Source: Shendure & Ji, Nat Biotech, 2008
  • 46. NGS Technology Comparison ABI SOLiD Illumina GA 454 Roche FLX Cost SOLiD 4: $495k IIe: $470k Titanium: $500k SOLiD PI: $240k IIx: $250k HiSeq: $690k Quantity SOLiD 4: 100Gb IIe: 20 - 38 Gb 450 Mb of Data SOLiD PI: 50Gb IIx: 50 – 95 Gb per run HiSeq: 200Gb + Run Time 7 Days 4 Days 9 Hours Pros Low error rate due to Most widely used Short run time. Long dibase probes NGS platform. reads better for de Requires least DNA novo sequencing Cons Long run times. Has Least multiplexing Expensive reagent been demonstrated capability of the 3. cost. Difficulty certain reads don’t Poor coverage of AT reading match reference rich regions homopolymer regions Source: The University of Western Ontario