SlideShare une entreprise Scribd logo
1  sur  57
Next Generation Sequencing
and
Bioinformatics Analysis Pipelines
Adam Ameur
National Genomics Infrastructure
SciLifeLab Uppsala
adam.ameur@igp.uu.se
INF R S RU URE
A T CT
ENOMI S
G C
ATCA GT
N ION L
GA CTAC
AT A
What is an analysis pipeline?
• Basically just a number of steps to analyze data
Raw data
(FASTQ reads)
Intermediate
result
Intermediate
result
Final
result
• Pipelines can be simple or very complex…
Today’s lecture
• Sequencing instruments and ‘standard’ pipelines
– IonTorrent/PacificBiosciences
• In-house bioinformatics pipelines, some examples
• News and future plans
Ion Torrent - PGM/Proton
• The Ion Torrent System
– 6 instruments available in Uppsala, early access users
– Two instruments: PGM and Proton
– For small scale (PGM) and large scale sequencing (Proton)
– Rapid sequencing (run time ~ 2-4 hours)
– Measures changes in pH
– Sequencing on a chip
Personal Genome Machine (PGM) Ion Proton
Ion Torrent output
• Ion Torrent throughput
~ from 10Mb to >10Gb, depending on the chip
• Read lengths: 400bp (PGM), 200bp (Proton)
• Output file format: FASTQ
• What can we use Ion Torrent for?
– Anything, except perhaps very large genomes
2 human exomes (PI chip)
2 human transcriptomes
1 human genome = 6 PI chips
PI
(Proton)
318
(PGM)
316
(PGM)
314
(PGM)
NEW! P2
(Proton)
Ion Torrent analysis workflow
Downstream analysis
Torrent Server
.fastq
.bam
.fasta
Torrent Suite Software
Torrent Suite Software Analysis
• Plug-ins within the Torrent Suite Software
– Alignment
• TMAP: Specifically developed for Ion Torrent data
– Variant Caller
• SNP/Indel detection
– Assembler
• MIRA
– Analysis of gene panels (Human exomes, cancer panels, etc…)
• SNP/Indel detection in amplicon-seq data
– Analysis of human transcriptomes
– …
• Analyses are started automatically when run is complete!
Pacific Biosciences
• Pacific Biosciences
– Installed summer 2013
– Single molecule sequencing
– Very long read lengths (up to 60 kb)
– Rapid sequencing
– Can detect base modifications (i.e. methylation)
– Relatively low throughput
PacBio output
• PacBio throughput
~ 1 Gb/SMRT cell
• PacBio read lengths: 500bp-60kb
• Output file format: FASTQ
• What can we use PacBio for?
– Anything, except really large genomes
~1 bacterial genome
~1 bacterial transcriptome
1 human genome = 100 SMRT cells?
PacBio analysis workflow
In-house PacBio cluster
Downstream analysis
.fastq
.bam
.fasta
SMRT analysis portal
SMRT analysis pipelines
• Mapping
• Variant calling
• Assembly
• Scaffolding
• Base modifications
What about Illumina?
• There are many different pipelines for Illumina…
In-house development of pipelines
• The standard analysis pipelines are nice…
… but sometimes we need to do own developments
… or adapt the pipelines to our specific applications
• Some examples of in-house developments:
I. Building a local variant database (WES/WGS)
II. Assembly of genomes using long reads
III. Clinical sequencing – Leukemia Diagnostics
*
Example I:
Computational infrastructure for exome-seq data
Background: exome-seq
• Main application of exome-seq
– Find disease causing mutations in humans
• Advantages
– Allows investigate all protein coding sequences
– Possible to detect both SNPs and small indels
– Low cost (compared to WGS)
– Possible to multiplex several exomes in one run
– Standardized work flow for data analysis
• Disadvantage
– All genetic variants outside of exons are missed (~98%)
Exome-seq throughput
• We are producing a lot of exome-seq data
– 4-6 exomes/day on Ion Proton
– In each exome we detect
• Over 50,000 SNPs
• About 2000 small indels
=> Over 1 million variants/run!
• In plain text files
How to analyze this?
• Traditional analysis - A lot of filtering!
– Typical filters
• Focus on rare SNPs (not present in dbSNP)
• Remove FPs (by filtering against other exomes)
• Effect on protein: non-synonymous, stop-gain etc
• Heterozygous/homozygous
– This analysis can be automated (more or less)
Result:
A few candidate
causative
SNP(s)!
Start:
All identified SNPs
Why is this not optimal?
• Drawbacks
– Work on one sample at time
• Difficult to compare between samples
– Takes time to re-run analysis
• When using different parameters
– No standardized storage of detected SNPs/indels
• Difficult to handle 100s of samples
• Better solution
– A database oriented system
• Both for data storage and filtering analyses
Analysis: In-house variant database
Ameur et al., Database Journal, 2014
*CANdidate Variant Analysis System and Data Base
*
CanvasDB - Filtering
CanvasDB - Filtering speed
• Rapid variant filtering, also for large databases
A recent exome-seq project
• Hearing loss: 2 affected brothers
– Likely a rare, recessive disease
=> Shared homozygous SNPs/indels
• Sequencing strategy
– TargetSeq exome capture
– One sample per PI chip
homoz
homoz
heteroz
heteroz
Filtering analysis
• CanvasDB filtering for a variant that is…
– rare
• at most in 1% of ~700 exomes
– shared
• found in both brothers
– homozygous
• in brothers, but in no other samples
– deleterious
• non-synonymous, frameshift, stop-gain, splicing, etc..
Filtering results
• Homozygous candidates
– 2 SNPs
• stop-gain in STRC
• non-synonymous in PCNT
– 0 indels
• Compound heterozygous candidates (lower priority)
– in 15 genes
=> Filtering is fast and gives a short candidate list!
STRC - a candidate gene
=> Stop-gain in STRC is likely to cause hearing loss!
Brother #1
Brother #2
Unrelated sample
IGV visualization: Stop gain in STRC
STRC, validation by Sanger
Brother #1
Brother #2
Stop-gain site
• Sanger validation
• Does not seem to be homozygous..
– Explanation: difficult to sequence STRC by Sanger
• Pseudo-gene with very high similarity
• New validation showed mutation is homozygous!!
CanvasDB – some success stories
Solved cases, exome-seq - Niklas Dahl/Joakim Klar
Neuromuscular disorder NMD11
Artrogryfosis SKD36
Lipodystrophy ACR1
Achondroplasia ACD2
Ectodermal dysplasia ED21
Achondroplasia ACD9
Ectodermal dysplasia ED1
Arythroderma AV1
Ichthyosis SD12
Muscular dystrophy DMD7
Neuromuscular disorder NMD8
Welanders myopathy (D) W
Skeletal dysplasia SKD21
Visceral myopathy (D) D:5156
Ataxia telangiectasia MR67
Exostosis SKD13
Alopecia AP43
Epidermolysis bullosa SD14
Hearing loss D:9652
CanvasDB - Availability
• CanvasDB system now freely available on GitHub!
Next Step: Whole Genome Sequencing
Capacity of HiSeq X Ten: 320 whole human genomes/week!!!
• More work on pipelines and databases needed!!!
• New instruments at SciLifeLab for human WGS…
Example of large WGS project:
Generation of a high quality genetic variant
database for the Swedish population
Whole Genome Sequencing of ~1000 individuals
Aim: to build a resource for Swedish research and health care
PI: Ulf Gyllensten
Example II:
Assembly of genomes using Pacific Biosciences
Genome assembly using NGS
• Short-read de novo assembly by NGS
– Requires mate-pair sequences
• Ideally with different insert sizes
– Complicated analysis
• Assembly, scaffolding, finishing
• Maybe even some manual steps
=> Rather expensive and time consuming
• Long reads really makes a difference!!
– We can assemble genomes using PacBio data only!
HGAP de novo assembly
• HGAP uses both long and shorter reads
Long reads (seeds)
Short reads
PacBio – Current throughput & read lengths
• >10kb average read lengths! (run from April 2014)
• ~ 1 Gb of sequence from one PacBio SMRT cell
PacBio assembly analysis
• Simple -- just click a button!!
PacBio assembly, example result
• Example: Complete assembly of a bacterial genome
PacBio assembly – recent developments
• Also larger genomes can be assembled by PacBio..
Next step: assembly of large genomes
• We need to install such pipelines at UPPNEX!!
405,000 CPUh used on Google Cloud!
• A computational challenge!!
Example III:
Clinical sequencing for Leukemia Treatment
Chronic Myeloid Leukemia
• BCR-ABL1 fusion protein – a CML drug target
www.cambridgemedicine.org/article/doi/10.7244/cmj-1355057881
The BCR-ABL1 fusion protein can
acquire resistance mutations
following drug treatment
BCR-ABL1 workflow – PacBio Sequencing
From sample to results: < 1 week
1 sample/SMRT cell
Cavelier et al., BMC Cancer, 2015
BCR-ABL1 mutations at diagnosis
BCR ABL1
PacBio sequencing generates ~10 000X coverage!
Sample from time of diagnosis:
BCR-ABL1 mutations in follow-up sample
Sample 6 months later
Mutations acquired in fusion transcript.
Might require treatment with alternative drug.
BCR ABL1
BCR-ABL1 dilution series results
• Mutations down to 1% detected!
Summary of mutations in 5 CML patients
Mutations mapped to protein structure
BCR-ABL1 - Compound mutations
P1 61m T315I
F359C
91.8%
4.2%
3.9%
P1 68.5m T315I
F359C
H396R
D276G
93.7%
2.0%
1.1%
2.0%
1.1%
BCR-ABL1 - Multiple isoforms in one individual!
BCR-ABL1 – Isoforms and protein structure
Next step: A clinical diagnostics pipeline!
Step1. Create CCS reads
Step2. Run mutation analysis
Step3. Upload to result server
Collaboration with Wesley Schaal & Ola Spjuth, UPPNEX/Uppsala Univ
Reporting system for mutation results
Ion Torrent – Ongoing developments
Ion S5 XL system Ion Proton PII chip (EA)
PacBio – Ongoing developments
• Fast track clinical samples
- Preparing workflows for rapid sequencing
- Organ transplantation, diagnostics, outbreaks, ...
• New chemistry and active loading of SMRT cells
- Improved quality, longer reads
- Increased throughput (2016?)
• New analysis algorithms
INFR S RU URE
A T CT
ENOMI S
G C
ATCA GT
N ION L
GA CTAC
AT A
Thank you!

Contenu connexe

Similaire à AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt

Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...QIAGEN
 
Apac distributor training series 3 swift product for cancer study
Apac distributor training series 3  swift product for cancer studyApac distributor training series 3  swift product for cancer study
Apac distributor training series 3 swift product for cancer studySwift Biosciences
 
CS Lecture 2017 04-11 from Data to Precision Medicine
CS Lecture 2017 04-11 from Data to Precision MedicineCS Lecture 2017 04-11 from Data to Precision Medicine
CS Lecture 2017 04-11 from Data to Precision MedicineGabe Rudy
 
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...VHIR Vall d’Hebron Institut de Recerca
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Databasenist-spin
 
Making powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyondMaking powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyondAdamCribbs1
 
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...Andor Kiss
 
Next generation sequencing methods
Next generation sequencing methods Next generation sequencing methods
Next generation sequencing methods Mrinal Vashisth
 
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Integrated DNA Technologies
 
Long read sequencing - WEHI bioinformatics seminar - tue 16 june 2015
Long read sequencing -  WEHI  bioinformatics seminar - tue 16 june 2015Long read sequencing -  WEHI  bioinformatics seminar - tue 16 june 2015
Long read sequencing - WEHI bioinformatics seminar - tue 16 june 2015Torsten Seemann
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngsDin Apellidos
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekData Driven Innovation
 
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeqThe Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeqGolden Helix
 
Peter Nagy, Columbia Agilent Symposium, Jan, 27 2012
Peter Nagy, Columbia Agilent Symposium, Jan, 27 2012Peter Nagy, Columbia Agilent Symposium, Jan, 27 2012
Peter Nagy, Columbia Agilent Symposium, Jan, 27 2012sequencing_columbia
 
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)Maki Ogawa
 
20161004 mapk dna_ref_standardslides
20161004 mapk dna_ref_standardslides20161004 mapk dna_ref_standardslides
20161004 mapk dna_ref_standardslidesMaki Ogawa
 
Cpgr services brochure 14 may 2013 - v 16
Cpgr services brochure   14 may 2013 - v 16Cpgr services brochure   14 may 2013 - v 16
Cpgr services brochure 14 may 2013 - v 16Reinhard Hiller
 
GLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopGLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopMorgan Langille
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Prof. Wim Van Criekinge
 

Similaire à AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt (20)

Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
 
Apac distributor training series 3 swift product for cancer study
Apac distributor training series 3  swift product for cancer studyApac distributor training series 3  swift product for cancer study
Apac distributor training series 3 swift product for cancer study
 
CS Lecture 2017 04-11 from Data to Precision Medicine
CS Lecture 2017 04-11 from Data to Precision MedicineCS Lecture 2017 04-11 from Data to Precision Medicine
CS Lecture 2017 04-11 from Data to Precision Medicine
 
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
 
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference DatabaseDevelopment of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
 
Making powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyondMaking powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyond
 
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...
 
Next generation sequencing methods
Next generation sequencing methods Next generation sequencing methods
Next generation sequencing methods
 
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
Characterizing Alzheimer’s Disease candidate genes and transcripts with targe...
 
Long read sequencing - WEHI bioinformatics seminar - tue 16 june 2015
Long read sequencing -  WEHI  bioinformatics seminar - tue 16 june 2015Long read sequencing -  WEHI  bioinformatics seminar - tue 16 june 2015
Long read sequencing - WEHI bioinformatics seminar - tue 16 june 2015
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
 
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeqThe Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
 
NGS.pptx
NGS.pptxNGS.pptx
NGS.pptx
 
Peter Nagy, Columbia Agilent Symposium, Jan, 27 2012
Peter Nagy, Columbia Agilent Symposium, Jan, 27 2012Peter Nagy, Columbia Agilent Symposium, Jan, 27 2012
Peter Nagy, Columbia Agilent Symposium, Jan, 27 2012
 
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)
Reference Standards, gDNA, FFPE (MAPK, BRAF, EGFR, KRAS)
 
20161004 mapk dna_ref_standardslides
20161004 mapk dna_ref_standardslides20161004 mapk dna_ref_standardslides
20161004 mapk dna_ref_standardslides
 
Cpgr services brochure 14 may 2013 - v 16
Cpgr services brochure   14 may 2013 - v 16Cpgr services brochure   14 may 2013 - v 16
Cpgr services brochure 14 may 2013 - v 16
 
GLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopGLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics Workshop
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
 

Dernier

Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...hotbabesbook
 
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...narwatsonia7
 
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...parulsinha
 
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...Dipal Arora
 
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟ 9332606886 ⟟ Call Me For G...
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟  9332606886 ⟟ Call Me For G...Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟  9332606886 ⟟ Call Me For G...
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟ 9332606886 ⟟ Call Me For G...narwatsonia7
 
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...Dipal Arora
 
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...Dipal Arora
 
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426jennyeacort
 
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...Ishani Gupta
 
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any TimeTop Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any TimeCall Girls Delhi
 
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort ServicePremium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Servicevidya singh
 
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Sheetaleventcompany
 
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...Taniya Sharma
 
O898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
O898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In AhmedabadO898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
O898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In AhmedabadGENUINE ESCORT AGENCY
 
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...aartirawatdelhi
 

Dernier (20)

Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
 
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...
 
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
 
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Tirupati Just Call 8250077686 Top Class Call Girl Service Available
 
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
 
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟ 9332606886 ⟟ Call Me For G...
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟  9332606886 ⟟ Call Me For G...Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟  9332606886 ⟟ Call Me For G...
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟ 9332606886 ⟟ Call Me For G...
 
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 9907093804 Top Class Call Girl Service Available
 
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
 
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 8250077686 Top Class Call Girl Service Ava...
 
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
 
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
 
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
Mumbai ] (Call Girls) in Mumbai 10k @ I'm VIP Independent Escorts Girls 98333...
 
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any TimeTop Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
 
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
 
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort ServicePremium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
Premium Call Girls Cottonpet Whatsapp 7001035870 Independent Escort Service
 
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
 
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
 
O898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
O898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In AhmedabadO898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
O898O367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
 
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
 

AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt

  • 1. Next Generation Sequencing and Bioinformatics Analysis Pipelines Adam Ameur National Genomics Infrastructure SciLifeLab Uppsala adam.ameur@igp.uu.se INF R S RU URE A T CT ENOMI S G C ATCA GT N ION L GA CTAC AT A
  • 2. What is an analysis pipeline? • Basically just a number of steps to analyze data Raw data (FASTQ reads) Intermediate result Intermediate result Final result • Pipelines can be simple or very complex…
  • 3. Today’s lecture • Sequencing instruments and ‘standard’ pipelines – IonTorrent/PacificBiosciences • In-house bioinformatics pipelines, some examples • News and future plans
  • 4. Ion Torrent - PGM/Proton • The Ion Torrent System – 6 instruments available in Uppsala, early access users – Two instruments: PGM and Proton – For small scale (PGM) and large scale sequencing (Proton) – Rapid sequencing (run time ~ 2-4 hours) – Measures changes in pH – Sequencing on a chip Personal Genome Machine (PGM) Ion Proton
  • 5. Ion Torrent output • Ion Torrent throughput ~ from 10Mb to >10Gb, depending on the chip • Read lengths: 400bp (PGM), 200bp (Proton) • Output file format: FASTQ • What can we use Ion Torrent for? – Anything, except perhaps very large genomes 2 human exomes (PI chip) 2 human transcriptomes 1 human genome = 6 PI chips PI (Proton) 318 (PGM) 316 (PGM) 314 (PGM) NEW! P2 (Proton)
  • 6. Ion Torrent analysis workflow Downstream analysis Torrent Server .fastq .bam .fasta
  • 8. Torrent Suite Software Analysis • Plug-ins within the Torrent Suite Software – Alignment • TMAP: Specifically developed for Ion Torrent data – Variant Caller • SNP/Indel detection – Assembler • MIRA – Analysis of gene panels (Human exomes, cancer panels, etc…) • SNP/Indel detection in amplicon-seq data – Analysis of human transcriptomes – … • Analyses are started automatically when run is complete!
  • 9. Pacific Biosciences • Pacific Biosciences – Installed summer 2013 – Single molecule sequencing – Very long read lengths (up to 60 kb) – Rapid sequencing – Can detect base modifications (i.e. methylation) – Relatively low throughput
  • 10. PacBio output • PacBio throughput ~ 1 Gb/SMRT cell • PacBio read lengths: 500bp-60kb • Output file format: FASTQ • What can we use PacBio for? – Anything, except really large genomes ~1 bacterial genome ~1 bacterial transcriptome 1 human genome = 100 SMRT cells?
  • 11. PacBio analysis workflow In-house PacBio cluster Downstream analysis .fastq .bam .fasta
  • 13. SMRT analysis pipelines • Mapping • Variant calling • Assembly • Scaffolding • Base modifications
  • 14. What about Illumina? • There are many different pipelines for Illumina…
  • 15. In-house development of pipelines • The standard analysis pipelines are nice… … but sometimes we need to do own developments … or adapt the pipelines to our specific applications • Some examples of in-house developments: I. Building a local variant database (WES/WGS) II. Assembly of genomes using long reads III. Clinical sequencing – Leukemia Diagnostics
  • 17. Background: exome-seq • Main application of exome-seq – Find disease causing mutations in humans • Advantages – Allows investigate all protein coding sequences – Possible to detect both SNPs and small indels – Low cost (compared to WGS) – Possible to multiplex several exomes in one run – Standardized work flow for data analysis • Disadvantage – All genetic variants outside of exons are missed (~98%)
  • 18. Exome-seq throughput • We are producing a lot of exome-seq data – 4-6 exomes/day on Ion Proton – In each exome we detect • Over 50,000 SNPs • About 2000 small indels => Over 1 million variants/run! • In plain text files
  • 19. How to analyze this? • Traditional analysis - A lot of filtering! – Typical filters • Focus on rare SNPs (not present in dbSNP) • Remove FPs (by filtering against other exomes) • Effect on protein: non-synonymous, stop-gain etc • Heterozygous/homozygous – This analysis can be automated (more or less) Result: A few candidate causative SNP(s)! Start: All identified SNPs
  • 20. Why is this not optimal? • Drawbacks – Work on one sample at time • Difficult to compare between samples – Takes time to re-run analysis • When using different parameters – No standardized storage of detected SNPs/indels • Difficult to handle 100s of samples • Better solution – A database oriented system • Both for data storage and filtering analyses
  • 21. Analysis: In-house variant database Ameur et al., Database Journal, 2014 *CANdidate Variant Analysis System and Data Base *
  • 23. CanvasDB - Filtering speed • Rapid variant filtering, also for large databases
  • 24. A recent exome-seq project • Hearing loss: 2 affected brothers – Likely a rare, recessive disease => Shared homozygous SNPs/indels • Sequencing strategy – TargetSeq exome capture – One sample per PI chip homoz homoz heteroz heteroz
  • 25. Filtering analysis • CanvasDB filtering for a variant that is… – rare • at most in 1% of ~700 exomes – shared • found in both brothers – homozygous • in brothers, but in no other samples – deleterious • non-synonymous, frameshift, stop-gain, splicing, etc..
  • 26. Filtering results • Homozygous candidates – 2 SNPs • stop-gain in STRC • non-synonymous in PCNT – 0 indels • Compound heterozygous candidates (lower priority) – in 15 genes => Filtering is fast and gives a short candidate list!
  • 27. STRC - a candidate gene => Stop-gain in STRC is likely to cause hearing loss!
  • 28. Brother #1 Brother #2 Unrelated sample IGV visualization: Stop gain in STRC
  • 29. STRC, validation by Sanger Brother #1 Brother #2 Stop-gain site • Sanger validation • Does not seem to be homozygous.. – Explanation: difficult to sequence STRC by Sanger • Pseudo-gene with very high similarity • New validation showed mutation is homozygous!!
  • 30. CanvasDB – some success stories Solved cases, exome-seq - Niklas Dahl/Joakim Klar Neuromuscular disorder NMD11 Artrogryfosis SKD36 Lipodystrophy ACR1 Achondroplasia ACD2 Ectodermal dysplasia ED21 Achondroplasia ACD9 Ectodermal dysplasia ED1 Arythroderma AV1 Ichthyosis SD12 Muscular dystrophy DMD7 Neuromuscular disorder NMD8 Welanders myopathy (D) W Skeletal dysplasia SKD21 Visceral myopathy (D) D:5156 Ataxia telangiectasia MR67 Exostosis SKD13 Alopecia AP43 Epidermolysis bullosa SD14 Hearing loss D:9652
  • 31. CanvasDB - Availability • CanvasDB system now freely available on GitHub!
  • 32. Next Step: Whole Genome Sequencing Capacity of HiSeq X Ten: 320 whole human genomes/week!!! • More work on pipelines and databases needed!!! • New instruments at SciLifeLab for human WGS…
  • 33. Example of large WGS project: Generation of a high quality genetic variant database for the Swedish population Whole Genome Sequencing of ~1000 individuals Aim: to build a resource for Swedish research and health care PI: Ulf Gyllensten
  • 34. Example II: Assembly of genomes using Pacific Biosciences
  • 35. Genome assembly using NGS • Short-read de novo assembly by NGS – Requires mate-pair sequences • Ideally with different insert sizes – Complicated analysis • Assembly, scaffolding, finishing • Maybe even some manual steps => Rather expensive and time consuming • Long reads really makes a difference!! – We can assemble genomes using PacBio data only!
  • 36. HGAP de novo assembly • HGAP uses both long and shorter reads Long reads (seeds) Short reads
  • 37. PacBio – Current throughput & read lengths • >10kb average read lengths! (run from April 2014) • ~ 1 Gb of sequence from one PacBio SMRT cell
  • 38. PacBio assembly analysis • Simple -- just click a button!!
  • 39. PacBio assembly, example result • Example: Complete assembly of a bacterial genome
  • 40. PacBio assembly – recent developments • Also larger genomes can be assembled by PacBio..
  • 41. Next step: assembly of large genomes • We need to install such pipelines at UPPNEX!! 405,000 CPUh used on Google Cloud! • A computational challenge!!
  • 42. Example III: Clinical sequencing for Leukemia Treatment
  • 43. Chronic Myeloid Leukemia • BCR-ABL1 fusion protein – a CML drug target www.cambridgemedicine.org/article/doi/10.7244/cmj-1355057881 The BCR-ABL1 fusion protein can acquire resistance mutations following drug treatment
  • 44. BCR-ABL1 workflow – PacBio Sequencing From sample to results: < 1 week 1 sample/SMRT cell Cavelier et al., BMC Cancer, 2015
  • 45. BCR-ABL1 mutations at diagnosis BCR ABL1 PacBio sequencing generates ~10 000X coverage! Sample from time of diagnosis:
  • 46. BCR-ABL1 mutations in follow-up sample Sample 6 months later Mutations acquired in fusion transcript. Might require treatment with alternative drug. BCR ABL1
  • 47. BCR-ABL1 dilution series results • Mutations down to 1% detected!
  • 48. Summary of mutations in 5 CML patients
  • 49. Mutations mapped to protein structure
  • 50. BCR-ABL1 - Compound mutations P1 61m T315I F359C 91.8% 4.2% 3.9% P1 68.5m T315I F359C H396R D276G 93.7% 2.0% 1.1% 2.0% 1.1%
  • 51. BCR-ABL1 - Multiple isoforms in one individual!
  • 52. BCR-ABL1 – Isoforms and protein structure
  • 53. Next step: A clinical diagnostics pipeline! Step1. Create CCS reads Step2. Run mutation analysis Step3. Upload to result server
  • 54. Collaboration with Wesley Schaal & Ola Spjuth, UPPNEX/Uppsala Univ Reporting system for mutation results
  • 55. Ion Torrent – Ongoing developments Ion S5 XL system Ion Proton PII chip (EA)
  • 56. PacBio – Ongoing developments • Fast track clinical samples - Preparing workflows for rapid sequencing - Organ transplantation, diagnostics, outbreaks, ... • New chemistry and active loading of SMRT cells - Improved quality, longer reads - Increased throughput (2016?) • New analysis algorithms
  • 57. INFR S RU URE A T CT ENOMI S G C ATCA GT N ION L GA CTAC AT A Thank you!