SlideShare une entreprise Scribd logo
1  sur  59
Systematic evaluation
of spliced alignment programs
for RNA-seq data
Engström et al. (Nature Methods 2013)
Presented by Monica Drăgan
2Functional Genomics, SS2014Dienstag, 25. März 2014
Systematic evaluation
of spliced alignment programs
for RNA-seq data
3Functional Genomics, SS2014Dienstag, 25. März 2014
Systematic evaluation
of spliced alignment programs
for RNA-seq data
4Functional Genomics, SS2014Dienstag, 25. März 2014
Systematic evaluation
of spliced alignment programs
for RNA-seq data
© bioinformatics.ca
Mapping the reads to
●
a reference genome
or
●
a transcriptome database
Deep sequencing (with NGS)
5Functional Genomics, SS2014Dienstag, 25. März 2014
Systematic evaluation
of spliced alignment programs
for RNA-seq data
© bioinformatics.ca
Why RNA sequencing?
●
Functional studies
●
Gene prediction is difficult
6Functional Genomics, SS2014Dienstag, 25. März 2014
Systematic evaluation
of spliced alignment programs
for RNA-seq data
7Functional Genomics, SS2014Dienstag, 25. März 2014
Systematic evaluation
of spliced alignment programs
for RNA-seq data
Mapping strategies depend on read length
●
Read length < 50 bp
●
Read length > 50 bp
8Functional Genomics, SS2014Dienstag, 25. März 2014
Systematic evaluation
of spliced alignment programs
for RNA-seq data
Mapping strategies depend on read length
●
Read length < 50 bp → Short (Unspliced) aligners
●
Read length > 50 bp
BWA BOWTIE
9Functional Genomics, SS2014Dienstag, 25. März 2014
Systematic evaluation
of spliced alignment programs
for RNA-seq data
Mapping strategies depend on read length
●
Read length < 50 bp → Short (Unspliced) aligners
●
Read length > 50 bp → Spliced alignment programs
●
In mRNA sequences the introns were removed
BWA BOWTIE
GSNAP
MapSplice
STAR
PAL Mapper
TopHat
ReadsMapPASS
SMALT
10Functional Genomics, SS2014Dienstag, 25. März 2014
Outline
 Challenges in RNA sequence alignment
 The aim of this paper
 Existing spliced-alignment software
 Conclusions
11Functional Genomics, SS2014Dienstag, 25. März 2014
Outline
 Challenges in RNA sequence alignment
 The aim of this paper
 Existing spliced-alignment software
 Conclusions
12Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads
13Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads → ~100M = computationally
expensive
14Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads → ~100M = computationally
expensive Compression with
Burrows-Wheeler
Transform
15Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads
 RNA Splicing
16Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads
 RNA Splicing
17Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads
 RNA Splicing / Alternative splicing
18Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads
 RNA Splicing / Alternative splicing
 a single gene may code
for multiple proteins
19Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads
 RNA Splicing / Alternative splicing
 Paired read separation issue
20Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads
 RNA Splicing / Alternative splicing
 Paired read separation issue
21Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads
 RNA Splicing / Alternative splicing
 Paired read separation issue
 Pseudogenes
22Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads
 RNA Splicing / Alternative splicing
 Paired read separation issue
 Pseudogenes
 pseudogenes often have highly similar sequences to functional,
intron-containing genes → RNA reads can incorrectly be mapped
here
 the human genome, which contains over 14,000 pseudogenes [Pei
et al. Genome Biol 2012]
23Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads
 RNA Splicing / Alternative splicing
 Paired read separation issue
 Pseudogenes
 Duplications
24Functional Genomics, SS2014Dienstag, 25. März 2014
Challenges in RNA-seq alignment
 Large #reads
 RNA Splicing / Alternative splicing
 Paired read separation issue
 Pseudogenes
 Duplications
 may correspond to biased PCR amplification of particular fragments
25Functional Genomics, SS2014Dienstag, 25. März 2014
Outline
 Challenges in RNA sequence alignment
 The aim of this paper
 Existing spliced-alignment software
 Conclusions
26Functional Genomics, SS2014Dienstag, 25. März 2014
The aim of this paper
 Asses the performance of 26 RNA seq alignment
protocols –based on 11 programs on real and simulated
human and mouse transcriptomes
 Alignment protocols were evaluated on Illumina 76-
nucleotide
 paired-end RNA-seq data from:
 the human leukemia cell line K562 (1.3 × 109 reads)
 mouse brain (1.1 × 108 reads) and two simulated
27Functional Genomics, SS2014Dienstag, 25. März 2014
Outline
 Challenges in RNA sequence alignment
 The aim of this paper
 Existing spliced-alignment software
 TopHat
 MapSplice
 STAR
 GSNAP

 Conclusions
28Functional Genomics, SS2014Dienstag, 25. März 2014
unspliced
alignment
TopHat
Trapnell, Pachter, and Salzberg (2009)
29Functional Genomics, SS2014Dienstag, 25. März 2014
unspliced
alignment
- reads that map to more than
10 locations
- reads that have more than a
few mismatches
TopHat
Trapnell, Pachter, and Salzberg (2009)
30Functional Genomics, SS2014Dienstag, 25. März 2014
unspliced
alignment
assemble
islands of sequences
- reads that map to more than
10 locations
- reads that have more than a
few mismatches
TopHat
Trapnell, Pachter, and Salzberg (2009)
31Functional Genomics, SS2014Dienstag, 25. März 2014
unspliced
alignment
assemble
Such an approach will identify only known
or predicted combinations of exons
TopHat
Trapnell, Pachter, and Salzberg (2009)
32Functional Genomics, SS2014Dienstag, 25. März 2014
TopHat
Trapnell, Pachter, and Salzberg (2009)
unspliced
alignment
spliced
alignment
33Functional Genomics, SS2014Dienstag, 25. März 2014
TopHat
Trapnell, Pachter, and Salzberg (2009)
34Functional Genomics, SS2014Dienstag, 25. März 2014
TopHat
Trapnell, Pachter, and Salzberg (2009)
Known junction signals:
GT-AG, GC-AG, and AT-AC
35Functional Genomics, SS2014Dienstag, 25. März 2014
TopHat
Trapnell, Pachter, and Salzberg (2009)
If an alignment extends into
an intron region, realign the reads
to the adjacent exons instead
Known junction signals:
GT-AG, GC-AG, and AT-AC
36Functional Genomics, SS2014Dienstag, 25. März 2014
Outline
 Challenges in sequence alignment
 What the paper is about
 Existing software
 TopHat
 MapSplice
 STAR
 GSNAP
 Conclusions
 Future work
37Functional Genomics, SS2014Dienstag, 25. März 2014
MapSplice
Wang et al. (2010)
 Similar to TopMap
 Reads = tags
 A tag has an ‘exonic alignment’ if it can be aligned in its
entirety to a consecutive sequence of nucleotides in G.
 T has a ‘spliced alignment’ if its alignment to G Requires
one or more gaps
38Functional Genomics, SS2014Dienstag, 25. März 2014
MapSplice
Wang et al. (2010)
Step 1: exonic alignment
39Functional Genomics, SS2014Dienstag, 25. März 2014
MapSplice
Wang et al. (2010)
Step 2: spliced alignment
●
the spliced alignment of tj+1
to the genomic interval between
anchors tj and tj+2
●
consider all the possible positions
of the splice site and map according
to the Hamming distace
40Functional Genomics, SS2014Dienstag, 25. März 2014
MapSplice
Wang et al. (2010)
Step 3: merge candidate segment alignments
41Functional Genomics, SS2014Dienstag, 25. März 2014
Outline
 Challenges in sequence alignment
 What the paper is about
 Existing software
 TopHat
 MapSplice
 STAR
 GSNAP
 Conclusions
 Future work
42Functional Genomics, SS2014Dienstag, 25. März 2014
STAR
Dobin et al. (2012)
Maximal Mappable Prefix (read location i) =
the longest read substring from position i
that has exact match on one
or more substrings of the ref genome
poor genomic alignment
Detect:
(a) splice junctions
(b) mismatches
(c) tails
43Functional Genomics, SS2014Dienstag, 25. März 2014
Outline
 Challenges in sequence alignment
 What the paper is about
 Existing software
 TopHat
 MapSplice
 STAR
 GSNAP
 Conclusions
 Future work
44Functional Genomics, SS2014Dienstag, 25. März 2014
GSNAP
Wu and Nacu (2010)
Efficient detection of indels and splice pairs:
 For large genomes, it is more efficient to preprocess the
genome rather than the reads to create genomic
index files, which provide genomic positions for a given
prefix/suffix.
 Works with candidate regions in the ref genome. (keep
track of the read location of 12 residues that support each
candidate region)
45Functional Genomics, SS2014Dienstag, 25. März 2014
GSNAP
Wu and Nacu (2010)
46Functional Genomics, SS2014Dienstag, 25. März 2014
For a more powerful use of the algorithms:
 use of available gene annotations, which allow it to avoid
erroneously mapping reads to pseudogenes
 use the information about the pair sof the paired read
47Functional Genomics, SS2014Dienstag, 25. März 2014
Outline
 Challenges in RNA sequence alignment
 The aim of this paper
 Existing spliced-alignment software
 Conclusions
48Functional Genomics, SS2014Dienstag, 25. März 2014
Conclusions
 Mismatches and basewise accuracy
MapSplice, PASS and TopHat display a low tolerance for mismatches.
Consequently, a large proportion of reads with low base-call quality scores
were not mapped by these methods
49Functional Genomics, SS2014Dienstag, 25. März 2014
Conclusions
 Mismatches and basewise accuracy
●
GSNAP, GSTRUCT, MapSplice,PASS, SMALT and STAR allow missmatches an can also
output an incomplete alignment when they are unable to map an entire sequence
50Functional Genomics, SS2014Dienstag, 25. März 2014
Conclusions
 Mismatches and basewise accuracy
Reads from mouse were mapped (against the mouse reference assembly17) at a greater rate and
with fewer mismatches than those from K562 (the cancer cell line K562 accumulated a lot of
mutations with respect to the human reference assembly).
51Functional Genomics, SS2014Dienstag, 25. März 2014
Conclusions
 Indel frequency
and accuracy
.
●
GSTRUCT produced the most uniform
distribution of indels
(coefficient of variation (CV) = 0.32)
●
TopHat produced the most variable
distribution
(CV = 1.5 and 1.1 splice junctions)
Size distribution of indels
for the human K562 data set
Precision and recall, stratified by indel size
GEM and PALMapper output included more
indels than any other method
52Functional Genomics, SS2014Dienstag, 25. März 2014
Conclusions
 Indel frequency
and accuracy
●
GEM and PALMapper report many false indels
(precision)
●
GSNAP and GSTRUCT exhibit high sensitivity
for deletions, independent of size (recall)
●
TopHat2 protocol is the most
sensitive method for long insertions (recall)
Precision and recall, stratified by indel size
53Functional Genomics, SS2014Dienstag, 25. März 2014
Conclusions
 Spliced alignment
●
High accuracy discovery rate for
ReadsMap, GSNAP, GSTRUCT and
MapSplice and TopHat
●
#false junction calls was greatly reduced
if junctions were filtered by supporting
alignment counts (plot c)
●
Protocols using annotation recovered
nearly all of the known junctions in
expressed transcripts (plot d)
●
For novel-junction discovery,
GSTRUCT outperformed other methods
●
54Functional Genomics, SS2014Dienstag, 25. März 2014
Conclusions
 GSNAP, GSTRUCT, MapSplice and STAR compared
favorably to the other methods
 MapSplice seems to be a conservative aligner with respect to
mismatch frequency, indel and exon junction calls.
 The most significant issue with GSNAP, GSTRUCT and
STAR is the presence of many false exon junctions in the
output.
 Both GSNAP and GSTRUCT require considerable computing
time when parameterized for sensitive spliced alignment
55Functional Genomics, SS2014Dienstag, 25. März 2014
Thank you!
56Functional Genomics, SS2014Dienstag, 25. März 2014
 Remaining challenges:
 Remaining challenges include exploiting gene annotation
with-
 out introducing bias, correctly placing multimapped reads,
achiev-
 ing optimal yet fast alignment around gaps and
mismatches, and
 Analysis
 reducing the number of false exon junctions reported.
Ongoing
 developments in sequencing technology will demand
efficient
 processing of longer reads with higher error rates and will
require
 more extensive spliced alignment as reads span multiple
57Functional Genomics, SS2014Dienstag, 25. März 2014
 Some RNA-seq aligners, including GSNAP [5], RUM [6],
and STAR [7], map reads independently of the alignments
of other reads, which may explain their lower sensitivity for
these spliced reads
 GSNAP [5] and STAR [7] also make use of annotation,
although they use it in a more limited fashion in order to
detect splice sites
58Functional Genomics, SS2014Dienstag, 25. März 2014
 have shown how suffix arrays (Manber
 and Myers, 1990), compressed using a Burrows-Wheeler
Transform
 (BWT) (Burrows and Wheeler, 1994), can rapidly map
reads that
 are exact matches or have a few mismatches or short
insertions or
 deletions (indels) relative to the reference.

59Functional Genomics, SS2014Dienstag, 25. März 2014
 A third approach, provided by the QPALMA program (Bona
 et al., 2008), can align individual reads across exon–exon
junctions
 using Smith–Waterman-type alignments and a specifically
trained
 splice site model.


Contenu connexe

Similaire à Systematic evaluation of spliced alignment programs for RNA-seq data

CRISPR cas, a potential tool for targeted genome modification in crops.
CRISPR cas, a potential tool for targeted genome modification in crops.CRISPR cas, a potential tool for targeted genome modification in crops.
CRISPR cas, a potential tool for targeted genome modification in crops.UAS,GKVK<BANGALORE
 
2944_IJDR_final_version
2944_IJDR_final_version2944_IJDR_final_version
2944_IJDR_final_versionDago Noel
 
2944_IJDR_final_version
2944_IJDR_final_version2944_IJDR_final_version
2944_IJDR_final_versionDago Noel
 
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...The Clinical Significance of Transcript Alignment Discrepancies … and tools t...
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...Human Variome Project
 
The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesReece Hart
 
Guide Picker Poster V3
Guide Picker Poster V3Guide Picker Poster V3
Guide Picker Poster V3Soren Hough
 
NGS Presentation .pptx
NGS Presentation  .pptxNGS Presentation  .pptx
NGS Presentation .pptxMalihaTanveer1
 
Forensics: Human Identity Testing in the Applied Genetics Group
Forensics: Human Identity Testing in the Applied Genetics GroupForensics: Human Identity Testing in the Applied Genetics Group
Forensics: Human Identity Testing in the Applied Genetics Groupnist-spin
 
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...Jonathan Eisen
 
Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)Gunnar Rätsch
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Golden Helix Inc
 
Goodwin2016 ngs 10 years
Goodwin2016 ngs 10 yearsGoodwin2016 ngs 10 years
Goodwin2016 ngs 10 yearsPrakash Koringa
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vstQiang Kou
 
How CRISPR–Cas9 Screening will revolutionise your drug development programs
How CRISPR–Cas9 Screening will revolutionise your drug development programsHow CRISPR–Cas9 Screening will revolutionise your drug development programs
How CRISPR–Cas9 Screening will revolutionise your drug development programsHorizonDiscovery
 
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGI
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGIHadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGI
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGIAllen Day, PhD
 
Making the cut with CRISPR
Making the cut with CRISPRMaking the cut with CRISPR
Making the cut with CRISPREdward Perello
 
2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.key2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.keyYannick Wurm
 
Experimental Designs in Next Generation Sequencing
Experimental Designs in Next Generation Sequencing Experimental Designs in Next Generation Sequencing
Experimental Designs in Next Generation Sequencing GuttiPavan
 

Similaire à Systematic evaluation of spliced alignment programs for RNA-seq data (20)

CRISPR cas, a potential tool for targeted genome modification in crops.
CRISPR cas, a potential tool for targeted genome modification in crops.CRISPR cas, a potential tool for targeted genome modification in crops.
CRISPR cas, a potential tool for targeted genome modification in crops.
 
2944_IJDR_final_version
2944_IJDR_final_version2944_IJDR_final_version
2944_IJDR_final_version
 
2944_IJDR_final_version
2944_IJDR_final_version2944_IJDR_final_version
2944_IJDR_final_version
 
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...The Clinical Significance of Transcript Alignment Discrepancies … and tools t...
The Clinical Significance of Transcript Alignment Discrepancies … and tools t...
 
The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment Discrepancies
 
Guide Picker Poster V3
Guide Picker Poster V3Guide Picker Poster V3
Guide Picker Poster V3
 
NGS Presentation .pptx
NGS Presentation  .pptxNGS Presentation  .pptx
NGS Presentation .pptx
 
Forensics: Human Identity Testing in the Applied Genetics Group
Forensics: Human Identity Testing in the Applied Genetics GroupForensics: Human Identity Testing in the Applied Genetics Group
Forensics: Human Identity Testing in the Applied Genetics Group
 
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
 
Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)Talk ABRF 2015 (Gunnar Rätsch)
Talk ABRF 2015 (Gunnar Rätsch)
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
Goodwin2016 ngs 10 years
Goodwin2016 ngs 10 yearsGoodwin2016 ngs 10 years
Goodwin2016 ngs 10 years
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vst
 
How CRISPR–Cas9 Screening will revolutionise your drug development programs
How CRISPR–Cas9 Screening will revolutionise your drug development programsHow CRISPR–Cas9 Screening will revolutionise your drug development programs
How CRISPR–Cas9 Screening will revolutionise your drug development programs
 
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGI
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGIHadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGI
Hadoop and Genomics - What you need to know - 2015.04.09 - Shenzhen - BGI
 
20140710 1 day1_nist_ercc2.0workshop
20140710 1 day1_nist_ercc2.0workshop20140710 1 day1_nist_ercc2.0workshop
20140710 1 day1_nist_ercc2.0workshop
 
Making the cut with CRISPR
Making the cut with CRISPRMaking the cut with CRISPR
Making the cut with CRISPR
 
2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.key2015 09-29-sbc322-methods.key
2015 09-29-sbc322-methods.key
 
Experimental Designs in Next Generation Sequencing
Experimental Designs in Next Generation Sequencing Experimental Designs in Next Generation Sequencing
Experimental Designs in Next Generation Sequencing
 

Dernier

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Dernier (20)

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

Systematic evaluation of spliced alignment programs for RNA-seq data

  • 1. Systematic evaluation of spliced alignment programs for RNA-seq data Engström et al. (Nature Methods 2013) Presented by Monica Drăgan
  • 2. 2Functional Genomics, SS2014Dienstag, 25. März 2014 Systematic evaluation of spliced alignment programs for RNA-seq data
  • 3. 3Functional Genomics, SS2014Dienstag, 25. März 2014 Systematic evaluation of spliced alignment programs for RNA-seq data
  • 4. 4Functional Genomics, SS2014Dienstag, 25. März 2014 Systematic evaluation of spliced alignment programs for RNA-seq data © bioinformatics.ca Mapping the reads to ● a reference genome or ● a transcriptome database Deep sequencing (with NGS)
  • 5. 5Functional Genomics, SS2014Dienstag, 25. März 2014 Systematic evaluation of spliced alignment programs for RNA-seq data © bioinformatics.ca Why RNA sequencing? ● Functional studies ● Gene prediction is difficult
  • 6. 6Functional Genomics, SS2014Dienstag, 25. März 2014 Systematic evaluation of spliced alignment programs for RNA-seq data
  • 7. 7Functional Genomics, SS2014Dienstag, 25. März 2014 Systematic evaluation of spliced alignment programs for RNA-seq data Mapping strategies depend on read length ● Read length < 50 bp ● Read length > 50 bp
  • 8. 8Functional Genomics, SS2014Dienstag, 25. März 2014 Systematic evaluation of spliced alignment programs for RNA-seq data Mapping strategies depend on read length ● Read length < 50 bp → Short (Unspliced) aligners ● Read length > 50 bp BWA BOWTIE
  • 9. 9Functional Genomics, SS2014Dienstag, 25. März 2014 Systematic evaluation of spliced alignment programs for RNA-seq data Mapping strategies depend on read length ● Read length < 50 bp → Short (Unspliced) aligners ● Read length > 50 bp → Spliced alignment programs ● In mRNA sequences the introns were removed BWA BOWTIE GSNAP MapSplice STAR PAL Mapper TopHat ReadsMapPASS SMALT
  • 10. 10Functional Genomics, SS2014Dienstag, 25. März 2014 Outline  Challenges in RNA sequence alignment  The aim of this paper  Existing spliced-alignment software  Conclusions
  • 11. 11Functional Genomics, SS2014Dienstag, 25. März 2014 Outline  Challenges in RNA sequence alignment  The aim of this paper  Existing spliced-alignment software  Conclusions
  • 12. 12Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads
  • 13. 13Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads → ~100M = computationally expensive
  • 14. 14Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads → ~100M = computationally expensive Compression with Burrows-Wheeler Transform
  • 15. 15Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads  RNA Splicing
  • 16. 16Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads  RNA Splicing
  • 17. 17Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads  RNA Splicing / Alternative splicing
  • 18. 18Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads  RNA Splicing / Alternative splicing  a single gene may code for multiple proteins
  • 19. 19Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads  RNA Splicing / Alternative splicing  Paired read separation issue
  • 20. 20Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads  RNA Splicing / Alternative splicing  Paired read separation issue
  • 21. 21Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads  RNA Splicing / Alternative splicing  Paired read separation issue  Pseudogenes
  • 22. 22Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads  RNA Splicing / Alternative splicing  Paired read separation issue  Pseudogenes  pseudogenes often have highly similar sequences to functional, intron-containing genes → RNA reads can incorrectly be mapped here  the human genome, which contains over 14,000 pseudogenes [Pei et al. Genome Biol 2012]
  • 23. 23Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads  RNA Splicing / Alternative splicing  Paired read separation issue  Pseudogenes  Duplications
  • 24. 24Functional Genomics, SS2014Dienstag, 25. März 2014 Challenges in RNA-seq alignment  Large #reads  RNA Splicing / Alternative splicing  Paired read separation issue  Pseudogenes  Duplications  may correspond to biased PCR amplification of particular fragments
  • 25. 25Functional Genomics, SS2014Dienstag, 25. März 2014 Outline  Challenges in RNA sequence alignment  The aim of this paper  Existing spliced-alignment software  Conclusions
  • 26. 26Functional Genomics, SS2014Dienstag, 25. März 2014 The aim of this paper  Asses the performance of 26 RNA seq alignment protocols –based on 11 programs on real and simulated human and mouse transcriptomes  Alignment protocols were evaluated on Illumina 76- nucleotide  paired-end RNA-seq data from:  the human leukemia cell line K562 (1.3 × 109 reads)  mouse brain (1.1 × 108 reads) and two simulated
  • 27. 27Functional Genomics, SS2014Dienstag, 25. März 2014 Outline  Challenges in RNA sequence alignment  The aim of this paper  Existing spliced-alignment software  TopHat  MapSplice  STAR  GSNAP   Conclusions
  • 28. 28Functional Genomics, SS2014Dienstag, 25. März 2014 unspliced alignment TopHat Trapnell, Pachter, and Salzberg (2009)
  • 29. 29Functional Genomics, SS2014Dienstag, 25. März 2014 unspliced alignment - reads that map to more than 10 locations - reads that have more than a few mismatches TopHat Trapnell, Pachter, and Salzberg (2009)
  • 30. 30Functional Genomics, SS2014Dienstag, 25. März 2014 unspliced alignment assemble islands of sequences - reads that map to more than 10 locations - reads that have more than a few mismatches TopHat Trapnell, Pachter, and Salzberg (2009)
  • 31. 31Functional Genomics, SS2014Dienstag, 25. März 2014 unspliced alignment assemble Such an approach will identify only known or predicted combinations of exons TopHat Trapnell, Pachter, and Salzberg (2009)
  • 32. 32Functional Genomics, SS2014Dienstag, 25. März 2014 TopHat Trapnell, Pachter, and Salzberg (2009) unspliced alignment spliced alignment
  • 33. 33Functional Genomics, SS2014Dienstag, 25. März 2014 TopHat Trapnell, Pachter, and Salzberg (2009)
  • 34. 34Functional Genomics, SS2014Dienstag, 25. März 2014 TopHat Trapnell, Pachter, and Salzberg (2009) Known junction signals: GT-AG, GC-AG, and AT-AC
  • 35. 35Functional Genomics, SS2014Dienstag, 25. März 2014 TopHat Trapnell, Pachter, and Salzberg (2009) If an alignment extends into an intron region, realign the reads to the adjacent exons instead Known junction signals: GT-AG, GC-AG, and AT-AC
  • 36. 36Functional Genomics, SS2014Dienstag, 25. März 2014 Outline  Challenges in sequence alignment  What the paper is about  Existing software  TopHat  MapSplice  STAR  GSNAP  Conclusions  Future work
  • 37. 37Functional Genomics, SS2014Dienstag, 25. März 2014 MapSplice Wang et al. (2010)  Similar to TopMap  Reads = tags  A tag has an ‘exonic alignment’ if it can be aligned in its entirety to a consecutive sequence of nucleotides in G.  T has a ‘spliced alignment’ if its alignment to G Requires one or more gaps
  • 38. 38Functional Genomics, SS2014Dienstag, 25. März 2014 MapSplice Wang et al. (2010) Step 1: exonic alignment
  • 39. 39Functional Genomics, SS2014Dienstag, 25. März 2014 MapSplice Wang et al. (2010) Step 2: spliced alignment ● the spliced alignment of tj+1 to the genomic interval between anchors tj and tj+2 ● consider all the possible positions of the splice site and map according to the Hamming distace
  • 40. 40Functional Genomics, SS2014Dienstag, 25. März 2014 MapSplice Wang et al. (2010) Step 3: merge candidate segment alignments
  • 41. 41Functional Genomics, SS2014Dienstag, 25. März 2014 Outline  Challenges in sequence alignment  What the paper is about  Existing software  TopHat  MapSplice  STAR  GSNAP  Conclusions  Future work
  • 42. 42Functional Genomics, SS2014Dienstag, 25. März 2014 STAR Dobin et al. (2012) Maximal Mappable Prefix (read location i) = the longest read substring from position i that has exact match on one or more substrings of the ref genome poor genomic alignment Detect: (a) splice junctions (b) mismatches (c) tails
  • 43. 43Functional Genomics, SS2014Dienstag, 25. März 2014 Outline  Challenges in sequence alignment  What the paper is about  Existing software  TopHat  MapSplice  STAR  GSNAP  Conclusions  Future work
  • 44. 44Functional Genomics, SS2014Dienstag, 25. März 2014 GSNAP Wu and Nacu (2010) Efficient detection of indels and splice pairs:  For large genomes, it is more efficient to preprocess the genome rather than the reads to create genomic index files, which provide genomic positions for a given prefix/suffix.  Works with candidate regions in the ref genome. (keep track of the read location of 12 residues that support each candidate region)
  • 45. 45Functional Genomics, SS2014Dienstag, 25. März 2014 GSNAP Wu and Nacu (2010)
  • 46. 46Functional Genomics, SS2014Dienstag, 25. März 2014 For a more powerful use of the algorithms:  use of available gene annotations, which allow it to avoid erroneously mapping reads to pseudogenes  use the information about the pair sof the paired read
  • 47. 47Functional Genomics, SS2014Dienstag, 25. März 2014 Outline  Challenges in RNA sequence alignment  The aim of this paper  Existing spliced-alignment software  Conclusions
  • 48. 48Functional Genomics, SS2014Dienstag, 25. März 2014 Conclusions  Mismatches and basewise accuracy MapSplice, PASS and TopHat display a low tolerance for mismatches. Consequently, a large proportion of reads with low base-call quality scores were not mapped by these methods
  • 49. 49Functional Genomics, SS2014Dienstag, 25. März 2014 Conclusions  Mismatches and basewise accuracy ● GSNAP, GSTRUCT, MapSplice,PASS, SMALT and STAR allow missmatches an can also output an incomplete alignment when they are unable to map an entire sequence
  • 50. 50Functional Genomics, SS2014Dienstag, 25. März 2014 Conclusions  Mismatches and basewise accuracy Reads from mouse were mapped (against the mouse reference assembly17) at a greater rate and with fewer mismatches than those from K562 (the cancer cell line K562 accumulated a lot of mutations with respect to the human reference assembly).
  • 51. 51Functional Genomics, SS2014Dienstag, 25. März 2014 Conclusions  Indel frequency and accuracy . ● GSTRUCT produced the most uniform distribution of indels (coefficient of variation (CV) = 0.32) ● TopHat produced the most variable distribution (CV = 1.5 and 1.1 splice junctions) Size distribution of indels for the human K562 data set Precision and recall, stratified by indel size GEM and PALMapper output included more indels than any other method
  • 52. 52Functional Genomics, SS2014Dienstag, 25. März 2014 Conclusions  Indel frequency and accuracy ● GEM and PALMapper report many false indels (precision) ● GSNAP and GSTRUCT exhibit high sensitivity for deletions, independent of size (recall) ● TopHat2 protocol is the most sensitive method for long insertions (recall) Precision and recall, stratified by indel size
  • 53. 53Functional Genomics, SS2014Dienstag, 25. März 2014 Conclusions  Spliced alignment ● High accuracy discovery rate for ReadsMap, GSNAP, GSTRUCT and MapSplice and TopHat ● #false junction calls was greatly reduced if junctions were filtered by supporting alignment counts (plot c) ● Protocols using annotation recovered nearly all of the known junctions in expressed transcripts (plot d) ● For novel-junction discovery, GSTRUCT outperformed other methods ●
  • 54. 54Functional Genomics, SS2014Dienstag, 25. März 2014 Conclusions  GSNAP, GSTRUCT, MapSplice and STAR compared favorably to the other methods  MapSplice seems to be a conservative aligner with respect to mismatch frequency, indel and exon junction calls.  The most significant issue with GSNAP, GSTRUCT and STAR is the presence of many false exon junctions in the output.  Both GSNAP and GSTRUCT require considerable computing time when parameterized for sensitive spliced alignment
  • 55. 55Functional Genomics, SS2014Dienstag, 25. März 2014 Thank you!
  • 56. 56Functional Genomics, SS2014Dienstag, 25. März 2014  Remaining challenges:  Remaining challenges include exploiting gene annotation with-  out introducing bias, correctly placing multimapped reads, achiev-  ing optimal yet fast alignment around gaps and mismatches, and  Analysis  reducing the number of false exon junctions reported. Ongoing  developments in sequencing technology will demand efficient  processing of longer reads with higher error rates and will require  more extensive spliced alignment as reads span multiple
  • 57. 57Functional Genomics, SS2014Dienstag, 25. März 2014  Some RNA-seq aligners, including GSNAP [5], RUM [6], and STAR [7], map reads independently of the alignments of other reads, which may explain their lower sensitivity for these spliced reads  GSNAP [5] and STAR [7] also make use of annotation, although they use it in a more limited fashion in order to detect splice sites
  • 58. 58Functional Genomics, SS2014Dienstag, 25. März 2014  have shown how suffix arrays (Manber  and Myers, 1990), compressed using a Burrows-Wheeler Transform  (BWT) (Burrows and Wheeler, 1994), can rapidly map reads that  are exact matches or have a few mismatches or short insertions or  deletions (indels) relative to the reference. 
  • 59. 59Functional Genomics, SS2014Dienstag, 25. März 2014  A third approach, provided by the QPALMA program (Bona  et al., 2008), can align individual reads across exon–exon junctions  using Smith–Waterman-type alignments and a specifically trained  splice site model. 

Notes de l'éditeur

  1. /home/monique/Desktop/ETH_alignment_MDragan.odp
  2. RNA Splicing Introns - mRNA transcripts do not include these introns, so the alignment program must handle gapped (or spliced) alignment with very large gaps
  3. RNA Splicing Introns - mRNA transcripts do not include these introns, so the alignment program must handle gapped (or spliced) alignment with very large gaps
  4. RNA Splicing Introns - mRNA transcripts do not include these introns, so the alignment program must handle gapped (or spliced) alignment with very large gaps
  5. RNA Splicing Introns - mRNA transcripts do not include these introns, so the alignment program must handle gapped (or spliced) alignment with very large gaps
  6. RNA Splicing Introns - mRNA transcripts do not include these introns, so the alignment program must handle gapped (or spliced) alignment with very large gaps
  7. RNA Splicing Introns - mRNA transcripts do not include these introns, so the alignment program must handle gapped (or spliced) alignment with very large gaps
  8. RNA Splicing Introns - mRNA transcripts do not include these introns, so the alignment program must handle gapped (or spliced) alignment with very large gaps
  9. RNA Splicing Introns - mRNA transcripts do not include these introns, so the alignment program must handle gapped (or spliced) alignment with very large gaps
  10. RNA Splicing Introns - mRNA transcripts do not include these introns, so the alignment program must handle gapped (or spliced) alignment with very large gaps
  11. RNA Splicing Introns - mRNA transcripts do not include these introns, so the alignment program must handle gapped (or spliced) alignment with very large gaps