SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
BG7
      A new system for bacterial genome
      annotation designed for NGS data




www.ohnosequences.com      www.era7bioinformatics.com
Motivation
                    Motivation
Features
                    The need of a system specially designed for NGS data
                    annotation with a pipeline unbiased by existing annotation systems
How it works?       designed for Sanger sequences

                    The need of a versatile system able to annotate genes even in the
Comparisons
                    step of preliminary assembly of the genome

Upcoming features   Special focus is given to the detection of “unexpected
                    proteins” without orthologous in close genomes (horizontally
                    acquired genes, phage genes, plasmid genes…)


                    A fast, automated and scalable process to face the
                    challenge of analyzing the huge amount of genomes that are being
                    sequenced with NGS technologies




www.ohnosequences.com                            www.era7bioinformatics.com
Motivation
                    Features
Features

                    1.   A new approach
How it works?

                    2.   It’s tolerant to NGS errors
Comparisons

                    3.   It’s based on cloud computing
Upcoming features

                    4.   It uses bio4j




www.ohnosequences.com                       www.era7bioinformatics.com
Motivation
                    Features: Approach
Features


How it works?
                            ORF prediction
Comparisons
                               is based on
Upcoming features
                            protein similarity




www.ohnosequences.com                  www.era7bioinformatics.com
Motivation
                       Features: Approach
Features
                       Use as much information as you can
                       (not just start/stop signals)
How it works?


                    TGGATGTGGCTCAGGACGAACGCTGGCGGCGTGCTTAACACATGCAAGTCGAACGGAAAGGCTGA
Comparisons


Upcoming features



                    TGGATGTGGCTCAGGACGAACGCTGGCGGCGTGCTTAACACATGCAAGTCGAACGGAAAGGCTGA




                                A           B     C     D             E



www.ohnosequences.com                                 www.era7bioinformatics.com
Motivation
                    Features: Approach
Features
                         Standard                    BG7
How it works?           Sequence                   Sequence

Comparisons

                                               Protein searching
Upcoming features     ORF prediction                (Blast)
                       (Glimmer)

                                                CDS prediction


                    Function prediction          RNA searching
                          (Blast)                   (Blast)

www.ohnosequences.com                     www.era7bioinformatics.com
Motivation
                        Features: NGS errors
Features

                    Issue                                    Technology
How it works?
                    Genomes in several contigs               All
Comparisons         Sequencing errors in start/stop codons   Illumina substitutions
                                                             454 indels
Upcoming features   Frameshifts                              454 indels
                    Horizontal gene transfer                 None


                        BG7 system is tolerant to all these issues



www.ohnosequences.com                                www.era7bioinformatics.com
Motivation
                          Features: Cloud computing
Features
                           AWS (Amazon Web Services)
How it works?


Comparisons              Completely Scalable       On demand

Upcoming features

                                  Fast                Cheap


                    Useful in tracking outbreaks
                    1 genome in ~2 hours
                    100 genomes in ~2 hours once you’ve got the reference proteins


www.ohnosequences.com                             www.era7bioinformatics.com
Motivation
                    Features: bio4j
Features
                    It uses
How it works?


Comparisons


Upcoming features
                                                Much richer
                                                annotations




                                             www.bio4j.com

www.ohnosequences.com                 www.era7bioinformatics.com
Motivation
                    How it works?
Features


How it works?


Comparisons


Upcoming features




www.ohnosequences.com               www.era7bioinformatics.com
•   Expert Manual Selection of reference sequences
   1

       • Protein        search
   2    • Blast


       • CDS definition
           • HSPs merge
   3       • Extension of the similarity region searching for start/stop signals


       • Solving conflicts
        • Solving duplicates
   4    • Solving overlaps


       •   RNA search
   5       • Blast



       • Incorporation of RNA genes
           • Definition of RNA genes
   6       • Conflicts with protein coding genes previously annotated are solved




www.ohnosequences.com                                                              www.era7bioinformatics.com
Motivation
                       Step 2: Protein search with tBlastn
Features
                                            A         B            C

How it works?


Comparisons


Upcoming features


                    Reference
                    Proteins (aa)


                    are searched in
                    the contigs sequences                    Input contigs (aa)

www.ohnosequences.com                           www.era7bioinformatics.com
Motivation
                    Step 3: CDS definition
Features                   Merging HSPs

How it works?

                               Several HSPs
Comparisons
                                                       Input contigs (aa)
Upcoming features



                     Protein




www.ohnosequences.com                         www.era7bioinformatics.com
Motivation
                    Step 3: CDS definition
Features                   Merging HSPs

How it works?

                                 Several HSPs
Comparisons
                                                            Input contigs (aa)
Upcoming features



                     Protein


                        We merge the HSPs to form a single similarity region




www.ohnosequences.com                            www.era7bioinformatics.com
Motivation
                    Step 3: CDS definition
Features                   Search for start/stop signals

How it works?


Comparisons


Upcoming features




                    We then search for start/stop signals upstream and
                    downstream the region with high similarity with the protein



www.ohnosequences.com                            www.era7bioinformatics.com
Motivation
                    Step 3: CDS definition
Features

                    Although we don’t find an start/stop codon for a given
How it works?
                    CDS we keep it

Comparisons         We just mark it accordingly

Upcoming features




www.ohnosequences.com                             www.era7bioinformatics.com
Motivation
                    Step 4: Solving conflicts
Features                   Duplicates

How it works?


Comparisons


Upcoming features




www.ohnosequences.com                           www.era7bioinformatics.com
Motivation
                    Step 4: Solving conflicts
Features                   Duplicates

How it works?


Comparisons


Upcoming features




www.ohnosequences.com                           www.era7bioinformatics.com
Motivation
                    Step 4: Solving conflicts
Features                   Overlapping CDS

How it works?


Comparisons


Upcoming features




www.ohnosequences.com                           www.era7bioinformatics.com
Motivation
                      Step 5: RNA search
Features                     Blastn
                                                                 Input contigs (nt)
How it works?


Comparisons


Upcoming features




                    Reference RNAs (nt) are searched in the contigs


www.ohnosequences.com                               www.era7bioinformatics.com
Motivation
                    Step 6: Incorporation of RNA genes
Features                   Definition of RNA genes
                                                         Input contigs (nt)
How it works?


Comparisons


Upcoming features




www.ohnosequences.com                      www.era7bioinformatics.com
Motivation
                    Step 6: Incorporation of RNA genes
Features                    Conflicts with protein coding genes are solved

How it works?


Comparisons


Upcoming features

                    If in a particular region we find a protein coding gene and
                    a RNA gene. RNA gene is selected over the protein coding
                    one




www.ohnosequences.com                             www.era7bioinformatics.com
Motivation
                    Finally
Features


How it works?


Comparisons


Upcoming features




                    TGGATGTGGCTCAGGACGAACGCTGGCGGCGTGCTTAACACATGCAAGTCGAACGGAAAGGCTGA




                                A           B     C     D             E


www.ohnosequences.com                                 www.era7bioinformatics.com
Motivation
                    Comparisons
Features
                    We’ve compared the NCBI annotations for
How it works?
                    Escherichia coli str. K-12 substr. MG1655
                    (Refseq ID NC_000913)
Comparisons


Upcoming features
                    With BG7 annotations




www.ohnosequences.com                       www.era7bioinformatics.com
Motivation
                     Comparisons
Features
                     The results we got were:
How it works?


Comparisons
                    Feature                        NCBI     BG7
Upcoming features   Protein coding genes           4145     43701
                                                            49512
                    RNA                            175      156

                     1   Selected genes
                     2   All detected genes: Selected + dismissed

www.ohnosequences.com                       www.era7bioinformatics.com
Motivation
                    Comparisons
Features


How it works?


Comparisons


Upcoming features




www.ohnosequences.com             www.era7bioinformatics.com
Motivation
                    Comparisons
Features
                    Conclusions
How it works?

                    Even in a not advantageous situation
Comparisons         (not a NGS project and a very well annotated genome)

Upcoming features   We got in one round annotation step

                    - ~95% of the NCBI protein coding genes
                    - ~89% of the NCBI RNA genes
                    - 419 new proteins detected



www.ohnosequences.com                        www.era7bioinformatics.com
Motivation
                    Upcoming features
Features
                    Improvements now focused on:
How it works?

                    - Overlapping solving phase
Comparisons

                    - Detection of very small proteins
Upcoming features




                    And any new need we find using it




www.ohnosequences.com                    www.era7bioinformatics.com
Motivation
                    Thanks:
Features
                    Oh no sequences! team
How it works?
                    Raquel Tobes: Bioinformatician, main advisor
Comparisons
                    Pablo Pareja: Main developer
Upcoming features
                    Eduardo Pareja: Scientific advisor

                    Eduardo Pareja-Tobes: Mathematician, advisor

                    Carmen Torrecillas: Junior Bioinformatician

                    Marina Manrique: Bioinformatician

www.ohnosequences.com                      www.era7bioinformatics.com
Thanks for your attention!




www.ohnosequences.com   www.era7bioinformatics.com

Contenu connexe

Similaire à A new system for bacterial genome annotation designed for NGS data

20111104 s4 overview
20111104 s4 overview20111104 s4 overview
20111104 s4 overviewLeo Neumeyer
 
Web Apollo Tutorial for the i5K copepod research community.
Web Apollo Tutorial for the i5K copepod research community.Web Apollo Tutorial for the i5K copepod research community.
Web Apollo Tutorial for the i5K copepod research community.Monica Munoz-Torres
 
Bioinformatics MiRON
Bioinformatics MiRONBioinformatics MiRON
Bioinformatics MiRONPrabin Shakya
 
OpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleOpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleYasset Perez-Riverol
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchDavid Ruau
 
Using ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyUsing ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyChris Evelo
 
2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dcc.titus.brown
 
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large CohortsRare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large CohortsGolden Helix Inc
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceThomas Zimmermann
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research DatabaseRajarshi Guha
 
Functional ANNOTATION OF GENOME.pptx
Functional ANNOTATION OF GENOME.pptxFunctional ANNOTATION OF GENOME.pptx
Functional ANNOTATION OF GENOME.pptxUmerjibranRaza
 
Snippets, Scans and Snap Decisions: How Component Identification Methods Impa...
Snippets, Scans and Snap Decisions: How Component Identification Methods Impa...Snippets, Scans and Snap Decisions: How Component Identification Methods Impa...
Snippets, Scans and Snap Decisions: How Component Identification Methods Impa...Sonatype
 

Similaire à A new system for bacterial genome annotation designed for NGS data (20)

20111104 s4 overview
20111104 s4 overview20111104 s4 overview
20111104 s4 overview
 
Arraygen_Brochure
Arraygen_BrochureArraygen_Brochure
Arraygen_Brochure
 
Web Apollo Tutorial for the i5K copepod research community.
Web Apollo Tutorial for the i5K copepod research community.Web Apollo Tutorial for the i5K copepod research community.
Web Apollo Tutorial for the i5K copepod research community.
 
Bioinformatics MiRON
Bioinformatics MiRONBioinformatics MiRON
Bioinformatics MiRON
 
Folker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data AnnotationFolker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data Annotation
 
GenomeTraveler
GenomeTravelerGenomeTraveler
GenomeTraveler
 
OpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scaleOpenMS: Quantitative proteomics at large scale
OpenMS: Quantitative proteomics at large scale
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
 
Using ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyUsing ontologies to do integrative systems biology
Using ontologies to do integrative systems biology
 
2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc
 
Harvester I
Harvester IHarvester I
Harvester I
 
Neo4j and bioinformatics
Neo4j and bioinformaticsNeo4j and bioinformatics
Neo4j and bioinformatics
 
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large CohortsRare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
 
Molecular Biology Software Links
Molecular Biology Software LinksMolecular Biology Software Links
Molecular Biology Software Links
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open Source
 
HPC For Bioinformatics
HPC For BioinformaticsHPC For Bioinformatics
HPC For Bioinformatics
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research Database
 
Functional ANNOTATION OF GENOME.pptx
Functional ANNOTATION OF GENOME.pptxFunctional ANNOTATION OF GENOME.pptx
Functional ANNOTATION OF GENOME.pptx
 
Ijcai 2020
Ijcai 2020Ijcai 2020
Ijcai 2020
 
Snippets, Scans and Snap Decisions: How Component Identification Methods Impa...
Snippets, Scans and Snap Decisions: How Component Identification Methods Impa...Snippets, Scans and Snap Decisions: How Component Identification Methods Impa...
Snippets, Scans and Snap Decisions: How Component Identification Methods Impa...
 

Dernier

All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...Arohi Goyal
 
Call Girls Nagpur Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Nagpur Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Nagpur Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Nagpur Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...CALL GIRLS
 
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls JaipurCall Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipurparulsinha
 
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Siliguri Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...Neha Kaur
 
Top Rated Bangalore Call Girls Richmond Circle ⟟ 8250192130 ⟟ Call Me For Gen...
Top Rated Bangalore Call Girls Richmond Circle ⟟ 8250192130 ⟟ Call Me For Gen...Top Rated Bangalore Call Girls Richmond Circle ⟟ 8250192130 ⟟ Call Me For Gen...
Top Rated Bangalore Call Girls Richmond Circle ⟟ 8250192130 ⟟ Call Me For Gen...narwatsonia7
 
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore EscortsVIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escortsaditipandeya
 
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...jageshsingh5554
 
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service Kochi
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service KochiLow Rate Call Girls Kochi Anika 8250192130 Independent Escort Service Kochi
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service KochiSuhani Kapoor
 
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...hotbabesbook
 
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...narwatsonia7
 
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
Call Girls Varanasi Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...indiancallgirl4rent
 
Top Rated Bangalore Call Girls Mg Road ⟟ 8250192130 ⟟ Call Me For Genuine Sex...
Top Rated Bangalore Call Girls Mg Road ⟟ 8250192130 ⟟ Call Me For Genuine Sex...Top Rated Bangalore Call Girls Mg Road ⟟ 8250192130 ⟟ Call Me For Genuine Sex...
Top Rated Bangalore Call Girls Mg Road ⟟ 8250192130 ⟟ Call Me For Genuine Sex...narwatsonia7
 
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Ooty Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.MiadAlsulami
 

Dernier (20)

All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
 
Call Girls Nagpur Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Nagpur Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Nagpur Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Nagpur Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
 
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls JaipurCall Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
 
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Siliguri Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Siliguri Just Call 9907093804 Top Class Call Girl Service Available
 
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCREscort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
 
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...
VIP Russian Call Girls in Varanasi Samaira 8250192130 Independent Escort Serv...
 
Top Rated Bangalore Call Girls Richmond Circle ⟟ 8250192130 ⟟ Call Me For Gen...
Top Rated Bangalore Call Girls Richmond Circle ⟟ 8250192130 ⟟ Call Me For Gen...Top Rated Bangalore Call Girls Richmond Circle ⟟ 8250192130 ⟟ Call Me For Gen...
Top Rated Bangalore Call Girls Richmond Circle ⟟ 8250192130 ⟟ Call Me For Gen...
 
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore EscortsVIP Call Girls Indore Kirti 💚😋  9256729539 🚀 Indore Escorts
VIP Call Girls Indore Kirti 💚😋 9256729539 🚀 Indore Escorts
 
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
 
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service Kochi
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service KochiLow Rate Call Girls Kochi Anika 8250192130 Independent Escort Service Kochi
Low Rate Call Girls Kochi Anika 8250192130 Independent Escort Service Kochi
 
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
 
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...
 
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
 
Call Girls Varanasi Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 9907093804 Top Class Call Girl Service Available
 
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
(Rocky) Jaipur Call Girl - 9521753030 Escorts Service 50% Off with Cash ON De...
 
Top Rated Bangalore Call Girls Mg Road ⟟ 8250192130 ⟟ Call Me For Genuine Sex...
Top Rated Bangalore Call Girls Mg Road ⟟ 8250192130 ⟟ Call Me For Genuine Sex...Top Rated Bangalore Call Girls Mg Road ⟟ 8250192130 ⟟ Call Me For Genuine Sex...
Top Rated Bangalore Call Girls Mg Road ⟟ 8250192130 ⟟ Call Me For Genuine Sex...
 
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Ooty Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
 

A new system for bacterial genome annotation designed for NGS data

  • 1. BG7 A new system for bacterial genome annotation designed for NGS data www.ohnosequences.com www.era7bioinformatics.com
  • 2. Motivation Motivation Features The need of a system specially designed for NGS data annotation with a pipeline unbiased by existing annotation systems How it works? designed for Sanger sequences The need of a versatile system able to annotate genes even in the Comparisons step of preliminary assembly of the genome Upcoming features Special focus is given to the detection of “unexpected proteins” without orthologous in close genomes (horizontally acquired genes, phage genes, plasmid genes…) A fast, automated and scalable process to face the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies www.ohnosequences.com www.era7bioinformatics.com
  • 3. Motivation Features Features 1. A new approach How it works? 2. It’s tolerant to NGS errors Comparisons 3. It’s based on cloud computing Upcoming features 4. It uses bio4j www.ohnosequences.com www.era7bioinformatics.com
  • 4. Motivation Features: Approach Features How it works? ORF prediction Comparisons is based on Upcoming features protein similarity www.ohnosequences.com www.era7bioinformatics.com
  • 5. Motivation Features: Approach Features Use as much information as you can (not just start/stop signals) How it works? TGGATGTGGCTCAGGACGAACGCTGGCGGCGTGCTTAACACATGCAAGTCGAACGGAAAGGCTGA Comparisons Upcoming features TGGATGTGGCTCAGGACGAACGCTGGCGGCGTGCTTAACACATGCAAGTCGAACGGAAAGGCTGA A B C D E www.ohnosequences.com www.era7bioinformatics.com
  • 6. Motivation Features: Approach Features Standard BG7 How it works? Sequence Sequence Comparisons Protein searching Upcoming features ORF prediction (Blast) (Glimmer) CDS prediction Function prediction RNA searching (Blast) (Blast) www.ohnosequences.com www.era7bioinformatics.com
  • 7. Motivation Features: NGS errors Features Issue Technology How it works? Genomes in several contigs All Comparisons Sequencing errors in start/stop codons Illumina substitutions 454 indels Upcoming features Frameshifts 454 indels Horizontal gene transfer None BG7 system is tolerant to all these issues www.ohnosequences.com www.era7bioinformatics.com
  • 8. Motivation Features: Cloud computing Features AWS (Amazon Web Services) How it works? Comparisons Completely Scalable On demand Upcoming features Fast Cheap Useful in tracking outbreaks 1 genome in ~2 hours 100 genomes in ~2 hours once you’ve got the reference proteins www.ohnosequences.com www.era7bioinformatics.com
  • 9. Motivation Features: bio4j Features It uses How it works? Comparisons Upcoming features Much richer annotations www.bio4j.com www.ohnosequences.com www.era7bioinformatics.com
  • 10. Motivation How it works? Features How it works? Comparisons Upcoming features www.ohnosequences.com www.era7bioinformatics.com
  • 11. Expert Manual Selection of reference sequences 1 • Protein search 2 • Blast • CDS definition • HSPs merge 3 • Extension of the similarity region searching for start/stop signals • Solving conflicts • Solving duplicates 4 • Solving overlaps • RNA search 5 • Blast • Incorporation of RNA genes • Definition of RNA genes 6 • Conflicts with protein coding genes previously annotated are solved www.ohnosequences.com www.era7bioinformatics.com
  • 12. Motivation Step 2: Protein search with tBlastn Features A B C How it works? Comparisons Upcoming features Reference Proteins (aa) are searched in the contigs sequences Input contigs (aa) www.ohnosequences.com www.era7bioinformatics.com
  • 13. Motivation Step 3: CDS definition Features Merging HSPs How it works? Several HSPs Comparisons Input contigs (aa) Upcoming features Protein www.ohnosequences.com www.era7bioinformatics.com
  • 14. Motivation Step 3: CDS definition Features Merging HSPs How it works? Several HSPs Comparisons Input contigs (aa) Upcoming features Protein We merge the HSPs to form a single similarity region www.ohnosequences.com www.era7bioinformatics.com
  • 15. Motivation Step 3: CDS definition Features Search for start/stop signals How it works? Comparisons Upcoming features We then search for start/stop signals upstream and downstream the region with high similarity with the protein www.ohnosequences.com www.era7bioinformatics.com
  • 16. Motivation Step 3: CDS definition Features Although we don’t find an start/stop codon for a given How it works? CDS we keep it Comparisons We just mark it accordingly Upcoming features www.ohnosequences.com www.era7bioinformatics.com
  • 17. Motivation Step 4: Solving conflicts Features Duplicates How it works? Comparisons Upcoming features www.ohnosequences.com www.era7bioinformatics.com
  • 18. Motivation Step 4: Solving conflicts Features Duplicates How it works? Comparisons Upcoming features www.ohnosequences.com www.era7bioinformatics.com
  • 19. Motivation Step 4: Solving conflicts Features Overlapping CDS How it works? Comparisons Upcoming features www.ohnosequences.com www.era7bioinformatics.com
  • 20. Motivation Step 5: RNA search Features Blastn Input contigs (nt) How it works? Comparisons Upcoming features Reference RNAs (nt) are searched in the contigs www.ohnosequences.com www.era7bioinformatics.com
  • 21. Motivation Step 6: Incorporation of RNA genes Features Definition of RNA genes Input contigs (nt) How it works? Comparisons Upcoming features www.ohnosequences.com www.era7bioinformatics.com
  • 22. Motivation Step 6: Incorporation of RNA genes Features Conflicts with protein coding genes are solved How it works? Comparisons Upcoming features If in a particular region we find a protein coding gene and a RNA gene. RNA gene is selected over the protein coding one www.ohnosequences.com www.era7bioinformatics.com
  • 23. Motivation Finally Features How it works? Comparisons Upcoming features TGGATGTGGCTCAGGACGAACGCTGGCGGCGTGCTTAACACATGCAAGTCGAACGGAAAGGCTGA A B C D E www.ohnosequences.com www.era7bioinformatics.com
  • 24. Motivation Comparisons Features We’ve compared the NCBI annotations for How it works? Escherichia coli str. K-12 substr. MG1655 (Refseq ID NC_000913) Comparisons Upcoming features With BG7 annotations www.ohnosequences.com www.era7bioinformatics.com
  • 25. Motivation Comparisons Features The results we got were: How it works? Comparisons Feature NCBI BG7 Upcoming features Protein coding genes 4145 43701 49512 RNA 175 156 1 Selected genes 2 All detected genes: Selected + dismissed www.ohnosequences.com www.era7bioinformatics.com
  • 26. Motivation Comparisons Features How it works? Comparisons Upcoming features www.ohnosequences.com www.era7bioinformatics.com
  • 27. Motivation Comparisons Features Conclusions How it works? Even in a not advantageous situation Comparisons (not a NGS project and a very well annotated genome) Upcoming features We got in one round annotation step - ~95% of the NCBI protein coding genes - ~89% of the NCBI RNA genes - 419 new proteins detected www.ohnosequences.com www.era7bioinformatics.com
  • 28. Motivation Upcoming features Features Improvements now focused on: How it works? - Overlapping solving phase Comparisons - Detection of very small proteins Upcoming features And any new need we find using it www.ohnosequences.com www.era7bioinformatics.com
  • 29. Motivation Thanks: Features Oh no sequences! team How it works? Raquel Tobes: Bioinformatician, main advisor Comparisons Pablo Pareja: Main developer Upcoming features Eduardo Pareja: Scientific advisor Eduardo Pareja-Tobes: Mathematician, advisor Carmen Torrecillas: Junior Bioinformatician Marina Manrique: Bioinformatician www.ohnosequences.com www.era7bioinformatics.com
  • 30. Thanks for your attention! www.ohnosequences.com www.era7bioinformatics.com