SlideShare une entreprise Scribd logo
1  sur  53
Databases
INSDC
International Nucleotide Sequence Database Collaboration
GenBank EMBL DDBJ
Sequence types:
Eukaryotic gene
Bacterial operon
Artificial Cloning vectors
Plasmid
Repeat element
Transfer RNA
GenBank
Nucleic Acids Research, 2008, 36, D25-D30
GenBank
Doubled in size about every 18 months
80 billion nucleotide bases from more than 76 million individual
sequences
Sequence-based
taxonomy
2,60,000 named species
1700 species are being added per month
12 % are human origin & 8% are human EST
The top species
Homo sapiens ------- 12.7 billion
Mus musculus ------- 8.3 billion
Rattus norvegicus ------- 5.8 billion
Bos taurus ------- 3.8 billion
Zea mays ------- 3.6 billion
GenBank
Records and Divisions
GenBank records
Partitioned into “divisions”
Traditional BCT, VRL, PRI, ROD
Recent EST, GSS, HTG, HTC, ENV
WGS
Special TPA
WGS Accession numbers are issued to these sequences
eg., AAAA01072744
AAAA Project ID
01 Version number
072744 Contig number
TPA Third party annotation
1. Experimental
2. Inferential
Data submission
BankIt
Use BankIt if:
you have one or a few sequence submissions
you prefer to use a WWW-based submission tool
your sequence annotation is not complicated
you do not require sequence analysis tools to submit your sequence(s)
Sequin
Use Sequin if:
you are submitting long or complex submissions
you are submitting mutation, phylogenetic, population,
environmental, or segmented sets
you would like graphical viewing and editing options,
including the alignment editor
you would like network access to related analytical tools
EMBL
EMBL
Maintained by EBI
Other databases of EBI
Swiss-Prot
TrEMBL
UniProt
InterPro
E-MSD
ArrayExpress
EMBL
Taxonomic
Invertebrates, Organells, Bacteriophages, Plants, Prokaryotes,
Rodents
Non-Taxonomic
HTG, HTC, GSS, WGS, EST
EMBL representation
For Genomic data Coding strand
cDNA data RNA sequence
tRNA Mature transcript
WebIn Data submission tool
DDBJ
DDBJ
Entry, format, abbreviation key same as GenBank
SAKURA Data submission tool
Secondary Nucleotide
Sequence Databases
UniGene
Database of unique
gene clusters
STACK (Sequence Tag Alignment and Consensus Knowledgebase)
Ribosomal database
HIV sequence database
EPD (Eukaryotic Promoter Database)
REBASE
SwissProt
Curated protein sequence database
High level of annotation
Description of the function
Domains structure
PTMs
Variants
TrEMBL
Consists of entries in SWISS-PROT-like format
PIR-PSD
Protein Information resource- Protein Sequence Database
World’s first database of classified and funtionally annotated
protein sequences
Grew out of The Atlas of Protein Sequence and Structure
UniProt
Universal Protein Resource
Comprehensive resource for protein sequence and annotated
data
Sequence Alignments
Alignments:
Pairwise
Multiple
Multiple Sequence Alignments provide information on,
Alignment itself
Consensus Sequences
Conserved residues
Conserved residue patterns
Sequence Profiles
Consensus Sequence Databases
Multiple Alignment
↓
A single sequence in which each residue is the most common or
consensus for the sequence family
↓
Consensus Sequence Database
Consensus Sequence Databases
 Disadvantage:
Much information from the sequences that do not
contain the consensus residue is ignored, even though these
hold information about allowed substitutions.
PROSITE
 Database of sequence patterns
 Associated with protein family membership.
 Developed using patterns that best fit particular protein
families and functions.
PROSITE
Serine Protease Family :-
 Pattern1:
[LIVM]-[ST]-A-[STAG]-H-C
 Pattern2:
[DNSTAGC]-[GSTAPIMVQH]-x(2)-G-[DE]-S-G-[GS]-[SAPHV]-
[LIVMFYWH]-PA-[LIVMFYSTANQH]
PROSITE
Features:
1.Much shorter than total sequence length
2.Provide information on acceptable substitution.
3.Provide information on shared biological functions.
PROSITE
Disadvantage:
1. Lack of specificity.
2. They have no way of attaching
probabilities to the variation.
PRINTS and BLOCKS
 Contain multiply aligned ungapped segments.
BLOCKS- blocks
PRINTS - motifs
PRINTS and BLOCKS
 Advantages
1. Potentially more sensitive (more
distant relationships can be found)
2.More specific (fewer false positives
occur)
Specialized Sequence Database
 rRNA database
 tRNA database
 5S rRNA database
 Promoter sequence database
 InBase, a database on inteins
OMIM
 Online Mendelian Inheritance in Man
 Comprehensive database of human genes and genetic
disorders.
 Has numerous links to databases like SWISS- PROT,
PubMed, Mutation databases, Mapviewer.
Structural Databases
 RCSB
1. PDB
2. NDB
Structural Databases
 PDBe of EBI
 MMDB
Structures derived from the PDB, with value-added features
such as,
Explicit chemical graphs,
Links to literature,
Similar sequences,
Related 3D structures,
Information about chemicals
Structural Databases
CATH
C- Class
A- Architecture
T- Topology
H- Homologous superfamily
SCOP
Structural Classification of Proteins
Higher Order Functions
Databases
 KEGG (Kyoto Encyclopedia of Genes and Genomes)
Subsidiary Databases
Contains 16 main databases
Higher Order Functions
Databases
 DIP (Database of Interacting Proteins)
 BIND (Biomolecular Interaction Network Database)
Literature Databases
 PubMed
 Web of Science
 BioMedNet
Data retrieval tools
Data retrieval tools
Entrez
Biological databases

Contenu connexe

Tendances (20)

BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)
 
Ddbj
DdbjDdbj
Ddbj
 
SWISS-PROT
SWISS-PROTSWISS-PROT
SWISS-PROT
 
Kegg
KeggKegg
Kegg
 
Proteome analysis
Proteome analysisProteome analysis
Proteome analysis
 
Protein Data Bank (PDB)
Protein Data Bank (PDB)Protein Data Bank (PDB)
Protein Data Bank (PDB)
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Tools of bioinforformatics by kk
Tools of bioinforformatics by kkTools of bioinforformatics by kk
Tools of bioinforformatics by kk
 
UniProt
UniProtUniProt
UniProt
 
Uni prot presentation
Uni prot presentationUni prot presentation
Uni prot presentation
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
NCBI
NCBINCBI
NCBI
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl database
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
Composite and Specialized databases
Composite and Specialized databasesComposite and Specialized databases
Composite and Specialized databases
 

En vedette

Computational biology bls 303
Computational biology bls 303Computational biology bls 303
Computational biology bls 303Bruno Mmassy
 
Chemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataChemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataAbhik Seal
 
Intro to Open Babel
Intro to Open BabelIntro to Open Babel
Intro to Open Babelbaoilleach
 
molecular file formats in bioinformatics
molecular file formats in bioinformaticsmolecular file formats in bioinformatics
molecular file formats in bioinformaticsnadeem akhter
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformaticsnadeem akhter
 
databases in bioinformatics
databases in bioinformaticsdatabases in bioinformatics
databases in bioinformaticsnadeem akhter
 

En vedette (12)

Computational biology bls 303
Computational biology bls 303Computational biology bls 303
Computational biology bls 303
 
Chemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataChemical File Formats for storing chemical data
Chemical File Formats for storing chemical data
 
Sequence file formats
Sequence file formatsSequence file formats
Sequence file formats
 
Intro to Open Babel
Intro to Open BabelIntro to Open Babel
Intro to Open Babel
 
molecular file formats in bioinformatics
molecular file formats in bioinformaticsmolecular file formats in bioinformatics
molecular file formats in bioinformatics
 
Design your own test automation tool
Design your own test automation toolDesign your own test automation tool
Design your own test automation tool
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological Databases
Biological DatabasesBiological Databases
Biological Databases
 
databases in bioinformatics
databases in bioinformaticsdatabases in bioinformatics
databases in bioinformatics
 
Biological databases
Biological databasesBiological databases
Biological databases
 

Similaire à Biological databases

100505 koenig biological_databases
100505 koenig biological_databases100505 koenig biological_databases
100505 koenig biological_databasesMeetika Gupta
 
EST Clustering.ppt
EST Clustering.pptEST Clustering.ppt
EST Clustering.pptMedhavi27
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchAnshika Bansal
 
Bioinformatics final
Bioinformatics finalBioinformatics final
Bioinformatics finalRainu Rajeev
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Sreekanth Gali
 
Biological databases
Biological databasesBiological databases
Biological databasesAshfaq Ahmad
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesJackie Wirz, PhD
 
Araport Data Integration - 2015 UMD Minisymposium
Araport Data Integration - 2015 UMD MinisymposiumAraport Data Integration - 2015 UMD Minisymposium
Araport Data Integration - 2015 UMD MinisymposiumVivek Krishnakumar
 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuKAUSHAL SAHU
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein functionLars Juhl Jensen
 
The uni prot knowledgebase
The uni prot knowledgebaseThe uni prot knowledgebase
The uni prot knowledgebaseKew Sama
 
Databases_CSS2.pptx
Databases_CSS2.pptxDatabases_CSS2.pptx
Databases_CSS2.pptxSilpa87
 

Similaire à Biological databases (20)

Protein databases
Protein databasesProtein databases
Protein databases
 
100505 koenig biological_databases
100505 koenig biological_databases100505 koenig biological_databases
100505 koenig biological_databases
 
EST Clustering.ppt
EST Clustering.pptEST Clustering.ppt
EST Clustering.ppt
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
Bioinformatics final
Bioinformatics finalBioinformatics final
Bioinformatics final
 
Proteome databases
Proteome databasesProteome databases
Proteome databases
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02
 
Databases
DatabasesDatabases
Databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners Slides
 
Gen bank
Gen bankGen bank
Gen bank
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Araport Data Integration - 2015 UMD Minisymposium
Araport Data Integration - 2015 UMD MinisymposiumAraport Data Integration - 2015 UMD Minisymposium
Araport Data Integration - 2015 UMD Minisymposium
 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahu
 
blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
Ncbi
NcbiNcbi
Ncbi
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein function
 
The uni prot knowledgebase
The uni prot knowledgebaseThe uni prot knowledgebase
The uni prot knowledgebase
 
Databases_CSS2.pptx
Databases_CSS2.pptxDatabases_CSS2.pptx
Databases_CSS2.pptx
 
Understanding Genome
Understanding Genome Understanding Genome
Understanding Genome
 

Plus de Malla Reddy College of Pharmacy (20)

Rna secondary structure prediction
Rna secondary structure predictionRna secondary structure prediction
Rna secondary structure prediction
 
Proteomics
ProteomicsProteomics
Proteomics
 
Proteins basics
Proteins basicsProteins basics
Proteins basics
 
Protein structure classification
Protein structure classificationProtein structure classification
Protein structure classification
 
Protein identication characterization
Protein identication characterizationProtein identication characterization
Protein identication characterization
 
Protein modeling
Protein modelingProtein modeling
Protein modeling
 
Primerdesign
PrimerdesignPrimerdesign
Primerdesign
 
Phylogenetic studies
Phylogenetic studiesPhylogenetic studies
Phylogenetic studies
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Homology modeling tools
Homology modeling toolsHomology modeling tools
Homology modeling tools
 
Homology modeling
Homology modelingHomology modeling
Homology modeling
 
Genome assembly
Genome assemblyGenome assembly
Genome assembly
 
Genome analysis2
Genome analysis2Genome analysis2
Genome analysis2
 
Genome analysis
Genome analysisGenome analysis
Genome analysis
 
Fasta
FastaFasta
Fasta
 
Drug design intro
Drug design introDrug design intro
Drug design intro
 
Drug design
Drug designDrug design
Drug design
 
Data retrieval
Data retrievalData retrieval
Data retrieval
 
Blast
BlastBlast
Blast
 
Bioinfo intro
Bioinfo introBioinfo intro
Bioinfo intro
 

Dernier

Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 

Dernier (20)

Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 

Biological databases