SlideShare une entreprise Scribd logo
1  sur  22
DATABASES IN
BIOINFORMATICS
Introduction
 Fast increase in biological information
 Biological science has now turned into a
data rich science
 Gene sequences
 Amino acid sequences in proteins
 Motifs and domains in proteins
 Structural data from XRD & NMR
 Metabolic pathways
 Protein-protein interactions
 Gene expression data DNA microarrays
Biological databases
 Biological database is a collection of
data which is structured, searchable,
updated periodically and also cross-
referenced.
 Some databases are multi functional
 Major purposes of databases is as
follows:Availability of
biological data
Systemization
of data
Analysis of
computed
biological data
History
 1956; first sequence database when insulin
was sequenced
 51 amino acids
 Atlas of protein sequences and structures in
1965 by Margaret Day Hoff et al was a
printed book.
 Became base for PIR protein information
resource
 First nucleotide sequence: yeast tRNA
 77 bases
 During this time 3D structure of proteins was
being studied and renowned PDB was made.
…
 First genome published was of free
living virus haemophilus influenzae in
1995
 Genome?
 All genes ? Or all DNA?
 Why are complete genome
interesting?
Aspects of genome analysis
Ab initio Gene
prediction
Locus
Gene
identification by
EST (expressed
sequence tags)
Gene prediction
via EST
Gene prediction
via comparison,
coding and
regulatory
regions
Features of biological
databases
1) Data heterogeneity
2) High volume data
3) Uncertainty
4) Data Curation
5) Large scale data integration
6) Data sharing
7) Dynamic and subject to change
Classification scheme for
biological databases
Data type
Maintenance status
Data access
Data source
Database design
Organism
Data type
 Genome database
 Sequence database
 Structure database
 Microarray database
 Chemical database
 Pathway database
 Enzyme database
 Disease database
 Literature database
Based on maintenance status
NCBI EMBL SIB
Based on data access
1) Publicly available
2) Available with copy wright
3) Browsing only, accessible but not
downloadable
4) Academic but not freely available
5) Proprietary commercial
6) Restricted
Based on data sources
Based on
data
sources
Primary databases
 Contains original data from the
researchers
 Public or open access mostly
 NCBI , GENEBANK
 EMBL
 SWISS-PROT
 NDB
Secondary databases
 Results from entries of primary
database
 Manually created or automatically
generated
 Swiss-prot is an example of secondary
database
Thanks…
Biological
sequence
databases
Lecture # 5
By:
Hira Shahzad
DDBJ
 DNA databank of japan
 Nucleotide sequence database
 Established in 1986
 Has been working in collaboration
with EMBL & NCBI
 After 20 years another collaborative
project named INSDC was formed
EMBL Genebank DDBJ
SWISS-PROT
 Protein sequence database
 Maintained by SIB Swiss institute of
bioinformatics in Switzerland and also
the European bioinformatics institute
EBI
 The output format is swiss-prot file
 That has been explained in molecular
file formats
Good luck 

Contenu connexe

Tendances

Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
biinoida
 

Tendances (20)

Ddbj
DdbjDdbj
Ddbj
 
Structural databases
Structural databases Structural databases
Structural databases
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
 
Sequence Submission Tools
Sequence Submission ToolsSequence Submission Tools
Sequence Submission Tools
 
Fasta
FastaFasta
Fasta
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
BLAST
BLASTBLAST
BLAST
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Uni prot presentation
Uni prot presentationUni prot presentation
Uni prot presentation
 
Scop database
Scop databaseScop database
Scop database
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
 
Prosite
PrositeProsite
Prosite
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 

En vedette

Protein databases
Protein databasesProtein databases
Protein databases
sarumalay
 
molecular file formats in bioinformatics
molecular file formats in bioinformaticsmolecular file formats in bioinformatics
molecular file formats in bioinformatics
nadeem akhter
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
nadeem akhter
 
Chemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataChemical File Formats for storing chemical data
Chemical File Formats for storing chemical data
Abhik Seal
 
Computational biology bls 303
Computational biology bls 303Computational biology bls 303
Computational biology bls 303
Bruno Mmassy
 

En vedette (18)

Protein databases
Protein databasesProtein databases
Protein databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological Databases
Biological DatabasesBiological Databases
Biological Databases
 
BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2
 
NCBI
NCBINCBI
NCBI
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
GenomeBrowser
GenomeBrowserGenomeBrowser
GenomeBrowser
 
EMBL-EBI
EMBL-EBIEMBL-EBI
EMBL-EBI
 
BITS: UCSC genome browser - Part 1
BITS: UCSC genome browser - Part 1BITS: UCSC genome browser - Part 1
BITS: UCSC genome browser - Part 1
 
molecular file formats in bioinformatics
molecular file formats in bioinformaticsmolecular file formats in bioinformatics
molecular file formats in bioinformatics
 
Sequence file formats
Sequence file formatsSequence file formats
Sequence file formats
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
Intro to Open Babel
Intro to Open BabelIntro to Open Babel
Intro to Open Babel
 
Design your own test automation tool
Design your own test automation toolDesign your own test automation tool
Design your own test automation tool
 
Chemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataChemical File Formats for storing chemical data
Chemical File Formats for storing chemical data
 
Computational biology bls 303
Computational biology bls 303Computational biology bls 303
Computational biology bls 303
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 

Similaire à databases in bioinformatics

Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
SBituila
 
Biological databases.pptx
Biological databases.pptxBiological databases.pptx
Biological databases.pptx
PagudalaSangeetha
 

Similaire à databases in bioinformatics (20)

Features of biological databases
Features of biological databasesFeatures of biological databases
Features of biological databases
 
protein databases.ppt
protein databases.pptprotein databases.ppt
protein databases.ppt
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Data base in detail
Data base in detailData base in detail
Data base in detail
 
Biological Database
Biological DatabaseBiological Database
Biological Database
 
Introduction to Biological databases
Introduction to Biological databasesIntroduction to Biological databases
Introduction to Biological databases
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptx
 
Biological databases.pptx
Biological databases.pptxBiological databases.pptx
Biological databases.pptx
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Structural database and their classification by abdul qahar
Structural database and their classification by abdul qaharStructural database and their classification by abdul qahar
Structural database and their classification by abdul qahar
 
Protein database
Protein  databaseProtein  database
Protein database
 
PROTEIN DATABASE
PROTEIN DATABASEPROTEIN DATABASE
PROTEIN DATABASE
 
Bioinformatics.pptx
Bioinformatics.pptxBioinformatics.pptx
Bioinformatics.pptx
 
History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 

Plus de nadeem akhter (10)

UV-VIS Spectroscopy
UV-VIS SpectroscopyUV-VIS Spectroscopy
UV-VIS Spectroscopy
 
Islamandscience
IslamandscienceIslamandscience
Islamandscience
 
Human development and sex determination
Human development and sex determination Human development and sex determination
Human development and sex determination
 
DNA structure and chromosome organization
DNA structure and chromosome organization DNA structure and chromosome organization
DNA structure and chromosome organization
 
Protein 3D structure and classification database
Protein 3D structure and classification database Protein 3D structure and classification database
Protein 3D structure and classification database
 
Molecular viewers
Molecular viewers Molecular viewers
Molecular viewers
 
ATOMIC ABSORPTION SPECTROSCOPY
ATOMIC ABSORPTION SPECTROSCOPYATOMIC ABSORPTION SPECTROSCOPY
ATOMIC ABSORPTION SPECTROSCOPY
 
bioinformatics simple
bioinformatics simple bioinformatics simple
bioinformatics simple
 
Islam and environmental biology Msc Biology
Islam and environmental biology Msc BiologyIslam and environmental biology Msc Biology
Islam and environmental biology Msc Biology
 
Chromatography and its types
Chromatography and its typesChromatography and its types
Chromatography and its types
 

Dernier

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Dernier (20)

Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
latest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answerslatest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answers
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 

databases in bioinformatics

  • 2. Introduction  Fast increase in biological information  Biological science has now turned into a data rich science  Gene sequences  Amino acid sequences in proteins  Motifs and domains in proteins  Structural data from XRD & NMR  Metabolic pathways  Protein-protein interactions  Gene expression data DNA microarrays
  • 3. Biological databases  Biological database is a collection of data which is structured, searchable, updated periodically and also cross- referenced.  Some databases are multi functional  Major purposes of databases is as follows:Availability of biological data Systemization of data Analysis of computed biological data
  • 4. History  1956; first sequence database when insulin was sequenced  51 amino acids  Atlas of protein sequences and structures in 1965 by Margaret Day Hoff et al was a printed book.  Became base for PIR protein information resource  First nucleotide sequence: yeast tRNA  77 bases  During this time 3D structure of proteins was being studied and renowned PDB was made.
  • 5. …  First genome published was of free living virus haemophilus influenzae in 1995  Genome?  All genes ? Or all DNA?  Why are complete genome interesting?
  • 6. Aspects of genome analysis Ab initio Gene prediction Locus Gene identification by EST (expressed sequence tags) Gene prediction via EST Gene prediction via comparison, coding and regulatory regions
  • 7. Features of biological databases 1) Data heterogeneity 2) High volume data 3) Uncertainty 4) Data Curation 5) Large scale data integration 6) Data sharing 7) Dynamic and subject to change
  • 8. Classification scheme for biological databases Data type Maintenance status Data access Data source Database design Organism
  • 9. Data type  Genome database  Sequence database  Structure database  Microarray database  Chemical database  Pathway database  Enzyme database  Disease database  Literature database
  • 10. Based on maintenance status NCBI EMBL SIB
  • 11. Based on data access 1) Publicly available 2) Available with copy wright 3) Browsing only, accessible but not downloadable 4) Academic but not freely available 5) Proprietary commercial 6) Restricted
  • 12. Based on data sources Based on data sources
  • 13. Primary databases  Contains original data from the researchers  Public or open access mostly  NCBI , GENEBANK  EMBL  SWISS-PROT  NDB
  • 14. Secondary databases  Results from entries of primary database  Manually created or automatically generated  Swiss-prot is an example of secondary database
  • 17. DDBJ  DNA databank of japan  Nucleotide sequence database  Established in 1986  Has been working in collaboration with EMBL & NCBI  After 20 years another collaborative project named INSDC was formed EMBL Genebank DDBJ
  • 18.
  • 19.
  • 20.
  • 21. SWISS-PROT  Protein sequence database  Maintained by SIB Swiss institute of bioinformatics in Switzerland and also the European bioinformatics institute EBI  The output format is swiss-prot file  That has been explained in molecular file formats