SlideShare une entreprise Scribd logo
1  sur  27
Free Powerpoint Templates
Page 1
Protein Database
By
KAUSHAL KUMAR SAHU
Assistant Professor (Ad Hoc)
Department of Biotechnology
Govt. Digvijay Autonomous P. G. College
Raj-Nandgaon ( C. G. )
Free Powerpoint Templates
Page 2
Introduction
• Bioinformatics is the application of Information technology to store, organize
and analyze the vast amount of biological data which is available in the form
of sequences and structures of proteins and nucleic acids. The biological
information of nucleic acids is available as sequences while the data of
proteins is available as sequences and structures.
• A biological database is a collection of data that is organized so that its
contents can easily be accessed, managed, and updated. The activity of
preparing a database can be divided in to:
• Collection of data in a form which can be easily accessed
• Making it available to a multi-user system (always available for the user)
Free Powerpoint Templates
Page 3
The network for production, construction
and accession of a database
EXPERIMENTS N
• | |- E U
• ORGANIZATION |----------|- T-->S COPY
• OF DATA HOST/SERVER | W-->E-->ONLINE -----> PERSONAL
• | | O-->R ACCESS DATABASE
• |------------> DATABASES R S
• K
• |
•
• EDS
• (Electronic Data Storage)
•
Free Powerpoint Templates
Page 4
Protein databases
• Protein databases are more specialized than primary sequence
databases. They contain information derived from the primary
sequence databases. Some contain protein translations of the
nucleic acid sequences. Some contain sets of patterns and motifs
derived from sequence homologs.
Free Powerpoint Templates
Page 5
History
• The first database was created within a short period after the Insulin protein
sequence was made available in 1956. Insulin is the first protein to be
sequenced. The sequence of Insulin consisted of just 51 residues which
characterize the sequence.
• In 1959, V.M. Ingram first made attempt to compare sickle cell
haemoglobin and normal haemoglobin and demonstrated their homology.
this results in more protein sequencing and accumulation of vast information
.hence it is realized to have database so that using computation software
the protein can be quickly compared.
• In 1965, Margaret Dayhoff established the first database of protein
sequences, a database that was published annually as a series of volumes
entitled “Atlas of Protein Sequence and Structure”
• In 1972, Protein Data Bank was developed as the first protein structure
database
Free Powerpoint Templates
Page 6
Classification of biological database
Free Powerpoint Templates
Page 7
Primary database:-
Protein data bank (PDB)
• Three-dimensional structures are stored in the Protein Databank (PDB).
This is the single world-wide archive of structural data derived by X-ray
crystallography, nuclear magnetic resonance spectroscopy, and other
techniques, as well as structural models
• The database is maintained by the Research Collaboratory for Structural
Bioinformatics (RCSB), at Rutgers University.
• Data in the PDB are very high quality and are extensively curated.
Free Powerpoint Templates
Page 8
Homepage
Free Powerpoint Templates
Page 9
Free Powerpoint Templates
Page 10
Free Powerpoint Templates
Page 11
Sequence database:
SWISS-PROT protein sequence database
• SWISS-PROT was created in at the department of medical biochemistry
(university of geneva) in 1986.
• In 1987, European Molecular biology laboratory and Swiss institute of
Bioinformatics (SIB) work in collaboration ,as equal partners , to develop
and maintain this highly annotated repository of protein sequences.
• It provides high quality annotation with minimum redundancy.
Free Powerpoint Templates
Page 12
Translated EMBL (TrEMBL)
• It was created in 1996 with the objective to fill the gap between flow of
genomic data and annotated protein sequences.
• TrEMBL contains computer annotated records generated by translating
coding sequences (CDS) available in EMBL nucleotide sequence database.
• It has two main sections-
• SP- TrEMBL
• REM- TrEMBL-
Free Powerpoint Templates
Page 13
Protein information resource (PIR)
• PIR was established in 1984 by the National Biomedical Research
Foundation (NBRF) as a resource to assist researchers in the identification
and interpretation of protein sequence information.
• The database is split into four sections PIR1 to PIR4
– PIR1 contains fully classified and annotated entries.
– PIR2 includes preliminary entries.
– PIR3 contains unverified entries
– PIR4 entries all into:-
• Conceptual translations sequence
• Protein sequences
• Conceptual translations of artifactual sequence.
• Sequence that are not genetically encoded and not produced in ribosome.
Free Powerpoint Templates
Page 14
Homepage
Free Powerpoint Templates
Page 15
Secondary databases:
Structural classification of proteins (SCOP)
• It was created in 1995 by Murzin et al. it is maintained at Cambridge with
the aim to gather information about structural similarities of proteins to
increase our understanding of protein evolution and development.
• SCOP provides comprehensive information on structural and evolutionary
relationships of protein with known structure including structures available in
protein data bank.
• The manually constructed SCOP classifies proteins in a hierarchy which
includes class, folds, superfamily, family, protein and species.
Free Powerpoint Templates
Page 16
Class Architecture Topology Homology
(CATH)
• The CATH database established in 1993 is a protein structure classification
based on four levels namely class, Architecture ,Topology and Homology.
• CATH contains hierarchical domain classification of protein structures
present in protein data bank and is maintained at University College
London.
• The classification has been done by combination of automated and manual
methods.
Free Powerpoint Templates
Page 17
Sequence database-
1.PROSITE:
• It is a method of determining what is the function of uncharacterized
proteins translated from genomic or cDNA sequences.
• It consists of a database of biologically significant sites, patterns and
profiles that help to reliably identify to which known family of protein (if any)
a new sequence belongs.
• It include protein pattern motifs indicative protein’s function , are widely
used for function prediction studies, cellular localization annotation, and
sequence classification.
Free Powerpoint Templates
Page 18
Homepage
Free Powerpoint Templates
Page 19
• 3. BLOCKS
• Blocks are multiply aligned ungapped segments corresponding to the most
highly conserved regions of proteins.
• Block database Itself contain more than 4000 entries.
• 4. Pfam
• The methodology used by Pfam to create protein family or domain
signatures is Hidden Markov Models (HMMs).
• They are thus particularly useful when analysing multidomain proteins.
• The biggest drawback of Pfam is its lack of biological information
(annotation) of the protein families
Free Powerpoint Templates
Page 20
Important database search tool:
SEARCH TOOL FUNCTION PROVIDED
BLAST (BASIC LOCALALIGNMENT TOOL) Used to analyze sequence information and detect
homologous sequences.
ENTREZ Used to access literature , sequence and
structural database.
DNAPLOT Sequence alignment tool
LOCUS LINK Accessing information on homologous gene
STRUCTURE It support molecular molding database
(MMDB)and software tool for structure analysis.
TAXONOMY BROWSER Taxonomic classification of various species as
well as genetic information.
FASTA This program provide algorithm to speed up
sequence comparison.
Free Powerpoint Templates
Page 21
Example: study protein sequence of hepatitis B virus
surface antigen FASTA product by NCBI
Free Powerpoint Templates
Page 22
Free Powerpoint Templates
Page 23
Free Powerpoint Templates
Page 24
Free Powerpoint Templates
Page 25
Application of protein database
• Protein sequence
• Determination of macromolecular structure
• Molecular evolution
• Drug development
Free Powerpoint Templates
Page 26
Conclusion
• The aim of most protein structure databases is to organize and annotate
the protein structures, providing the biological community access to the
experimental data in a useful way. whereas sequence databases focus on
sequence information, and contain no structural information for the majority
of entries.
• Thus there is no doubt that Bioinformatics tools for efficient research will
have significant impact in biological sciences and betterment of human
lives.
Free Powerpoint Templates
Page 27
References
• Principles of gene manipulation and genomics- S.B.
Primrose and R.M.Twyman (seventh edition)
• www.bioinfo.com
• www.ncbi.nil.nih.gov.
• http://www.mrc-
lmb.cam.ac.uk/genomes/madanm/pdfs/biodbseq.pdf
•

Contenu connexe

Tendances

Tendances (20)

UniProt
UniProtUniProt
UniProt
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
Cath
CathCath
Cath
 
Gene bank by kk sahu
Gene bank by kk sahuGene bank by kk sahu
Gene bank by kk sahu
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Protein database
Protein databaseProtein database
Protein database
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
Ddbj
DdbjDdbj
Ddbj
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
Protein sequence databases
Protein sequence databasesProtein sequence databases
Protein sequence databases
 
Rasmol
RasmolRasmol
Rasmol
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
 
Fasta
FastaFasta
Fasta
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 

Similaire à Protein database

Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
SBituila
 

Similaire à Protein database (20)

Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological database
 
Important protein databases and proteomics softwares
Important protein databases and proteomics softwaresImportant protein databases and proteomics softwares
Important protein databases and proteomics softwares
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Protein Database
Protein DatabaseProtein Database
Protein Database
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
biological databases.pptx
biological databases.pptxbiological databases.pptx
biological databases.pptx
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptx
 
Bioinformatics مي.pdf
Bioinformatics  مي.pdfBioinformatics  مي.pdf
Bioinformatics مي.pdf
 
Biological databases
Biological databases Biological databases
Biological databases
 
Introduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptxIntroduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptx
 
Protein database
Protein databaseProtein database
Protein database
 
Proteomics resources at the EBI & ExPASy
Proteomics resources at the EBI & ExPASyProteomics resources at the EBI & ExPASy
Proteomics resources at the EBI & ExPASy
 
Biological database
Biological databaseBiological database
Biological database
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 

Plus de KAUSHAL SAHU

Plus de KAUSHAL SAHU (20)

tumor suppressor gene, prb, p53 gene
tumor suppressor gene, prb, p53 genetumor suppressor gene, prb, p53 gene
tumor suppressor gene, prb, p53 gene
 
tumor suppressor gene by
tumor suppressor gene bytumor suppressor gene by
tumor suppressor gene by
 
tumor suppresor genes
tumor suppresor genestumor suppresor genes
tumor suppresor genes
 
tumor suppressor gene, prb, p53
tumor suppressor gene, prb, p53tumor suppressor gene, prb, p53
tumor suppressor gene, prb, p53
 
transcription factor by kk sahu
transcription factor by kk sahutranscription factor by kk sahu
transcription factor by kk sahu
 
DNA repair by kk sahu
DNA repair by kk sahuDNA repair by kk sahu
DNA repair by kk sahu
 
membrane protein, synthesis by
membrane protein, synthesis bymembrane protein, synthesis by
membrane protein, synthesis by
 
prokaryotic translation mechinry
prokaryotic translation mechinryprokaryotic translation mechinry
prokaryotic translation mechinry
 
translation mechinary
translation mechinarytranslation mechinary
translation mechinary
 
translation cycle, protein synnthesis
translation cycle, protein synnthesistranslation cycle, protein synnthesis
translation cycle, protein synnthesis
 
co and post translation modification, by
co and post translation modification, byco and post translation modification, by
co and post translation modification, by
 
co and post translation modification
co and post translation modificationco and post translation modification
co and post translation modification
 
Prokaryotic transcription by kk
Prokaryotic transcription by kk Prokaryotic transcription by kk
Prokaryotic transcription by kk
 
Enzyme Kinetics and thermodynamic analysis
Enzyme Kinetics and thermodynamic analysisEnzyme Kinetics and thermodynamic analysis
Enzyme Kinetics and thermodynamic analysis
 
Chromatin, Organization macromolecule complex
Chromatin, Organization macromolecule complexChromatin, Organization macromolecule complex
Chromatin, Organization macromolecule complex
 
Receptor mediated endocytosis by kk
Receptor mediated endocytosis by kkReceptor mediated endocytosis by kk
Receptor mediated endocytosis by kk
 
Recepter mediated endocytosis by kk ashu
Recepter mediated endocytosis by kk ashuRecepter mediated endocytosis by kk ashu
Recepter mediated endocytosis by kk ashu
 
Protein sorting and targeting
Protein sorting and targetingProtein sorting and targeting
Protein sorting and targeting
 
Prokaryotic translation machinery by kk
Prokaryotic translation machinery by kk Prokaryotic translation machinery by kk
Prokaryotic translation machinery by kk
 
eukaryotic translation machinery by kk sahu
eukaryotic translation machinery by kk sahueukaryotic translation machinery by kk sahu
eukaryotic translation machinery by kk sahu
 

Dernier

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 

Dernier (20)

chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 

Protein database

  • 1. Free Powerpoint Templates Page 1 Protein Database By KAUSHAL KUMAR SAHU Assistant Professor (Ad Hoc) Department of Biotechnology Govt. Digvijay Autonomous P. G. College Raj-Nandgaon ( C. G. )
  • 2. Free Powerpoint Templates Page 2 Introduction • Bioinformatics is the application of Information technology to store, organize and analyze the vast amount of biological data which is available in the form of sequences and structures of proteins and nucleic acids. The biological information of nucleic acids is available as sequences while the data of proteins is available as sequences and structures. • A biological database is a collection of data that is organized so that its contents can easily be accessed, managed, and updated. The activity of preparing a database can be divided in to: • Collection of data in a form which can be easily accessed • Making it available to a multi-user system (always available for the user)
  • 3. Free Powerpoint Templates Page 3 The network for production, construction and accession of a database EXPERIMENTS N • | |- E U • ORGANIZATION |----------|- T-->S COPY • OF DATA HOST/SERVER | W-->E-->ONLINE -----> PERSONAL • | | O-->R ACCESS DATABASE • |------------> DATABASES R S • K • | • • EDS • (Electronic Data Storage) •
  • 4. Free Powerpoint Templates Page 4 Protein databases • Protein databases are more specialized than primary sequence databases. They contain information derived from the primary sequence databases. Some contain protein translations of the nucleic acid sequences. Some contain sets of patterns and motifs derived from sequence homologs.
  • 5. Free Powerpoint Templates Page 5 History • The first database was created within a short period after the Insulin protein sequence was made available in 1956. Insulin is the first protein to be sequenced. The sequence of Insulin consisted of just 51 residues which characterize the sequence. • In 1959, V.M. Ingram first made attempt to compare sickle cell haemoglobin and normal haemoglobin and demonstrated their homology. this results in more protein sequencing and accumulation of vast information .hence it is realized to have database so that using computation software the protein can be quickly compared. • In 1965, Margaret Dayhoff established the first database of protein sequences, a database that was published annually as a series of volumes entitled “Atlas of Protein Sequence and Structure” • In 1972, Protein Data Bank was developed as the first protein structure database
  • 6. Free Powerpoint Templates Page 6 Classification of biological database
  • 7. Free Powerpoint Templates Page 7 Primary database:- Protein data bank (PDB) • Three-dimensional structures are stored in the Protein Databank (PDB). This is the single world-wide archive of structural data derived by X-ray crystallography, nuclear magnetic resonance spectroscopy, and other techniques, as well as structural models • The database is maintained by the Research Collaboratory for Structural Bioinformatics (RCSB), at Rutgers University. • Data in the PDB are very high quality and are extensively curated.
  • 11. Free Powerpoint Templates Page 11 Sequence database: SWISS-PROT protein sequence database • SWISS-PROT was created in at the department of medical biochemistry (university of geneva) in 1986. • In 1987, European Molecular biology laboratory and Swiss institute of Bioinformatics (SIB) work in collaboration ,as equal partners , to develop and maintain this highly annotated repository of protein sequences. • It provides high quality annotation with minimum redundancy.
  • 12. Free Powerpoint Templates Page 12 Translated EMBL (TrEMBL) • It was created in 1996 with the objective to fill the gap between flow of genomic data and annotated protein sequences. • TrEMBL contains computer annotated records generated by translating coding sequences (CDS) available in EMBL nucleotide sequence database. • It has two main sections- • SP- TrEMBL • REM- TrEMBL-
  • 13. Free Powerpoint Templates Page 13 Protein information resource (PIR) • PIR was established in 1984 by the National Biomedical Research Foundation (NBRF) as a resource to assist researchers in the identification and interpretation of protein sequence information. • The database is split into four sections PIR1 to PIR4 – PIR1 contains fully classified and annotated entries. – PIR2 includes preliminary entries. – PIR3 contains unverified entries – PIR4 entries all into:- • Conceptual translations sequence • Protein sequences • Conceptual translations of artifactual sequence. • Sequence that are not genetically encoded and not produced in ribosome.
  • 15. Free Powerpoint Templates Page 15 Secondary databases: Structural classification of proteins (SCOP) • It was created in 1995 by Murzin et al. it is maintained at Cambridge with the aim to gather information about structural similarities of proteins to increase our understanding of protein evolution and development. • SCOP provides comprehensive information on structural and evolutionary relationships of protein with known structure including structures available in protein data bank. • The manually constructed SCOP classifies proteins in a hierarchy which includes class, folds, superfamily, family, protein and species.
  • 16. Free Powerpoint Templates Page 16 Class Architecture Topology Homology (CATH) • The CATH database established in 1993 is a protein structure classification based on four levels namely class, Architecture ,Topology and Homology. • CATH contains hierarchical domain classification of protein structures present in protein data bank and is maintained at University College London. • The classification has been done by combination of automated and manual methods.
  • 17. Free Powerpoint Templates Page 17 Sequence database- 1.PROSITE: • It is a method of determining what is the function of uncharacterized proteins translated from genomic or cDNA sequences. • It consists of a database of biologically significant sites, patterns and profiles that help to reliably identify to which known family of protein (if any) a new sequence belongs. • It include protein pattern motifs indicative protein’s function , are widely used for function prediction studies, cellular localization annotation, and sequence classification.
  • 19. Free Powerpoint Templates Page 19 • 3. BLOCKS • Blocks are multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins. • Block database Itself contain more than 4000 entries. • 4. Pfam • The methodology used by Pfam to create protein family or domain signatures is Hidden Markov Models (HMMs). • They are thus particularly useful when analysing multidomain proteins. • The biggest drawback of Pfam is its lack of biological information (annotation) of the protein families
  • 20. Free Powerpoint Templates Page 20 Important database search tool: SEARCH TOOL FUNCTION PROVIDED BLAST (BASIC LOCALALIGNMENT TOOL) Used to analyze sequence information and detect homologous sequences. ENTREZ Used to access literature , sequence and structural database. DNAPLOT Sequence alignment tool LOCUS LINK Accessing information on homologous gene STRUCTURE It support molecular molding database (MMDB)and software tool for structure analysis. TAXONOMY BROWSER Taxonomic classification of various species as well as genetic information. FASTA This program provide algorithm to speed up sequence comparison.
  • 21. Free Powerpoint Templates Page 21 Example: study protein sequence of hepatitis B virus surface antigen FASTA product by NCBI
  • 25. Free Powerpoint Templates Page 25 Application of protein database • Protein sequence • Determination of macromolecular structure • Molecular evolution • Drug development
  • 26. Free Powerpoint Templates Page 26 Conclusion • The aim of most protein structure databases is to organize and annotate the protein structures, providing the biological community access to the experimental data in a useful way. whereas sequence databases focus on sequence information, and contain no structural information for the majority of entries. • Thus there is no doubt that Bioinformatics tools for efficient research will have significant impact in biological sciences and betterment of human lives.
  • 27. Free Powerpoint Templates Page 27 References • Principles of gene manipulation and genomics- S.B. Primrose and R.M.Twyman (seventh edition) • www.bioinfo.com • www.ncbi.nil.nih.gov. • http://www.mrc- lmb.cam.ac.uk/genomes/madanm/pdfs/biodbseq.pdf •