SlideShare une entreprise Scribd logo
1  sur  75
Vartika's Presentation
Data
Data is raw, unorganized facts that
need to be processed.
Example:- Each student's test score
is one
piece of data.
Vartika's Presentation
INFORMATION
When data is processed,
organized, structured or presented in
a given context so as to make it
useful, it is called information.
What is database????
• Database are convenient system to properly store, searchand
retrieve any type of data.
• A database helps to easily handle and share large amount of data
and supports large scaleanalysis by easyaccessand data updating.
Vartika's Presentation
What is Biological Database
• Biological databases are libraries of life sciences information
,collected from scientific experiments, published literature, high-
throughput experiment technology and computational analysis.
• They contain information from genomics, proteomics, microarray
gene expression.
• Informationcontained in Biological database includes function,
gene structure, localization(both cellular and
chromosomal),biological sequences andstructures.
Vartika's Presentation
Major purpose of these Data Base is :
•Availability of Biological data.
•Systemization of data.
•Analysis of computed Biological Data.
Vartika's Presentation
History:
 1956; first sequence database when insulin was sequenced
 51 amino acids.
 Atlas of protein sequences and structures in 1965 by Margaret Day Hoff et
al was a printed book.
 Became base for PIR protein information resource
 First nucleotide sequence: yeast tRNA
 77 bases
 During this time 3D structure of proteins was being studied and renowned
PDB was made.
 First genome published was of free living virus Haemophilus influenzae in
1995.
Vartika's Presentation
Features of Biological Data Bases:
1) Data heterogeneity
2) High volume data
3) Uncertainty
4) Data Curation
5) Large scale data integration
6) Data sharing
7) Dynamic and subject to change
Vartika's Presentation
Classification scheme for
biological databases :
Data type
Maintenance status
Data access
Data source
Database design
Organism
Vartika's Presentation
Data Types :
Vartika's Presentation
Based on data
sources
Based on
data
sources
Vartika's Presentation
Content Based:
Genome database
Sequence database
Structure database
Microarray database
Chemical database
Pathway database
Enzyme database
Disease database
Literature database
Vartika's Presentation
Based on maintenance
status
NCBI EMBL SIB
Vartika's Presentation
Based on data
access
1) Publicly available
2) Available with copy wright
3) Browsing only, accessible but not
downloadable
4) Academic but not freely available
5) Proprietary commercial
6) Restricted
Vartika's Presentation
Biological
sequence
Databases
Vartika's Presentation
Vartika's Presentation
Databases Architecture
Information system
)Querysystem
StorageSystem
Data
(The Google,Entrez
SRS)
Your search keywords
Oracle,MySQL,PCbinary
files,Unix text
files,Bookshelves
GenBank flat file
PDBfile
Interaction Record
Title of abook
BookVartika's Presentation
A Sequence Retrieving and
Manipulation Network
DNA
NCBI-GenBANK
Protein
PIR
SWISSPROTDDBJ
EBI-EMBL EXPASY, PDB
GCG
SeqWEB
Vector NTI
GenoMAX
Entrez
SRS
GenBANK
GCG
FASTA
Staden
Image
Databases
Softwares
Formats
Sequence
Converter
Retriva
l
System
Information
Sequnece, Pdb, Image
Vartika's Presentation
Types of biological databases
 Primary Database.
Secondarydatabase.
Vartika's Presentation
Primary databases
Thesesare the primary sourcesof data usedto store nucleic acid, protein sequences and
structural information of biological macromolecules.
Some primarydatabases-
• NCBI(The National Centre for Biotechnology Information)
• GenBank
• DDBJ(DNAdata bank of Japan)
• SWISS-PROT(Swiss-Prot)
• PIR(Protein InformationResource)
• PDB(Protein DataBank)
This sequencecollection of this database is due to the efforts of basic researchfrom
academic industrial and sequencinglab)
Vartika's Presentation
GenBank/EMBL/DDBJ
International
Nucleotide Sequence Database
DDBJ:DNAData Bankof Japan
CIB:Center for Information Biology and
DNAData Bankof Japan
NIG:National Institute of Genetics
IAM: International Advisory Meeting
ICM: International Collaborative Meeting
EMBL:
European Molecular Biology
Laboratory
EBI:
European Bioinformatics
Institute
NCBI:
National Center for BiotechnologyInformation
NLM:
National Library of Medicine Vartika's Presentation
Secondary Database
• ASecondary database contain additional information derived from the analysis
of data available in primary sources.
• Secondary databasesare analysed in avariety of waysand contain different
information in different formats.
• Some secondarydatabases
• TrEMBL
• Pfam
• PROSITE
• Profiles
• SCOP
• CATH
Vartika's Presentation
Flat File Storage Data Formats
• When GenBank, EMBL and DDBJ formed a collaboration (1986), sequence
databases had moved to a defined flat file format with a shared feature
table format and annotation standards.
• The flat file formats from the sequence databases are still used to access
and display sequence and annotation. They are also convenient for storage
of localcopies.
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
The National Center for
Biotechnology Information
Bethesda,
MD
Created in 1988 as a part of the
National Library of Medicine at NIH
– Establish public databases
– Research in computational biology
– Develop software tools for sequence analysis
– Disseminate biomedical informationVartika's Presentation
NCBI Databases and Services
• GenBank primary sequencedatabase
• Free public accesstobiomedical literature
• PubMed free Medline (3million searches per day)
• PubMedCentral full text online access
• Entrez integrated molecular and literature databases
• BLASThighest volume sequence searchservice
(100 – 200 Ksearches perday)
• VASTstructure similaritysearches
• Software andDatabases
Vartika's Presentation
GenBank (Genetic Sequence Databank)
• GenBank®is the genetic sequencedatabaseat the National
Center for Biotechnology Information (NCBI).
• It wasestablished in the year 1982and now maintained by the
NationalCenter for Biotechnology (NCBI).
• DNAsequencescanbe submitted to GenBankusing several
different methods.
• It contains publicly available nucleotide sequencesfor more than
240 000 named organisms, obtained primarily through
submissions from individual laboratories and batch submissions
fromlarge-scale sequencing projects.Vartika's Presentation
• It hasaflat file structure that is anASCIItext file,
readable & downloadable by both humans and
computers.
• There are two main waysof making batch sequence
submissions to GenBank: NCBI’sBarcode
SubmissionTool(BarSTool) and Sequin.
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
EMBL
• The European Molecular Biology Laboratory (EMBL) is amolecular biology research
institution supported by 22member states, four prospect and two associate member
states.
• EMBLwascreated in 1974and is an intergovernmental organisation funded by public
researchmoney from its member states.
• The Laboratory operates from five sites: the main laboratory in Heidelberg, and
outstations in Hinxton (the European Bioinformatics Institute (EBI), in England),
Grenoble (France),Hamburg (Germany), and Monterotondo (near Rome).
• EMBLgroups and laboratories perform basicresearchin molecular biology and
molecular medicine aswell astraining for scientists,students and visitors.
• Israel is the onlyAsian state that hasfull membership.
• TheEMBLNucleotide SequenceDatabase (http:// www.ebi.ac.uk/embl/), maintained
at the European Bioinformatics Institute (EBI),
Vartika's Presentation
• It is used to incorporate and distributes nucleotide sequences from
public sources.
• The database is apart of an international collaboration with DDBJ
(Japan) and GenBank(USA).
• Data are exchangedbetween the collaborating databases on a
daily basis.
• The web-based tool, Webin, is the preferred system for individual
submission of nucleotide sequences,including Third Party
Annotation (TPA) and alignment data.
Vartika's Presentation
• Automatic submission procedures are usedfor submission of data
from large-scale genomesequencing
• The latest data collection canbe accessedvia FTP,email and
WWW interfaces.
• The EBI's Sequence Retrieval System (SRS) integrates and links
the main nucleotide and protein databases aswell asmany other
specialist molecular biologydatabases.
• For sequencesimilarity searching, avariety of tools (e.g. FASTA
and BLAST) are available that allow external users to compare
their own sequences against the data in the EMBLNucleotide
Sequence Database and otherdatabases.
• All available resourcescanbe accessedvia the EBIhome page atVartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
EMBL format
28-APR-1992 (Rel. 31, Created)
30-JUN-1993 (Rel. 36, Last updated, Version 6)
L.ivanovii sod gene for superoxide dismutase
sod gene; superoxide dismutase.
Listeria ivanovii
Bacteria; Firmicutes; Bacillus/Clostridium group;
Bacillus/Staphylococcus group; Listeria.
[1]
MEDLINE; 92140371.
Haas A., Goebel W.;
"Cloning of a superoxide dismutase gene from Listeria ivanovii by
functional complementation in Escherichia coli and characterization of
ID LISOD standard; DNA; PRO; 756 BP.
XX
AC X64011; S78972;
XX
SV X64011.1
XX
DT
DT
XX
DE
XX
KW
XX
OS
OC
OC
XX
RN
RX
RA
RT
RT
the
RT gene product.";
Vartika's Presentation
M o l . G e n . G e n e t . 2 3 1 : 3 1 3 - 3 2 2 ( 1 9 9 2 ) .
[ 2 ]
1 - 7 5 6
K r e f t J . ;
;
S u b m i t t e d ( 2 1 - A P R - 1 9 9 2 ) t o t h e E M B L / G e n B a n k / D D B J d a t a b a s e s .
J . K r e f t , I n s t i t u t f . M i k r o b i o l o g i e , U n i v e r s i t a e t W u e r z b u r g , B i o z e n t r u m
H u b l a n d , 8 7 0 0 W u e r z b u r g , F R G
S W I S S - P R O T ; P 2 8 7 6 3 ; S O D M _ L I S I V .
K e y L o c a t i o n / Q u a l i f i e r s
s o u r c e
R B S
t e r m i n a t o r
C D S
1 . . 7 5 6
/ d b _ x r e f = " t a x o n : 1 6 3 8 "
/ o r g a n i s m = " L i s t e r i a i v a n o v i i "
/ s t r a i n = " A T C C 1 9 1 1 9 "
9 5 . . 1 0 0
/ g e n e = " s o d "
7 2 3 . . 7 4 6
/ g e n e = " s o d "
1 0 9 . . 7 1 7
/ d b _ x r e f = " S W I S S - P R O T : P 2 8 7 6 3 "
/ t r a n s l _ t a b l e = 1 1
/ g e n e = " s o d "
/ E C _ n u m b e r = " 1 . 1 5 . 1 . 1 "
/ p r o d u c t = " s u p e r o x i d e d i s m u t a s e "
/ p r o t e i n _ i d = " C A A 4 5 4 0 6 . 1 "
/ t r a n s l a t i o n = " M T Y E L P K L P Y T Y D A L E P N F D K E T M E I H Y T K H H N I Y V T K L N E A
H A E L A S K P G E E L V A N L D S V P E E I R G A V R N H G G G H A N H T L F W S S L S P N G G G A P T G N L
I E S E F G T F D E F K E K F N A A A A A R F G S G W A W L V V N N G K L E I V S T A N Q D S P L S E G K T P V
D V W E H A Y Y L K F Q N R R P E Y I D T F W N V I N W D E R N K R F D A A K "
R L
X X
R N
R P
R A
R T
R L
R L
A m
R L
X X
D R
X X
F H
F H
F T
F T
F T
F T
F T
F T
F T
F T
F T
F T
F T
F T
F T
F T
F T
F T
V S G
F T
K A A
F T
L G L
F T
X X
S Q S e q u e n c e 7 5 6 B P ; 2 4 7 A ; 1 3 6 C ; 1 5 1 G ; 2 2 2 T ; 0 o t h e r ;
c g t t a t t t a a g g t g t t a c a t a g t t c t a t g g a a a t a g g g t c t a t a c c t t t c
g c c t t a c a a t
g t a a t t t c t t
g a c t t a c g a a
t t a c c a a a a t
a g a a a c a a t g
g a a a t t c a c t
a g c a g t c t c a
g g a c a c g c a g
a g a t a g c g t t
c c t g a a g a a a
c c a t a c t t t a
t t c t g g t c t a
a a a a g c a g c a
a t c g a a a g c g
g g c a g c t g c g
g c t c g t t t t g
t a a t a a a c a a t c c g a g g a g g a a t t t t t a a t
t t a t g a t g c t t t g g a g c c g a a t t t t g a t a a
c c a c a a t a t t t a t g t a a c a a a a c t a a a t g a
t a a a c c t g g g g a a g a a t t a g t t g c t a a t c t
a g t a c g t a a c c a c g g t g g t g g a c a t g c t a a
a a a t g g t g g t g g t g c t c c a a c t g g t a a c t t
a t t t g a t g a a t t c a a a g a a a a a t t c a a t g c
g g c a t g g c t a g t a g t g a a c a a t g g t a a a c t
a g a a a t t g t t
6 0
t t c a c a t a a a
1 2 0
t a c c t t a t a c
1 8 0
a t a c a a a g c a
2 4 0
a a c t t g c a a g
3 0 0
t t c g t g g c g c
3 6 0
g t c t t a g c c c
4 2 0
a a t t c g g c a c
4 8 0
g t t c a g g a t g
Vartika's Presentation
I D - I d e n t i f i c a t i o n .
A C - A c c e s s i o n n u m b e r ( s ) .
D T - D a t e .
D E - D e s c r i p t i o n .
G N - G e n e n a m e ( s ) .
O S - O r g a n i s m s p e c i e s .
O G - O r g a n e l l e .
O C - O r g a n i s m c l a s s i f i c a t i o n .
R N - R e f e r e n c e n u m b e r .
R P - R e f e r e n c e p o s i t i o n .
R C - R e f e r e n c e c o m m e n t s .
R X - R e f e r e n c e c r o s s - r e f e r e n c e s .
R A - R e f e r e n c e a u t h o r s .
R L - R e f e r e n c e l o c a t i o n .
C C - C o m m e n t s o r n o t e s .
D R - D a t a b a s e c r o s s - r e f e r e n c e s .
K W - K e y w o r d s .
F T - F e a t u r e t a b l e d a t a .
S Q - S e q u e n c e h e a d e r .
- ( b l a n k s ) s e q u e n c e d a t a .
/ / - T e r m i n a t i o n l i n e .
S o m e e n t r i e s d o n o t c o n t a i n a l l o f t h e l i n e t y p e s , a n d s o m e l i n e t y p e s o c c u r m a n y t i m e s i n a s i n g l e
e n t r y . E a c h e n t r y m u s t b e g i n w i t h a n i d e n t i f i c a t i o n l i n e ( I D ) a n d e n d w i t h a t e r m i n a t o r l i n e ( / / ) .Vartika's Presentation
PubMed
• PubMed is a free search engine accessing primarily
the MEDLINE database of references and abstracts on
sciences and biomedical topics.
• The PubMed system was offered free to the public in
1997.
• The United States National Library of Medicine (NLM)
the National Institutes of Health maintains the
part of the Entrez system of information retrieval.
• PMID is the unique identifier number used in
Vartika's Presentation
• Theyare assignedto eacharticle record when it enters the
PubMedsystem.
• ThePMID# is alwaysfound at the end of aPubMed
citation.
• PubMed Central (PMC) is afree digital system that
archivespublicly accessiblefull-text scholarly articles that
have been published within the biomedical and life
sciences journalliterature.
• A"PubMed Mobile" option, providing accessto amobileVartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Entrez
• WWW-based data retrievalsystem.
• Developed by NCBI(National Centre for Biotechnology
Information).
• - Integrates information held in different DBs.
Vartika's Presentation
Data bases covered by Entrez are
• Nucleic acid -GenBank,
RefSeq,PDB.
• Protein seqs-SWISS-
PROT,PIR.
• 3Dstructures –MMDB
• Genomes –Many
sources
• PopSet – FromGenBank
• OMIM –OMIM
• Taxonomy – NCBItaxonomy
database
• Books- Bookshelf
• ProbeSet – GEO(Gene
ExpressionOmnibus)
• Literature -PubMed
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
SRS
• SRSis aSequence RetrievalSystem
• - Data retrieval tool developed by EBI
• - Integrates 80 molecular biology DBs
• -AnOpen sourcesoftware (Canbe installed locally)
• SRShas an associated scripting language calledIcarus
• Central resource for molecular biology data
• - more than 250databanks have been indexed. More than 35SRS
servers over theWWW(world wide)
Vartika's Presentation
• Information retrieval
• Easy way to retrieve information from sequence and sequence-related
databases
• Possibility to search for multiple words/other criteria
• Linkage between different databases
• E.g. Find all primary structures with known three-dimensional
• Different types of database in SRS
• Sequence & structure
• DNA, protein, three-dimensional structures
• Sequence-related
• Gene-related
• Genome, mapping, mutations, transcription factors
• SNP
• Bibliographic
Vartika's Presentation
• SRS main toolbar tabs:
• Top Page: displays databases in different database groups
• Query: displays either the standard or extended query form
• Results or “the query manager”: maintains a history of all the
results obtained during a session
• Projects or “the project manager”: maintains a history of all
queries and views used during a session
• Views: allows a user to define a user specific view for one or
more databases
• Databanks: contains a list and some facts about the databases
available in the system
Vartika's Presentation
• Search terms in SRS
• SRS indexed fields can be searched using any of the
• Single word search
• Multiple word phrases
• Numbers and dates
• Regular expressions
• Wildcards
•
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
LocusLink
• LocusLink (http://www.ncbi.nlm.nih.gov/LocusLink) is aNational
Center for Biotechnology Information (NCBI) online resource.
• It is principally intended for useby graduate students and
professional researchersin the biomedical sciences.
• It is designed to bring together related information on genetic loci
and gene products from several sources.
• LocusLink provides acentral point of accessfor basic biomedical
information and molecular data for genes, transcripts, and proteins
from model organisms, currently including human, rat, mouse,
fruit fly,and zebrafish.
• Now it is not availablein NCBI.
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation
Vartika's Presentation

Contenu connexe

Tendances

Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuKAUSHAL SAHU
 
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUEPacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUEMuunda Mudenda
 
Protein Structure, Databases and Structural Alignment
Protein Structure, Databases and Structural AlignmentProtein Structure, Databases and Structural Alignment
Protein Structure, Databases and Structural AlignmentSaramita De Chakravarti
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsajay301
 
Protein information resource (PIR)
Protein information resource (PIR)Protein information resource (PIR)
Protein information resource (PIR)ShivaniShewale2
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNAmaryamshah13
 
Gene bank by kk sahu
Gene bank by kk sahuGene bank by kk sahu
Gene bank by kk sahuKAUSHAL SAHU
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanksNithyaNandapal
 
Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological databaseKAUSHAL SAHU
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fastaALLIENU
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysisDr. Naveen Gaurav srivastava
 

Tendances (20)

NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahu
 
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUEPacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
Protein Structure, Databases and Structural Alignment
Protein Structure, Databases and Structural AlignmentProtein Structure, Databases and Structural Alignment
Protein Structure, Databases and Structural Alignment
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Protein information resource (PIR)
Protein information resource (PIR)Protein information resource (PIR)
Protein information resource (PIR)
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNA
 
Gene bank by kk sahu
Gene bank by kk sahuGene bank by kk sahu
Gene bank by kk sahu
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanks
 
Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological database
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
SWISS-PROT
SWISS-PROTSWISS-PROT
SWISS-PROT
 
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology Laboratory
 
EMBL-EBI
EMBL-EBIEMBL-EBI
EMBL-EBI
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysis
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Genome mapping
Genome mapping Genome mapping
Genome mapping
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
DNA microarray
DNA microarrayDNA microarray
DNA microarray
 

Similaire à Data base in detail

Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...Elufer Akram
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxVandana Yadav03
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEPrashantSharma807
 
Primary Databases.pptx
Primary Databases.pptxPrimary Databases.pptx
Primary Databases.pptxSwarup Malakar
 
Introduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptxIntroduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptxRAJESHKUMAR428748
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBioinformaticsCentre
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...SBituila
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...BibiQuinah
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformaticsVinaKhan1
 
protein databases.ppt
protein databases.pptprotein databases.ppt
protein databases.pptSanthiyaAK
 
Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu KAUSHAL SAHU
 
bioinfomatics
bioinfomaticsbioinfomatics
bioinfomaticsnguyenpg
 

Similaire à Data base in detail (20)

Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptx
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Biological database
Biological databaseBiological database
Biological database
 
Biological databases.pptx
Biological databases.pptxBiological databases.pptx
Biological databases.pptx
 
Primary Databases.pptx
Primary Databases.pptxPrimary Databases.pptx
Primary Databases.pptx
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Introduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptxIntroduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptx
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdf
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 
protein databases.ppt
protein databases.pptprotein databases.ppt
protein databases.ppt
 
Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu
 
bioinfomatics
bioinfomaticsbioinfomatics
bioinfomatics
 
Introduction to Biological databases
Introduction to Biological databasesIntroduction to Biological databases
Introduction to Biological databases
 
What are Databases?
What are Databases?What are Databases?
What are Databases?
 
Biological databases
Biological databasesBiological databases
Biological databases
 

Dernier

Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxMedical College
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsCharlene Llagas
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalMAESTRELLAMesa2
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
Ai in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxAi in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxsubscribeus100
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxuniversity
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 

Dernier (20)

Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptx
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and Functions
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and Vertical
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
Ai in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxAi in communication electronicss[1].pptx
Ai in communication electronicss[1].pptx
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 

Data base in detail

  • 2. Data Data is raw, unorganized facts that need to be processed. Example:- Each student's test score is one piece of data. Vartika's Presentation INFORMATION When data is processed, organized, structured or presented in a given context so as to make it useful, it is called information.
  • 3. What is database???? • Database are convenient system to properly store, searchand retrieve any type of data. • A database helps to easily handle and share large amount of data and supports large scaleanalysis by easyaccessand data updating. Vartika's Presentation
  • 4. What is Biological Database • Biological databases are libraries of life sciences information ,collected from scientific experiments, published literature, high- throughput experiment technology and computational analysis. • They contain information from genomics, proteomics, microarray gene expression. • Informationcontained in Biological database includes function, gene structure, localization(both cellular and chromosomal),biological sequences andstructures. Vartika's Presentation
  • 5. Major purpose of these Data Base is : •Availability of Biological data. •Systemization of data. •Analysis of computed Biological Data. Vartika's Presentation
  • 6. History:  1956; first sequence database when insulin was sequenced  51 amino acids.  Atlas of protein sequences and structures in 1965 by Margaret Day Hoff et al was a printed book.  Became base for PIR protein information resource  First nucleotide sequence: yeast tRNA  77 bases  During this time 3D structure of proteins was being studied and renowned PDB was made.  First genome published was of free living virus Haemophilus influenzae in 1995. Vartika's Presentation
  • 7. Features of Biological Data Bases: 1) Data heterogeneity 2) High volume data 3) Uncertainty 4) Data Curation 5) Large scale data integration 6) Data sharing 7) Dynamic and subject to change Vartika's Presentation
  • 8. Classification scheme for biological databases : Data type Maintenance status Data access Data source Database design Organism Vartika's Presentation
  • 9. Data Types : Vartika's Presentation
  • 10. Based on data sources Based on data sources Vartika's Presentation
  • 11. Content Based: Genome database Sequence database Structure database Microarray database Chemical database Pathway database Enzyme database Disease database Literature database Vartika's Presentation
  • 12. Based on maintenance status NCBI EMBL SIB Vartika's Presentation
  • 13. Based on data access 1) Publicly available 2) Available with copy wright 3) Browsing only, accessible but not downloadable 4) Academic but not freely available 5) Proprietary commercial 6) Restricted Vartika's Presentation
  • 16. Databases Architecture Information system )Querysystem StorageSystem Data (The Google,Entrez SRS) Your search keywords Oracle,MySQL,PCbinary files,Unix text files,Bookshelves GenBank flat file PDBfile Interaction Record Title of abook BookVartika's Presentation
  • 17. A Sequence Retrieving and Manipulation Network DNA NCBI-GenBANK Protein PIR SWISSPROTDDBJ EBI-EMBL EXPASY, PDB GCG SeqWEB Vector NTI GenoMAX Entrez SRS GenBANK GCG FASTA Staden Image Databases Softwares Formats Sequence Converter Retriva l System Information Sequnece, Pdb, Image Vartika's Presentation
  • 18. Types of biological databases  Primary Database. Secondarydatabase. Vartika's Presentation
  • 19. Primary databases Thesesare the primary sourcesof data usedto store nucleic acid, protein sequences and structural information of biological macromolecules. Some primarydatabases- • NCBI(The National Centre for Biotechnology Information) • GenBank • DDBJ(DNAdata bank of Japan) • SWISS-PROT(Swiss-Prot) • PIR(Protein InformationResource) • PDB(Protein DataBank) This sequencecollection of this database is due to the efforts of basic researchfrom academic industrial and sequencinglab) Vartika's Presentation
  • 20. GenBank/EMBL/DDBJ International Nucleotide Sequence Database DDBJ:DNAData Bankof Japan CIB:Center for Information Biology and DNAData Bankof Japan NIG:National Institute of Genetics IAM: International Advisory Meeting ICM: International Collaborative Meeting EMBL: European Molecular Biology Laboratory EBI: European Bioinformatics Institute NCBI: National Center for BiotechnologyInformation NLM: National Library of Medicine Vartika's Presentation
  • 21. Secondary Database • ASecondary database contain additional information derived from the analysis of data available in primary sources. • Secondary databasesare analysed in avariety of waysand contain different information in different formats. • Some secondarydatabases • TrEMBL • Pfam • PROSITE • Profiles • SCOP • CATH Vartika's Presentation
  • 22. Flat File Storage Data Formats • When GenBank, EMBL and DDBJ formed a collaboration (1986), sequence databases had moved to a defined flat file format with a shared feature table format and annotation standards. • The flat file formats from the sequence databases are still used to access and display sequence and annotation. They are also convenient for storage of localcopies. Vartika's Presentation
  • 27. The National Center for Biotechnology Information Bethesda, MD Created in 1988 as a part of the National Library of Medicine at NIH – Establish public databases – Research in computational biology – Develop software tools for sequence analysis – Disseminate biomedical informationVartika's Presentation
  • 28. NCBI Databases and Services • GenBank primary sequencedatabase • Free public accesstobiomedical literature • PubMed free Medline (3million searches per day) • PubMedCentral full text online access • Entrez integrated molecular and literature databases • BLASThighest volume sequence searchservice (100 – 200 Ksearches perday) • VASTstructure similaritysearches • Software andDatabases Vartika's Presentation
  • 29. GenBank (Genetic Sequence Databank) • GenBank®is the genetic sequencedatabaseat the National Center for Biotechnology Information (NCBI). • It wasestablished in the year 1982and now maintained by the NationalCenter for Biotechnology (NCBI). • DNAsequencescanbe submitted to GenBankusing several different methods. • It contains publicly available nucleotide sequencesfor more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions fromlarge-scale sequencing projects.Vartika's Presentation
  • 30. • It hasaflat file structure that is anASCIItext file, readable & downloadable by both humans and computers. • There are two main waysof making batch sequence submissions to GenBank: NCBI’sBarcode SubmissionTool(BarSTool) and Sequin. Vartika's Presentation
  • 33. EMBL • The European Molecular Biology Laboratory (EMBL) is amolecular biology research institution supported by 22member states, four prospect and two associate member states. • EMBLwascreated in 1974and is an intergovernmental organisation funded by public researchmoney from its member states. • The Laboratory operates from five sites: the main laboratory in Heidelberg, and outstations in Hinxton (the European Bioinformatics Institute (EBI), in England), Grenoble (France),Hamburg (Germany), and Monterotondo (near Rome). • EMBLgroups and laboratories perform basicresearchin molecular biology and molecular medicine aswell astraining for scientists,students and visitors. • Israel is the onlyAsian state that hasfull membership. • TheEMBLNucleotide SequenceDatabase (http:// www.ebi.ac.uk/embl/), maintained at the European Bioinformatics Institute (EBI), Vartika's Presentation
  • 34. • It is used to incorporate and distributes nucleotide sequences from public sources. • The database is apart of an international collaboration with DDBJ (Japan) and GenBank(USA). • Data are exchangedbetween the collaborating databases on a daily basis. • The web-based tool, Webin, is the preferred system for individual submission of nucleotide sequences,including Third Party Annotation (TPA) and alignment data. Vartika's Presentation
  • 35. • Automatic submission procedures are usedfor submission of data from large-scale genomesequencing • The latest data collection canbe accessedvia FTP,email and WWW interfaces. • The EBI's Sequence Retrieval System (SRS) integrates and links the main nucleotide and protein databases aswell asmany other specialist molecular biologydatabases. • For sequencesimilarity searching, avariety of tools (e.g. FASTA and BLAST) are available that allow external users to compare their own sequences against the data in the EMBLNucleotide Sequence Database and otherdatabases. • All available resourcescanbe accessedvia the EBIhome page atVartika's Presentation
  • 43. EMBL format 28-APR-1992 (Rel. 31, Created) 30-JUN-1993 (Rel. 36, Last updated, Version 6) L.ivanovii sod gene for superoxide dismutase sod gene; superoxide dismutase. Listeria ivanovii Bacteria; Firmicutes; Bacillus/Clostridium group; Bacillus/Staphylococcus group; Listeria. [1] MEDLINE; 92140371. Haas A., Goebel W.; "Cloning of a superoxide dismutase gene from Listeria ivanovii by functional complementation in Escherichia coli and characterization of ID LISOD standard; DNA; PRO; 756 BP. XX AC X64011; S78972; XX SV X64011.1 XX DT DT XX DE XX KW XX OS OC OC XX RN RX RA RT RT the RT gene product."; Vartika's Presentation
  • 44. M o l . G e n . G e n e t . 2 3 1 : 3 1 3 - 3 2 2 ( 1 9 9 2 ) . [ 2 ] 1 - 7 5 6 K r e f t J . ; ; S u b m i t t e d ( 2 1 - A P R - 1 9 9 2 ) t o t h e E M B L / G e n B a n k / D D B J d a t a b a s e s . J . K r e f t , I n s t i t u t f . M i k r o b i o l o g i e , U n i v e r s i t a e t W u e r z b u r g , B i o z e n t r u m H u b l a n d , 8 7 0 0 W u e r z b u r g , F R G S W I S S - P R O T ; P 2 8 7 6 3 ; S O D M _ L I S I V . K e y L o c a t i o n / Q u a l i f i e r s s o u r c e R B S t e r m i n a t o r C D S 1 . . 7 5 6 / d b _ x r e f = " t a x o n : 1 6 3 8 " / o r g a n i s m = " L i s t e r i a i v a n o v i i " / s t r a i n = " A T C C 1 9 1 1 9 " 9 5 . . 1 0 0 / g e n e = " s o d " 7 2 3 . . 7 4 6 / g e n e = " s o d " 1 0 9 . . 7 1 7 / d b _ x r e f = " S W I S S - P R O T : P 2 8 7 6 3 " / t r a n s l _ t a b l e = 1 1 / g e n e = " s o d " / E C _ n u m b e r = " 1 . 1 5 . 1 . 1 " / p r o d u c t = " s u p e r o x i d e d i s m u t a s e " / p r o t e i n _ i d = " C A A 4 5 4 0 6 . 1 " / t r a n s l a t i o n = " M T Y E L P K L P Y T Y D A L E P N F D K E T M E I H Y T K H H N I Y V T K L N E A H A E L A S K P G E E L V A N L D S V P E E I R G A V R N H G G G H A N H T L F W S S L S P N G G G A P T G N L I E S E F G T F D E F K E K F N A A A A A R F G S G W A W L V V N N G K L E I V S T A N Q D S P L S E G K T P V D V W E H A Y Y L K F Q N R R P E Y I D T F W N V I N W D E R N K R F D A A K " R L X X R N R P R A R T R L R L A m R L X X D R X X F H F H F T F T F T F T F T F T F T F T F T F T F T F T F T F T F T F T V S G F T K A A F T L G L F T X X S Q S e q u e n c e 7 5 6 B P ; 2 4 7 A ; 1 3 6 C ; 1 5 1 G ; 2 2 2 T ; 0 o t h e r ; c g t t a t t t a a g g t g t t a c a t a g t t c t a t g g a a a t a g g g t c t a t a c c t t t c g c c t t a c a a t g t a a t t t c t t g a c t t a c g a a t t a c c a a a a t a g a a a c a a t g g a a a t t c a c t a g c a g t c t c a g g a c a c g c a g a g a t a g c g t t c c t g a a g a a a c c a t a c t t t a t t c t g g t c t a a a a a g c a g c a a t c g a a a g c g g g c a g c t g c g g c t c g t t t t g t a a t a a a c a a t c c g a g g a g g a a t t t t t a a t t t a t g a t g c t t t g g a g c c g a a t t t t g a t a a c c a c a a t a t t t a t g t a a c a a a a c t a a a t g a t a a a c c t g g g g a a g a a t t a g t t g c t a a t c t a g t a c g t a a c c a c g g t g g t g g a c a t g c t a a a a a t g g t g g t g g t g c t c c a a c t g g t a a c t t a t t t g a t g a a t t c a a a g a a a a a t t c a a t g c g g c a t g g c t a g t a g t g a a c a a t g g t a a a c t a g a a a t t g t t 6 0 t t c a c a t a a a 1 2 0 t a c c t t a t a c 1 8 0 a t a c a a a g c a 2 4 0 a a c t t g c a a g 3 0 0 t t c g t g g c g c 3 6 0 g t c t t a g c c c 4 2 0 a a t t c g g c a c 4 8 0 g t t c a g g a t g Vartika's Presentation
  • 45. I D - I d e n t i f i c a t i o n . A C - A c c e s s i o n n u m b e r ( s ) . D T - D a t e . D E - D e s c r i p t i o n . G N - G e n e n a m e ( s ) . O S - O r g a n i s m s p e c i e s . O G - O r g a n e l l e . O C - O r g a n i s m c l a s s i f i c a t i o n . R N - R e f e r e n c e n u m b e r . R P - R e f e r e n c e p o s i t i o n . R C - R e f e r e n c e c o m m e n t s . R X - R e f e r e n c e c r o s s - r e f e r e n c e s . R A - R e f e r e n c e a u t h o r s . R L - R e f e r e n c e l o c a t i o n . C C - C o m m e n t s o r n o t e s . D R - D a t a b a s e c r o s s - r e f e r e n c e s . K W - K e y w o r d s . F T - F e a t u r e t a b l e d a t a . S Q - S e q u e n c e h e a d e r . - ( b l a n k s ) s e q u e n c e d a t a . / / - T e r m i n a t i o n l i n e . S o m e e n t r i e s d o n o t c o n t a i n a l l o f t h e l i n e t y p e s , a n d s o m e l i n e t y p e s o c c u r m a n y t i m e s i n a s i n g l e e n t r y . E a c h e n t r y m u s t b e g i n w i t h a n i d e n t i f i c a t i o n l i n e ( I D ) a n d e n d w i t h a t e r m i n a t o r l i n e ( / / ) .Vartika's Presentation
  • 46. PubMed • PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on sciences and biomedical topics. • The PubMed system was offered free to the public in 1997. • The United States National Library of Medicine (NLM) the National Institutes of Health maintains the part of the Entrez system of information retrieval. • PMID is the unique identifier number used in Vartika's Presentation
  • 47. • Theyare assignedto eacharticle record when it enters the PubMedsystem. • ThePMID# is alwaysfound at the end of aPubMed citation. • PubMed Central (PMC) is afree digital system that archivespublicly accessiblefull-text scholarly articles that have been published within the biomedical and life sciences journalliterature. • A"PubMed Mobile" option, providing accessto amobileVartika's Presentation
  • 55. Entrez • WWW-based data retrievalsystem. • Developed by NCBI(National Centre for Biotechnology Information). • - Integrates information held in different DBs. Vartika's Presentation
  • 56. Data bases covered by Entrez are • Nucleic acid -GenBank, RefSeq,PDB. • Protein seqs-SWISS- PROT,PIR. • 3Dstructures –MMDB • Genomes –Many sources • PopSet – FromGenBank • OMIM –OMIM • Taxonomy – NCBItaxonomy database • Books- Bookshelf • ProbeSet – GEO(Gene ExpressionOmnibus) • Literature -PubMed Vartika's Presentation
  • 65. SRS • SRSis aSequence RetrievalSystem • - Data retrieval tool developed by EBI • - Integrates 80 molecular biology DBs • -AnOpen sourcesoftware (Canbe installed locally) • SRShas an associated scripting language calledIcarus • Central resource for molecular biology data • - more than 250databanks have been indexed. More than 35SRS servers over theWWW(world wide) Vartika's Presentation
  • 66. • Information retrieval • Easy way to retrieve information from sequence and sequence-related databases • Possibility to search for multiple words/other criteria • Linkage between different databases • E.g. Find all primary structures with known three-dimensional • Different types of database in SRS • Sequence & structure • DNA, protein, three-dimensional structures • Sequence-related • Gene-related • Genome, mapping, mutations, transcription factors • SNP • Bibliographic Vartika's Presentation
  • 67. • SRS main toolbar tabs: • Top Page: displays databases in different database groups • Query: displays either the standard or extended query form • Results or “the query manager”: maintains a history of all the results obtained during a session • Projects or “the project manager”: maintains a history of all queries and views used during a session • Views: allows a user to define a user specific view for one or more databases • Databanks: contains a list and some facts about the databases available in the system Vartika's Presentation
  • 68. • Search terms in SRS • SRS indexed fields can be searched using any of the • Single word search • Multiple word phrases • Numbers and dates • Regular expressions • Wildcards • Vartika's Presentation
  • 72. LocusLink • LocusLink (http://www.ncbi.nlm.nih.gov/LocusLink) is aNational Center for Biotechnology Information (NCBI) online resource. • It is principally intended for useby graduate students and professional researchersin the biomedical sciences. • It is designed to bring together related information on genetic loci and gene products from several sources. • LocusLink provides acentral point of accessfor basic biomedical information and molecular data for genes, transcripts, and proteins from model organisms, currently including human, rat, mouse, fruit fly,and zebrafish. • Now it is not availablein NCBI. Vartika's Presentation