SlideShare une entreprise Scribd logo
1  sur  54
DATA RETRIEVAL SYSTEM
Text-based Database Searching
Submitted By:
Dr. Shikha Thakur
Assistant Professor (Guest Faculty)
TCSC
Mumbai
Maharashtra
• The amount of biologically relevant data accessible via the WWW is
increasing at a very rapid rate.
• It is important for scientists to have easy and efficient ways of wading
through the data and finding what is important for their research.
• Knowing how to access and search for information in the database is
essential.
Depending on the type of data at hand, there are
two basic ways of searching:
• Using descriptive words to search text databases.
• Using a nucleotide or protein sequence to search sequence
databases.
Text- based database Searching
• There are three important data retrieval systems of particular
relevance to molecular biologists:
• Entrez ( at NCBI) (GI(Global Image disk image file) /Accession no.
• Sequence Retreival System, SRS (at EBI)
• DBGET/LinkDB (At Japan)
• The advantage of these retrieval systems is that they not only return
matches to a query, but also provide handy pointers to additional
important information in related databases.
Text-based database Searching
• The three systems differ in the databases they search and the links
they provide to other information.
• In using any of these systems, queries can be as simple as entering
the accession number of a newly published sequence or as complex
as searching multiple database fields for specific terms.
Text-Based Database Searching
• Basic Search Concepts
• Boolean Search – An advanced query search using two or more terms,
using Boolean operator AND, OR, NOT, default – AND
• Broadening the Search – If the results of a search produce no useful
entries, change or remove terms.
• Narrowing the search – If the results of a search produce no useful entries,
change or remove terms.
• Proximity Searching – To search with multiword terms or phrases, place
quotes around the terms.
• Wild Card – The character prepended or appended to a search term make
a search less specific., e.g., to look for all authors with last name Zav,
search using Zav*.
Entrez
• Entrez – is a molecular biology database and retrieval system
developed by the National Center for Biotechnology Information
(NCBI).
• It is an entry point for exploring distinct but integrated databases.
• (http://www.ncbi.nlm.nih.gov/Entrez/)
Entrez
• The Entrez system provides access to:
• Nucleotide sequence databases- GenBank/DDBJ/EBI
• Protein sequence databases – Swiss-Prot, PIR, PRF, PDB, and translated
protein sequences from DNA sequence databases.
• Genome and chromosome mapping data
• Molecular Modeling 3-D structures Databases.
• Literature database, PubMed – Provides excellent and easy access to
MEDLINE and pre-MEDLINE articles.
• Taxonomy database – Allows retrieval of DNA and protein sequences for
any taxonomic group.
• Specialized Databases – OMIM, dbSNP, UniSTS, etc.
Entrez
• The most valuable feature of Entrez is
• Its exploitation of the concept of ’neighbouring’.
• Which allows related articles indifferent databases to be linked to
each other, whether or not they are cross-referenced directly.
• Neighbours and links are listed in the order of similarity to the query.
• The similarity is based on pre-computed analysis of sequences,
structures and the literature.
Entrez
• One particularly useful feature in Entrez is –
• The ability to retrieve large sets of data based on some criterion and
to download them to a local computer- Batch Entrez
• Allowing these sequences to be worked on using analytical tools
available on local computer.
Entrez Features
1. Entrez Global Query – Search a subset of Entrez databases.
2. Batch Entrez –Upload a file of GI or accession numbers to retrieve
sequences.
3. Making Links Entrez – Linking to PubMed and Genbank
4.E-Utilities – Entrez programming utilities
5. LinkOut – External links to related resources.
6. Cubby – Provides with a stored search feature to store and update
searches, allows to customize your LinkOut display.
SRS.
• The Sequence Retrieval System (SRS) – A network browser for
datbases in molecular biology.
• It is a powerful sequence information indexing, search and retrieval
system (http://srs.ebi.ac.uk/)
SRS
• SRS is a homogeneous interface to over 80 biological databases
developed at the European Bioinformatics Institute (EBI) at Hinxton,
UK.
• The types of databases included are sequence and sequence related,
metabolic pathways, transcription factors, application results (e.g.,
BLAST), protein 3D- structure, genome, mapping, mutations, and
locus-specific mutatins.
• One can access and query their contents and navigate among them.
SRS
The Web page listing all the databases contains a link to a description
page about the database and includes the date of last update.
One can select one or more datbases to search before entering the
query.
• Over 30 versions of SRS are currently running on the WWW. Each
includes a different subset of databases and associated analytical
tools.
SRS
• SRS Features:
• SRS databases are well indexed, thus reducing the search time for the
large number of potential databases.
• SRS allows any flat file database to be indexed to any other. The
advantage being the derived indices may be rapidly searched allowing
users to retrieve link and access entries from all the interconnected
resources.
• The system has the particular strength that it can be readily
customized to use any defined set of databanks.
SRS
• Simple SRS queries
• By accession number
• Query on accession number: J00231
• By a simple author or organism: Ausubel and Rhizobium
• Boolean relations between keywords: and, or, but not
SRS
• Contd…
• Searching by dates: 01-Jan-1995:31-Dec-1995.
• Searching by size: 400:600
• Using hypertext links in an entry: Medline, Swiss- Prot and PDB
entries can be linked from within the EMBL database.
• Display of molecules via Rasmol plug-in
DBGET
• DBGET/LinkDB – Is an integrated bioinformatics database retrieval
system at GenomeNet, developed by the institute for Chemical
Research, Kyoto University, and the Human Genome Center of the
University of Tokyo.
DBGET
• DBGET – Is used to search and extract entries from a wide range of
molecular biology databases.
• LinkDB- Is used to compute links between entries in different
databases.
• It is designed to be a network distributed database system with an
open architecture, which is suitable for incorporating local databases
or establishing a server environment.
• http://www.genome.ad.jp/dbget/
DBGET
• DBGET/LinkDB is integrated with other search tools, such as BLAST,
FAST and MOTIF to conduct further retreivals instantly.
• DBGET provides access to about 20 databases, which are queried one
at a time. After querying one of these databases, DBGET presents
links to associated information in addition to the list of results.
• A unique feature of DBGET is its connection with the Kyoto
Encyclopedia of Genes and Genomes(KEGG) database – a database of
metabolic and regulatory pathways.
DBGET
• DBGET has three basic commands (or three basic modes in the Web
version), bfind, bget, and blink, to search and extract database
entries.
• blink – To search and extract database entries.
• bget – Performs the retrieval of database entries specified by the
combination of dbname:identifier
• bfind – Is used for searching entries by keywords
• Notable feature of DBGET, different from other text search systems, is
that no keyword indexing is performed when a database is installed or
updated.
DBGET
• Selected fields are extracted and stored in separate files for bfind
searches.
• An advantage for rapid database updates, but sometimes a
disadvantage for elaborate searching.
• To supplement bfind, the full text search STAG is provided.
• blink – The LinkDB search. Once entries of interest are found, it can
be used to retrieve related entries in a given database or all databases
in GenomeNet.
Example
• Let’s consider an example to show how each system can be used to
retrieve the SwissProt entry P04391, an ornithine
carbamoyltransferase protein in Escherichia coli.
• In Entrez, enter the name P04391 in the protein database query
form and view the entry and associated links and neighbours.
Example - SRS
• In SRS, first select the SwissProt database, then enter P04391 in the
query form and, once the entry is displayed search for links to other
related databases.
Example – LinkDB
• However, the fastest way of gathering the related information for this
entry is to search LinkDB.
• By simply entering swissport:P04391, a list of all links to all the
related databases is displayed.
Thank You

Contenu connexe

Tendances

Tendances (20)

Biological databases
Biological databasesBiological databases
Biological databases
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Molecular modeling database
Molecular modeling database Molecular modeling database
Molecular modeling database
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
Scop database
Scop databaseScop database
Scop database
 
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology Laboratory
 
Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu
 
Cath
CathCath
Cath
 
Clustal W - Multiple Sequence alignment
Clustal W - Multiple Sequence alignment   Clustal W - Multiple Sequence alignment
Clustal W - Multiple Sequence alignment
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 
Gen bank
Gen bankGen bank
Gen bank
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 
Protein information resource (PIR)
Protein information resource (PIR)Protein information resource (PIR)
Protein information resource (PIR)
 
FASTA
FASTAFASTA
FASTA
 
Phylogenetic data analysis
Phylogenetic data analysisPhylogenetic data analysis
Phylogenetic data analysis
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
 
SEQUENCE ANALYSIS
SEQUENCE ANALYSISSEQUENCE ANALYSIS
SEQUENCE ANALYSIS
 

Similaire à Data retreival system

Data retriveal ,srg and dbget
Data retriveal ,srg and dbgetData retriveal ,srg and dbget
Data retriveal ,srg and dbgetSurendraKumar338
 
biological databases.pptx
biological databases.pptxbiological databases.pptx
biological databases.pptxscience lover
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxVandana Yadav03
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBioinformaticsCentre
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics AakifahAmreen
 
Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological databaseKAUSHAL SAHU
 
Bioinformatics مي.pdf
Bioinformatics  مي.pdfBioinformatics  مي.pdf
Bioinformatics مي.pdfnedalalazzwy
 
Hands on training_biological_databases.ppt
Hands on training_biological_databases.pptHands on training_biological_databases.ppt
Hands on training_biological_databases.pptSoumen Barman
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemMaryann Martone
 

Similaire à Data retreival system (20)

Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
 
Data retriveal ,srg and dbget
Data retriveal ,srg and dbgetData retriveal ,srg and dbget
Data retriveal ,srg and dbget
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
biological databases.pptx
biological databases.pptxbiological databases.pptx
biological databases.pptx
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptx
 
ENTREZ.ppt
ENTREZ.pptENTREZ.ppt
ENTREZ.ppt
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
 
Biological data base
Biological data baseBiological data base
Biological data base
 
Biological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdfBiological Database (1)pptxpdfpdfpdf.pdf
Biological Database (1)pptxpdfpdfpdf.pdf
 
Genomic Databases-.pptx
Genomic Databases-.pptxGenomic Databases-.pptx
Genomic Databases-.pptx
 
Biological data bioinformatics
Biological data bioinformatics Biological data bioinformatics
Biological data bioinformatics
 
Protein database
Protein  databaseProtein  database
Protein database
 
Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological database
 
Important protein databases and proteomics softwares
Important protein databases and proteomics softwaresImportant protein databases and proteomics softwares
Important protein databases and proteomics softwares
 
Bioinformatics مي.pdf
Bioinformatics  مي.pdfBioinformatics  مي.pdf
Bioinformatics مي.pdf
 
Hands on training_biological_databases.ppt
Hands on training_biological_databases.pptHands on training_biological_databases.ppt
Hands on training_biological_databases.ppt
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystem
 
Biological databases
Biological databasesBiological databases
Biological databases
 

Plus de Shikha Thakur

Medicinal plants on terrace
Medicinal plants on terraceMedicinal plants on terrace
Medicinal plants on terraceShikha Thakur
 
Biological Weapon Threat to Humanity
Biological Weapon Threat to HumanityBiological Weapon Threat to Humanity
Biological Weapon Threat to HumanityShikha Thakur
 
Energetics of kreb's cycle
Energetics of kreb's cycleEnergetics of kreb's cycle
Energetics of kreb's cycleShikha Thakur
 
Top 10 must vaccines
Top 10 must vaccinesTop 10 must vaccines
Top 10 must vaccinesShikha Thakur
 
Introduction to Pubmed
Introduction to PubmedIntroduction to Pubmed
Introduction to PubmedShikha Thakur
 
Career oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsCareer oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsShikha Thakur
 
Screening potential of biocontrol agents
Screening potential of biocontrol agentsScreening potential of biocontrol agents
Screening potential of biocontrol agentsShikha Thakur
 

Plus de Shikha Thakur (14)

Types of greenhouse
Types of greenhouseTypes of greenhouse
Types of greenhouse
 
Medicinal plants on terrace
Medicinal plants on terraceMedicinal plants on terrace
Medicinal plants on terrace
 
Biological Weapon Threat to Humanity
Biological Weapon Threat to HumanityBiological Weapon Threat to Humanity
Biological Weapon Threat to Humanity
 
Bacteria
BacteriaBacteria
Bacteria
 
Swiss prot
Swiss protSwiss prot
Swiss prot
 
Energetics of kreb's cycle
Energetics of kreb's cycleEnergetics of kreb's cycle
Energetics of kreb's cycle
 
Top 10 must vaccines
Top 10 must vaccinesTop 10 must vaccines
Top 10 must vaccines
 
Introduction to Pubmed
Introduction to PubmedIntroduction to Pubmed
Introduction to Pubmed
 
Proteomics
ProteomicsProteomics
Proteomics
 
Career oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsCareer oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of Bioinformatics
 
Screening
ScreeningScreening
Screening
 
Presentation1
Presentation1Presentation1
Presentation1
 
Presentation2
Presentation2Presentation2
Presentation2
 
Screening potential of biocontrol agents
Screening potential of biocontrol agentsScreening potential of biocontrol agents
Screening potential of biocontrol agents
 

Dernier

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 

Dernier (20)

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 

Data retreival system

  • 1. DATA RETRIEVAL SYSTEM Text-based Database Searching Submitted By: Dr. Shikha Thakur Assistant Professor (Guest Faculty) TCSC Mumbai Maharashtra
  • 2. • The amount of biologically relevant data accessible via the WWW is increasing at a very rapid rate. • It is important for scientists to have easy and efficient ways of wading through the data and finding what is important for their research. • Knowing how to access and search for information in the database is essential.
  • 3. Depending on the type of data at hand, there are two basic ways of searching: • Using descriptive words to search text databases. • Using a nucleotide or protein sequence to search sequence databases.
  • 4. Text- based database Searching • There are three important data retrieval systems of particular relevance to molecular biologists: • Entrez ( at NCBI) (GI(Global Image disk image file) /Accession no. • Sequence Retreival System, SRS (at EBI) • DBGET/LinkDB (At Japan) • The advantage of these retrieval systems is that they not only return matches to a query, but also provide handy pointers to additional important information in related databases.
  • 5. Text-based database Searching • The three systems differ in the databases they search and the links they provide to other information. • In using any of these systems, queries can be as simple as entering the accession number of a newly published sequence or as complex as searching multiple database fields for specific terms.
  • 6. Text-Based Database Searching • Basic Search Concepts • Boolean Search – An advanced query search using two or more terms, using Boolean operator AND, OR, NOT, default – AND • Broadening the Search – If the results of a search produce no useful entries, change or remove terms. • Narrowing the search – If the results of a search produce no useful entries, change or remove terms. • Proximity Searching – To search with multiword terms or phrases, place quotes around the terms. • Wild Card – The character prepended or appended to a search term make a search less specific., e.g., to look for all authors with last name Zav, search using Zav*.
  • 7. Entrez • Entrez – is a molecular biology database and retrieval system developed by the National Center for Biotechnology Information (NCBI). • It is an entry point for exploring distinct but integrated databases. • (http://www.ncbi.nlm.nih.gov/Entrez/)
  • 8. Entrez • The Entrez system provides access to: • Nucleotide sequence databases- GenBank/DDBJ/EBI • Protein sequence databases – Swiss-Prot, PIR, PRF, PDB, and translated protein sequences from DNA sequence databases. • Genome and chromosome mapping data • Molecular Modeling 3-D structures Databases. • Literature database, PubMed – Provides excellent and easy access to MEDLINE and pre-MEDLINE articles. • Taxonomy database – Allows retrieval of DNA and protein sequences for any taxonomic group. • Specialized Databases – OMIM, dbSNP, UniSTS, etc.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. Entrez • The most valuable feature of Entrez is • Its exploitation of the concept of ’neighbouring’. • Which allows related articles indifferent databases to be linked to each other, whether or not they are cross-referenced directly. • Neighbours and links are listed in the order of similarity to the query. • The similarity is based on pre-computed analysis of sequences, structures and the literature.
  • 18. Entrez • One particularly useful feature in Entrez is – • The ability to retrieve large sets of data based on some criterion and to download them to a local computer- Batch Entrez • Allowing these sequences to be worked on using analytical tools available on local computer.
  • 19. Entrez Features 1. Entrez Global Query – Search a subset of Entrez databases. 2. Batch Entrez –Upload a file of GI or accession numbers to retrieve sequences. 3. Making Links Entrez – Linking to PubMed and Genbank 4.E-Utilities – Entrez programming utilities 5. LinkOut – External links to related resources. 6. Cubby – Provides with a stored search feature to store and update searches, allows to customize your LinkOut display.
  • 20. SRS. • The Sequence Retrieval System (SRS) – A network browser for datbases in molecular biology. • It is a powerful sequence information indexing, search and retrieval system (http://srs.ebi.ac.uk/)
  • 21.
  • 22.
  • 23. SRS • SRS is a homogeneous interface to over 80 biological databases developed at the European Bioinformatics Institute (EBI) at Hinxton, UK. • The types of databases included are sequence and sequence related, metabolic pathways, transcription factors, application results (e.g., BLAST), protein 3D- structure, genome, mapping, mutations, and locus-specific mutatins. • One can access and query their contents and navigate among them.
  • 24. SRS The Web page listing all the databases contains a link to a description page about the database and includes the date of last update. One can select one or more datbases to search before entering the query. • Over 30 versions of SRS are currently running on the WWW. Each includes a different subset of databases and associated analytical tools.
  • 25. SRS • SRS Features: • SRS databases are well indexed, thus reducing the search time for the large number of potential databases. • SRS allows any flat file database to be indexed to any other. The advantage being the derived indices may be rapidly searched allowing users to retrieve link and access entries from all the interconnected resources. • The system has the particular strength that it can be readily customized to use any defined set of databanks.
  • 26. SRS • Simple SRS queries • By accession number • Query on accession number: J00231 • By a simple author or organism: Ausubel and Rhizobium • Boolean relations between keywords: and, or, but not
  • 27. SRS • Contd… • Searching by dates: 01-Jan-1995:31-Dec-1995. • Searching by size: 400:600 • Using hypertext links in an entry: Medline, Swiss- Prot and PDB entries can be linked from within the EMBL database. • Display of molecules via Rasmol plug-in
  • 28.
  • 29. DBGET • DBGET/LinkDB – Is an integrated bioinformatics database retrieval system at GenomeNet, developed by the institute for Chemical Research, Kyoto University, and the Human Genome Center of the University of Tokyo.
  • 30.
  • 31.
  • 32. DBGET • DBGET – Is used to search and extract entries from a wide range of molecular biology databases. • LinkDB- Is used to compute links between entries in different databases. • It is designed to be a network distributed database system with an open architecture, which is suitable for incorporating local databases or establishing a server environment. • http://www.genome.ad.jp/dbget/
  • 33.
  • 34.
  • 35.
  • 36.
  • 37. DBGET • DBGET/LinkDB is integrated with other search tools, such as BLAST, FAST and MOTIF to conduct further retreivals instantly. • DBGET provides access to about 20 databases, which are queried one at a time. After querying one of these databases, DBGET presents links to associated information in addition to the list of results. • A unique feature of DBGET is its connection with the Kyoto Encyclopedia of Genes and Genomes(KEGG) database – a database of metabolic and regulatory pathways.
  • 38.
  • 39. DBGET • DBGET has three basic commands (or three basic modes in the Web version), bfind, bget, and blink, to search and extract database entries. • blink – To search and extract database entries. • bget – Performs the retrieval of database entries specified by the combination of dbname:identifier • bfind – Is used for searching entries by keywords • Notable feature of DBGET, different from other text search systems, is that no keyword indexing is performed when a database is installed or updated.
  • 40. DBGET • Selected fields are extracted and stored in separate files for bfind searches. • An advantage for rapid database updates, but sometimes a disadvantage for elaborate searching. • To supplement bfind, the full text search STAG is provided. • blink – The LinkDB search. Once entries of interest are found, it can be used to retrieve related entries in a given database or all databases in GenomeNet.
  • 41. Example • Let’s consider an example to show how each system can be used to retrieve the SwissProt entry P04391, an ornithine carbamoyltransferase protein in Escherichia coli. • In Entrez, enter the name P04391 in the protein database query form and view the entry and associated links and neighbours.
  • 42.
  • 43.
  • 44.
  • 45. Example - SRS • In SRS, first select the SwissProt database, then enter P04391 in the query form and, once the entry is displayed search for links to other related databases.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50. Example – LinkDB • However, the fastest way of gathering the related information for this entry is to search LinkDB. • By simply entering swissport:P04391, a list of all links to all the related databases is displayed.
  • 51.
  • 52.
  • 53.