SlideShare a Scribd company logo
1 of 103
Download to read offline
Advanced NCBI.
The Entrez API
https://github.com/lindenb/courses
Pierre Lindenbaum
@yokofakun
pierre.lindenbaum@univ-nantes.fr
http://plindenbaum.blogspot.com
Institut du Thorax. Nantes. France
September 27, 2016
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
NCBI ? What about EBI, ENSEMBL, ...
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
What will be covered today? :
File formats...
EInfo, GQuery, ESearch , Esummary, EFetch..
processing XML answer with XSLT: HTML, SVG, R...
generating a java parser for dbSNP.
NCBI EBot
using standalone BLAST
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
CURL
c u r l ” http :// en . w i k i p e d i a . org / wiki /Main page”
wget −O − ” http :// en . w i k i p e d i a . org / wiki /Main page”
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
XML
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
XSLT
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
XSLT
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
XSLTPROC
x s l t p r o c s t y l e s h e e t . x s l f i l e . xml > r e s u l t . xml
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
JSON
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Formats
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Formats
Genbank
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.
fcgi?db=nucleotide&id=25&rettype=gb
LOCUS X53813 422 bp DNA l i n e a r MAM 22−JUN−1992
DEFINITION Blue Whale heavy s a t e l l i t e DNA.
ACCESSION X53813 X17460
VERSION X53813 .1 GI :25
KEYWORDS s a t e l l i t e DNA.
SOURCE Balaenoptera musculus ( Blue whale )
ORGANISM Balaenoptera musculus
Eukaryota ; Metazoa ; Chordata ; Craniata ; V e r t e br a t a ; Euteleostomi ;
Mammalia ; E u t h e r i a ; L a u r a s i a t h e r i a ; C e t a r t i o d a c t y l a ; Cetacea ;
M y s t i c e t i ; B a l a e n o p t e r i d a e ; Balaenoptera .
REFERENCE 1 ( bases 1 to 422)
AUTHORS Arnason ,U. and Widegren ,B.
TITLE Composition and chromosomal l o c a l i z a t i o n of cetacean h i g h l y
r e p e t i t i v e DNA with s p e c i a l r e f e r e n c e to the blue whale ,
Balaenoptera musculus
JOURNAL Chromosoma 98 (5) , 323−329 (1989)
PUBMED 2612291
COMMENT See a l s o <X52700−2> f o r 1 ,760 bp common cetacean component c l o n e s
and <X52703−6>,<X53811−4> f o r the 422 bp heavy s a t e l l i t e c l o n e s .
FEATURES Location / Q u a l i f i e r s
source 1 . . 4 2 2
/ organism=”Balaenoptera musculus ”
/ mol type=”genomic DNA”
/ d b x r e f=”taxon :9771”
/ c l o n e =”7”
m i s c f e a t u r e 1 . . 4 2 2
/ note=”heavy s a t e l l i t e DNA”
ORIGIN
1 t a g t t a t t c a a c c t a t c c c a c t c t c t a g a t a c c c c t t a g c acgtaaagga a t a t t a t t t gPierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Formats
ASN.1
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.
fcgi?db=nucleotide&id=25
Seq−e n t r y ::= seq {
i d {
embl {
a c c e s s i o n ”X53813” ,
v e r s i o n 1 } ,
g i 25 } ,
d e s c r {
t i t l e ” Blue Whale heavy s a t e l l i t e DNA” ,
source {
org {
taxname ” Balaenoptera musculus ” ,
common ” Blue whale ” ,
db {
{
db ” taxon ” ,
tag
i d 9771 } } ,
orgname {
name
b i no m i al {
genus ” Balaenoptera ” ,
s p e c i e s ” musculus ” } ,
l i n e a g e ” Eukaryota ; Metazoa ; Chordata ; Craniata ; Ve r t e b r a t a ;
Euteleostomi ; Mammalia ; E u t h e r i a ; L a u r a s i a t h e r i a ; C e t a r t i o d a c t y l a ; Cetacea ;
M y s t i c e t i ; B a l a e n o p t e r i d a e ; Balaenoptera ” ,
gcode 1 ,
mgcode 2 ,
d i v ”MAM” } } ,
subtype {Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Formats
ASN.1 (schema)
http:
//www.ncbi.nlm.nih.gov/data_specs/asn/insdseq.asn
INSDSeq ::= SEQUENCE {
l o c u s V i s i b l e S t r i n g ,
l e n g t h INTEGER ,
s t r a n d e d n e s s V i s i b l e S t r i n g OPTIONAL ,
moltype V i s i b l e S t r i n g ,
topology V i s i b l e S t r i n g OPTIONAL ,
d i v i s i o n V i s i b l e S t r i n g ,
update−date V i s i b l e S t r i n g ,
create−date V i s i b l e S t r i n g OPTIONAL ,
update−r e l e a s e V i s i b l e S t r i n g OPTIONAL ,
create−r e l e a s e V i s i b l e S t r i n g OPTIONAL ,
d e f i n i t i o n V i s i b l e S t r i n g ,
primary−a c c e s s i o n V i s i b l e S t r i n g OPTIONAL ,
entry−v e r s i o n V i s i b l e S t r i n g OPTIONAL ,
a c c e s s i o n−v e r s i o n V i s i b l e S t r i n g OPTIONAL ,
other−s e q i d s SEQUENCE OF INSDSeqid OPTIONAL ,
secondary−a c c e s s i o n s SEQUENCE OF INSDSecondary−accn OPTIONAL,
p r o j e c t V i s i b l e S t r i n g OPTIONAL ,
keywords SEQUENCE OF INSDKeyword OPTIONAL ,
segment V i s i b l e S t r i n g OPTIONAL ,
source V i s i b l e S t r i n g OPTIONAL ,
organism V i s i b l e S t r i n g OPTIONAL ,
taxonomy V i s i b l e S t r i n g OPTIONAL ,
r e f e r e n c e s SEQUENCE OF INSDReference OPTIONAL ,
comment V i s i b l e S t r i n g OPTIONAL ,
comment−s e t SEQUENCE OF INSDComment OPTIONAL ,
struc−comments SEQUENCE OF INSDStrucComment OPTIONAL ,
primary V i s i b l e S t r i n g OPTIONAL ,
source−db V i s i b l e S t r i n g OPTIONAL ,Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Formats
ASN.1 (tools)
DATATOOL
Generate C++ data storage classes based on ASN.1 serialization
streams.
Convert data between ASN.1, XML and JSON formats.
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Formats
XML
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.
fcgi?db=nucleotide&id=25&retmode=xml
<?xml v e r s i o n=” 1.0 ”?>
<!DOCTYPE GBSet PUBLIC ”−//NCBI//NCBI GBSeq/EN” ” h t t p : //www. ncbi . nlm . nih . gov/ dtd /NCBI G
<GBSet>
<GBSeq>
<GBSeq locus>X53813</ GBSeq locus>
<GBSeq length>422</ GBSeq length>
<GBSeq strandedness>double</ GBSeq strandedness>
<GBSeq moltype>DNA</GBSeq moltype>
<GBSeq topology>l i n e a r</ GBSeq topology>
<GBSeq division>MAM</ GBSeq division>
<GBSeq update−date>22−JUN−1992</GBSeq update−date>
<GBSeq create−date>13−JUL−1990</ GBSeq create−date>
<G B S e q d e f i n i t i o n>Blue Whale heavy s a t e l l i t e DNA</ G B S e q d e f i n i t i o n>
<GBSeq primary−a c c e s s i o n>X53813</ GBSeq primary−a c c e s s i o n>
<GBSeq accession−v e r s i o n>X53813 .1</ GBSeq accession−v e r s i o n>
<GBSeq other−s e q i d s>
<GBSeqid>emb| X53813 . 1 |</GBSeqid>
<GBSeqid>g i |25</GBSeqid>
</ GBSeq other−s e q i d s>
<GBSeq secondary−a c c e s s i o n s>
<GBSecondary−accn>X17460</GBSecondary−accn>
</ GBSeq secondary−a c c e s s i o n s>
<GBSeq keywords>
<GBKeyword>s a t e l l i t e DNA</GBKeyword>
</GBSeq keywords>
<GBSeq source>Balaenoptera musculus ( Blue whale )</ GBSeq source>
<GBSeq organism>Balaenoptera musculus</ GBSeq organism>
<GBSeq taxonomy>Eukaryota ; Metazoa ; Chordata ; Craniata ; V e r t eb r a t a ; Euteleostomi ; Mam
a c t y l a ; Cetacea ; M y s t i c e t i ; B a l a e n o p t e r i d a e ; Balaenoptera</GBSeq taxonomy>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Formats
XML (DTD)
http://www.ncbi.nlm.nih.gov/dtd/NCBI_GBSeq.mod.dtd
<!ELEMENT GBSeq (
GBSeq locus ,
GBSeq length ,
GBSeq strandedness ? ,
GBSeq moltype ,
GBSeq topology ? ,
GBSeq division ,
GBSeq update−date ,
GBSeq create−date ? ,
GBSeq update−r e l e a s e ? ,
GBSeq create−r e l e a s e ? ,
GBSeq definition ,
GBSeq primary−a c c e s s i o n ? ,
GBSeq entry−v e r s i o n ? ,
GBSeq accession−v e r s i o n ? ,
GBSeq other−s e q i d s ? ,
GBSeq secondary−a c c e s s i o n s ? ,
GBSeq project ? ,
GBSeq keywords ? ,
GBSeq segment ? ,
GBSeq source ? ,
GBSeq organism ? ,
GBSeq taxonomy ? ,
GBSeq references ? ,
GBSeq comment ? ,
GBSeq comment−s e t ? ,
GBSeq struc−comments ? ,
( . . . )
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
E-Utilities
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
GI
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
GI
http://www.ncbi.nlm.nih.gov/news/
03-02-2016-phase-out-of-GI-numbers/ : ”NCBI is phasing
out sequence GIs - use Accession.Version instead!”
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
E-Utilities
Set of seven server-side programs that provide a stable interface to
the search, retrieval, and linking functions of the Entrez system,
using a fixed URL syntax.
The output provided by the E-Utilities is in XML format,
sometimes JSON, (...)
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Entrez Direct
http://www.ncbi.nlm.nih.gov/books/NBK179288/ ”Entrez
Direct (EDirect) is an advanced method for accessing the NCBI’s
set of interconnected databases (publication, sequence, structure,
gene, variation, expression, etc.) from a UNIX terminal window.
Functions take search terms from command-line arguments.
Individual operations are combined to build multi-step queries.
Record retrieval and formatting normally complete the process.”
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EInfo
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EInfo
Provides a list of the names of all valid Entrez databases.
Provides statistics for a single database, including lists of indexing
fields and available link names.
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EInfo
Base URL:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EInfo
XML Ouput
https:
//eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi
<e I n f o R e s u l t>
<DbList>
<DbName>pubmed</DbName>
<DbName>p r o t e i n</DbName>
<DbName>nuccore</DbName>
<DbName>n u c l e o t i d e</DbName>
<DbName>nucgss</DbName>
<DbName>nucest</DbName>
<DbName>s t r u c t u r e</DbName>
<DbName>genome</DbName>
<DbName>assembly</DbName>
<DbName>gcassembly</DbName>
<DbName>genomeprj</DbName>
<DbName>b i o p r o j e c t</DbName>
<DbName>biosample</DbName>
<DbName>biosystems</DbName>
<DbName>b l a s t d b i n f o</DbName>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EInfo
JSON Ouput
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.
fcgi?retmode=json
{
” header ”: {
” type ”: ” e i n f o ” ,
” v e r s i o n ”: ”0.3”
} ,
” e i n f o r e s u l t ”: {
” d b l i s t ”: [
”pubmed” ,
” p r o t e i n ” ,
” nuccore ” ,
( . . . )
” unigene ” ,
” g e n c o l l ” ,
” gtr ”
]
}
}Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EInfo
Return statistics for a given Entrez database:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi?
db=DbName
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EInfo
Statistics for Pubmed
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.
fcgi?db=pubmed
<?xml v e r s i o n=” 1.0 ”?>
<e I n f o R e s u l t>
<DbInfo>
<DbName>pubmed</DbName>
<MenuName>PubMed</MenuName>
<D e s c r i p t i o n>PubMed b i b l i o g r a p h i c r e c o r d</ D e s c r i p t i o n>
<DbBuild>Build130805 −2117m.4</ DbBuild>
<Count>22974581</Count>
<LastUpdate>2013/08/06 08 :33</ LastUpdate>
<F i e l d L i s t>
( . . . )
<F i e l d>
<Name>UID</Name>
<FullName>UID</FullName>
<D e s c r i p t i o n>Unique number a s s i g n e d to p u b l i c a t i o n</ D e s c r i p t i o n>
<TermCount>0</TermCount>
<IsDate>N</ IsDate>
<I s N u m e r i c a l>Y</ I s N u m e r i c a l>
<SingleToken>Y</ SingleToken>
<H i e r a r c h y>N</ H i e r a r c h y>
<IsHidden>Y</ IsHidden>
</ F i e l d>
<F i e l d>
( . . . )
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EInfo
Statistics for Pubmed
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.
fcgi?db=pubmed&retmode=json
{
” header ”: {
” type ”: ” e i n f o ” ,
” v e r s i o n ”: ”0.3”
} ,
” e i n f o r e s u l t ”: {
” d b i n f o ”: {
”dbname ”: ”pubmed ” ,
”menuname ”: ”PubMed” ,
” d e s c r i p t i o n ”: ”PubMed b i b l i o g r a p h i c r e c o r d ” ,
” d b b u i l d ”: ” Build160921 −2207m.6” ,
” count ”: ”26470199” ,
” l a s t u p d a t e ”: ”2016/09/22 16:32” ,
” f i e l d l i s t ”: [
{
”name ”: ”ALL” ,
” fullname ”: ” A l l F i e l d s ” ,
” d e s c r i p t i o n ”: ” A l l terms from a l l s e a r c h a b l e f i e l d s ” ,
” termcount ”: ”179424126” ,
” i s d a t e ”: ”N” ,
” i s n u m e r i c a l ”: ”N” ,
” s i n g l e t o k e n ”: ”N” ,
” h i e r a r c h y ”: ”N” ,
” i s h i d d e n ”: ”N”
} ,
{
”name ”: ”UID” ,
” fullname ”: ”UID” ,
” d e s c r i p t i o n ”: ” Unique number a s s i g n e d to p u b l i c a t i o n ” ,Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EInfo
With entrez-direct
$ e i n f o −dbs
$ e i n f o −db pubmed
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
GQuery
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
GQuery
Provides the number of records retrieved in all Entrez databases by
a single text query.
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
GQuery
Example
$ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ gquery ? term=t y r a n n o s a u r u s%20rex&retmode
=xml”
<R e s u l t>
<Term>t y r a n n o s a u r u s rex</Term>
<eGQueryResult>
<ResultItem><DbName>pubmed</DbName><MenuName/><Count>41</Count><Status>
Ok</ Status></ ResultItem>
<ResultItem><DbName>pmc</DbName><MenuName/><Count>160</Count><Status>Ok<
/ Status></ ResultItem>
<ResultItem><DbName>mesh</DbName><MenuName/><Count>15</Count><Status>Ok<
/ Status></ ResultItem>
<ResultItem><DbName>books</DbName><MenuName/><Count>179</Count><Status>
Ok</ Status></ ResultItem>
<ResultItem><DbName>pubmedhealth</DbName><MenuName/><Count>21</Count><
Status>Ok</ Status></ ResultItem>
<ResultItem><DbName>omim</DbName><MenuName/><Count>10</Count><Status>Ok<
/ Status></ ResultItem>
<ResultItem><DbName>omia</DbName><MenuName/><Count>0</Count><Status>Term
or Database i s not found</ Status></ ResultItem>
<ResultItem><DbName>n c b i s e a r c h</DbName><MenuName/><Count>1</Count><
Status>Ok</ Status></ ResultItem>
<ResultItem><DbName>nuccore</DbName><MenuName/><Count>0</Count><Status>
Term or Database i s not found</ Status></ ResultItem>
( . . . )
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
GQuery
Transforming to HTML using XSLT
The XSLT stylesheet. https://raw.githubusercontent.com/
lindenb/courses/master/about.ncbi/gquery2html.xsl
1 <?xml v e r s i o n=’ 1.0 ’ encoding=”UTF−8” ?>
2 <x s l : s t y l e s h e e t x m l n s : x s l=’ h t t p : //www. w3 . org /1999/XSL/ Transform ’ v e r s i o n=’ 1.0 ’>
3 <x s l : o u t p u t method=” html ”/>
4
5 <x s l : t e m p l a t e match=”/”><html><body>
6 <x s l : a p p l y −templates s e l e c t=” R e s u l t ”/>
7 </body></ html></ x s l : t e m p l a t e>
8
9 <x s l : t e m p l a t e match=” R e s u l t ”>
10 <t a b l e><c a p t i o n><x s l : v a l u e −of s e l e c t=”Term”/></ c a p t i o n>
11 <t r><th>Database</ th><th>Count</ th><th>Status</ th></ t r>
12 <x s l : a p p l y −templates s e l e c t=” eGQueryResult / ResultItem ”/>
13 </ t a b l e>
14 </ x s l : t e m p l a t e>
15
16 <x s l : t e m p l a t e match=” ResultItem ”>
17 <t r>
18 <td><a>
19 <x s l : a t t r i b u t e name=” h r e f ”>h t t p : //www. ncbi . nlm . nih . gov/<x s l : v a l u e −of s e l e c t=”
DbName”/>?cmd=se arch&amp ; term=<x s l : v a l u e −of s e l e c t=” t r a n s l a t e (/ R e s u l t /Term
, ’ ’ , ’+ ’) ”/></ x s l : a t t r i b u t e>
20 <x s l : v a l u e −of s e l e c t=”DbName”/></a></ td>
21 <td><x s l : v a l u e −of s e l e c t=”Count”/></ td>
22 <td><x s l : v a l u e −of s e l e c t=” Status ”/></ td>
23 </ t r>
24 </ x s l : t e m p l a t e>
25
26 </ x s l : s t y l e s h e e t>
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
GQuery
Transforming to HTML
$ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ gquery ? term=t y r a n n o s a u r u s%20rex&retmode
=xml” |
x s l t p r o c gquery2html . x s l −
<html>
<body>
<t a b l e>
<caption>t y r a n n o s a u r u s rex</ caption>
<t r>
<th>Database</ th>
<th>Count</ th>
<th>Status</ th>
</ t r>
<t r>
<td>
<a h r e f=” h t t p s ://www. ncbi . nlm . nih . gov/pubmed?cmd=s earch&amp ; term=t y r a n n o s a u r u s
</ td>
<td>41</ td>
<td>Ok</ td>
</ t r>
<t r>
<td>
<a h r e f=” h t t p s ://www. ncbi . nlm . nih . gov/pmc?cmd=searc h&amp ; term=t y r a n n o s a u r u s+re
</ td>
<td>160</ td>
<td>Ok</ td>
</ t r>
<t r>
<td>
<a h r e f=” h t t p s ://www. ncbi . nlm . nih . gov/mesh?cmd=sea rch&amp ; term=t y r a n n o s a u r u s+r
</ td>
<td>15</ td>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESearch
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESearch
Provides a list of UIDs matching a text query
Posts the results of a search on the History server
Downloads all UIDs from a dataset stored on the History
server
Combines or limits UID datasets stored on the History server
Sorts sets of UIDs
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESearch
Syntax
Base URL https:
//eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESearch
Searching for ’Mammuthus primigenius’
c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=n u c l e o t i d e&
term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D” |
x m l l i n t −−format −
<e Sea rc hR esu lt>
<Count>684</Count>
<RetMax>20</RetMax>
<RetStart>0</ RetStart>
<I d L i s t>
<Id>507866428</ Id>
<Id>124056416</ Id>
<Id>383843869</ Id>
<Id>383843867</ Id>
<Id>383843865</ Id>
<Id>383843863</ Id>
<Id>383843861</ Id>
<Id>383843859</ Id>
<Id>383843857</ Id>
<Id>383843855</ Id>
<Id>383843853</ Id>
<Id>383843851</ Id>
<Id>383843849</ Id>
<Id>383843847</ Id>
<Id>383843845</ Id>
<Id>157367690</ Id>
<Id>157367676</ Id>
<Id>157367662</ Id>
<Id>157367648</ Id>
<Id>157367634</ Id>
</ I d L i s t>
<T r a n s l a t i o n S e t>
<T r a n s l a t i o n>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESearch
Searching for ’Mammuthus primigenius’ (JSON)
c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=n u c l e o t i d e
&term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D&retmode=j s o n ”
{
” header ”: {
” type ”: ” e s e a r c h ” ,
” v e r s i o n ”: ”0.3”
} ,
” e s e a r c h r e s u l t ”: {
” count ”: ”811” ,
” retmax ”: ”20” ,
” r e t s t a r t ”: ”0” ,
” i d l i s t ”: [
”1059791223” ,
”198241525” ,
”198241523” ,
”198241521” ,
”198241519” ,
”198241517” ,
”198241515” ,
”198241513” ,
”198241511” ,
”198241509” ,
”198241507” ,
”198241505” ,
”198241503” ,
”198241501” ,
”198241499” ,
”198241497” ,
”198241495” ,
”198241493” ,
”198241491” ,Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESearch
the retmax parameter
c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=n u c l e o t i d e&
term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D&retmax=2” |
x m l l i n t −−format −
<e Sea rc hR esu lt>
<Count>684</Count>
<RetMax>2</RetMax>
<RetStart>0</ RetStart>
<I d L i s t>
<Id>507866428</ Id>
<Id>124056416</ Id>
</ I d L i s t>
<T r a n s l a t i o n S e t>
<T r a n s l a t i o n>
<From>”Mammuthus p r i m i g e n i u s ” [ORGN]</From>
<To>”Mammuthus p r i m i g e n i u s ” [ Organism ]</To>
</ T r a n s l a t i o n>
</ T r a n s l a t i o n S e t>
<T r a n s l a t i o n S t a c k>
<TermSet>
<Term>”Mammuthus p r i m i g e n i u s ” [ Organism ]</Term>
<F i e l d>Organism</ F i e l d>
<Count>684</Count>
<Explode>Y</ Explode>
</TermSet>
<OP>GROUP</OP>
</ T r a n s l a t i o n S t a c k>
<QueryTranslation>”Mammuthus p r i m i g e n i u s ” [ Organism ]</ QueryTranslation>
</ e Se ar ch Res ul t>
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESearch
the retstart parameter
c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=n u c l e o t i d e&
term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D&retmax=3&r e t s t a r t =100” |
x m l l i n t −−format −
<e Sea rc hR esu lt>
<Count>684</Count>
<RetMax>3</RetMax>
<RetStart>100</ RetStart>
<I d L i s t>
<Id>300810656</ Id>
<Id>300810655</ Id>
<Id>300810654</ Id>
</ I d L i s t>
<T r a n s l a t i o n S e t>
<T r a n s l a t i o n>
<From>”Mammuthus p r i m i g e n i u s ” [ORGN]</From>
<To>”Mammuthus p r i m i g e n i u s ” [ Organism ]</To>
</ T r a n s l a t i o n>
</ T r a n s l a t i o n S e t>
<T r a n s l a t i o n S t a c k>
<TermSet>
<Term>”Mammuthus p r i m i g e n i u s ” [ Organism ]</Term>
<F i e l d>Organism</ F i e l d>
<Count>684</Count>
<Explode>Y</ Explode>
</TermSet>
<OP>GROUP</OP>
</ T r a n s l a t i o n S t a c k>
<QueryTranslation>”Mammuthus p r i m i g e n i u s ” [ Organism ]</ QueryTranslation>
</ e Se ar ch Res ul t>
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESearch
rettype=retcount
c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=n u c l e o t i d e&
term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D&r e t t y p e=count ” |
x m l l i n t −−format −
<eSearchResult>
<Count>684</Count>
</ eSearchResult>
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESearch
sort=Date Released
c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=
n u c l e o t i d e&term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D&s o r t=Date+Released ”
x m l l i n t −−format −
<eSearchResult><Count>811</Count><RetMax>20</RetMax>
<Id>1033204644</ Id>
<Id>1033204658</ Id>
<Id>1033204672</ Id>
<Id>1033204686</ Id>
<Id>1033204729</ Id>
<Id>1033204771</ Id>
<Id>1033204785</ Id>
<Id>1033204799</ Id>
<Id>1033204813</ Id>
<Id>1033204827</ Id>
<Id>1033204871</ Id>
<Id>1033205124</ Id>
<Id>1033205194</ Id>
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESummary
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESummary
Syntax
Returns document summaries (DocSums) for a list of input
UIDs
Returns DocSums for a set of UIDs stored on the Entrez
History server
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESummary
Syntax
Base URL: https://eutils.ncbi.nlm.nih.gov/entrez/
eutils/esummary.fcgi?db=(DB)&id=(TERM)
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESummary
Retrieve nucleotide gi=507866428
$ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s /esummary . f c g i ?db=
n u c l e o t i d e&i d =507866428”
<eSummaryResult>
<DocSum>
<Id>507866428</ Id>
<Item Name=” Caption ” Type=” S t r i n g ”>KC524742</ Item>
<Item Name=” T i t l e ” Type=” S t r i n g ”>Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin (Mb
<Item Name=” Extra ” Type=” S t r i n g ”>g i |507866428| gb | KC524742 . 1 | [ 5 0 7 8 6 6 4 2 8 ]</ Item>
<Item Name=” Gi ” Type=” I n t e g e r ”>507866428</ Item>
<Item Name=” CreateDate ” Type=” S t r i n g ”>2013/06/15</ Item>
<Item Name=”UpdateDate” Type=” S t r i n g ”>2013/06/21</ Item>
<Item Name=” Flags ” Type=” I n t e g e r ”>0</ Item>
<Item Name=” TaxId ” Type=” I n t e g e r ”>37349</ Item>
<Item Name=” Length ” Type=” I n t e g e r ”>9042</ Item>
<Item Name=” Status ” Type=” S t r i n g ”>l i v e</ Item>
<Item Name=” ReplacedBy ” Type=” S t r i n g ”></ Item>
<Item Name=”Comment” Type=” S t r i n g ”><! [CDATA[ ] ]></ Item>
</DocSum>
</ eSummaryResult>
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESummary
Retrieve nucleotide gi=507866428 in JSON
$ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s /esummary . f c g i ?db=
n u c l e o t i d e&i d =507866428& retmode=j s o n ”
{
” header ”: {
” type ”: ”esummary ” ,
” v e r s i o n ”: ”0.3”
} ,
” r e s u l t ”: {
” u i d s ”: [
”507866428”
] ,
”507866428”: {
” uid ”: ”507866428” ,
” c a p t i o n ”: ”KC524742 ” ,
” t i t l e ”: ”Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin (Mb) gene , p a r
” e x t r a ”: ” g i |507866428| gb | KC524742 . 1 | ” ,
” g i ”: 507866428 ,
” c r e a t e d a t e ”: ”2013/06/15” ,
” updatedate ”: ”2013/06/21” ,
” f l a g s ”: ”” ,
” t a x i d ”: 37349 ,
” s l e n ”: 9042 ,
” biomol ”: ” genomic ” ,
” moltype ”: ”dna ” ,
” topology ”: ” l i n e a r ” ,
” sourcedb ”: ” i n s d ” ,
” s e g s e t s i z e ”: ”” ,
” p r o j e c t i d ”: ”0” ,
( . . . )
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESummary
Retrieve snp rs25
$ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s /esummary . f c g i ?db=snp&i d =25
”
<eSummaryResult>
<DocSum>
<Id>25</ Id>
<Item Name=”SNP ID” Type=” I n t e g e r ”>25</ Item>
<Item Name=”Organism” Type=” S t r i n g ”></ Item>
<Item Name=”ALLELE ORIGIN” Type=” S t r i n g ”></ Item>
<Item Name=”GLOBAL MAF” Type=” S t r i n g ”>0.4913</ Item>
<Item Name=”GLOBAL POPULATION” Type=” S t r i n g ”></ Item>
<Item Name=”GLOBAL SAMPLESIZE” Type=” I n t e g e r ”>0</ Item>
<Item Name=”SUSPECTED” Type=” S t r i n g ”></ Item>
<Item Name=”CLINICAL SIGNIFICANCE” Type=” S t r i n g ”></ Item>
<Item Name=”GENE” Type=” S t r i n g ”>THSD7A</ Item>
<Item Name=”LOCUS ID” Type=” I n t e g e r ”>221981</ Item>
<Item Name=”ACC” Type=” S t r i n g ”>NM 015204 . 2 , NT 007819 .17</ Item>
<Item Name=”CHR” Type=” S t r i n g ”>7</ Item>
<Item Name=”WEIGHT” Type=” I n t e g e r ”>1</ Item>
<Item Name=”HANDLE” Type=” S t r i n g ”>1000GENOMES, BGI , BL ,BUSHMAN,COMPLETE GENOMICS, CSHL−HAPM
<Item Name=”FXN CLASS” Type=” S t r i n g ”>intron−v a r i a n t</ Item>
<Item Name=”VALIDATED” Type=” S t r i n g ”>by−1000G, by−c l u s t e r , by−frequency , by−hapmap</ Item>
<Item Name=”GTYPE” Type=” S t r i n g ”>t r u e</ Item>
<Item Name=”NONREF” Type=” S t r i n g ”>f a l s e</ Item>
<Item Name=”DOCSUM” Type=” S t r i n g ”>HGVS=NC 000007 .13 :g .11584142T&gt ; C, NG 027670 .1 :g .29268
<Item Name=”HET” Type=” I n t e g e r ”>50</ Item>
<Item Name=”SRATE” Type=” I n t e g e r ”>0</ Item>
<Item Name=”TAX ID” Type=” I n t e g e r ”>9606</ Item>
<Item Name=”CHRRPT” Type=” S t r i n g ”>2 5 | 2 | 0 | 1 | 1 | 1 | 7 | NT 007819 .17|11574141|11584142|THSD7A|0
<Item Name=”ORIG BUILD” Type=” I n t e g e r ”>36</ Item>
<Item Name=”UPD BUILD” Type=” I n t e g e r ”>138</ Item>
<Item Name=”CREATEDATE” Type=” S t r i n g ”>2000−09−19 17 :02</ Item>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ESummary
Retrieve pubmed pmid=7939126
$ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s /esummary . f c g i ?db=pubmed&
i d =7939126”
<eSummaryResult>
<DocSum>
<Id>7939126</ Id>
<Item Name=”PubDate” Type=”Date”>1994 Apr</ Item>
<Item Name=”EPubDate” Type=”Date”></ Item>
<Item Name=” Source ” Type=” S t r i n g ”>Sleep</ Item>
<Item Name=” A u t h o r L i s t ” Type=” L i s t ”>
<Item Name=” Author ” Type=” S t r i n g ”>Broughton R</ Item>
<Item Name=” Author ” Type=” S t r i n g ”>B i l l i n g s R</ Item>
<Item Name=” Author ” Type=” S t r i n g ”>Cartwright R</ Item>
<Item Name=” Author ” Type=” S t r i n g ”>Doucette D</ Item>
<Item Name=” Author ” Type=” S t r i n g ”>Edmeads J</ Item>
<Item Name=” Author ” Type=” S t r i n g ”>Edwardh M</ Item>
<Item Name=” Author ” Type=” S t r i n g ”>Ervin F</ Item>
<Item Name=” Author ” Type=” S t r i n g ”>Orchard B</ Item>
<Item Name=” Author ” Type=” S t r i n g ”>H i l l R</ Item>
<Item Name=” Author ” Type=” S t r i n g ”>T u r r e l l G</ Item>
</ Item>
<Item Name=” LastAuthor ” Type=” S t r i n g ”>T u r r e l l G</ Item>
<Item Name=” T i t l e ” Type=” S t r i n g ”>Homicidal somnambulism: a case r e p o r t .</ Item>
<Item Name=”Volume” Type=” S t r i n g ”>17</ Item>
<Item Name=” I s s u e ” Type=” S t r i n g ”>3</ Item>
<Item Name=” Pages ” Type=” S t r i n g ”>253−64</ Item>
<Item Name=” LangList ” Type=” L i s t ”>
<Item Name=”Lang” Type=” S t r i n g ”>E n g l i s h</ Item>
</ Item>
<Item Name=”NlmUniqueID” Type=” S t r i n g ”>7809084</ Item>
<Item Name=”ISSN” Type=” S t r i n g ”>0161−8105</ Item>
<Item Name=”ESSN” Type=” S t r i n g ”>1550−9109</ Item>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EFetch
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EFetch
Syntax
Base URL: https://eutils.ncbi.nlm.nih.gov/entrez/
eutils/efetch.fcgi?db=(db)&id=(ID)
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EFetch
Retrieve nucleotide gi=507866428 as ASN.1
Default https://eutils.ncbi.nlm.nih.gov/entrez/eutils/
efetch.fcgi?db=nucleotide&id=507866428
Seq−e n t r y ::= set {
c l a s s nuc−prot ,
d e s c r {
source {
genome genomic ,
org {
taxname ”Mammuthus p r i m i g e n i u s ” ,
common ” woolly mammoth” ,
db {
{
db ” taxon ” ,
tag
i d 37349 } } ,
orgname {
name
b i no m i al {
genus ”Mammuthus” ,
s p e c i e s ” p r i m i g e n i u s ” } ,
mod {
{
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EFetch
Retrieve nucleotide gi=507866428 as Fasta
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.
fcgi?db=nucleotide&id=507866428&rettype=fasta
>g i |507866428| gb | KC524742 . 1 | Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin
(Mb) gene , p a r t i a l cds
GCACTTGCTTTTTTTGTCTTCTTCAGACCACGACATGGGACTCAGCGACGGGGAATGGGAGTTGGTGTTG
AAAACCTGGGGGAAAGTGGAGGCTGACATCCCGGGCCATGGGCTGGAAGTCTTCGTCAGGTAAAGGAAGA
AATCCTGTGGCCCCCATCACCCACCCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EFetch
Retrieve nucleotide gi=507866428 as TinySeq
https:
//eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?
db=nucleotide&id=507866428&rettype=fasta&retmode=xml
<?xml v e r s i o n=” 1.0 ”?>
<!DOCTYPE TSeqSet PUBLIC ”−//NCBI//NCBI TSeq/EN”
<TSeqSet>
<TSeq>
<TSeq seqtype v a l u e=” n u c l e o t i d e ”/>
<TSeq gi>507866428</ TSeq gi>
<TSeq accver>KC524742 .1</ TSeq accver>
<TSeq taxid>37349</ TSeq taxid>
<TSeq orgname>Mammuthus p r i m i g e n i u s</TSeq orgnam
<T S e q d e f l i n e>Mammuthus p r i m i g e n i u s i s o l a t e CME2
<TSeq length>9042</ TSeq length>
<TSeq sequence>GCACTTGCTTTTTTTGTCTTCTTCAGACCACGA
</TSeq>
</TSeqSet>
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EFetch
Retrieve nucleotide gi=507866428 as Genbank-xml
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.
fcgi?db=nucleotide&id=507866428&retmode=xml
<GBSeq>
<GBSeq locus>KC524742</ GBSeq locus>
<GBSeq length>9042</ GBSeq length>
<GBSeq strandedness>double</ GBSeq strandedness>
<GBSeq moltype>DNA</GBSeq moltype>
<GBSeq topology>l i n e a r</ GBSeq topology>
<GBSeq division>MAM</ GBSeq division>
<GBSeq update−date>21−JUN−2013</GBSeq update−date>
<GBSeq create−date>15−JUN−2013</ GBSeq create−date>
<G B S e q d e f i n i t i o n>Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin (Mb) gene , p a r t i
<GBSeq primary−a c c e s s i o n>KC524742</ GBSeq primary−a c c e s s i o n>
<GBSeq accession−v e r s i o n>KC524742 .1</ GBSeq accession−v e r s i o n>
<GBSeq other−s e q i d s>
<GBSeqid>gb | KC524742 . 1 |</GBSeqid>
<GBSeqid>g i |507866428</GBSeqid>
</ GBSeq other−s e q i d s>
<GBSeq source>Mammuthus p r i m i g e n i u s ( woolly mammoth)</ GBSeq source>
<GBSeq organism>Mammuthus p r i m i g e n i u s</ GBSeq organism>
( . . . )
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EFetch
Retrieve nucleotide gi=507866428 as Genbank
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.
fcgi?db=nucleotide&id=507866428&rettype=gb
LOCUS KC524742 9042 bp DNA l i n e a r MAM 21−JUN−2013
DEFINITION Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin (Mb) gene ,
p a r t i a l cds .
ACCESSION KC524742
VERSION KC524742 .1 GI :507866428
KEYWORDS .
SOURCE Mammuthus p r i m i g e n i u s ( woolly mammoth)
ORGANISM Mammuthus p r i m i g e n i u s
Eukaryota ; Metazoa ; Chordata ; Craniata ; V e r t e br a t a ; Euteleostomi ;
Mammalia ; E u t h e r i a ; A f r o t h e r i a ; Proboscidea ; E l e p h a n t i d a e ;
Mammuthus .
REFERENCE 1 ( bases 1 to 9042)
AUTHORS Mirceta , S . , Signore ,A.V. , Burns , J .M. , Cossins ,A.R. , Campbell ,K. L .
and Berenbrink ,M.
TITLE E v o l u t i o n of mammalian d i v i n g c a p a c i t y t r a c e d by myoglobin net
s u r f a c e charge
JOURNAL Science 340 (6138) , 1234192 (2013)
PUBMED 23766330
REFERENCE 2 ( bases 1 to 9042)
AUTHORS Signore ,A.V. , Campbell ,K. L . and Poinar ,H.N.
TITLE D i r e c t Submission
JOURNAL Submitted (09−JAN−2013) B i o l o g i c a l Sciences , U n i v e r s i t y of
Manitoba , 50 S i f t o n Road , Winnipeg , Manitoba R3T2N2 , Canada
COMMENT ##Assembly−Data−START##
Sequencing Technology : : Sanger dideoxy sequencing
##Assembly−Data−END##
FEATURES Location / Q u a l i f i e r s
source 1 . . 9 0 4 2
/ organism=”Mammuthus p r i m i g e n i u s ”Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EFetch
Efetch works with the ACCESSION NUMBERS
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.
fcgi?db=nucleotide&id=KC524742&rettype=gb
LOCUS KC524742 9042 bp DNA l i n e a r MAM 21−JUN−2013
DEFINITION Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin (Mb) gene ,
p a r t i a l cds .
ACCESSION KC524742
VERSION KC524742 .1 GI :507866428
KEYWORDS .
SOURCE Mammuthus p r i m i g e n i u s ( woolly mammoth)
ORGANISM Mammuthus p r i m i g e n i u s
Eukaryota ; Metazoa ; Chordata ; Craniata ; V e r t e br a t a ; Euteleostomi ;
Mammalia ; E u t h e r i a ; A f r o t h e r i a ; Proboscidea ; E l e p h a n t i d a e ;
Mammuthus .
REFERENCE 1 ( bases 1 to 9042)
AUTHORS Mirceta , S . , Signore ,A.V. , Burns , J .M. , Cossins ,A.R. , Campbell ,K. L .
and Berenbrink ,M.
TITLE E v o l u t i o n of mammalian d i v i n g c a p a c i t y t r a c e d by myoglobin net
s u r f a c e charge
JOURNAL Science 340 (6138) , 1234192 (2013)
PUBMED 23766330
REFERENCE 2 ( bases 1 to 9042)
AUTHORS Signore ,A.V. , Campbell ,K. L . and Poinar ,H.N.
TITLE D i r e c t Submission
JOURNAL Submitted (09−JAN−2013) B i o l o g i c a l Sciences , U n i v e r s i t y of
Manitoba , 50 S i f t o n Road , Winnipeg , Manitoba R3T2N2 , Canada
COMMENT ##Assembly−Data−START##
Sequencing Technology : : Sanger dideoxy sequencing
##Assembly−Data−END##
FEATURES Location / Q u a l i f i e r s
source 1 . . 9 0 4 2
/ organism=”Mammuthus p r i m i g e n i u s ”Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EFetch
Using the WebEnv parameter.
Web environment string returned from a previous ESearch, EPost
or ELink call. When provided, ESearch will post the results of the
search operation to this pre-existing WebEnv.
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EFetch
Using the WebEnv parameter.
Searching extinct species in the NCBI taxonomy (’extinct[PROP]’)
c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?usehistory=y&db=
taxonomy&term=e x t i n c t%5BPROP%5D”
<e Sea rc hR esu lt>
<Count>145</Count>
<RetMax>20</RetMax>
<RetStart>0</ RetStart>
<QueryKey>1</QueryKey>
<WebEnv>NCID 1 75550312 130.14.18.34 9001 1375948145 325582538</WebEnv>
<I d L i s t>
<Id>1225531</ Id>
<Id>1225530</ Id>
<Id>1211276</ Id>
<Id>1211275</ Id>
<Id>1027716</ Id>
<Id>948961</ Id>
<Id>943952</ Id>
<Id>867394</ Id>
<Id>867393</ Id>
<Id>748142</ Id>
<Id>748141</ Id>
<Id>741158</ Id>
<Id>703576</ Id>
<Id>703571</ Id>
<Id>703559</ Id>
<Id>693865</ Id>
<Id>686441</ Id>
<Id>665113</ Id>
<Id>659069</ Id>
<Id>656807</ Id>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EFetch
Using the WebEnv parameter.
Fetch the extinct species in the NCBI taxonomy (’extinct[PROP]’)
using the WebEnv parameter.
$ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=taxonomy&
query key=1&WebEnv=NCID 1 75550312 130.14.18.34 9001 1375948145 325582538&retmode=xml”
<TaxaSet><Taxon>
<TaxId>1225531</ TaxId>
<S c i e n t i f i c N a m e>Equus ovodovi</ S c i e n t i f i c N a m e>
<OtherNames>
<Synonym>Equus ( Sussemionus ) ovodovi</Synonym>
<Name>
<ClassCDE>a u t h o r i t y</ClassCDE>
<DispName>Equus ovodovi Eisenmann &amp ; Sergej , 2011</DispName>
</Name>
</OtherNames>
<ParentTaxId>1225530</ ParentTaxId>
<Rank>s p e c i e s</Rank>
<D i v i s i o n>Mammals</ D i v i s i o n>
<GeneticCode>
<GCId>1</GCId>
<GCName>Standard</GCName>
</ GeneticCode>
<MitoGeneticCode>
( . . . . )
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EPOST
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EPost
Uploads a list of UIDs to the Entrez History server
Appends a list of UIDs to an existing set of UID lists attached
to a Web Environment
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EPost
Post gi to epost
Get a list of gis of extincts animals:
wget −O − ’ h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=
taxonomy&term=e x t i n c t [PROP]& retmax =1000’ |
x m l l i n t −format − |
grep ’<Id >’ |
cut −d ’<’ −f 2 |
cut −d ’>’ −f 2|
t r ”n” ” , ”
output:
1860150 ,1860149 ,1849957 ,1825730 ,1825729 ,1636722 ,1607772 ,1607771 ,1607767 ,1607757 ,1607756
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EPost
Post gi to epost
wget −O − ’ h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / epost . f c g i ?db=taxonomy&
WebEnd=NCID 1 15435144 130 . 1 4 . 2 2 . 2 1 5
9001 1474637318 669113391 0MetA0 S MegaStore F 1&i d
=1860150 ,1860149 ,1849957 ,1825730 ,1825729 ,1636722 ,1607772... ”
Output:
<?xml v e r s i o n=” 1.0 ”?>
<!DOCTYPE ePostResult PUBLIC ”−//NLM//DTD ePostResult , 11 May 2002//EN” ” h t t p : //
www. ncbi . nlm . nih . gov/ e n t r e z / query /DTD/ ePost 020511 . dtd ”>
<ePostResult>
<QueryKey>1</QueryKey>
<WebEnv>NCID 1 15467192 130 . 1 4 . 2 2 . 2 1 5
9001 1474637456 570452194 0MetA0 S MegaStore F 1</WebEnv>
</ ePostResult>
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EPost
Searching in the WebEnv
Search Homo Sapiens in WebEnv ?
c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=taxonomy&
term=Homo%20Sapiens&u s e h i s t o r y=y&WebEnv=NCID 1 75550312 130 . 1 4 . 1 8 . 3 4
9001 1375948145 325582538&query key=1”
<e Sea rc hR esu lt>
<Count>0</Count>
<RetMax>0</RetMax>
<RetStart>0</ RetStart>
<QueryKey>8</QueryKey>
<WebEnv>NCID 1 75550312 130 . 1 4 . 1 8 . 3 4 9001 1375948145 325582538</WebEnv>
<I d L i s t />
<T r a n s l a t i o n S e t />
<T r a n s l a t i o n S t a c k>
<OP>GROUP</OP>
<TermSet>
<Term>homo s a p i e n s [ A l l Names ]</Term>
<F i e l d>A l l Names</ F i e l d>
<Count>0</Count>
<Explode>N</ Explode>
</TermSet>
<OP>AND</OP>
</ T r a n s l a t i o n S t a c k>
<QueryTranslation>(#2) AND homo s a p i e n s [ A l l Names ]</ QueryTranslation>
</ e Se ar ch Res ul t>
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EPost
Searching in the WebEnv
Search Tyranosaurus in WebEnv ?
$ c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=
taxonomy&term=Tyrannosaurus&u s e h i s t o r y=y&WebEnv=NCID 1 75550312 130 . 1 4 . 1 8 . 3 4
9001 1375948145 325582538&query key=1”
<e Sea rc hR esu lt>
<Count>1</Count>
<RetMax>1</RetMax>
<RetStart>0</ RetStart>
<QueryKey>9</QueryKey>
<WebEnv>NCID 1 75550312 130 . 1 4 . 1 8 . 3 4 9001 1375948145 325582538</WebEnv>
<I d L i s t>
<Id>436494</ Id>
</ I d L i s t>
<T r a n s l a t i o n S e t />
<T r a n s l a t i o n S t a c k>
<OP>GROUP</OP>
<TermSet>
<Term>Tyrannosaurus [ A l l Names ]</Term>
<F i e l d>A l l Names</ F i e l d>
<Count>1</Count>
<Explode>N</ Explode>
</TermSet>
<OP>AND</OP>
</ T r a n s l a t i o n S t a c k>
<QueryTranslation>(#2) AND Tyrannosaurus [ A l l Names ]</ QueryTranslation>
</ e Se ar ch Res ul t>
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
EDirect: combining tools
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Piping Edirect
esearch −db taxonomy −query ” Tyrannosaurus ” |
e f e t c h −format xml
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Piping Edirect
esearch −db pubmed −query ” Tyrannosaurus ” |
e f i l t e r −mindate 2005 |
e f e t c h −format docsum |
x t r a c t −pattern DocumentSummary 
−element MedlineCitation /PMID 
−element Id S o r t F i r s t A u t h o r
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Elink
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Elink
Returns UIDs linked to an input set of UIDs in either the
same or a different Entrez database
Returns UIDs linked to other UIDs in the same Entrez
database that match an Entrez query
Checks for the existence of Entrez links for a set of UIDs
within the same database
Lists the available links for a UID
Lists LinkOut URLs and attributes for a set of UIDs
Lists hyperlinks to primary LinkOut providers for a set of UIDs
Creates hyperlinks to the primary LinkOut provider for a single
UID
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Elink
Base URL:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
ELink
Searching the pubmed records associated to sequence gi:507866428
h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e l i n k . f c g i ? dbfrom=n u c l e o t i d e&db=
pubmed&i d =507866428&cmd=n e i g h b o r s c o r e
<e L i n k R e s u l t>
<LinkSet>
<DbFrom>nuccore</DbFrom>
<I d L i s t>
<Id>507866428</ Id>
</ I d L i s t>
<LinkSetDb>
<DbTo>pubmed</DbTo>
<LinkName>nuccore pubmed</LinkName>
<Link>
<Id>23766330</ Id>
<Score>0</ Score>
</ Link>
</ LinkSetDb>
</ LinkSet>
</ e L i n k R e s u l t>
$ c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=pubmed&
i d =23766330& r e t t y p e=medline&retmode=t e x t ”
PMID− 23766330
TI − E v o l u t i o n of mammalian d i v i n g c a p a c i t y t r a c e d by myoglobin net s u r f a c e
charge .
PG − 1234192
LID − 10.1126/ s c i e n c e .1234192 [ doi ]
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Transformations
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Efetch
Transforming to SVG
Using the stylesheet
https://github.com/lindenb/xslt-sandbox/blob/master/
stylesheets/bio/ncbi/gb2svg.xsl
x s l t p r o c <( c u r l ” h t t p s :// raw . github . com/ l i n d e n b / x s l t −sandbox / master / s t y l e s h e e t s
/ bio / ncbi / gb2svg . x s l ” ) 
” h t t p s ://www. ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=n u c l e o t i d e&i d
=14971102& retmode=xml&r e t t y p e=gbc”
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Efetch
Transforming to SVG
1 <?xml v e r s i o n=” 1.0 ” encoding=”UTF−8”?>
2 <s v g : s v g xmlns:svg=” h t t p : //www. w3 . org /2000/ svg ” h e i g h t=”121” width=”920” s t y l e=”
stroke−width:1px ; ”>
3 <s v g : t i t l e>Human r o t a v i r u s segment 7 NSP3 gene , complete cds</ s v g : t i t l e>
4 <s v g : d e f s>
5 <s v g : l i n e a r G r a d i e n t x1=”0%” y1=”0%” x2=”0%” y2=”100%” i d=” grad ”>
6 <s v g : s t o p o f f s e t=”5%” stop−c o l o r=” black ”/>
7 <s v g : s t o p o f f s e t=”50%” stop−c o l o r=” whitesmoke ”/>
8 <s v g : s t o p o f f s e t=”95%” stop−c o l o r=” black ”/>
9 </ s v g : l i n e a r G r a d i e n t>
10 <s v g : l i n e a r G r a d i e n t x1=”0%” y1=”0%” x2=”0%” y2=”100%” i d=”
v e r t i c a l b o d y g r a d i e n t ”>
11 <s v g : s t o p o f f s e t=”5%” stop−c o l o r=” white ”/>
12 <s v g : s t o p o f f s e t=”95%” stop−c o l o r=” l i g h t g r a y ”/>
13 </ s v g : l i n e a r G r a d i e n t>
14 </ s v g : d e f s>
15 <s v g : s t y l e type=” t e x t / c s s ”/>
16 <s v g : g>
17 <s v g : g transform=” t r a n s l a t e (0 ,0) ”>
18 <s v g : r e c t x=”0” y=”0” width=”920” h e i g h t=”120” f i l l =” u r l (#
v e r t i c a l b o d y g r a d i e n t ) ” s t r o k e=” black ”/>
19 <s v g : t e x t s t y l e=” c o l o r : r e d ; font−s i z e : 3 5 p x ; ” x=”10” y=”35”>Human r o t a v i r u s
segment 7 NSP3 gene , complete cds</ s v g : t e x t>
20 <s v g : g>
21 <s v g : r e c t x=”10” y=”40” width=”900” h e i g h t=”18” s t y l e=” f i l l : u r l (#grad ) ;
s t r o k e : b l a c k ; ” t i t l e=” 1 . . 1 0 7 4 ”/>
22 <s v g : t e x t y=”54” x=”460” text−anchor=” middle ”><s v g : t s p a n s t y l e=” font−
w e i g h t : b o l d ; ”>source</ s v g : t s p a n><s v g : t s p a n x m l n s : x s i=” h t t p : //www. w3
. org /2001/XMLSchema−i n s t a n c e ” x m l n s : x l i n k=” h t t p : //www. w3 . org /1999/
x l i n k ” font−weight=” bold ”>organism</ s v g : t s p a n>:Human r o t a v i r u s A <
s v g : t s p a n x m l n s : x s i=” h t t p : //www. w3 . org /2001/XMLSchema−i n s t a n c e ”
x m l n s : x l i n k=” h t t p : //www. w3 . org /1999/ x l i n k ” font−weight=” bold ”>
mol type</ s v g : t s p a n>:genomic RNA <s v g : t s p a n x m l n s : x s i=” h t t p : //www.Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Efetch
Transforming to SVG
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Efetch
Transforming to R
$ c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=pubmed&
term=Tyrannosaurus&u s e h i s t o r y=t r u e ” | x m l l i n t −−format −
$ c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=pubmed&
u s e h i s t o r y=t r u e&WebEnv=NCID 1 52434791 130 . 1 4 . 2 2 . 2 1 5
9001 1375957034 1619786167&query key=1&retmode=xml”
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Efetch
Transforming to R
1 <?xml v e r s i o n=’ 1.0 ’ encoding=”UTF−8” ?>
2 <x s l : s t y l e s h e e t x m l n s : x s l=’ h t t p : //www. w3 . org /1999/XSL/ Transform ’ v e r s i o n=’ 1.0 ’>
3 <x s l : o u t p u t method=” t e x t ”/>
4
5
6 <x s l : t e m p l a t e match=”/”>
7 date2count &l t ;− l i s t ()
8 <x s l : a p p l y −templates s e l e c t=”/ PubmedArticleSet / PubmedArticle [ M e d l i n e C i t a t i o n /
DateCreated / Year ] ”/>
9 df &l t ;− data . frame (
10 Year=as . i n t e g e r ( names ( date2count ) ) ,
11 Count=u n l i s t ( date2count )
12 )
13 png ( ’ jeterpubmed . png ’ )
14 p l o t ( df )
15 t i t l e ( ’ pubmed: count ( a r t i c l e s )=f ( year ) ’ )
16 dev . o f f ()
17 </ x s l : t e m p l a t e>
18
19 <x s l : t e m p l a t e match=” PubmedArticle ”>
20 <x s l : v a r i a b l e name=” year ” s e l e c t=” M e d l i n e C i t a t i o n / DateCreated / Year ”/>
21 date2count [ [ ”<x s l : v a l u e −of s e l e c t=”$ year ”/>” ] ] &l t ;− i f e l s e ( i s . n u l l ( date2count [ [
”<x s l : v a l u e −of s e l e c t=”$ year ”/>” ] ] ) ,1 ,1+ date2count [ [ ”<x s l : v a l u e −of s e l e c t=”
$ year ”/>” ] ] )
22 </ x s l : t e m p l a t e>
23
24 </ x s l : s t y l e s h e e t>
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Efetch
Transforming to R
$ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=pubmed&
u s e h i s t o r y=t r u e&WebEnv=NCID 1 52434791 130 . 1 4 . 2 2 . 2 1 5
9001 1375957034 1619786167&query key=1&retmode=xml” |
x s l t p r o c pubmed2rstats . x s l −
date2count <− l i s t ()
date2count [ [ ”2013” ] ] <− i f e l s e ( i s . n u l l ( date2count [ [ ”2013” ] ] ) ,1 ,1+ date2count [ [ ”
2013” ] ] )
date2count [ [ ”2012” ] ] <− i f e l s e ( i s . n u l l ( date2count [ [ ”2012” ] ] ) ,1 ,1+ date2count [ [ ”
2012” ] ] )
date2count [ [ ”2012” ] ] <− i f e l s e ( i s . n u l l ( date2count [ [ ”2012” ] ] ) ,1 ,1+ date2count [ [ ”
2012” ] ] )
date2count [ [ ”2011” ] ] <− i f e l s e ( i s . n u l l ( date2count [ [ ”2011” ] ] ) ,1 ,1+ date2count [ [ ”
2011” ] ] )
date2count [ [ ”2011” ] ] <− i f e l s e ( i s . n u l l ( date2count [ [ ”2011” ] ] ) ,1 ,1+ date2count [ [ ”
2011” ] ] )
( . . )
df <− data . frame (
Year=as . i n t e g e r ( names ( date2count ) ) ,
Count=u n l i s t ( date2count )
)
png ( ’ jeterpubmed . png ’ )
p l o t ( df )
t i t l e ( ’ pubmed : count ( a r t i c l e s )=f ( year ) ’ )
dev . o f f ()
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Efetch
Transforming to R
$ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=pubmed&
u s e h i s t o r y=t r u e&WebEnv=NCID 1 52434791 130 . 1 4 . 2 2 . 2 1 5
9001 1375957034 1619786167&query key=1&retmode=xml” |
x s l t p r o c pubmed2rstats . x s l − |
R −−no−save
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Generating a JAVA parser
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Using the XML schema
XML Schema for dbSNP
ftp://ftp.ncbi.nlm.nih.gov/snp/specs/docsum_3.4.xsd
<?xml v e r s i o n=” 1.0 ” encoding=”UTF−8”?>
<xsd:schema xmlns:xsd=” h t t p : //www. w3 . org /2001/XMLSchema” xmlns=” h t t p : //www. ncbi . nlm . nih .
ementFormDefault=” q u a l i f i e d ” a t t r i b u t e F o r m D e f a u l t=” u n q u a l i f i e d ”>
<x s d : e l e m e n t name=” ExchangeSet ”>
<x s d : a n n o t a t i o n>
<xsd:documentation>Set of dbSNP refSNP docsums , v e r s i o n 3.4</ xsd:documentation>
</ x s d : a n n o t a t i o n>
<xsd:complexType>
<x s d : s e q u e n c e>
<x s d : e l e m e n t name=” SourceDatabase ” minOccurs=”0”>
<xsd:complexType>
<x s d : a t t r i b u t e name=” t a x I d ” type=” x s d : i n t ” use=” r e q u i r e d ”>
<x s d : a n n o t a t i o n>
<xsd:documentation>NCBI taxonomy ID f o r v a r i a t i o n</ xsd:documentation>
</ x s d : a n n o t a t i o n>
</ x s d : a t t r i b u t e>
<x s d : a t t r i b u t e name=” organism ” type=” x s d : s t r i n g ” use=” r e q u i r e d ”>
<x s d : a n n o t a t i o n>
<xsd:documentation>common name f o r s p e c i e s used as part of database name
</ x s d : a n n o t a t i o n>
</ x s d : a t t r i b u t e>
<x s d : a t t r i b u t e name=”dbSnpOrgAbbr” type=” x s d : s t r i n g ”>
<x s d : a n n o t a t i o n>
<xsd:documentation>organism a b b r e v i a t i o n used i n dbSNP . </ xsd:documentat
</ x s d : a n n o t a t i o n>
</ x s d : a t t r i b u t e>
<x s d : a t t r i b u t e name=” gpipeOrgAbbr ” type=” x s d : s t r i n g ”>
<x s d : a n n o t a t i o n>
<xsd:documentation>organism a b b r e v i a t i o n used w i t h i n NCBI genome p i p e l i n
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Using the XML schema
Compiling the XML Schema for dbSNP with XJC
$ x j c −d . ” f t p :// f t p . ncbi . nlm . nih . gov/ snp / specs /docsum 3 . 4 . xsd ”
p a r s i n g a schema . . .
comp iling a schema . . .
h t t p s / www ncbi nlm nih gov / snp /docsum/ Assay . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/ Assembly . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/BaseURL . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/Component . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/ ExchangeSet . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/ FxnSet . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/MapLoc . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/ ObjectFactory . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/ PrimarySequence . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/Rs . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/ RsLinkout . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/ RsStruct . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/Ss . j a v a
h t t p s / www ncbi nlm nih gov / snp /docsum/ package−i n f o . j a v a
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Using the XML schema
Compiling the XML Schema for dbSNP with XJC
Search the non-genomic rs# in dbSNP.
1 import h t t p s . www ncbi nlm nih gov . snp . docsum . ∗ ;
2 import j a va x . xml . bind . ∗ ;
3 import j a va x . xml . stream . ∗ ;
4 import j a va x . xml . stream . even ts . ∗ ;
5 c l a s s ParseDbSnp
6 {
7 p u b l i c s t a t i c void main ( S t r i n g [ ] args ) throws Exception
8 {
9 JAXBContext jaxbCtxt=JAXBContext . newInstance ( ” h t t p s . www ncbi nlm nih gov
. snp . docsum” ) ;
10 Unmarshaller u n m a r s h a l l e r=jaxbCtxt . c r e a t e U n m a r s h a l l e r () ;
11 XMLInputFactory i f a c t o r y = XMLInputFactory . newInstance () ;
12 XMLEventReader r= i f a c t o r y . createXMLEventReader ( System . i n ) ;
13 while ( r . hasNext () )
14 {
15 XMLEvent evt=r . peek () ;
16 i f ( ! ( evt . i s S t a r t E l e m e n t () && evt . asStartElement () . getName () .
g e t L o c a l P a r t () . e q u a l s ( ”Rs” ) ) )
17 {
18 evt=r . nextEvent () ;
19 continue ;
20 }
21
22 Rs r s=u n m a r s h a l l e r . unmarshal ( r , Rs . c l a s s ) . getValue () ;
23 i f ( ” genomic ” . e q u a l s ( r s . getMolType () ) ) continue ;
24 System . out . p r i n t l n ( ” r s ”+r s . getRsId ()+” ”+r s . getMolType () ) ;
25 }
26 r . c l o s e () ;
27 }
28 }
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Using the XML schema
Compiling the XML Schema for dbSNP with XJC
compile...
$ j a v a c ParseDbSnp . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum /∗. j a v a
and run...
$ c u r l −s ” f t p :// f t p . ncbi . nih . gov/ snp / organisms /human 9606/XML/ ds ch1 . xml . gz” |
gunzip −c |
j a v a ParseDbSnp
rs701 cDNA
rs860 cDNA
rs861 cDNA
rs862 cDNA
rs863 cDNA
rs864 cDNA
rs865 cDNA
rs866 cDNA
rs877 cDNA
rs878 cDNA
rs879 cDNA
rs880 cDNA
rs882 cDNA
rs883 cDNA
rs884 cDNA
rs885 cDNA
rs886 cDNA
rs913 cDNA
rs945 cDNA
rs946 cDNA
( . . . )
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
NCBI EBot
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
NCBI EBot
URL
https://www.ncbi.nlm.nih.gov/Class/PowerTools/eutils/
ebot/ebot.cgi
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
NCBI EBot
Sample output
#!/ usr / bin / p e r l
( . . . )
# PUBLIC DOMAIN NOTICE
# N a t i o n a l Center f o r Biotechnology I n f o r m a t i o n
use LWP: : Simple ;
use LWP: : UserAgent ;
use Net : : FTP;
my $delay = 0;
my $maxdelay = 3;
my $base = ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s /” ;
$params{email} = ”nobody@nowhere . com” ;
$params{db} = ” nuccore ” ;
$params{ t o o l } = ” ebot ” ;
$params{term} = ”Mammuthus+p r i m i g e n i u s [ORGN] ” ;
%params = e s e a r c h(%params ) ;
$params{retmode} = ”xml” ;
$params{ o u t f i l e } = ” r e s u l t . xml” ;
$params{ r e t t y p e } = ” n a t i v e ” ;
e f e t c h b a t c h (%params ) ;
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
BLAST
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Standalone Blast
Downloading
Standalone tools are available at ftp://ftp.ncbi.nlm.nih.gov/
blast/executables/blast+/LATEST/
#add BLAST to your path
export PATH=${PATH}:/ path / to / ncbi−blast −2.2.28+/ bin
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Standalone Blast
Download a sample
apis mellifera proteins
c u r l −o p r o t e i n . fa . gz 
” f t p :// f t p . ncbi . nih . gov/genomes/ A p i s m e l l i f e r a / p r o t e i n / p r o t e i n . fa . gz”
gunzip p r o t e i n . fa . gz
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Standalone Blast
Create a Blast database with makeblastdb
Getting help...
$ makeblastdb −help
( . . . )
−dbtype <String , ‘ nucl ’ , ‘ prot ’>
Molecule type of t a r g e t db
−in <F i l e I n >
Input f i l e / database name
Default = ‘−’
−i n p u t t y p e <String , ‘ asn1 bin ’ , ‘ asn1 txt ’ , ‘ blast
Type of the data s p e c i f i e d in i n p u t f i l e
Default = ‘ fasta ’
( . . )
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Standalone Blast
Create a Blast database with makeblastdb
Create the BLAST database:
$ makeblastdb −in p r o t e i n . fa −dbtype prot
B u i l d i n g a new DB, c u r r e n t time : 09/02/2013 18:29:38
New DB name : p r o t e i n . fa
New DB t i t l e : p r o t e i n . fa
Sequence type : Protein
Keep Linkouts : T
Keep MBits : T
Maximum f i l e s i z e : 1000000000B
Adding sequences from FASTA; added 10570 sequences
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Standalone Blast
Query a Blast database with blastp
Get help:
$ b l a s t p −help
( . . . )
−query <F i l e I n >
Input f i l e name
Default = ‘−’
−db <String >
BLAST database name
( . . . )
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Standalone Blast
Blast human EIF4G1 gi:187956781
$ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=p r o t e i n&
r e t t y p e=f a s t a&i d =187956781” |
b l a s t p −db p r o t e i n . fa
Query= g i |187956781| gb | AAI40897 . 1 | EIF4G1 p r o t e i n [Homo s a p i e n s ]
( . . . )
Score E
Sequences producing s i g n i f i c a n t alignments : ( B i t s ) Value
g i |328782175| r e f | XP 394628 . 4 | PREDICTED : e u k a r y o t i c t r a n s l a t i o n . . . 189 4e−49
g i |328779480| r e f | XP 003249661 . 1 | PREDICTED : h y p o t h e t i c a l p r o t e i . . . 38.1 0.017
g i |110762568| r e f | XP 001121713 . 1 | PREDICTED : h y p o t h e t i c a l p r o t e i . . . 38.1 0.018
( . . . )
> g i |328782175| r e f | XP 394628 . 4 | PREDICTED : e u k a r y o t i c t r a n s l a t i o n
i n i t i a t i o n f a c t o r 4 gamma 2− l i k e [ Apis m e l l i f e r a ]
Length=899
Score = 189 b i t s (479) , Expect = 4e−49, Method : Compositional matrix a d j u s t .
I d e n t i t i e s = 115/319 (36%) , P o s i t i v e s = 175/319 (55%) , Gaps = 39/319 (12%)
Query 717 KEPRKIIATVLMTEDIKLNKAEKAWKPSS−−KRTAADKDRGEEDADGSKTQDLFRRVRSI 774
++P + +++ +DI+ E+ W P S +R A + S+ +FR+VR I
S b j c t 22 RKPSETTVGLVIKDDIRSLSTEQRWIPPSTLRRDALTPE−−−−−−−−SRNNFIFRKVRGI 73
Query 775 LNKLTPQMFQQLMKQVTQLAIDTEERLKGVIDLIFEKAISEPNFSVAYANMCRCL−−−−− 829
LNKLTP+ F +L + + ++++ LKGVI LIFEKA+ EP +S YA +C+ L
S b j c t 74 LNKLTPEKFAKLSNDLLNVELNSDVILKGVIFLIFEKALDEPKYSSMYAQLCKRLSDEAA 133
Query 830 −MALKVPTTEKPTVTVNFRKLLLNRCQKEFEKDKDDDEVFEKKQKEMDEAATAEERGRLK 888
K E F LLL++C+ EFE E FE + DE EE
S b j c t 134 NFEPKKALIESQKGQSTFTFLLLSKCRDEFENRSKASEAFENQ−−−−DELGPEEE−−−−− 184Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
Standalone Blast
Blast human EIF4G1 gi:187956781 , ouput XML
$ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=p r o t e i n&
r e t t y p e=f a s t a&i d =187956781” |
b l a s t p −db p r o t e i n . fa −outfmt 5
( . . . )
<H i t h s p s>
<Hsp>
<Hsp num>1</Hsp num>
<Hsp bit−s c o r e>189.119</ Hsp bit−s c o r e>
<Hsp score>479</ Hsp score>
<Hsp evalue>3.78314 e−49</ Hsp evalue>
<Hsp query−from>717</ Hsp query−from>
<Hsp query−to>1017</ Hsp query−to>
<Hsp hit−from>22</ Hsp hit−from>
<Hsp hit−to>319</ Hsp hit−to>
<Hsp query−frame>0</ Hsp query−frame>
<Hsp hit−frame>0</ Hsp hit−frame>
<H s p i d e n t i t y>115</ H s p i d e n t i t y>
<H s p p o s i t i v e>175</ H s p p o s i t i v e>
<Hsp gaps>39</ Hsp gaps>
<Hsp align−l e n>319</ Hsp align−l e n>
<Hsp qseq>KEPRKIIATVLMTEDIKLNKAEKAWKPSS−−KRTAADKDRGEEDADGSKTQDLFRRVRSILNKLTPQMFQQ
IARRRSLGNIKFIGELFKLKMLTEAIMHDCVVKLL−−−−−−−−KNHDEESLECLCRLLTTIGKDLDFEKAKPRMDQYFNQMEKIIKEKK
<Hsp hseq>RKPSETTVGLVIKDDIRSLSTEQRWIPPSTLRRDALTPE−−−−−−−−SRNNFIFRKVRGILNKLTPEKFAKLS
VAKRKMLGNIKFIGELGKLGIVSETILHRCILQLLEKKRRRRSRGDTAEDIECLCQIMRTCGRILDSDKGRGLMDQYFKRMNSLAESRD
<Hsp midline>++P + +++ +DI+ E+ W P S +R A + S+ +FR+VR ILNKLTP+ F
+ + ++++ LKGVI LIFEKA+ EP +S YA +C+ L K E F LLL++C+ EFE
E FE + DE EE E
R +A+R+ LGNIKFIGEL KL +++E I+H C+++LL + E +ECLC+++ T G+ LD +K + MDQYF +M
+ + + RI+FML+DV++LR WVPR+ +GP I+QI + E</ Hsp midline>
</Hsp>
( . . . )Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
NCBI URL-API Blast
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
NCBI URL-API Blast
https://www.ncbi.nlm.nih.gov/blast/Doc/urlapi.html
$ c u r l ” h t t p s ://www. ncbi . nlm . nih . gov/ b l a s t / B l a s t . c g i ?CMD=Put&QUERY=PAERLMERKADIE
&DATABASE=nr&PROGRAM=b l a s t p&FILTER=L&HITLIST SZE=500”
( . . . )
<!−−QBlastInfoBegin
RID = 1NRYGX9K014
RTOE = 29
QBlastInfoEnd
−−>
( . . . )
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
The End
Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour

More Related Content

What's hot

Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchDavid Ruau
 
Biological databases
Biological databasesBiological databases
Biological databasesQamar iqbal
 
Biological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usabilityBiological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usabilityLars Juhl Jensen
 
BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.BITS
 
140127 rtg phased pedigree analyses
140127 rtg phased pedigree analyses140127 rtg phased pedigree analyses
140127 rtg phased pedigree analysesGenomeInABottle
 
UNL UCARE Summer Symposium Poster
UNL UCARE Summer Symposium PosterUNL UCARE Summer Symposium Poster
UNL UCARE Summer Symposium PosterNichole Leacock
 
100505 koenig biological_databases
100505 koenig biological_databases100505 koenig biological_databases
100505 koenig biological_databasesMeetika Gupta
 
The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesReece Hart
 
Kim Pruitt biocuration2015
Kim Pruitt biocuration2015Kim Pruitt biocuration2015
Kim Pruitt biocuration2015Kim D. Pruitt
 
Computational Resources In Infectious Disease
Computational Resources In Infectious DiseaseComputational Resources In Infectious Disease
Computational Resources In Infectious DiseaseJoão André Carriço
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GenomeInABottle
 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesYasset Perez-Riverol
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim D. Pruitt
 
Ruby on bioinformatics
Ruby on bioinformaticsRuby on bioinformatics
Ruby on bioinformaticsTse-Ching Ho
 
Biological Database Systems
Biological Database SystemsBiological Database Systems
Biological Database SystemsDenis Shestakov
 

What's hot (20)

Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usabilityBiological databases: Challenges in organization and usability
Biological databases: Challenges in organization and usability
 
BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.
 
Gen bank
Gen bankGen bank
Gen bank
 
140127 rtg phased pedigree analyses
140127 rtg phased pedigree analyses140127 rtg phased pedigree analyses
140127 rtg phased pedigree analyses
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Bioinformatica 06-10-2011-t2-databases
Bioinformatica 06-10-2011-t2-databasesBioinformatica 06-10-2011-t2-databases
Bioinformatica 06-10-2011-t2-databases
 
UNL UCARE Summer Symposium Poster
UNL UCARE Summer Symposium PosterUNL UCARE Summer Symposium Poster
UNL UCARE Summer Symposium Poster
 
100505 koenig biological_databases
100505 koenig biological_databases100505 koenig biological_databases
100505 koenig biological_databases
 
The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment Discrepancies
 
Kim Pruitt biocuration2015
Kim Pruitt biocuration2015Kim Pruitt biocuration2015
Kim Pruitt biocuration2015
 
Computational Resources In Infectious Disease
Computational Resources In Infectious DiseaseComputational Resources In Infectious Disease
Computational Resources In Infectious Disease
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005
 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata files
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015
 
Ruby on bioinformatics
Ruby on bioinformaticsRuby on bioinformatics
Ruby on bioinformatics
 
Whole exome sequencing(wes)
Whole exome sequencing(wes)Whole exome sequencing(wes)
Whole exome sequencing(wes)
 
Biological Database Systems
Biological Database SystemsBiological Database Systems
Biological Database Systems
 
Ensembl genome
Ensembl genomeEnsembl genome
Ensembl genome
 

Viewers also liked

File formats for Next Generation Sequencing
File formats for Next Generation SequencingFile formats for Next Generation Sequencing
File formats for Next Generation SequencingPierre Lindenbaum
 
Building a Simple LIMS with the Eclipse Modeling Framework (EMF) ,my notebook
Building a Simple LIMS with the Eclipse Modeling Framework (EMF) ,my notebookBuilding a Simple LIMS with the Eclipse Modeling Framework (EMF) ,my notebook
Building a Simple LIMS with the Eclipse Modeling Framework (EMF) ,my notebookPierre Lindenbaum
 
"Mon make à moi", (tout sauf Galaxy)
"Mon make à moi", (tout sauf Galaxy)"Mon make à moi", (tout sauf Galaxy)
"Mon make à moi", (tout sauf Galaxy)Pierre Lindenbaum
 
How to make a monkey: functional adaptation in the primate genome
How to make a monkey: functional adaptation in the primate genomeHow to make a monkey: functional adaptation in the primate genome
How to make a monkey: functional adaptation in the primate genomeRutger Vos
 
AM Career Marketing OHSU RIPSS 2014
AM Career Marketing OHSU RIPSS 2014AM Career Marketing OHSU RIPSS 2014
AM Career Marketing OHSU RIPSS 2014Jackie Wirz, PhD
 
NGP Retreat Open Science 2015
NGP Retreat Open Science 2015NGP Retreat Open Science 2015
NGP Retreat Open Science 2015Jackie Wirz, PhD
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Sreekanth Gali
 
Bioinformatics issues and challanges presentation at s p college
Bioinformatics  issues and challanges  presentation at s p collegeBioinformatics  issues and challanges  presentation at s p college
Bioinformatics issues and challanges presentation at s p collegeSKUASTKashmir
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBIgeetikaJethra
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...VHIR Vall d’Hebron Institut de Recerca
 

Viewers also liked (13)

File formats for Next Generation Sequencing
File formats for Next Generation SequencingFile formats for Next Generation Sequencing
File formats for Next Generation Sequencing
 
Building a Simple LIMS with the Eclipse Modeling Framework (EMF) ,my notebook
Building a Simple LIMS with the Eclipse Modeling Framework (EMF) ,my notebookBuilding a Simple LIMS with the Eclipse Modeling Framework (EMF) ,my notebook
Building a Simple LIMS with the Eclipse Modeling Framework (EMF) ,my notebook
 
Make
MakeMake
Make
 
"Mon make à moi", (tout sauf Galaxy)
"Mon make à moi", (tout sauf Galaxy)"Mon make à moi", (tout sauf Galaxy)
"Mon make à moi", (tout sauf Galaxy)
 
How to make a monkey: functional adaptation in the primate genome
How to make a monkey: functional adaptation in the primate genomeHow to make a monkey: functional adaptation in the primate genome
How to make a monkey: functional adaptation in the primate genome
 
AM Career Marketing OHSU RIPSS 2014
AM Career Marketing OHSU RIPSS 2014AM Career Marketing OHSU RIPSS 2014
AM Career Marketing OHSU RIPSS 2014
 
NGP Retreat Open Science 2015
NGP Retreat Open Science 2015NGP Retreat Open Science 2015
NGP Retreat Open Science 2015
 
Introduction to Linux
Introduction to LinuxIntroduction to Linux
Introduction to Linux
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02
 
Bioinformatics issues and challanges presentation at s p college
Bioinformatics  issues and challanges  presentation at s p collegeBioinformatics  issues and challanges  presentation at s p college
Bioinformatics issues and challanges presentation at s p college
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
NCBI
NCBINCBI
NCBI
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
 

Similar to NCBI Entrez API Guide for Advanced Bioinformatics

Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchAnshika Bansal
 
Examining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingExamining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingStephen Turner
 
That's not what I meant! - Fran Alexander
That's not what I meant! - Fran Alexander That's not what I meant! - Fran Alexander
That's not what I meant! - Fran Alexander Incisive_Events
 
Toolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGSToolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGSMirko Rossi
 
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...Alejandra Gonzalez-Beltran
 
IPK - Reproducible research - To infinity
IPK - Reproducible research - To infinityIPK - Reproducible research - To infinity
IPK - Reproducible research - To infinityPeterMorrell4
 
Formats de données en biologie
Formats de données en biologieFormats de données en biologie
Formats de données en biologiepierrepo
 
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...Diego Molla-Aliod
 
Miguel Foronda T3chfest
Miguel Foronda T3chfestMiguel Foronda T3chfest
Miguel Foronda T3chfestMiguel Foronda
 
Thesis def
Thesis defThesis def
Thesis defJay Vyas
 
B.sc biochem i bobi u 2 database
B.sc biochem i bobi u 2 databaseB.sc biochem i bobi u 2 database
B.sc biochem i bobi u 2 databaseRai University
 

Similar to NCBI Entrez API Guide for Advanced Bioinformatics (20)

2012 03 01_bioinformatics_ii_les1
2012 03 01_bioinformatics_ii_les12012 03 01_bioinformatics_ii_les1
2012 03 01_bioinformatics_ii_les1
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
Examining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingExamining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencing
 
That's not what I meant! - Fran Alexander
That's not what I meant! - Fran Alexander That's not what I meant! - Fran Alexander
That's not what I meant! - Fran Alexander
 
Toolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGSToolbox for bacterial population analysis using NGS
Toolbox for bacterial population analysis using NGS
 
Thesis biobix
Thesis biobixThesis biobix
Thesis biobix
 
2014 naples
2014 naples2014 naples
2014 naples
 
EB-eye Back End
EB-eye Back EndEB-eye Back End
EB-eye Back End
 
2014 ucl
2014 ucl2014 ucl
2014 ucl
 
01 Introduction To Dbms
01 Introduction To Dbms01 Introduction To Dbms
01 Introduction To Dbms
 
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
Seminario en CIFASIS, Rosario, Argentina - Seminar in CIFASIS, Rosario, Argen...
 
Bioinformatica t2-databases
Bioinformatica t2-databasesBioinformatica t2-databases
Bioinformatica t2-databases
 
IPK - Reproducible research - To infinity
IPK - Reproducible research - To infinityIPK - Reproducible research - To infinity
IPK - Reproducible research - To infinity
 
Formats de données en biologie
Formats de données en biologieFormats de données en biologie
Formats de données en biologie
 
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...
Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific S...
 
20120423.NGS.Rennes
20120423.NGS.Rennes20120423.NGS.Rennes
20120423.NGS.Rennes
 
Miguel Foronda T3chfest
Miguel Foronda T3chfestMiguel Foronda T3chfest
Miguel Foronda T3chfest
 
Thesis def
Thesis defThesis def
Thesis def
 
2014 villefranche
2014 villefranche2014 villefranche
2014 villefranche
 
B.sc biochem i bobi u 2 database
B.sc biochem i bobi u 2 databaseB.sc biochem i bobi u 2 database
B.sc biochem i bobi u 2 database
 

More from Pierre Lindenbaum

More from Pierre Lindenbaum (20)

Next Generation Sequencing file Formats ( 2017 )
Next Generation Sequencing file Formats ( 2017 )Next Generation Sequencing file Formats ( 2017 )
Next Generation Sequencing file Formats ( 2017 )
 
Mum, I 3D printed a gel comb !
Mum, I 3D printed a gel comb !Mum, I 3D printed a gel comb !
Mum, I 3D printed a gel comb !
 
XML for bioinformatics
XML for bioinformaticsXML for bioinformatics
XML for bioinformatics
 
Sketching 20120412
Sketching 20120412Sketching 20120412
Sketching 20120412
 
Introduction to mongodb for bioinformatics
Introduction to mongodb for bioinformaticsIntroduction to mongodb for bioinformatics
Introduction to mongodb for bioinformatics
 
Biostar17037
Biostar17037Biostar17037
Biostar17037
 
Tweeting for the BioStar Paper
Tweeting for the BioStar PaperTweeting for the BioStar Paper
Tweeting for the BioStar Paper
 
Variation Toolkit
Variation ToolkitVariation Toolkit
Variation Toolkit
 
Bioinformatician 2.0
Bioinformatician 2.0Bioinformatician 2.0
Bioinformatician 2.0
 
Analyzing Exome Data with KNIME
Analyzing Exome Data with KNIMEAnalyzing Exome Data with KNIME
Analyzing Exome Data with KNIME
 
NOTCH2 backstage
NOTCH2 backstageNOTCH2 backstage
NOTCH2 backstage
 
Bioinfo tweets
Bioinfo tweetsBioinfo tweets
Bioinfo tweets
 
Post doctoriales 2011
Post doctoriales 2011Post doctoriales 2011
Post doctoriales 2011
 
20110114 Next Generation Sequencing Course
20110114 Next Generation Sequencing Course20110114 Next Generation Sequencing Course
20110114 Next Generation Sequencing Course
 
MyWordle.java
MyWordle.javaMyWordle.java
MyWordle.java
 
Biblio2.0
Biblio2.0Biblio2.0
Biblio2.0
 
Me & Biohackathon 2010
Me & Biohackathon 2010Me & Biohackathon 2010
Me & Biohackathon 2010
 
An implementation of Jan Aerts' LocusTree
An implementation of Jan Aerts' LocusTreeAn implementation of Jan Aerts' LocusTree
An implementation of Jan Aerts' LocusTree
 
Pourquoi et comment créer son Réseau
Pourquoi et comment créer son RéseauPourquoi et comment créer son Réseau
Pourquoi et comment créer son Réseau
 
Bibliography2.0
Bibliography2.0Bibliography2.0
Bibliography2.0
 

Recently uploaded

College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort ServiceCollege Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort ServiceNehru place Escorts
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...narwatsonia7
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service MumbaiVIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbaisonalikaur4
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiNehru place Escorts
 
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any TimeCall Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Timevijaych2041
 
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...narwatsonia7
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxDr.Nusrat Tariq
 
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...rajnisinghkjn
 
Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Gabriel Guevara MD
 
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service MumbaiLow Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbaisonalikaur4
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girlsnehamumbai
 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...narwatsonia7
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipurparulsinha
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersnarwatsonia7
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowNehru place Escorts
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️saminamagar
 

Recently uploaded (20)

College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort ServiceCollege Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
College Call Girls Vyasarpadi Whatsapp 7001305949 Independent Escort Service
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
 
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service MumbaiVIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
VIP Call Girls Mumbai Arpita 9910780858 Independent Escort Service Mumbai
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
 
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any TimeCall Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
 
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
 
Glomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptxGlomerular Filtration rate and its determinants.pptx
Glomerular Filtration rate and its determinants.pptx
 
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
Noida Sector 135 Call Girls ( 9873940964 ) Book Hot And Sexy Girls In A Few C...
 
Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024
 
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service MumbaiLow Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
Low Rate Call Girls Mumbai Suman 9910780858 Independent Escort Service Mumbai
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
 
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in green park  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
 

NCBI Entrez API Guide for Advanced Bioinformatics

  • 1. Advanced NCBI. The Entrez API https://github.com/lindenb/courses Pierre Lindenbaum @yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.com Institut du Thorax. Nantes. France September 27, 2016 Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 2. NCBI ? What about EBI, ENSEMBL, ... Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 3. Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 4. What will be covered today? : File formats... EInfo, GQuery, ESearch , Esummary, EFetch.. processing XML answer with XSLT: HTML, SVG, R... generating a java parser for dbSNP. NCBI EBot using standalone BLAST Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 5. CURL c u r l ” http :// en . w i k i p e d i a . org / wiki /Main page” wget −O − ” http :// en . w i k i p e d i a . org / wiki /Main page” Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 6. XML Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 7. XSLT Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 8. XSLT Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 9. XSLTPROC x s l t p r o c s t y l e s h e e t . x s l f i l e . xml > r e s u l t . xml Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 10. JSON Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 11. Formats Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 12. Formats Genbank https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch. fcgi?db=nucleotide&id=25&rettype=gb LOCUS X53813 422 bp DNA l i n e a r MAM 22−JUN−1992 DEFINITION Blue Whale heavy s a t e l l i t e DNA. ACCESSION X53813 X17460 VERSION X53813 .1 GI :25 KEYWORDS s a t e l l i t e DNA. SOURCE Balaenoptera musculus ( Blue whale ) ORGANISM Balaenoptera musculus Eukaryota ; Metazoa ; Chordata ; Craniata ; V e r t e br a t a ; Euteleostomi ; Mammalia ; E u t h e r i a ; L a u r a s i a t h e r i a ; C e t a r t i o d a c t y l a ; Cetacea ; M y s t i c e t i ; B a l a e n o p t e r i d a e ; Balaenoptera . REFERENCE 1 ( bases 1 to 422) AUTHORS Arnason ,U. and Widegren ,B. TITLE Composition and chromosomal l o c a l i z a t i o n of cetacean h i g h l y r e p e t i t i v e DNA with s p e c i a l r e f e r e n c e to the blue whale , Balaenoptera musculus JOURNAL Chromosoma 98 (5) , 323−329 (1989) PUBMED 2612291 COMMENT See a l s o <X52700−2> f o r 1 ,760 bp common cetacean component c l o n e s and <X52703−6>,<X53811−4> f o r the 422 bp heavy s a t e l l i t e c l o n e s . FEATURES Location / Q u a l i f i e r s source 1 . . 4 2 2 / organism=”Balaenoptera musculus ” / mol type=”genomic DNA” / d b x r e f=”taxon :9771” / c l o n e =”7” m i s c f e a t u r e 1 . . 4 2 2 / note=”heavy s a t e l l i t e DNA” ORIGIN 1 t a g t t a t t c a a c c t a t c c c a c t c t c t a g a t a c c c c t t a g c acgtaaagga a t a t t a t t t gPierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 13. Formats ASN.1 https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch. fcgi?db=nucleotide&id=25 Seq−e n t r y ::= seq { i d { embl { a c c e s s i o n ”X53813” , v e r s i o n 1 } , g i 25 } , d e s c r { t i t l e ” Blue Whale heavy s a t e l l i t e DNA” , source { org { taxname ” Balaenoptera musculus ” , common ” Blue whale ” , db { { db ” taxon ” , tag i d 9771 } } , orgname { name b i no m i al { genus ” Balaenoptera ” , s p e c i e s ” musculus ” } , l i n e a g e ” Eukaryota ; Metazoa ; Chordata ; Craniata ; Ve r t e b r a t a ; Euteleostomi ; Mammalia ; E u t h e r i a ; L a u r a s i a t h e r i a ; C e t a r t i o d a c t y l a ; Cetacea ; M y s t i c e t i ; B a l a e n o p t e r i d a e ; Balaenoptera ” , gcode 1 , mgcode 2 , d i v ”MAM” } } , subtype {Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 14. Formats ASN.1 (schema) http: //www.ncbi.nlm.nih.gov/data_specs/asn/insdseq.asn INSDSeq ::= SEQUENCE { l o c u s V i s i b l e S t r i n g , l e n g t h INTEGER , s t r a n d e d n e s s V i s i b l e S t r i n g OPTIONAL , moltype V i s i b l e S t r i n g , topology V i s i b l e S t r i n g OPTIONAL , d i v i s i o n V i s i b l e S t r i n g , update−date V i s i b l e S t r i n g , create−date V i s i b l e S t r i n g OPTIONAL , update−r e l e a s e V i s i b l e S t r i n g OPTIONAL , create−r e l e a s e V i s i b l e S t r i n g OPTIONAL , d e f i n i t i o n V i s i b l e S t r i n g , primary−a c c e s s i o n V i s i b l e S t r i n g OPTIONAL , entry−v e r s i o n V i s i b l e S t r i n g OPTIONAL , a c c e s s i o n−v e r s i o n V i s i b l e S t r i n g OPTIONAL , other−s e q i d s SEQUENCE OF INSDSeqid OPTIONAL , secondary−a c c e s s i o n s SEQUENCE OF INSDSecondary−accn OPTIONAL, p r o j e c t V i s i b l e S t r i n g OPTIONAL , keywords SEQUENCE OF INSDKeyword OPTIONAL , segment V i s i b l e S t r i n g OPTIONAL , source V i s i b l e S t r i n g OPTIONAL , organism V i s i b l e S t r i n g OPTIONAL , taxonomy V i s i b l e S t r i n g OPTIONAL , r e f e r e n c e s SEQUENCE OF INSDReference OPTIONAL , comment V i s i b l e S t r i n g OPTIONAL , comment−s e t SEQUENCE OF INSDComment OPTIONAL , struc−comments SEQUENCE OF INSDStrucComment OPTIONAL , primary V i s i b l e S t r i n g OPTIONAL , source−db V i s i b l e S t r i n g OPTIONAL ,Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 15. Formats ASN.1 (tools) DATATOOL Generate C++ data storage classes based on ASN.1 serialization streams. Convert data between ASN.1, XML and JSON formats. Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 16. Formats XML https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch. fcgi?db=nucleotide&id=25&retmode=xml <?xml v e r s i o n=” 1.0 ”?> <!DOCTYPE GBSet PUBLIC ”−//NCBI//NCBI GBSeq/EN” ” h t t p : //www. ncbi . nlm . nih . gov/ dtd /NCBI G <GBSet> <GBSeq> <GBSeq locus>X53813</ GBSeq locus> <GBSeq length>422</ GBSeq length> <GBSeq strandedness>double</ GBSeq strandedness> <GBSeq moltype>DNA</GBSeq moltype> <GBSeq topology>l i n e a r</ GBSeq topology> <GBSeq division>MAM</ GBSeq division> <GBSeq update−date>22−JUN−1992</GBSeq update−date> <GBSeq create−date>13−JUL−1990</ GBSeq create−date> <G B S e q d e f i n i t i o n>Blue Whale heavy s a t e l l i t e DNA</ G B S e q d e f i n i t i o n> <GBSeq primary−a c c e s s i o n>X53813</ GBSeq primary−a c c e s s i o n> <GBSeq accession−v e r s i o n>X53813 .1</ GBSeq accession−v e r s i o n> <GBSeq other−s e q i d s> <GBSeqid>emb| X53813 . 1 |</GBSeqid> <GBSeqid>g i |25</GBSeqid> </ GBSeq other−s e q i d s> <GBSeq secondary−a c c e s s i o n s> <GBSecondary−accn>X17460</GBSecondary−accn> </ GBSeq secondary−a c c e s s i o n s> <GBSeq keywords> <GBKeyword>s a t e l l i t e DNA</GBKeyword> </GBSeq keywords> <GBSeq source>Balaenoptera musculus ( Blue whale )</ GBSeq source> <GBSeq organism>Balaenoptera musculus</ GBSeq organism> <GBSeq taxonomy>Eukaryota ; Metazoa ; Chordata ; Craniata ; V e r t eb r a t a ; Euteleostomi ; Mam a c t y l a ; Cetacea ; M y s t i c e t i ; B a l a e n o p t e r i d a e ; Balaenoptera</GBSeq taxonomy>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 17. Formats XML (DTD) http://www.ncbi.nlm.nih.gov/dtd/NCBI_GBSeq.mod.dtd <!ELEMENT GBSeq ( GBSeq locus , GBSeq length , GBSeq strandedness ? , GBSeq moltype , GBSeq topology ? , GBSeq division , GBSeq update−date , GBSeq create−date ? , GBSeq update−r e l e a s e ? , GBSeq create−r e l e a s e ? , GBSeq definition , GBSeq primary−a c c e s s i o n ? , GBSeq entry−v e r s i o n ? , GBSeq accession−v e r s i o n ? , GBSeq other−s e q i d s ? , GBSeq secondary−a c c e s s i o n s ? , GBSeq project ? , GBSeq keywords ? , GBSeq segment ? , GBSeq source ? , GBSeq organism ? , GBSeq taxonomy ? , GBSeq references ? , GBSeq comment ? , GBSeq comment−s e t ? , GBSeq struc−comments ? , ( . . . ) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 18. E-Utilities Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 19. GI Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 20. GI http://www.ncbi.nlm.nih.gov/news/ 03-02-2016-phase-out-of-GI-numbers/ : ”NCBI is phasing out sequence GIs - use Accession.Version instead!” Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 21. E-Utilities Set of seven server-side programs that provide a stable interface to the search, retrieval, and linking functions of the Entrez system, using a fixed URL syntax. The output provided by the E-Utilities is in XML format, sometimes JSON, (...) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 22. Entrez Direct http://www.ncbi.nlm.nih.gov/books/NBK179288/ ”Entrez Direct (EDirect) is an advanced method for accessing the NCBI’s set of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a UNIX terminal window. Functions take search terms from command-line arguments. Individual operations are combined to build multi-step queries. Record retrieval and formatting normally complete the process.” Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 23. EInfo Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 24. EInfo Provides a list of the names of all valid Entrez databases. Provides statistics for a single database, including lists of indexing fields and available link names. Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 25. EInfo Base URL: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 26. EInfo XML Ouput https: //eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi <e I n f o R e s u l t> <DbList> <DbName>pubmed</DbName> <DbName>p r o t e i n</DbName> <DbName>nuccore</DbName> <DbName>n u c l e o t i d e</DbName> <DbName>nucgss</DbName> <DbName>nucest</DbName> <DbName>s t r u c t u r e</DbName> <DbName>genome</DbName> <DbName>assembly</DbName> <DbName>gcassembly</DbName> <DbName>genomeprj</DbName> <DbName>b i o p r o j e c t</DbName> <DbName>biosample</DbName> <DbName>biosystems</DbName> <DbName>b l a s t d b i n f o</DbName>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 27. EInfo JSON Ouput https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo. fcgi?retmode=json { ” header ”: { ” type ”: ” e i n f o ” , ” v e r s i o n ”: ”0.3” } , ” e i n f o r e s u l t ”: { ” d b l i s t ”: [ ”pubmed” , ” p r o t e i n ” , ” nuccore ” , ( . . . ) ” unigene ” , ” g e n c o l l ” , ” gtr ” ] } }Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 28. EInfo Return statistics for a given Entrez database: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo.fcgi? db=DbName Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 29. EInfo Statistics for Pubmed https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo. fcgi?db=pubmed <?xml v e r s i o n=” 1.0 ”?> <e I n f o R e s u l t> <DbInfo> <DbName>pubmed</DbName> <MenuName>PubMed</MenuName> <D e s c r i p t i o n>PubMed b i b l i o g r a p h i c r e c o r d</ D e s c r i p t i o n> <DbBuild>Build130805 −2117m.4</ DbBuild> <Count>22974581</Count> <LastUpdate>2013/08/06 08 :33</ LastUpdate> <F i e l d L i s t> ( . . . ) <F i e l d> <Name>UID</Name> <FullName>UID</FullName> <D e s c r i p t i o n>Unique number a s s i g n e d to p u b l i c a t i o n</ D e s c r i p t i o n> <TermCount>0</TermCount> <IsDate>N</ IsDate> <I s N u m e r i c a l>Y</ I s N u m e r i c a l> <SingleToken>Y</ SingleToken> <H i e r a r c h y>N</ H i e r a r c h y> <IsHidden>Y</ IsHidden> </ F i e l d> <F i e l d> ( . . . ) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 30. EInfo Statistics for Pubmed https://eutils.ncbi.nlm.nih.gov/entrez/eutils/einfo. fcgi?db=pubmed&retmode=json { ” header ”: { ” type ”: ” e i n f o ” , ” v e r s i o n ”: ”0.3” } , ” e i n f o r e s u l t ”: { ” d b i n f o ”: { ”dbname ”: ”pubmed ” , ”menuname ”: ”PubMed” , ” d e s c r i p t i o n ”: ”PubMed b i b l i o g r a p h i c r e c o r d ” , ” d b b u i l d ”: ” Build160921 −2207m.6” , ” count ”: ”26470199” , ” l a s t u p d a t e ”: ”2016/09/22 16:32” , ” f i e l d l i s t ”: [ { ”name ”: ”ALL” , ” fullname ”: ” A l l F i e l d s ” , ” d e s c r i p t i o n ”: ” A l l terms from a l l s e a r c h a b l e f i e l d s ” , ” termcount ”: ”179424126” , ” i s d a t e ”: ”N” , ” i s n u m e r i c a l ”: ”N” , ” s i n g l e t o k e n ”: ”N” , ” h i e r a r c h y ”: ”N” , ” i s h i d d e n ”: ”N” } , { ”name ”: ”UID” , ” fullname ”: ”UID” , ” d e s c r i p t i o n ”: ” Unique number a s s i g n e d to p u b l i c a t i o n ” ,Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 31. EInfo With entrez-direct $ e i n f o −dbs $ e i n f o −db pubmed Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 32. GQuery Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 33. GQuery Provides the number of records retrieved in all Entrez databases by a single text query. Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 34. GQuery Example $ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ gquery ? term=t y r a n n o s a u r u s%20rex&retmode =xml” <R e s u l t> <Term>t y r a n n o s a u r u s rex</Term> <eGQueryResult> <ResultItem><DbName>pubmed</DbName><MenuName/><Count>41</Count><Status> Ok</ Status></ ResultItem> <ResultItem><DbName>pmc</DbName><MenuName/><Count>160</Count><Status>Ok< / Status></ ResultItem> <ResultItem><DbName>mesh</DbName><MenuName/><Count>15</Count><Status>Ok< / Status></ ResultItem> <ResultItem><DbName>books</DbName><MenuName/><Count>179</Count><Status> Ok</ Status></ ResultItem> <ResultItem><DbName>pubmedhealth</DbName><MenuName/><Count>21</Count>< Status>Ok</ Status></ ResultItem> <ResultItem><DbName>omim</DbName><MenuName/><Count>10</Count><Status>Ok< / Status></ ResultItem> <ResultItem><DbName>omia</DbName><MenuName/><Count>0</Count><Status>Term or Database i s not found</ Status></ ResultItem> <ResultItem><DbName>n c b i s e a r c h</DbName><MenuName/><Count>1</Count>< Status>Ok</ Status></ ResultItem> <ResultItem><DbName>nuccore</DbName><MenuName/><Count>0</Count><Status> Term or Database i s not found</ Status></ ResultItem> ( . . . ) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 35. GQuery Transforming to HTML using XSLT The XSLT stylesheet. https://raw.githubusercontent.com/ lindenb/courses/master/about.ncbi/gquery2html.xsl 1 <?xml v e r s i o n=’ 1.0 ’ encoding=”UTF−8” ?> 2 <x s l : s t y l e s h e e t x m l n s : x s l=’ h t t p : //www. w3 . org /1999/XSL/ Transform ’ v e r s i o n=’ 1.0 ’> 3 <x s l : o u t p u t method=” html ”/> 4 5 <x s l : t e m p l a t e match=”/”><html><body> 6 <x s l : a p p l y −templates s e l e c t=” R e s u l t ”/> 7 </body></ html></ x s l : t e m p l a t e> 8 9 <x s l : t e m p l a t e match=” R e s u l t ”> 10 <t a b l e><c a p t i o n><x s l : v a l u e −of s e l e c t=”Term”/></ c a p t i o n> 11 <t r><th>Database</ th><th>Count</ th><th>Status</ th></ t r> 12 <x s l : a p p l y −templates s e l e c t=” eGQueryResult / ResultItem ”/> 13 </ t a b l e> 14 </ x s l : t e m p l a t e> 15 16 <x s l : t e m p l a t e match=” ResultItem ”> 17 <t r> 18 <td><a> 19 <x s l : a t t r i b u t e name=” h r e f ”>h t t p : //www. ncbi . nlm . nih . gov/<x s l : v a l u e −of s e l e c t=” DbName”/>?cmd=se arch&amp ; term=<x s l : v a l u e −of s e l e c t=” t r a n s l a t e (/ R e s u l t /Term , ’ ’ , ’+ ’) ”/></ x s l : a t t r i b u t e> 20 <x s l : v a l u e −of s e l e c t=”DbName”/></a></ td> 21 <td><x s l : v a l u e −of s e l e c t=”Count”/></ td> 22 <td><x s l : v a l u e −of s e l e c t=” Status ”/></ td> 23 </ t r> 24 </ x s l : t e m p l a t e> 25 26 </ x s l : s t y l e s h e e t> Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 36. GQuery Transforming to HTML $ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ gquery ? term=t y r a n n o s a u r u s%20rex&retmode =xml” | x s l t p r o c gquery2html . x s l − <html> <body> <t a b l e> <caption>t y r a n n o s a u r u s rex</ caption> <t r> <th>Database</ th> <th>Count</ th> <th>Status</ th> </ t r> <t r> <td> <a h r e f=” h t t p s ://www. ncbi . nlm . nih . gov/pubmed?cmd=s earch&amp ; term=t y r a n n o s a u r u s </ td> <td>41</ td> <td>Ok</ td> </ t r> <t r> <td> <a h r e f=” h t t p s ://www. ncbi . nlm . nih . gov/pmc?cmd=searc h&amp ; term=t y r a n n o s a u r u s+re </ td> <td>160</ td> <td>Ok</ td> </ t r> <t r> <td> <a h r e f=” h t t p s ://www. ncbi . nlm . nih . gov/mesh?cmd=sea rch&amp ; term=t y r a n n o s a u r u s+r </ td> <td>15</ td>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 37. ESearch Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 38. ESearch Provides a list of UIDs matching a text query Posts the results of a search on the History server Downloads all UIDs from a dataset stored on the History server Combines or limits UID datasets stored on the History server Sorts sets of UIDs Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 39. ESearch Syntax Base URL https: //eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 40. ESearch Searching for ’Mammuthus primigenius’ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=n u c l e o t i d e& term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D” | x m l l i n t −−format − <e Sea rc hR esu lt> <Count>684</Count> <RetMax>20</RetMax> <RetStart>0</ RetStart> <I d L i s t> <Id>507866428</ Id> <Id>124056416</ Id> <Id>383843869</ Id> <Id>383843867</ Id> <Id>383843865</ Id> <Id>383843863</ Id> <Id>383843861</ Id> <Id>383843859</ Id> <Id>383843857</ Id> <Id>383843855</ Id> <Id>383843853</ Id> <Id>383843851</ Id> <Id>383843849</ Id> <Id>383843847</ Id> <Id>383843845</ Id> <Id>157367690</ Id> <Id>157367676</ Id> <Id>157367662</ Id> <Id>157367648</ Id> <Id>157367634</ Id> </ I d L i s t> <T r a n s l a t i o n S e t> <T r a n s l a t i o n>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 41. ESearch Searching for ’Mammuthus primigenius’ (JSON) c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=n u c l e o t i d e &term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D&retmode=j s o n ” { ” header ”: { ” type ”: ” e s e a r c h ” , ” v e r s i o n ”: ”0.3” } , ” e s e a r c h r e s u l t ”: { ” count ”: ”811” , ” retmax ”: ”20” , ” r e t s t a r t ”: ”0” , ” i d l i s t ”: [ ”1059791223” , ”198241525” , ”198241523” , ”198241521” , ”198241519” , ”198241517” , ”198241515” , ”198241513” , ”198241511” , ”198241509” , ”198241507” , ”198241505” , ”198241503” , ”198241501” , ”198241499” , ”198241497” , ”198241495” , ”198241493” , ”198241491” ,Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 42. ESearch the retmax parameter c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=n u c l e o t i d e& term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D&retmax=2” | x m l l i n t −−format − <e Sea rc hR esu lt> <Count>684</Count> <RetMax>2</RetMax> <RetStart>0</ RetStart> <I d L i s t> <Id>507866428</ Id> <Id>124056416</ Id> </ I d L i s t> <T r a n s l a t i o n S e t> <T r a n s l a t i o n> <From>”Mammuthus p r i m i g e n i u s ” [ORGN]</From> <To>”Mammuthus p r i m i g e n i u s ” [ Organism ]</To> </ T r a n s l a t i o n> </ T r a n s l a t i o n S e t> <T r a n s l a t i o n S t a c k> <TermSet> <Term>”Mammuthus p r i m i g e n i u s ” [ Organism ]</Term> <F i e l d>Organism</ F i e l d> <Count>684</Count> <Explode>Y</ Explode> </TermSet> <OP>GROUP</OP> </ T r a n s l a t i o n S t a c k> <QueryTranslation>”Mammuthus p r i m i g e n i u s ” [ Organism ]</ QueryTranslation> </ e Se ar ch Res ul t> Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 43. ESearch the retstart parameter c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=n u c l e o t i d e& term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D&retmax=3&r e t s t a r t =100” | x m l l i n t −−format − <e Sea rc hR esu lt> <Count>684</Count> <RetMax>3</RetMax> <RetStart>100</ RetStart> <I d L i s t> <Id>300810656</ Id> <Id>300810655</ Id> <Id>300810654</ Id> </ I d L i s t> <T r a n s l a t i o n S e t> <T r a n s l a t i o n> <From>”Mammuthus p r i m i g e n i u s ” [ORGN]</From> <To>”Mammuthus p r i m i g e n i u s ” [ Organism ]</To> </ T r a n s l a t i o n> </ T r a n s l a t i o n S e t> <T r a n s l a t i o n S t a c k> <TermSet> <Term>”Mammuthus p r i m i g e n i u s ” [ Organism ]</Term> <F i e l d>Organism</ F i e l d> <Count>684</Count> <Explode>Y</ Explode> </TermSet> <OP>GROUP</OP> </ T r a n s l a t i o n S t a c k> <QueryTranslation>”Mammuthus p r i m i g e n i u s ” [ Organism ]</ QueryTranslation> </ e Se ar ch Res ul t> Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 44. ESearch rettype=retcount c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=n u c l e o t i d e& term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D&r e t t y p e=count ” | x m l l i n t −−format − <eSearchResult> <Count>684</Count> </ eSearchResult> Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 45. ESearch sort=Date Released c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db= n u c l e o t i d e&term=%22Mammuthus%20 p r i m i g e n i u s%22%5BORGN%5D&s o r t=Date+Released ” x m l l i n t −−format − <eSearchResult><Count>811</Count><RetMax>20</RetMax> <Id>1033204644</ Id> <Id>1033204658</ Id> <Id>1033204672</ Id> <Id>1033204686</ Id> <Id>1033204729</ Id> <Id>1033204771</ Id> <Id>1033204785</ Id> <Id>1033204799</ Id> <Id>1033204813</ Id> <Id>1033204827</ Id> <Id>1033204871</ Id> <Id>1033205124</ Id> <Id>1033205194</ Id> Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 46. ESummary Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 47. ESummary Syntax Returns document summaries (DocSums) for a list of input UIDs Returns DocSums for a set of UIDs stored on the Entrez History server Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 48. ESummary Syntax Base URL: https://eutils.ncbi.nlm.nih.gov/entrez/ eutils/esummary.fcgi?db=(DB)&id=(TERM) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 49. ESummary Retrieve nucleotide gi=507866428 $ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s /esummary . f c g i ?db= n u c l e o t i d e&i d =507866428” <eSummaryResult> <DocSum> <Id>507866428</ Id> <Item Name=” Caption ” Type=” S t r i n g ”>KC524742</ Item> <Item Name=” T i t l e ” Type=” S t r i n g ”>Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin (Mb <Item Name=” Extra ” Type=” S t r i n g ”>g i |507866428| gb | KC524742 . 1 | [ 5 0 7 8 6 6 4 2 8 ]</ Item> <Item Name=” Gi ” Type=” I n t e g e r ”>507866428</ Item> <Item Name=” CreateDate ” Type=” S t r i n g ”>2013/06/15</ Item> <Item Name=”UpdateDate” Type=” S t r i n g ”>2013/06/21</ Item> <Item Name=” Flags ” Type=” I n t e g e r ”>0</ Item> <Item Name=” TaxId ” Type=” I n t e g e r ”>37349</ Item> <Item Name=” Length ” Type=” I n t e g e r ”>9042</ Item> <Item Name=” Status ” Type=” S t r i n g ”>l i v e</ Item> <Item Name=” ReplacedBy ” Type=” S t r i n g ”></ Item> <Item Name=”Comment” Type=” S t r i n g ”><! [CDATA[ ] ]></ Item> </DocSum> </ eSummaryResult> Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 50. ESummary Retrieve nucleotide gi=507866428 in JSON $ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s /esummary . f c g i ?db= n u c l e o t i d e&i d =507866428& retmode=j s o n ” { ” header ”: { ” type ”: ”esummary ” , ” v e r s i o n ”: ”0.3” } , ” r e s u l t ”: { ” u i d s ”: [ ”507866428” ] , ”507866428”: { ” uid ”: ”507866428” , ” c a p t i o n ”: ”KC524742 ” , ” t i t l e ”: ”Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin (Mb) gene , p a r ” e x t r a ”: ” g i |507866428| gb | KC524742 . 1 | ” , ” g i ”: 507866428 , ” c r e a t e d a t e ”: ”2013/06/15” , ” updatedate ”: ”2013/06/21” , ” f l a g s ”: ”” , ” t a x i d ”: 37349 , ” s l e n ”: 9042 , ” biomol ”: ” genomic ” , ” moltype ”: ”dna ” , ” topology ”: ” l i n e a r ” , ” sourcedb ”: ” i n s d ” , ” s e g s e t s i z e ”: ”” , ” p r o j e c t i d ”: ”0” , ( . . . ) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 51. ESummary Retrieve snp rs25 $ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s /esummary . f c g i ?db=snp&i d =25 ” <eSummaryResult> <DocSum> <Id>25</ Id> <Item Name=”SNP ID” Type=” I n t e g e r ”>25</ Item> <Item Name=”Organism” Type=” S t r i n g ”></ Item> <Item Name=”ALLELE ORIGIN” Type=” S t r i n g ”></ Item> <Item Name=”GLOBAL MAF” Type=” S t r i n g ”>0.4913</ Item> <Item Name=”GLOBAL POPULATION” Type=” S t r i n g ”></ Item> <Item Name=”GLOBAL SAMPLESIZE” Type=” I n t e g e r ”>0</ Item> <Item Name=”SUSPECTED” Type=” S t r i n g ”></ Item> <Item Name=”CLINICAL SIGNIFICANCE” Type=” S t r i n g ”></ Item> <Item Name=”GENE” Type=” S t r i n g ”>THSD7A</ Item> <Item Name=”LOCUS ID” Type=” I n t e g e r ”>221981</ Item> <Item Name=”ACC” Type=” S t r i n g ”>NM 015204 . 2 , NT 007819 .17</ Item> <Item Name=”CHR” Type=” S t r i n g ”>7</ Item> <Item Name=”WEIGHT” Type=” I n t e g e r ”>1</ Item> <Item Name=”HANDLE” Type=” S t r i n g ”>1000GENOMES, BGI , BL ,BUSHMAN,COMPLETE GENOMICS, CSHL−HAPM <Item Name=”FXN CLASS” Type=” S t r i n g ”>intron−v a r i a n t</ Item> <Item Name=”VALIDATED” Type=” S t r i n g ”>by−1000G, by−c l u s t e r , by−frequency , by−hapmap</ Item> <Item Name=”GTYPE” Type=” S t r i n g ”>t r u e</ Item> <Item Name=”NONREF” Type=” S t r i n g ”>f a l s e</ Item> <Item Name=”DOCSUM” Type=” S t r i n g ”>HGVS=NC 000007 .13 :g .11584142T&gt ; C, NG 027670 .1 :g .29268 <Item Name=”HET” Type=” I n t e g e r ”>50</ Item> <Item Name=”SRATE” Type=” I n t e g e r ”>0</ Item> <Item Name=”TAX ID” Type=” I n t e g e r ”>9606</ Item> <Item Name=”CHRRPT” Type=” S t r i n g ”>2 5 | 2 | 0 | 1 | 1 | 1 | 7 | NT 007819 .17|11574141|11584142|THSD7A|0 <Item Name=”ORIG BUILD” Type=” I n t e g e r ”>36</ Item> <Item Name=”UPD BUILD” Type=” I n t e g e r ”>138</ Item> <Item Name=”CREATEDATE” Type=” S t r i n g ”>2000−09−19 17 :02</ Item>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 52. ESummary Retrieve pubmed pmid=7939126 $ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s /esummary . f c g i ?db=pubmed& i d =7939126” <eSummaryResult> <DocSum> <Id>7939126</ Id> <Item Name=”PubDate” Type=”Date”>1994 Apr</ Item> <Item Name=”EPubDate” Type=”Date”></ Item> <Item Name=” Source ” Type=” S t r i n g ”>Sleep</ Item> <Item Name=” A u t h o r L i s t ” Type=” L i s t ”> <Item Name=” Author ” Type=” S t r i n g ”>Broughton R</ Item> <Item Name=” Author ” Type=” S t r i n g ”>B i l l i n g s R</ Item> <Item Name=” Author ” Type=” S t r i n g ”>Cartwright R</ Item> <Item Name=” Author ” Type=” S t r i n g ”>Doucette D</ Item> <Item Name=” Author ” Type=” S t r i n g ”>Edmeads J</ Item> <Item Name=” Author ” Type=” S t r i n g ”>Edwardh M</ Item> <Item Name=” Author ” Type=” S t r i n g ”>Ervin F</ Item> <Item Name=” Author ” Type=” S t r i n g ”>Orchard B</ Item> <Item Name=” Author ” Type=” S t r i n g ”>H i l l R</ Item> <Item Name=” Author ” Type=” S t r i n g ”>T u r r e l l G</ Item> </ Item> <Item Name=” LastAuthor ” Type=” S t r i n g ”>T u r r e l l G</ Item> <Item Name=” T i t l e ” Type=” S t r i n g ”>Homicidal somnambulism: a case r e p o r t .</ Item> <Item Name=”Volume” Type=” S t r i n g ”>17</ Item> <Item Name=” I s s u e ” Type=” S t r i n g ”>3</ Item> <Item Name=” Pages ” Type=” S t r i n g ”>253−64</ Item> <Item Name=” LangList ” Type=” L i s t ”> <Item Name=”Lang” Type=” S t r i n g ”>E n g l i s h</ Item> </ Item> <Item Name=”NlmUniqueID” Type=” S t r i n g ”>7809084</ Item> <Item Name=”ISSN” Type=” S t r i n g ”>0161−8105</ Item> <Item Name=”ESSN” Type=” S t r i n g ”>1550−9109</ Item>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 53. EFetch Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 54. EFetch Syntax Base URL: https://eutils.ncbi.nlm.nih.gov/entrez/ eutils/efetch.fcgi?db=(db)&id=(ID) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 55. EFetch Retrieve nucleotide gi=507866428 as ASN.1 Default https://eutils.ncbi.nlm.nih.gov/entrez/eutils/ efetch.fcgi?db=nucleotide&id=507866428 Seq−e n t r y ::= set { c l a s s nuc−prot , d e s c r { source { genome genomic , org { taxname ”Mammuthus p r i m i g e n i u s ” , common ” woolly mammoth” , db { { db ” taxon ” , tag i d 37349 } } , orgname { name b i no m i al { genus ”Mammuthus” , s p e c i e s ” p r i m i g e n i u s ” } , mod { { Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 56. EFetch Retrieve nucleotide gi=507866428 as Fasta https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch. fcgi?db=nucleotide&id=507866428&rettype=fasta >g i |507866428| gb | KC524742 . 1 | Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin (Mb) gene , p a r t i a l cds GCACTTGCTTTTTTTGTCTTCTTCAGACCACGACATGGGACTCAGCGACGGGGAATGGGAGTTGGTGTTG AAAACCTGGGGGAAAGTGGAGGCTGACATCCCGGGCCATGGGCTGGAAGTCTTCGTCAGGTAAAGGAAGA AATCCTGTGGCCCCCATCACCCACCCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 57. EFetch Retrieve nucleotide gi=507866428 as TinySeq https: //eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi? db=nucleotide&id=507866428&rettype=fasta&retmode=xml <?xml v e r s i o n=” 1.0 ”?> <!DOCTYPE TSeqSet PUBLIC ”−//NCBI//NCBI TSeq/EN” <TSeqSet> <TSeq> <TSeq seqtype v a l u e=” n u c l e o t i d e ”/> <TSeq gi>507866428</ TSeq gi> <TSeq accver>KC524742 .1</ TSeq accver> <TSeq taxid>37349</ TSeq taxid> <TSeq orgname>Mammuthus p r i m i g e n i u s</TSeq orgnam <T S e q d e f l i n e>Mammuthus p r i m i g e n i u s i s o l a t e CME2 <TSeq length>9042</ TSeq length> <TSeq sequence>GCACTTGCTTTTTTTGTCTTCTTCAGACCACGA </TSeq> </TSeqSet> Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 58. EFetch Retrieve nucleotide gi=507866428 as Genbank-xml https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch. fcgi?db=nucleotide&id=507866428&retmode=xml <GBSeq> <GBSeq locus>KC524742</ GBSeq locus> <GBSeq length>9042</ GBSeq length> <GBSeq strandedness>double</ GBSeq strandedness> <GBSeq moltype>DNA</GBSeq moltype> <GBSeq topology>l i n e a r</ GBSeq topology> <GBSeq division>MAM</ GBSeq division> <GBSeq update−date>21−JUN−2013</GBSeq update−date> <GBSeq create−date>15−JUN−2013</ GBSeq create−date> <G B S e q d e f i n i t i o n>Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin (Mb) gene , p a r t i <GBSeq primary−a c c e s s i o n>KC524742</ GBSeq primary−a c c e s s i o n> <GBSeq accession−v e r s i o n>KC524742 .1</ GBSeq accession−v e r s i o n> <GBSeq other−s e q i d s> <GBSeqid>gb | KC524742 . 1 |</GBSeqid> <GBSeqid>g i |507866428</GBSeqid> </ GBSeq other−s e q i d s> <GBSeq source>Mammuthus p r i m i g e n i u s ( woolly mammoth)</ GBSeq source> <GBSeq organism>Mammuthus p r i m i g e n i u s</ GBSeq organism> ( . . . ) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 59. EFetch Retrieve nucleotide gi=507866428 as Genbank https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch. fcgi?db=nucleotide&id=507866428&rettype=gb LOCUS KC524742 9042 bp DNA l i n e a r MAM 21−JUN−2013 DEFINITION Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin (Mb) gene , p a r t i a l cds . ACCESSION KC524742 VERSION KC524742 .1 GI :507866428 KEYWORDS . SOURCE Mammuthus p r i m i g e n i u s ( woolly mammoth) ORGANISM Mammuthus p r i m i g e n i u s Eukaryota ; Metazoa ; Chordata ; Craniata ; V e r t e br a t a ; Euteleostomi ; Mammalia ; E u t h e r i a ; A f r o t h e r i a ; Proboscidea ; E l e p h a n t i d a e ; Mammuthus . REFERENCE 1 ( bases 1 to 9042) AUTHORS Mirceta , S . , Signore ,A.V. , Burns , J .M. , Cossins ,A.R. , Campbell ,K. L . and Berenbrink ,M. TITLE E v o l u t i o n of mammalian d i v i n g c a p a c i t y t r a c e d by myoglobin net s u r f a c e charge JOURNAL Science 340 (6138) , 1234192 (2013) PUBMED 23766330 REFERENCE 2 ( bases 1 to 9042) AUTHORS Signore ,A.V. , Campbell ,K. L . and Poinar ,H.N. TITLE D i r e c t Submission JOURNAL Submitted (09−JAN−2013) B i o l o g i c a l Sciences , U n i v e r s i t y of Manitoba , 50 S i f t o n Road , Winnipeg , Manitoba R3T2N2 , Canada COMMENT ##Assembly−Data−START## Sequencing Technology : : Sanger dideoxy sequencing ##Assembly−Data−END## FEATURES Location / Q u a l i f i e r s source 1 . . 9 0 4 2 / organism=”Mammuthus p r i m i g e n i u s ”Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 60. EFetch Efetch works with the ACCESSION NUMBERS https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch. fcgi?db=nucleotide&id=KC524742&rettype=gb LOCUS KC524742 9042 bp DNA l i n e a r MAM 21−JUN−2013 DEFINITION Mammuthus p r i m i g e n i u s i s o l a t e CME2005/915 myoglobin (Mb) gene , p a r t i a l cds . ACCESSION KC524742 VERSION KC524742 .1 GI :507866428 KEYWORDS . SOURCE Mammuthus p r i m i g e n i u s ( woolly mammoth) ORGANISM Mammuthus p r i m i g e n i u s Eukaryota ; Metazoa ; Chordata ; Craniata ; V e r t e br a t a ; Euteleostomi ; Mammalia ; E u t h e r i a ; A f r o t h e r i a ; Proboscidea ; E l e p h a n t i d a e ; Mammuthus . REFERENCE 1 ( bases 1 to 9042) AUTHORS Mirceta , S . , Signore ,A.V. , Burns , J .M. , Cossins ,A.R. , Campbell ,K. L . and Berenbrink ,M. TITLE E v o l u t i o n of mammalian d i v i n g c a p a c i t y t r a c e d by myoglobin net s u r f a c e charge JOURNAL Science 340 (6138) , 1234192 (2013) PUBMED 23766330 REFERENCE 2 ( bases 1 to 9042) AUTHORS Signore ,A.V. , Campbell ,K. L . and Poinar ,H.N. TITLE D i r e c t Submission JOURNAL Submitted (09−JAN−2013) B i o l o g i c a l Sciences , U n i v e r s i t y of Manitoba , 50 S i f t o n Road , Winnipeg , Manitoba R3T2N2 , Canada COMMENT ##Assembly−Data−START## Sequencing Technology : : Sanger dideoxy sequencing ##Assembly−Data−END## FEATURES Location / Q u a l i f i e r s source 1 . . 9 0 4 2 / organism=”Mammuthus p r i m i g e n i u s ”Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 61. EFetch Using the WebEnv parameter. Web environment string returned from a previous ESearch, EPost or ELink call. When provided, ESearch will post the results of the search operation to this pre-existing WebEnv. Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 62. EFetch Using the WebEnv parameter. Searching extinct species in the NCBI taxonomy (’extinct[PROP]’) c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?usehistory=y&db= taxonomy&term=e x t i n c t%5BPROP%5D” <e Sea rc hR esu lt> <Count>145</Count> <RetMax>20</RetMax> <RetStart>0</ RetStart> <QueryKey>1</QueryKey> <WebEnv>NCID 1 75550312 130.14.18.34 9001 1375948145 325582538</WebEnv> <I d L i s t> <Id>1225531</ Id> <Id>1225530</ Id> <Id>1211276</ Id> <Id>1211275</ Id> <Id>1027716</ Id> <Id>948961</ Id> <Id>943952</ Id> <Id>867394</ Id> <Id>867393</ Id> <Id>748142</ Id> <Id>748141</ Id> <Id>741158</ Id> <Id>703576</ Id> <Id>703571</ Id> <Id>703559</ Id> <Id>693865</ Id> <Id>686441</ Id> <Id>665113</ Id> <Id>659069</ Id> <Id>656807</ Id>Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 63. EFetch Using the WebEnv parameter. Fetch the extinct species in the NCBI taxonomy (’extinct[PROP]’) using the WebEnv parameter. $ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=taxonomy& query key=1&WebEnv=NCID 1 75550312 130.14.18.34 9001 1375948145 325582538&retmode=xml” <TaxaSet><Taxon> <TaxId>1225531</ TaxId> <S c i e n t i f i c N a m e>Equus ovodovi</ S c i e n t i f i c N a m e> <OtherNames> <Synonym>Equus ( Sussemionus ) ovodovi</Synonym> <Name> <ClassCDE>a u t h o r i t y</ClassCDE> <DispName>Equus ovodovi Eisenmann &amp ; Sergej , 2011</DispName> </Name> </OtherNames> <ParentTaxId>1225530</ ParentTaxId> <Rank>s p e c i e s</Rank> <D i v i s i o n>Mammals</ D i v i s i o n> <GeneticCode> <GCId>1</GCId> <GCName>Standard</GCName> </ GeneticCode> <MitoGeneticCode> ( . . . . ) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 64. EPOST Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 65. EPost Uploads a list of UIDs to the Entrez History server Appends a list of UIDs to an existing set of UID lists attached to a Web Environment Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 66. EPost Post gi to epost Get a list of gis of extincts animals: wget −O − ’ h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db= taxonomy&term=e x t i n c t [PROP]& retmax =1000’ | x m l l i n t −format − | grep ’<Id >’ | cut −d ’<’ −f 2 | cut −d ’>’ −f 2| t r ”n” ” , ” output: 1860150 ,1860149 ,1849957 ,1825730 ,1825729 ,1636722 ,1607772 ,1607771 ,1607767 ,1607757 ,1607756 Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 67. EPost Post gi to epost wget −O − ’ h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / epost . f c g i ?db=taxonomy& WebEnd=NCID 1 15435144 130 . 1 4 . 2 2 . 2 1 5 9001 1474637318 669113391 0MetA0 S MegaStore F 1&i d =1860150 ,1860149 ,1849957 ,1825730 ,1825729 ,1636722 ,1607772... ” Output: <?xml v e r s i o n=” 1.0 ”?> <!DOCTYPE ePostResult PUBLIC ”−//NLM//DTD ePostResult , 11 May 2002//EN” ” h t t p : // www. ncbi . nlm . nih . gov/ e n t r e z / query /DTD/ ePost 020511 . dtd ”> <ePostResult> <QueryKey>1</QueryKey> <WebEnv>NCID 1 15467192 130 . 1 4 . 2 2 . 2 1 5 9001 1474637456 570452194 0MetA0 S MegaStore F 1</WebEnv> </ ePostResult> Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 68. EPost Searching in the WebEnv Search Homo Sapiens in WebEnv ? c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=taxonomy& term=Homo%20Sapiens&u s e h i s t o r y=y&WebEnv=NCID 1 75550312 130 . 1 4 . 1 8 . 3 4 9001 1375948145 325582538&query key=1” <e Sea rc hR esu lt> <Count>0</Count> <RetMax>0</RetMax> <RetStart>0</ RetStart> <QueryKey>8</QueryKey> <WebEnv>NCID 1 75550312 130 . 1 4 . 1 8 . 3 4 9001 1375948145 325582538</WebEnv> <I d L i s t /> <T r a n s l a t i o n S e t /> <T r a n s l a t i o n S t a c k> <OP>GROUP</OP> <TermSet> <Term>homo s a p i e n s [ A l l Names ]</Term> <F i e l d>A l l Names</ F i e l d> <Count>0</Count> <Explode>N</ Explode> </TermSet> <OP>AND</OP> </ T r a n s l a t i o n S t a c k> <QueryTranslation>(#2) AND homo s a p i e n s [ A l l Names ]</ QueryTranslation> </ e Se ar ch Res ul t> Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 69. EPost Searching in the WebEnv Search Tyranosaurus in WebEnv ? $ c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db= taxonomy&term=Tyrannosaurus&u s e h i s t o r y=y&WebEnv=NCID 1 75550312 130 . 1 4 . 1 8 . 3 4 9001 1375948145 325582538&query key=1” <e Sea rc hR esu lt> <Count>1</Count> <RetMax>1</RetMax> <RetStart>0</ RetStart> <QueryKey>9</QueryKey> <WebEnv>NCID 1 75550312 130 . 1 4 . 1 8 . 3 4 9001 1375948145 325582538</WebEnv> <I d L i s t> <Id>436494</ Id> </ I d L i s t> <T r a n s l a t i o n S e t /> <T r a n s l a t i o n S t a c k> <OP>GROUP</OP> <TermSet> <Term>Tyrannosaurus [ A l l Names ]</Term> <F i e l d>A l l Names</ F i e l d> <Count>1</Count> <Explode>N</ Explode> </TermSet> <OP>AND</OP> </ T r a n s l a t i o n S t a c k> <QueryTranslation>(#2) AND Tyrannosaurus [ A l l Names ]</ QueryTranslation> </ e Se ar ch Res ul t> Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 70. EDirect: combining tools Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 71. Piping Edirect esearch −db taxonomy −query ” Tyrannosaurus ” | e f e t c h −format xml Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 72. Piping Edirect esearch −db pubmed −query ” Tyrannosaurus ” | e f i l t e r −mindate 2005 | e f e t c h −format docsum | x t r a c t −pattern DocumentSummary −element MedlineCitation /PMID −element Id S o r t F i r s t A u t h o r Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 73. Elink Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 74. Elink Returns UIDs linked to an input set of UIDs in either the same or a different Entrez database Returns UIDs linked to other UIDs in the same Entrez database that match an Entrez query Checks for the existence of Entrez links for a set of UIDs within the same database Lists the available links for a UID Lists LinkOut URLs and attributes for a set of UIDs Lists hyperlinks to primary LinkOut providers for a set of UIDs Creates hyperlinks to the primary LinkOut provider for a single UID Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 75. Elink Base URL: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 76. ELink Searching the pubmed records associated to sequence gi:507866428 h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e l i n k . f c g i ? dbfrom=n u c l e o t i d e&db= pubmed&i d =507866428&cmd=n e i g h b o r s c o r e <e L i n k R e s u l t> <LinkSet> <DbFrom>nuccore</DbFrom> <I d L i s t> <Id>507866428</ Id> </ I d L i s t> <LinkSetDb> <DbTo>pubmed</DbTo> <LinkName>nuccore pubmed</LinkName> <Link> <Id>23766330</ Id> <Score>0</ Score> </ Link> </ LinkSetDb> </ LinkSet> </ e L i n k R e s u l t> $ c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=pubmed& i d =23766330& r e t t y p e=medline&retmode=t e x t ” PMID− 23766330 TI − E v o l u t i o n of mammalian d i v i n g c a p a c i t y t r a c e d by myoglobin net s u r f a c e charge . PG − 1234192 LID − 10.1126/ s c i e n c e .1234192 [ doi ] Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 77. Transformations Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 78. Efetch Transforming to SVG Using the stylesheet https://github.com/lindenb/xslt-sandbox/blob/master/ stylesheets/bio/ncbi/gb2svg.xsl x s l t p r o c <( c u r l ” h t t p s :// raw . github . com/ l i n d e n b / x s l t −sandbox / master / s t y l e s h e e t s / bio / ncbi / gb2svg . x s l ” ) ” h t t p s ://www. ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=n u c l e o t i d e&i d =14971102& retmode=xml&r e t t y p e=gbc” Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 79. Efetch Transforming to SVG 1 <?xml v e r s i o n=” 1.0 ” encoding=”UTF−8”?> 2 <s v g : s v g xmlns:svg=” h t t p : //www. w3 . org /2000/ svg ” h e i g h t=”121” width=”920” s t y l e=” stroke−width:1px ; ”> 3 <s v g : t i t l e>Human r o t a v i r u s segment 7 NSP3 gene , complete cds</ s v g : t i t l e> 4 <s v g : d e f s> 5 <s v g : l i n e a r G r a d i e n t x1=”0%” y1=”0%” x2=”0%” y2=”100%” i d=” grad ”> 6 <s v g : s t o p o f f s e t=”5%” stop−c o l o r=” black ”/> 7 <s v g : s t o p o f f s e t=”50%” stop−c o l o r=” whitesmoke ”/> 8 <s v g : s t o p o f f s e t=”95%” stop−c o l o r=” black ”/> 9 </ s v g : l i n e a r G r a d i e n t> 10 <s v g : l i n e a r G r a d i e n t x1=”0%” y1=”0%” x2=”0%” y2=”100%” i d=” v e r t i c a l b o d y g r a d i e n t ”> 11 <s v g : s t o p o f f s e t=”5%” stop−c o l o r=” white ”/> 12 <s v g : s t o p o f f s e t=”95%” stop−c o l o r=” l i g h t g r a y ”/> 13 </ s v g : l i n e a r G r a d i e n t> 14 </ s v g : d e f s> 15 <s v g : s t y l e type=” t e x t / c s s ”/> 16 <s v g : g> 17 <s v g : g transform=” t r a n s l a t e (0 ,0) ”> 18 <s v g : r e c t x=”0” y=”0” width=”920” h e i g h t=”120” f i l l =” u r l (# v e r t i c a l b o d y g r a d i e n t ) ” s t r o k e=” black ”/> 19 <s v g : t e x t s t y l e=” c o l o r : r e d ; font−s i z e : 3 5 p x ; ” x=”10” y=”35”>Human r o t a v i r u s segment 7 NSP3 gene , complete cds</ s v g : t e x t> 20 <s v g : g> 21 <s v g : r e c t x=”10” y=”40” width=”900” h e i g h t=”18” s t y l e=” f i l l : u r l (#grad ) ; s t r o k e : b l a c k ; ” t i t l e=” 1 . . 1 0 7 4 ”/> 22 <s v g : t e x t y=”54” x=”460” text−anchor=” middle ”><s v g : t s p a n s t y l e=” font− w e i g h t : b o l d ; ”>source</ s v g : t s p a n><s v g : t s p a n x m l n s : x s i=” h t t p : //www. w3 . org /2001/XMLSchema−i n s t a n c e ” x m l n s : x l i n k=” h t t p : //www. w3 . org /1999/ x l i n k ” font−weight=” bold ”>organism</ s v g : t s p a n>:Human r o t a v i r u s A < s v g : t s p a n x m l n s : x s i=” h t t p : //www. w3 . org /2001/XMLSchema−i n s t a n c e ” x m l n s : x l i n k=” h t t p : //www. w3 . org /1999/ x l i n k ” font−weight=” bold ”> mol type</ s v g : t s p a n>:genomic RNA <s v g : t s p a n x m l n s : x s i=” h t t p : //www.Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 80. Efetch Transforming to SVG Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 81. Efetch Transforming to R $ c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e s e a r c h . f c g i ?db=pubmed& term=Tyrannosaurus&u s e h i s t o r y=t r u e ” | x m l l i n t −−format − $ c u r l −s ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=pubmed& u s e h i s t o r y=t r u e&WebEnv=NCID 1 52434791 130 . 1 4 . 2 2 . 2 1 5 9001 1375957034 1619786167&query key=1&retmode=xml” Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 82. Efetch Transforming to R 1 <?xml v e r s i o n=’ 1.0 ’ encoding=”UTF−8” ?> 2 <x s l : s t y l e s h e e t x m l n s : x s l=’ h t t p : //www. w3 . org /1999/XSL/ Transform ’ v e r s i o n=’ 1.0 ’> 3 <x s l : o u t p u t method=” t e x t ”/> 4 5 6 <x s l : t e m p l a t e match=”/”> 7 date2count &l t ;− l i s t () 8 <x s l : a p p l y −templates s e l e c t=”/ PubmedArticleSet / PubmedArticle [ M e d l i n e C i t a t i o n / DateCreated / Year ] ”/> 9 df &l t ;− data . frame ( 10 Year=as . i n t e g e r ( names ( date2count ) ) , 11 Count=u n l i s t ( date2count ) 12 ) 13 png ( ’ jeterpubmed . png ’ ) 14 p l o t ( df ) 15 t i t l e ( ’ pubmed: count ( a r t i c l e s )=f ( year ) ’ ) 16 dev . o f f () 17 </ x s l : t e m p l a t e> 18 19 <x s l : t e m p l a t e match=” PubmedArticle ”> 20 <x s l : v a r i a b l e name=” year ” s e l e c t=” M e d l i n e C i t a t i o n / DateCreated / Year ”/> 21 date2count [ [ ”<x s l : v a l u e −of s e l e c t=”$ year ”/>” ] ] &l t ;− i f e l s e ( i s . n u l l ( date2count [ [ ”<x s l : v a l u e −of s e l e c t=”$ year ”/>” ] ] ) ,1 ,1+ date2count [ [ ”<x s l : v a l u e −of s e l e c t=” $ year ”/>” ] ] ) 22 </ x s l : t e m p l a t e> 23 24 </ x s l : s t y l e s h e e t> Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 83. Efetch Transforming to R $ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=pubmed& u s e h i s t o r y=t r u e&WebEnv=NCID 1 52434791 130 . 1 4 . 2 2 . 2 1 5 9001 1375957034 1619786167&query key=1&retmode=xml” | x s l t p r o c pubmed2rstats . x s l − date2count <− l i s t () date2count [ [ ”2013” ] ] <− i f e l s e ( i s . n u l l ( date2count [ [ ”2013” ] ] ) ,1 ,1+ date2count [ [ ” 2013” ] ] ) date2count [ [ ”2012” ] ] <− i f e l s e ( i s . n u l l ( date2count [ [ ”2012” ] ] ) ,1 ,1+ date2count [ [ ” 2012” ] ] ) date2count [ [ ”2012” ] ] <− i f e l s e ( i s . n u l l ( date2count [ [ ”2012” ] ] ) ,1 ,1+ date2count [ [ ” 2012” ] ] ) date2count [ [ ”2011” ] ] <− i f e l s e ( i s . n u l l ( date2count [ [ ”2011” ] ] ) ,1 ,1+ date2count [ [ ” 2011” ] ] ) date2count [ [ ”2011” ] ] <− i f e l s e ( i s . n u l l ( date2count [ [ ”2011” ] ] ) ,1 ,1+ date2count [ [ ” 2011” ] ] ) ( . . ) df <− data . frame ( Year=as . i n t e g e r ( names ( date2count ) ) , Count=u n l i s t ( date2count ) ) png ( ’ jeterpubmed . png ’ ) p l o t ( df ) t i t l e ( ’ pubmed : count ( a r t i c l e s )=f ( year ) ’ ) dev . o f f () Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 84. Efetch Transforming to R $ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=pubmed& u s e h i s t o r y=t r u e&WebEnv=NCID 1 52434791 130 . 1 4 . 2 2 . 2 1 5 9001 1375957034 1619786167&query key=1&retmode=xml” | x s l t p r o c pubmed2rstats . x s l − | R −−no−save Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 85. Generating a JAVA parser Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 86. Using the XML schema XML Schema for dbSNP ftp://ftp.ncbi.nlm.nih.gov/snp/specs/docsum_3.4.xsd <?xml v e r s i o n=” 1.0 ” encoding=”UTF−8”?> <xsd:schema xmlns:xsd=” h t t p : //www. w3 . org /2001/XMLSchema” xmlns=” h t t p : //www. ncbi . nlm . nih . ementFormDefault=” q u a l i f i e d ” a t t r i b u t e F o r m D e f a u l t=” u n q u a l i f i e d ”> <x s d : e l e m e n t name=” ExchangeSet ”> <x s d : a n n o t a t i o n> <xsd:documentation>Set of dbSNP refSNP docsums , v e r s i o n 3.4</ xsd:documentation> </ x s d : a n n o t a t i o n> <xsd:complexType> <x s d : s e q u e n c e> <x s d : e l e m e n t name=” SourceDatabase ” minOccurs=”0”> <xsd:complexType> <x s d : a t t r i b u t e name=” t a x I d ” type=” x s d : i n t ” use=” r e q u i r e d ”> <x s d : a n n o t a t i o n> <xsd:documentation>NCBI taxonomy ID f o r v a r i a t i o n</ xsd:documentation> </ x s d : a n n o t a t i o n> </ x s d : a t t r i b u t e> <x s d : a t t r i b u t e name=” organism ” type=” x s d : s t r i n g ” use=” r e q u i r e d ”> <x s d : a n n o t a t i o n> <xsd:documentation>common name f o r s p e c i e s used as part of database name </ x s d : a n n o t a t i o n> </ x s d : a t t r i b u t e> <x s d : a t t r i b u t e name=”dbSnpOrgAbbr” type=” x s d : s t r i n g ”> <x s d : a n n o t a t i o n> <xsd:documentation>organism a b b r e v i a t i o n used i n dbSNP . </ xsd:documentat </ x s d : a n n o t a t i o n> </ x s d : a t t r i b u t e> <x s d : a t t r i b u t e name=” gpipeOrgAbbr ” type=” x s d : s t r i n g ”> <x s d : a n n o t a t i o n> <xsd:documentation>organism a b b r e v i a t i o n used w i t h i n NCBI genome p i p e l i n Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 87. Using the XML schema Compiling the XML Schema for dbSNP with XJC $ x j c −d . ” f t p :// f t p . ncbi . nlm . nih . gov/ snp / specs /docsum 3 . 4 . xsd ” p a r s i n g a schema . . . comp iling a schema . . . h t t p s / www ncbi nlm nih gov / snp /docsum/ Assay . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/ Assembly . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/BaseURL . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/Component . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/ ExchangeSet . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/ FxnSet . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/MapLoc . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/ ObjectFactory . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/ PrimarySequence . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/Rs . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/ RsLinkout . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/ RsStruct . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/Ss . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum/ package−i n f o . j a v a Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 88. Using the XML schema Compiling the XML Schema for dbSNP with XJC Search the non-genomic rs# in dbSNP. 1 import h t t p s . www ncbi nlm nih gov . snp . docsum . ∗ ; 2 import j a va x . xml . bind . ∗ ; 3 import j a va x . xml . stream . ∗ ; 4 import j a va x . xml . stream . even ts . ∗ ; 5 c l a s s ParseDbSnp 6 { 7 p u b l i c s t a t i c void main ( S t r i n g [ ] args ) throws Exception 8 { 9 JAXBContext jaxbCtxt=JAXBContext . newInstance ( ” h t t p s . www ncbi nlm nih gov . snp . docsum” ) ; 10 Unmarshaller u n m a r s h a l l e r=jaxbCtxt . c r e a t e U n m a r s h a l l e r () ; 11 XMLInputFactory i f a c t o r y = XMLInputFactory . newInstance () ; 12 XMLEventReader r= i f a c t o r y . createXMLEventReader ( System . i n ) ; 13 while ( r . hasNext () ) 14 { 15 XMLEvent evt=r . peek () ; 16 i f ( ! ( evt . i s S t a r t E l e m e n t () && evt . asStartElement () . getName () . g e t L o c a l P a r t () . e q u a l s ( ”Rs” ) ) ) 17 { 18 evt=r . nextEvent () ; 19 continue ; 20 } 21 22 Rs r s=u n m a r s h a l l e r . unmarshal ( r , Rs . c l a s s ) . getValue () ; 23 i f ( ” genomic ” . e q u a l s ( r s . getMolType () ) ) continue ; 24 System . out . p r i n t l n ( ” r s ”+r s . getRsId ()+” ”+r s . getMolType () ) ; 25 } 26 r . c l o s e () ; 27 } 28 } Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 89. Using the XML schema Compiling the XML Schema for dbSNP with XJC compile... $ j a v a c ParseDbSnp . j a v a h t t p s / www ncbi nlm nih gov / snp /docsum /∗. j a v a and run... $ c u r l −s ” f t p :// f t p . ncbi . nih . gov/ snp / organisms /human 9606/XML/ ds ch1 . xml . gz” | gunzip −c | j a v a ParseDbSnp rs701 cDNA rs860 cDNA rs861 cDNA rs862 cDNA rs863 cDNA rs864 cDNA rs865 cDNA rs866 cDNA rs877 cDNA rs878 cDNA rs879 cDNA rs880 cDNA rs882 cDNA rs883 cDNA rs884 cDNA rs885 cDNA rs886 cDNA rs913 cDNA rs945 cDNA rs946 cDNA ( . . . ) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 90. NCBI EBot Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 91. NCBI EBot URL https://www.ncbi.nlm.nih.gov/Class/PowerTools/eutils/ ebot/ebot.cgi Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 92. NCBI EBot Sample output #!/ usr / bin / p e r l ( . . . ) # PUBLIC DOMAIN NOTICE # N a t i o n a l Center f o r Biotechnology I n f o r m a t i o n use LWP: : Simple ; use LWP: : UserAgent ; use Net : : FTP; my $delay = 0; my $maxdelay = 3; my $base = ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s /” ; $params{email} = ”nobody@nowhere . com” ; $params{db} = ” nuccore ” ; $params{ t o o l } = ” ebot ” ; $params{term} = ”Mammuthus+p r i m i g e n i u s [ORGN] ” ; %params = e s e a r c h(%params ) ; $params{retmode} = ”xml” ; $params{ o u t f i l e } = ” r e s u l t . xml” ; $params{ r e t t y p e } = ” n a t i v e ” ; e f e t c h b a t c h (%params ) ; Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 93. BLAST Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 94. Standalone Blast Downloading Standalone tools are available at ftp://ftp.ncbi.nlm.nih.gov/ blast/executables/blast+/LATEST/ #add BLAST to your path export PATH=${PATH}:/ path / to / ncbi−blast −2.2.28+/ bin Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 95. Standalone Blast Download a sample apis mellifera proteins c u r l −o p r o t e i n . fa . gz ” f t p :// f t p . ncbi . nih . gov/genomes/ A p i s m e l l i f e r a / p r o t e i n / p r o t e i n . fa . gz” gunzip p r o t e i n . fa . gz Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 96. Standalone Blast Create a Blast database with makeblastdb Getting help... $ makeblastdb −help ( . . . ) −dbtype <String , ‘ nucl ’ , ‘ prot ’> Molecule type of t a r g e t db −in <F i l e I n > Input f i l e / database name Default = ‘−’ −i n p u t t y p e <String , ‘ asn1 bin ’ , ‘ asn1 txt ’ , ‘ blast Type of the data s p e c i f i e d in i n p u t f i l e Default = ‘ fasta ’ ( . . ) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 97. Standalone Blast Create a Blast database with makeblastdb Create the BLAST database: $ makeblastdb −in p r o t e i n . fa −dbtype prot B u i l d i n g a new DB, c u r r e n t time : 09/02/2013 18:29:38 New DB name : p r o t e i n . fa New DB t i t l e : p r o t e i n . fa Sequence type : Protein Keep Linkouts : T Keep MBits : T Maximum f i l e s i z e : 1000000000B Adding sequences from FASTA; added 10570 sequences Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 98. Standalone Blast Query a Blast database with blastp Get help: $ b l a s t p −help ( . . . ) −query <F i l e I n > Input f i l e name Default = ‘−’ −db <String > BLAST database name ( . . . ) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 99. Standalone Blast Blast human EIF4G1 gi:187956781 $ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=p r o t e i n& r e t t y p e=f a s t a&i d =187956781” | b l a s t p −db p r o t e i n . fa Query= g i |187956781| gb | AAI40897 . 1 | EIF4G1 p r o t e i n [Homo s a p i e n s ] ( . . . ) Score E Sequences producing s i g n i f i c a n t alignments : ( B i t s ) Value g i |328782175| r e f | XP 394628 . 4 | PREDICTED : e u k a r y o t i c t r a n s l a t i o n . . . 189 4e−49 g i |328779480| r e f | XP 003249661 . 1 | PREDICTED : h y p o t h e t i c a l p r o t e i . . . 38.1 0.017 g i |110762568| r e f | XP 001121713 . 1 | PREDICTED : h y p o t h e t i c a l p r o t e i . . . 38.1 0.018 ( . . . ) > g i |328782175| r e f | XP 394628 . 4 | PREDICTED : e u k a r y o t i c t r a n s l a t i o n i n i t i a t i o n f a c t o r 4 gamma 2− l i k e [ Apis m e l l i f e r a ] Length=899 Score = 189 b i t s (479) , Expect = 4e−49, Method : Compositional matrix a d j u s t . I d e n t i t i e s = 115/319 (36%) , P o s i t i v e s = 175/319 (55%) , Gaps = 39/319 (12%) Query 717 KEPRKIIATVLMTEDIKLNKAEKAWKPSS−−KRTAADKDRGEEDADGSKTQDLFRRVRSI 774 ++P + +++ +DI+ E+ W P S +R A + S+ +FR+VR I S b j c t 22 RKPSETTVGLVIKDDIRSLSTEQRWIPPSTLRRDALTPE−−−−−−−−SRNNFIFRKVRGI 73 Query 775 LNKLTPQMFQQLMKQVTQLAIDTEERLKGVIDLIFEKAISEPNFSVAYANMCRCL−−−−− 829 LNKLTP+ F +L + + ++++ LKGVI LIFEKA+ EP +S YA +C+ L S b j c t 74 LNKLTPEKFAKLSNDLLNVELNSDVILKGVIFLIFEKALDEPKYSSMYAQLCKRLSDEAA 133 Query 830 −MALKVPTTEKPTVTVNFRKLLLNRCQKEFEKDKDDDEVFEKKQKEMDEAATAEERGRLK 888 K E F LLL++C+ EFE E FE + DE EE S b j c t 134 NFEPKKALIESQKGQSTFTFLLLSKCRDEFENRSKASEAFENQ−−−−DELGPEEE−−−−− 184Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 100. Standalone Blast Blast human EIF4G1 gi:187956781 , ouput XML $ c u r l ” h t t p s :// e u t i l s . ncbi . nlm . nih . gov/ e n t r e z / e u t i l s / e f e t c h . f c g i ?db=p r o t e i n& r e t t y p e=f a s t a&i d =187956781” | b l a s t p −db p r o t e i n . fa −outfmt 5 ( . . . ) <H i t h s p s> <Hsp> <Hsp num>1</Hsp num> <Hsp bit−s c o r e>189.119</ Hsp bit−s c o r e> <Hsp score>479</ Hsp score> <Hsp evalue>3.78314 e−49</ Hsp evalue> <Hsp query−from>717</ Hsp query−from> <Hsp query−to>1017</ Hsp query−to> <Hsp hit−from>22</ Hsp hit−from> <Hsp hit−to>319</ Hsp hit−to> <Hsp query−frame>0</ Hsp query−frame> <Hsp hit−frame>0</ Hsp hit−frame> <H s p i d e n t i t y>115</ H s p i d e n t i t y> <H s p p o s i t i v e>175</ H s p p o s i t i v e> <Hsp gaps>39</ Hsp gaps> <Hsp align−l e n>319</ Hsp align−l e n> <Hsp qseq>KEPRKIIATVLMTEDIKLNKAEKAWKPSS−−KRTAADKDRGEEDADGSKTQDLFRRVRSILNKLTPQMFQQ IARRRSLGNIKFIGELFKLKMLTEAIMHDCVVKLL−−−−−−−−KNHDEESLECLCRLLTTIGKDLDFEKAKPRMDQYFNQMEKIIKEKK <Hsp hseq>RKPSETTVGLVIKDDIRSLSTEQRWIPPSTLRRDALTPE−−−−−−−−SRNNFIFRKVRGILNKLTPEKFAKLS VAKRKMLGNIKFIGELGKLGIVSETILHRCILQLLEKKRRRRSRGDTAEDIECLCQIMRTCGRILDSDKGRGLMDQYFKRMNSLAESRD <Hsp midline>++P + +++ +DI+ E+ W P S +R A + S+ +FR+VR ILNKLTP+ F + + ++++ LKGVI LIFEKA+ EP +S YA +C+ L K E F LLL++C+ EFE E FE + DE EE E R +A+R+ LGNIKFIGEL KL +++E I+H C+++LL + E +ECLC+++ T G+ LD +K + MDQYF +M + + + RI+FML+DV++LR WVPR+ +GP I+QI + E</ Hsp midline> </Hsp> ( . . . )Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 101. NCBI URL-API Blast Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 102. NCBI URL-API Blast https://www.ncbi.nlm.nih.gov/blast/Doc/urlapi.html $ c u r l ” h t t p s ://www. ncbi . nlm . nih . gov/ b l a s t / B l a s t . c g i ?CMD=Put&QUERY=PAERLMERKADIE &DATABASE=nr&PROGRAM=b l a s t p&FILTER=L&HITLIST SZE=500” ( . . . ) <!−−QBlastInfoBegin RID = 1NRYGX9K014 RTOE = 29 QBlastInfoEnd −−> ( . . . ) Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour
  • 103. The End Pierre Lindenbaum@yokofakun pierre.lindenbaum@univ-nantes.fr http://plindenbaum.blogspot.comAdvanced NCBI.The Entrez APIhttps://github.com/lindenb/cour