A brief presentation of examples using the GBIF API for the GBIF nodes training hackaton for checklist cross-mapping and precursor national checklist generation from GBIF data. Organized by Species 2000 and GBIF at Naturalis in Leiden, March 2015.
GBIF API Hackaton, March 2015, Leiden, Sp2000/GBIF
1. GBIF & Species2000 Hackathon
March 2015
Naturalis, Leiden
GBIF portal API
Dag Endresen
GBIF Norway
UiO Natural History Museum in Oslo
University of Oslo
Tuesday, March 3rd, 2015
Slides: CC-BY-4.0
2. Credits, some slides from:
Daniel Amariles, Colombia
(2013) Nodes training at
GBIF GB20 in Berlin [link]
Gallien Labeyrie, France
(2014) Mentoring project
France-Spain-Portugal
[link]
5. GBIF DATA PORTAL API
An interface to access
data published through
the GBIF network using
web services.
6. DATA PORTAL API
GBIF Data Portal API:
http://api.gbif.org/v1/ (+parameters)
Summary and information:
http://www.gbif.org/developer/summary
The RESTful API take search parameters as
key=value pairs and respond with json content
type.
RESTful query format
JSON response type
7. GBIF API sections
• Registry
informa)on
about
the
datasets,
organiza)ons
(e.g.
data
publishers),
networks
and
the
means
to
access
them
(technical
endpoints)
• Species
informa)on
about
species
and
higher
taxa,
and
u)lity
services
for
interpre)ng
names
and
looking
up
the
iden)fiers
(access
to
all
published
checklists
in
the
GBIF
checklist
bank)
• Occurrence
occurrence
informa)on
crawled
and
indexed
by
GBIF
and
search
services
to
do
real
)me
paged
search
and
asynchronous
download
services
to
do
large
batch
downloads
• Maps
simple
services
to
show
the
maps
of
GBIF
mobilized
content
• News
services
to
stream
useful
informa)on
(RSS)
8. API example : dataset
Search for datasets by publishing country:
http://api.gbif.org/v1/dataset/search?publishingCountry=NO
Dataset information (UiO NHM Lichens):
http://api.gbif.org/v1/dataset/7948250c-6958-4a29-a670-
ed1015b26252
Contact persons for a dataset:
http://api.gbif.org/v1/dataset/7948250c-6958-4a29-a670-
ed1015b26252/contact
Dataset endpoint (get download URL):
http://api.gbif.org/v1/dataset/7948250c-6958-4a29-a670-
ed1015b26252/endpoint
http://www.gbif.org/developer/registry
9. API example : dataset
Download activity metrics for dataset (UiO NHM Lichens):
http://api.gbif.org/v1/occurrence/download/dataset/
7948250c-6958-4a29-a670-ed1015b26252
=> records from this dataset, included in 2650 download sets
Records lastInterpreted since November 2014:
http://api.gbif.org/v1/occurrence/search?
datasetKey=7948250c-6958-4a29-a670-
ed1015b26252&lastInterpreted=2014-11-01,* (=> 168 316 occ.)
Metrics for dataset data contents:
http://api.gbif.org/v1/dataset/66dd0960-2d7d-46ee-
a491-87b9adcfe7b1/metrics
=> count records by dimensions such as Kingdom, Rank, Vernacular
name langue, Extensions provided, …
NB! only implemented for species checklists, not (yet?) for occurrences!
http://www.gbif.org/developer/registry
10. API example : species
List all name usages (across all checklists):
http://api.gbif.org/v1/species?name=Beta%20vulgaris
Name usage across checklists (Beta vulgaris, 5383920):
http://api.gbif.org/v1/species/5383920/related
Name parsed into epithets and author etc.:
http://api.gbif.org/v1/parser/name?name=Abies%20alba
%20Mill.%20sec.%20Markus%20D.
{"scientificName": "Abies alba Mill. sec. Markus D.",
"type": "SCINAME",
"genusOrAbove": "Abies",
"specificEpithet": "alba",
"authorsParsed": true,
"authorship": "Mill.",
"sensu": "sec. Markus D.",
"canonicalName": "Abies alba",
"canonicalNameWithMarker": "Abies alba",
"canonicalNameComplete": "Abies alba Mill."
}
http://www.gbif.org/developer/species
11. API example : occurrence
List occurrences of Beta vulgaris:
http://api.gbif.org/v1/species/match?name=Beta+vulgaris => taxonKey
http://api.gbif.org/v1/occurrence/search?taxonKey=5383920
List occurrences from Norway (of Beta vulgaris):
http://api.gbif.org/v1/occurrence/search?publishingCountry=NO
http://api.gbif.org/v1/occurrence/search?publishingCountry=NO&taxonKey=5383920
Information about a single occurrence record:
http://api.gbif.org/v1/occurrence/1040970640
http://api.gbif.org/v1/occurrence/1040970640/fragment
http://api.gbif.org/v1/occurrence/1040970640/verbatim
List occurrence counts for datasets of country (or taxon):
http://api.gbif.org/v1/occurrence/counts/datasets?country=NO
http://www.gbif.org/developer/occurrence
12. API example : identifiers
(not implemented -- yet)
Searching by occurrenceID is unfortunately
not supported yet …
http://dev.gbif.org/issues/browse/POR-2451
http://dev.gbif.org/issues/browse/POR-2337
So, we cannot yet list or count occurrences with pattern “urn*” in occurrenceID
http://api.gbif.org/v1/occurrence/search?occurrenceID=urn*
http://api.gbif.org/v1/occurrence/search?occurrenceID=urn%3Acatalog
http://api.gbif.org/v1/occurrence/search?occurrenceID=urn%3Alsid
http://api.gbif.org/v1/occurrence/search?occurrenceID=http*
…
13. API example : download data
Lookup speciesKey (1) and download occurrences (2):
http://api.gbif.org/v1/species/match?
verbose=false&kingdom=Plantae&name=Beta+vulgaris
=> usageKey/speciesKey = 5383920
http://api.gbif.org/v1/occurrence/search?
taxonKey=5383920 [&limit=1000&offset=0]
=> notice: count = 25 513
=> then: page through results…
(using offset & limit)
http://api.gbif.org/v1/occurrence/download/request
[POST] => downloadKey (see next slide)
14. API example : asynchronous (1)
Request asynchronous download:
$ curl -i --user yourGbifUserName:yourGbifPassord -H
"Content-Type: application/json" -H "Accept: application/json"
-X POST -d @filter.json http://api.gbif.org/v1//occurrence/
download/request >> log.txt
Search parameters in a json text file: filter.json (in current
directory or located in a “PATH-directory”):
{
"creator":”yourGbifUserName",
"notification_address": [“yourEmail@mail.net"],
"predicate":
{
"type":"and",
"predicates":
[{"type":"equals","key":"HAS_COORDINATE","value":"false"},
{"type":"equals","key":"TAXON_KEY","value":"5383920"}]
}
}
17. API example : asynchronous (2b)
(…clean log.txt with the downloadKeys using regular
expressions…)
function gbifwget {
echo -e "nn----------------n$1 $2 $3n" >> log_wget.txt
wget http://api.gbif.org/v1/occurrence/download/request/$1.zip 2>&1 | tee /
dev/tty >> log_wget.txt
mv $1.zip ./dwca/$2.zip 2>&1 | tee /dev/tty >> log_wget.txt
}
$ gbifwget 0006050-141024112412452 4140730 "Aciachne acicularis"
$ gbifwget 0006053-141024112412452 4140704 "Aciachne flagellifera"
$ gbifwget 0006056-141024112412452 5289784 "Aegilops comosa"
…
(work in progress…)
18. MAPPING API v1.0
You can easily overlay GBIF content on
your own maps.
http://www.gbif.org/developer/maps
Slide by Daniel Amariles, 2013
19. This
service
is
intended
for
use
with
commonly
used
map
clients
such
as
the
Google
Maps
API,
Leaflet
JS
library
or
Modest
maps
JS
library.
These
libraries
allow
the
GBIF
layers
to
be
visualized
with
other
content,
such
as
those
coming
from
Web
Map
Service
(WMS)
providers.
It
should
be
noted
that
the
mapping
API
is
not
a
WMS
service,
nor
does
it
support
WFS
capabili)es.
hNp://leafletjs.com/
MAPPING API v1.0
hNp://modestmaps.com/
Slide by Daniel Amariles, 2013
20. CUSTOMIZING LAYER CONTENT
The
format
of
the
URL
is
as
follows:
With
the
following
required
parameters:
type
:
TAXON,
DATASET,
COUNTRY
or
PUBLISHER
key
:
The
appropriate
key
for
the
chosen
type
(a
taxon
key,
dataset/
publisher
UUID
or
2
leNer
ISO
country
code)
Other
supported
parameters:
resolu)on,
layer,
paleNe,
colors,
satura)on,
hue
hNp://www.gbif.org/developer/maps
hNp://api.gbif.org/v1/map/density/)le?x={x}&y={y}&z={z}
Slide by Daniel Amariles, 2013
21. Useful Tools (JSON & REST)
• REST client …
• JSON client/parser …
• JSONView (Firefox, Chrome, …)
• http://jsonview.com/
• Display formatted JSON in browser
• R CRAN : jsonlite
• http://cran.r-project.org/web/packages/jsonlite/
• E.g. read json into a dataframe [link]
• OpenRefine
• http://openrefine.org/
22.
23. R CRAN
rOpenSci provides programmatic access to scientific data
with R (rgbif, taxize, EML, geonames, …).
https://github.com/ropensci
http://ropensci.org/packages/
http://ropensci.org/tutorials/rgbif_tutorial.html
http://ropensci.org/tutorials/taxize_tutorial.html
27. Resolve taxonomic names
library(taxize) # rOpenSci Taxize
gnr <- gnr_resolve(names = "Beta vuulgariss") # Misspelled name
gnr$results # display suggested names
submitted_name matched_name data_source_title score
1 Beta vuulgariss Beta vulgaris L. Catalogue of Life 0.75
2 Beta vuulgariss Beta vulgaris L. ITIS 0.75
3 Beta vuulgariss Beta vulgaris NCBI 0.75
4 Beta vuulgariss Beta vulgaris var.-gr. crassa Alef. GRIN Taxonomy for Plants 0.75
specieslist <- c("Beta vulgaris", "Phleum pratensis", "Nicotiana glauca")
classification(specieslist, db = 'itis') # lookup higher taxonomy
Global Names Resolver: http://resolver.globalnames.org/
rOpenSci Taxize: http://ropensci.org/tutorials/taxize_tutorial.html
db = ’col'
db = ’itis'
28. rOpenSci : EML
library(EML, rfigshare)
description <- "My dataset published in GBIF"
eml_write(dat = dat, meta, title = "My Dataset",
description = description, creator = "Your Name
<name@mail.net>", file = "dataset.xml")
eml_publish("dataset.xml", description = description,
categories = "Ecology", tags = "biodiversity", destination =
"figshare", visibility = "public")
meta <- eml_read("eml_example.xml")
29. GBIF API support
Subscribe to the mailing-list for help and
information messages:
api-users@lists.gbif.org
30. GBIF Hackathon March 2015
Naturalis, Leiden
GBIF portal API
Dag Endresen
GBIF Norway
UiO Natural History Museum in Oslo
University of Oslo
Tuesday, March 3rd, 2015
Slides: CC-BY-4.0