Lecture presented at the Journals Club of the Naturhistorisches Museum Bern, March 17, 2014.
"Towards an (European) Open Biodiversity Knowledge Management System"
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
20140317 pi b_nmbe_journal_club
1. Towards an (European) Open
Biodiversity Knowledge
Management System
Donat Agosti (Plazi, Bern)
March 17, 2014
Berne, Journal Club @ NMBE
2. El Bulli: Cooking in Progress (2011) Ferran Adria (Actor), Gereon Wetzel (Director)
3. The cook (Ferran Adriá) wants to know when he can
expect what seafood for his kitchen.
He assumes that phenological data is open and
accessible to anyone.
He has a question and needs to know: What seafood
at what time?
His goal is to provide a service based on the use of
observation data, i.e. treat you (and make some
money).
4. The fishmonger knows when what seafood is
available.
He considers his knowledge of seafood phenology
as his asset to make money.
His goal is to make money with knowledge based
on observation records and understanding the
characteristics of seafood.
5. What do YOU want to know?
How do YOU expect to get to your information?
6. • What are the main online resources you use?
• Do you maintain your own digital library?
• Do you participate in an online project, eg
scratchpads, catalogue, digital archive and
make your data accessible?
• … ?
7. What does this mean?
Meredith Lane, e-biosphere Conference, London 2009
8. Hardisty, Nature 502, 171 (2013)
BUT: predictive ecology has substantial data needs
Harfoot, BIH2013, Rome, 2013
The big question
What is the future of the biological world?
Imagine if we could:
…Predict community level dynamics of ecosystems at
scales from local to global, based on the ecology and
biology of all individual organisms
9. Decentralized biodiversity infrastructure
Plants
3,400 Herbaria worldwide
10,000 Associate curators and specialists
350,000,000 specimens in collections
180,000,000 specimens digitized
2,000,000,000 specimens including animals
Source: gbif.org; http://sciweb.nybg.org/science2/IndexHerbariorum.asp
10. 200,000,000+ printed pages
1,900,000 species described
20,000,000+ species treatments
17,000 new species per year
Biodiversity libraries
BUT: The data are hidden
Incomplete digitization
Publications are
unstructured
Collections are incomplete
Data is not linked
Most data are not open
11. Nationaal Herbarium Nederland collection on GBIF
Source: http://www.gbif.org/dataset/7b33b040-f762-11e1-a439-00145eb45e9a
One collection’s view of the world
12. Another collection’s view of the world
http://www.gbif.org/dataset/82b0f51c-f762-11e1-a439-00145eb45e9a
13. What does this mean?
The Linking Open Data cloud diagram
Linked Open Data Cloud
14. Names as information tags in life sciences
Names
Characteristics
Publications
GenesCollections
Specimens
Distribution
15. The enhanced and linked treatments, extracted, stored on Plazi.org, and served in
a human readable form, are linked to the underlying data: Fisher & Smith, 2008,
PLoS ONE.
17. Coordination and Policy Development in Preparation for a
European Open Biodiversity Knowledge Management
System
Supported by the European Commission through its FP7 research funding programme
pro-iBiosphere
19. Create digital objects
+ Identifiers and resolvers
+ Open Access
+ Adequate infrastructure
+ Sustainable and permanent infrastructure
+ Reliable services for partners in research projects and society
Seamless Global Virtual Research Knowledge Management System
(European Open Biodiversity Knowledge Management System)
Biodiversity Knowledge Management System
20. Impact
Support reliable and permanent open access to digital biodiversity
records
Create identifiers and link biodiversity literature, collections, digital
objects, genes, etc.
Ensure global interoperability and sharing of biodiversity data,
information and knowledge
Provide new services in support of open science
Provide the ground for modelling biosphere
Develop data policies to harness the potential of open access
European Open Biodiversity Management System
The envisaged
will:
24. Treatment
A publication or section of a publication documenting the
features or distribution of a related group of organisms
(called a “taxon”, plural “taxa”) in ways adhering to highly
formalized conventions.
http://terms.tdwg.org/wiki/tp:taxon-treatment
Catapano, 2010.
35. The enhanced and linked treatments, extracted, stored on Plazi.org, and served in
a human readable form, are linked to the underlying data: Fisher & Smith, 2008,
PLoS ONE.
36. Penestomus egazini Miller, Haddad & Griswold, 2010
Progress
Treatments (% complete): 4/4 (100%)
Data summary
Specimen records:41
adult female
adult male
other
51%
2%
46%
Specimen collections
Institutions: 3
Distribution
Muséum National d'Histoire Naturelle, Paris
California Academy of Sciences, San Francisco
Albany Museum, Grahamstown
2%
5%
76%
20%
Countries
Lesotho
South Africa
Georeferenced materials citations
Export species materials citations (DwC)
Export treatment materials citations (DwC)
37. 0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
Materials Citations Records by Researcher
Other
Donat Agosti
David Grimaldi
Toby Schuh
James Carpenter
Norman Platnick
American Museum of Natural History
Data summary
Materials citations 2004-2013:111,364
Distribution
Georeferenced materials citations
Export species materials citations (DwC)
MaterialsCitationsRecords
38. 0
500
1000
1500
2000
2500
Materials Citations Records by Institution
Other
Muséum National d'Histoire
Naturelle, Paris
Natural History Museum,
London
Museum of Comparative
Zoology
Smithsonian Institution
American Museum of Natural
History
Zootaxa
Data summary
Materials citations 2004-2013:11,476
Distribution
Georeferenced materials citations
Export species materials citations (DwC)
MaterialsCitationsRecords
43. Unified marked up final output
Taxon treatments, keys, images, localities
PROSPECTIVE PUBLISHING | HISTORICAL LITERATURE
Legacy and new taxonomic literature
Content management systems &
repositories (e.g., Plazi, EOL, GBIF, SCRATCHPADS, EDIT)
TaxPub XML schema
PENSOFT MARK UP tool
Marked up publications
PDF, HTML and XML
archiving
WIKI
Species-ID, Wikispecies
Wikipedia
Indexing (IPNI,
ZooBank, Myco-
Bank, GNA)
Aggregators
(EOL, GBIF)
Electronic
archives; Data
Centers
END
USERS
TaxonX schema
PLAZI’ GOLDEN GATE editor
Automated
submission; peer-
review
45. Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire
body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages)
48. Before antbase.org, Harvard‘s Museum of
Comparative Zoology could claim to be the only
location with a complete set of ant systematics
publications from 1758 - present.
49. Before antbase.org, Harvard‘s Museum of
Comparative Zoology could claim to be the only
location with a complete set of ant systematics
publications from 1758 - present.
Through antbase.org‘s
digital library, access
to this body of
literature is worldwide,
and it is actively used
(>10,000 visits in one
month only).
52. • The free and open use of content, services and other digital resources
about biodiversity;
• Licenses that grant all users a free, irrevocable, world-wide, right to
copy, use, distribute, transmit and display the work publicly as well as
build on the work and making derivative works, subject to proper
attribution consistent with community practices;
• Policy developments that will foster free and open access to biodiversity
data;
• Tracking the use of information to ensure that sources and suppliers of
data are assigned credit for their contributions;
• An agreed infrastructure, standards and protocols to improve access to
and use of open data;
Bouchout Declaration, 2014 (1)
53. • Registers for content and services to allow discovery, access and use of
open data;
• Persistent, dereferenceable identifiers for data objects and physical
objects such as specimens, images and taxonomic treatments;
• Linking data using agreed vocabularies, both within and beyond
biodiversity, that enable participation in the Linked Open Data Cloud;
• Dialogue coordinated by the leading signatories to refine the concept,
priorities and technical requirements of Open Biodiversity Knowledge
Management.
• A sustainable Open Biodiversity Knowledge Management that is
attentive to scientific, sociological, legal, and financial aspects.
Bouchout Declaration, 2014 (2)
60. founded in 2008
Swiss based NGO with members in
Switzerland, Germany, Bulgaria, US and
Iran
research based think tank with the
mission to promote open access to
scientific content
five pillars: Legal advice,
technical innovations and solutions,
maintenance of a treatment repository
and Biowikifarm, consultancy, advocacy
66. founded in 2008
Swiss based NGO with members in
Switzerland, Germany, Bulgaria, US and
Iran
research based think tank with the
mission to promote open access to
scientific content
five pillars: Legal advice,
technical innovations and solutions,
maintenance of a treatment repository
and Biowikifarm, consultancy, advocacy
Plazi GmbH founded in 2012 as
service SME owned by Plazi
67. research based think tank with the
mission to promote open access to
scientific content
five pillars: Legal advice,
technical innovations and solutions,
maintenance of a treatment repository
and Biowikifarm, consultancy, advocacy
Plazi GmbH founded in 2012 as
service SME owned by Plazi
Funding from public donors, eg. EU,
corporate and private
69. five pillars: Legal advice, technical
innovations and solutions, maintenance
of a treatment repository and
Biowikifarm, consultancy, advocacy
Plazi GmbH founded in 2012 as
service SME owned by Plazi
Funding from public donors, eg. EU,
corporate and private
Clients are global
70. Consultancies and Services:
Consulting publishers on how to
produce XML semantically enhanced
output (eg. EJT, Zootaxa, Smithsonian
Institution)
Service to mark-up literature
71. http://plazi.org
Thank you very much!
Donat Agosti
agosti@plazi.org
This project is funded under the European Union's Seventh Framework
Programme (FP7/2007-2013) under grant agreement №312848.