SlideShare une entreprise Scribd logo
1  sur  68
Télécharger pour lire hors ligne
Donat Agosti Plazi
http://plazi.org
Systematics Association
Oxford, 28. August 2015
Nothing in taxonomy makes sense
except in the light of Open Access
I want to be able at anytime, anywhere to access, mine and analyse a
significant body of published and digitized taxonomic knowledge.
I want to build by machine the catalogue of life.
I hope taxonomiy communications arrives in the 21st century
Vision and hope
1. The demand
Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the
only location with a complete set of ant systematics publications from 1758 - present.
Through antbase.org‘s
digital library, access
to this body of
literature is worldwide,
and it is actively used
(>10,000 visits in one
month only).
2004
2. The corpus of taxonomic literature
Build and establish a TreatmentBank, such as Plazi, as basis for
content mining of and linking to the taxonomic literature
3. The core corpus of taxonomic knowledge: Treatments
4. Make use of the semantic linked WWW
Avoid all the waistful actual publishing!
• Publish structured data
• Publish open access
• Make taxonomic literature first class literature by minting
DOIs and making digital copies accessible
• Add links to names, treatments, articles, DNA sequences,
digital objects
• Help by building your own public corpus of citable data
Pensoft journals (e.g. Biodiversity Data Journal, Zookeys,
Phytokeys) are the gold standard.
Surfing or the seduction of science (for a young kid)
Surfing or the seduction of science (for a young kid)
Surfing or the seduction of science (for a young kid)
Surfing or the seduction of science (for an adult)
Get a copy of the Cyclothone paper
Surfing or the seduction of science (for an adult)
Surfing or the imperative for science
Surfing or the imperative for science
Linking treatments and data with external resources
NCBI
Surfing or the imperative for science
Establish Plazi as, or use Plazi to build TreatmentBank as source for content mining of the
taxonomic literature
TreatmentBank
What are the species in Amazonia?
TreatmentBank
Countries (Region)
Australia (Queensland)
Export species materials citations (DwC)
Text mining tools: Visualization of treatment content
Summary of content of 37 Zootaxa spider publications and 8
Biodiversity Data Journal. (Miller et al., 2015)
Pseudomyrmex ants and Vachellia ant-acacias
are a classic example of mutualism in biology.
allenii
melanoceras
ruddiae
chiapensis
collinsii
cookii
cornigera
globulifera
hindsii
janzenii
mayana
sphaerocephala
boopis
flavicornis
hesperius
ita
janzeni
kuenckeli
mixtecus
nigrocinctus
nigropilosus
opaciceps
particeps
peperi
reconditus
satanicus
simulans
spinicola
subtilissimus
veneficus
ferrugineus
gentlei
gracilis
Transbiotic link network
Associated species linked through
references in taxonomic treatments
Acacia-ant species: Pseudomyrmex gracili
Treatment: redescription
Associated ant-acacia: Acacia gentlei
Ants Plants
Photocredits: Alex Wild
Treatment
Treatments linked
through citations
Text mining tools: Visualization of treatment content
What does this mean?
The Linking Open Data cloud diagram
Linked Open Data Cloud
The demand: scientists and citizen scientists
Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the
only location with a complete set of ant systematics publications from 1758 - present.
Through antbase.org‘s
digital library, access
to this body of
literature is worldwide,
and it is actively used
(>10,000 visits in one
month only).
Online catalogue
Open access
Online library
Online catalogue
The interest of big science
2004
2005
The demand: scientists and citizen scientists
The scientific challenge: Bridging the gap
1 tnntttccca cgaataaata atataagatt ttgattatta cctccttctt taattttatt
61 attatcaaga agattagttt ataaaggagt aggaacagga tgaactgttt atcctccttt
121 atctaataat ttatatcata atggattttc aactgattta gcaatttttt ctttacatat
181 tgcaggaata tcatcaatta taggagcaat taattttatt tcaacaattt taaatataca
241 tcataaaaat ttatcattag ataaaattcc attgttagtt tgatcaattt taattacagc
301 tattttatta ttattatctt tacctgtatt agcaggtgca attactatat tattaactga
361 tcgaaatcta aatacaactt tttttgatcc ttcgggtgga ggagatccaa ttttatatca
421 acatttattt
Where do we stand?
The bristlemouths are a rapacious
family of deep-sea fishes that include
the wildly successful genus
Cyclothone
In contrast, ichthyologists put the
likely figure for bristlemouths at
hundreds of trillions — and perhaps
quadrillions, or thousands of
trillions.
The bristlemouths are a rapacious
family of deep-sea fishes that include
the wildly successful genus
Cyclothone
Taxonomy?
Source?
Issue USD 266.00
Article USD 48.00
Get a copy of the Cyclothone paper
Our contribution for a better understanding of biodiversity
Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire
body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages. Source: (Agosti 2005)
Access
• Limited access (copyright)
• Limited discoverability of content
• Research results cannot be cited
• Data mining does not work
Issues of access
Provide an open access, linked corpus of taxonomic literature
A solution
Surfing at breakfast table
article
treatment
Cites
httpURI
cites (DOI)
Scientific name
https://www.wikidata.org/wiki/
Property:P1992
Feed Wikipedia with taxonomic data
Surfing or the imperative for science
Surfing or the imperative for science
Surfing or the imperative for science
LODPDF
HNS
H
Surfing or the imperative for science: Use of name services
The goal
Create a citable open corpus of taxonomic publications
Biodiversity Literature Repository: Record
Biodiversity Literature Repository: RecordTreatment
Illustration
http://plazi.org/wiki/Blue_ListPatterson et al., 2014: http://dx.doi.org/10.1186/1756-0500-7-79
Legal issues
Workflow
Plazi
SRS
find scan «OCR» markup store +
access
Text
<tax:treatment>
<tax:nomenclature>
<tax:name>
<tax:xid source="HNS" identifier="193329"/>
<tax:xmldata>
<dc:Genus>Mystrium</dc:Genus>
<dc:Species>leonie</dc:Species>
</tax:xmldata>
Mystrium leonie
</tax:name>
<tax:status>n. sp.</tax:status>
Fig 1 D - F
</tax:nomenclature>
<tax:div type="description">
<tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI
1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margi
to a sharp apical tooth, the apex parallel to the an
(Holotype with material in mandibles, so mandibles a
$ described below from paratypes.) Median clypeus
....
</treatment>
Semantisch
erweiterter Text
(TaxonX)
… alternatives: From human to machine readable text
RDF
Plazi tools: table extraction
«Treatment»
Wissenschaftliche Artname
Verbreitungsnachweis
Cataglyphis tartessica workers
Variable mean ± SD
Head length 11.23 ± 0.12
Head width 11.15 ± 0.12
Scape length 11.47 ± 0.12
Mesosoma length 11.94 ± 0.16
Femur length 12.03 ± 0.14
Cephalic index 0 93.60 ± 3.940
Scape index 128.10 ± 7.660
Plazi tools: discovering of scientific names
Plazi tools: discovering and parsing of bibliographic references
Plazi tools: discovering and parsing of observation data
Plazi tools: discovering of treatments
Treatment: a well defined part of an article that
defines the particular usage of a scientific name
by an authority at a given time (a page(s) in a
publication).
Treatment
The special case taxonomic literature: The citated elements are
treatments, not article
Formica obsoleta Linnaeus, 1758: 580
Treatment
Original combinations
Reference to an orginal combination
Subsequent useages of names cite the referenced treatment
What is a treatment?
Treatment and treatment reference and citation
Treatmentcitation
Treatment
references
Treatment
Citing of treatments or linking of treatments to treatments
By minting persistent httpURIs for treatments, treatments
can be cited like a bibliographic reference
http://treatment.plazi.org/id/A9FFD1FC-4629-FFB4-968F-AD38386521BA
Status quo
• 50,000+ treatments life, daily growth
• RDF in Betaversion
• GoldenGate Imagine (PDF and text mining tool) in betaversion
• Provider for data for NCBI, Wikidata, GBIF, EOL, antweb
• Biodiversity Literature Repository functional
Next steps
• Collaborate with ContentMine to extract >50
treatments/day
Next steps
Planned collaboration with ContentMine to extract treatments on a
daly bases
http://www.slideshare.net/petermurrayrust/?
BioDiv
Next steps
• Collaborate with ContentMine to extract 50 treatments/day
• 1 Million treatments life
• RDF Version accessibl
• GoldenGate Imagine (Text mining tool)
• Provider für Daten für NCBI, GBIF, EOL, antweb
• Biodiversity Literature Repository mit 100,000 bibliographic
references and digital copies (PDF, images, etc.)
Next steps
BUT
Next steps
Avoid all this waste (our next generation will have to clean up)!
Publish structured data
Publish open access
Publish in journals with DOI
Add links to names, treatments, articles, DNA sequences, digital
objects
Help build your own corpus of citable data
Pensoft journals (e.g. Biodiversity Data Journal, Zookeys,
Phytokeys) are the gold standard.
Thanks!
Donat Agosti
agosti@plazi.org
Acknowledgment: Pensoft, Zenodo/CERN, NCBI, Wikidata, ContentMine

Contenu connexe

Tendances

Nigel Robinson - ZooBank and Zoological Record: a partnership for success
Nigel Robinson - ZooBank and Zoological Record: a partnership for successNigel Robinson - ZooBank and Zoological Record: a partnership for success
Nigel Robinson - ZooBank and Zoological Record: a partnership for success
ICZN
 

Tendances (20)

Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
 
Nigel Robinson - ZooBank and Zoological Record: a partnership for success
Nigel Robinson - ZooBank and Zoological Record: a partnership for successNigel Robinson - ZooBank and Zoological Record: a partnership for success
Nigel Robinson - ZooBank and Zoological Record: a partnership for success
 
Nigel J. Robinson - ZooBank and Zoological Record - a partnership for success
Nigel J. Robinson - ZooBank and Zoological Record - a partnership for successNigel J. Robinson - ZooBank and Zoological Record - a partnership for success
Nigel J. Robinson - ZooBank and Zoological Record - a partnership for success
 
Optimising the use of existing knowledge
Optimising the use of existing knowledgeOptimising the use of existing knowledge
Optimising the use of existing knowledge
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
Content Mining of Science in Cambridge
Content Mining of Science in CambridgeContent Mining of Science in Cambridge
Content Mining of Science in Cambridge
 
Zika virus -a research landscape analysis using journals, patents and dataset...
Zika virus -a research landscape analysis using journals, patents and dataset...Zika virus -a research landscape analysis using journals, patents and dataset...
Zika virus -a research landscape analysis using journals, patents and dataset...
 
The Biodiversity Heritage Library: Corn-fed, Missouri Raised, Going Global
The Biodiversity Heritage Library: Corn-fed, Missouri Raised, Going GlobalThe Biodiversity Heritage Library: Corn-fed, Missouri Raised, Going Global
The Biodiversity Heritage Library: Corn-fed, Missouri Raised, Going Global
 
Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature
 
A Botanical Introduction to The Biodiversity Heritage Library
A Botanical Introduction to The Biodiversity Heritage LibraryA Botanical Introduction to The Biodiversity Heritage Library
A Botanical Introduction to The Biodiversity Heritage Library
 
Modern Tools & Rationales for 21st Century Research
Modern Tools & Rationales  for 21st Century ResearchModern Tools & Rationales  for 21st Century Research
Modern Tools & Rationales for 21st Century Research
 
schema.org and biomedical ontologies
schema.org and biomedical ontologies schema.org and biomedical ontologies
schema.org and biomedical ontologies
 
Chemspider Presentation at the ACS Meeting in New orleans
Chemspider Presentation at the ACS Meeting in New orleansChemspider Presentation at the ACS Meeting in New orleans
Chemspider Presentation at the ACS Meeting in New orleans
 
Museum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on themMuseum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on them
 
Cochrane workshop2016
Cochrane workshop2016Cochrane workshop2016
Cochrane workshop2016
 
Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)Can Computers understand the scientific literature (includes compscie material)
Can Computers understand the scientific literature (includes compscie material)
 
A Global Library of Life: The Biodiversity Heritage Library
A Global Library of Life: The Biodiversity Heritage LibraryA Global Library of Life: The Biodiversity Heritage Library
A Global Library of Life: The Biodiversity Heritage Library
 
Mining the scientific literature for plants and chemistry
Mining the scientific literature for plants and chemistryMining the scientific literature for plants and chemistry
Mining the scientific literature for plants and chemistry
 
OSFair2017 Workshop | OmicsDI: Omics discovery index
OSFair2017 Workshop | OmicsDI: Omics discovery indexOSFair2017 Workshop | OmicsDI: Omics discovery index
OSFair2017 Workshop | OmicsDI: Omics discovery index
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 

En vedette

En vedette (7)

Linked Open Data and Systematic Taxonomy
Linked Open Data and Systematic TaxonomyLinked Open Data and Systematic Taxonomy
Linked Open Data and Systematic Taxonomy
 
Open taxonomy
Open taxonomyOpen taxonomy
Open taxonomy
 
Open Research Data: Taxonomy
Open Research Data: TaxonomyOpen Research Data: Taxonomy
Open Research Data: Taxonomy
 
The role of product category for brand relationships
The role of product category for brand relationships The role of product category for brand relationships
The role of product category for brand relationships
 
Brand As A Category Not A Product
Brand As A Category Not A ProductBrand As A Category Not A Product
Brand As A Category Not A Product
 
Category Management Project
Category Management ProjectCategory Management Project
Category Management Project
 
Taxonomies for E-commerce
Taxonomies for E-commerceTaxonomies for E-commerce
Taxonomies for E-commerce
 

Similaire à Nothing in taxonomy makes sense except in the light of Open Access

20140327 rda plazi_final
20140327 rda plazi_final20140327 rda plazi_final
20140327 rda plazi_final
agosti
 
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
ICZN
 
Special Libraries Associatin
Special Libraries AssociatinSpecial Libraries Associatin
Special Libraries Associatin
drielinger
 
2009 05 20 Cimc Pilsk
2009 05 20 Cimc Pilsk2009 05 20 Cimc Pilsk
2009 05 20 Cimc Pilsk
SCPilsk
 

Similaire à Nothing in taxonomy makes sense except in the light of Open Access (20)

Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
 
BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...
BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...
BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...
 
20140327 rda plazi_final
20140327 rda plazi_final20140327 rda plazi_final
20140327 rda plazi_final
 
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
Donat Agosti - Copyright, Biopiracy and the Taxonomic Impediment
 
Special Libraries Associatin
Special Libraries AssociatinSpecial Libraries Associatin
Special Libraries Associatin
 
Scientific search for everyone
Scientific search for everyoneScientific search for everyone
Scientific search for everyone
 
Eol fellow-march2010
Eol fellow-march2010Eol fellow-march2010
Eol fellow-march2010
 
Biodiversity Heritage Library : Development and Partnerhips
Biodiversity Heritage Library : Development and PartnerhipsBiodiversity Heritage Library : Development and Partnerhips
Biodiversity Heritage Library : Development and Partnerhips
 
Open Access to Legacy Biodiversity Literature
Open Access to Legacy Biodiversity LiteratureOpen Access to Legacy Biodiversity Literature
Open Access to Legacy Biodiversity Literature
 
2 donat agosti-1
2 donat agosti-12 donat agosti-1
2 donat agosti-1
 
Botanists and annotations printer friendly
Botanists and annotations   printer friendlyBotanists and annotations   printer friendly
Botanists and annotations printer friendly
 
Botanical Literature Goes Global: The Biodiversity Heritage Library
Botanical Literature Goes Global: The Biodiversity Heritage Library Botanical Literature Goes Global: The Biodiversity Heritage Library
Botanical Literature Goes Global: The Biodiversity Heritage Library
 
The Biodiversity Heritage Library Mass Digitizing Project: A Grandeur in this...
The Biodiversity Heritage Library Mass Digitizing Project: A Grandeur in this...The Biodiversity Heritage Library Mass Digitizing Project: A Grandeur in this...
The Biodiversity Heritage Library Mass Digitizing Project: A Grandeur in this...
 
What are we DOIng about the missing links? Connecting taxonomic names to the ...
What are we DOIng about the missing links? Connecting taxonomic names to the ...What are we DOIng about the missing links? Connecting taxonomic names to the ...
What are we DOIng about the missing links? Connecting taxonomic names to the ...
 
Mla May 7
Mla May 7Mla May 7
Mla May 7
 
2009 05 20 Cimc Pilsk
2009 05 20 Cimc Pilsk2009 05 20 Cimc Pilsk
2009 05 20 Cimc Pilsk
 
Smithsonian Libraries 2.0 and the Biodiversity Heritage Library Project
Smithsonian Libraries 2.0 and the Biodiversity Heritage Library ProjectSmithsonian Libraries 2.0 and the Biodiversity Heritage Library Project
Smithsonian Libraries 2.0 and the Biodiversity Heritage Library Project
 
An Introduction to the Biodiversity Heritage Library for the DC Science Libra...
An Introduction to the Biodiversity Heritage Library for the DC Science Libra...An Introduction to the Biodiversity Heritage Library for the DC Science Libra...
An Introduction to the Biodiversity Heritage Library for the DC Science Libra...
 
3 Years On: The Biodiversity Heritage Library
3 Years On: The Biodiversity Heritage Library3 Years On: The Biodiversity Heritage Library
3 Years On: The Biodiversity Heritage Library
 
2017.07.25 xixibc kalfatovic
2017.07.25 xixibc kalfatovic2017.07.25 xixibc kalfatovic
2017.07.25 xixibc kalfatovic
 

Plus de agosti

Plus de agosti (17)

DOI and the Mitteilungen: communicating scientific results in the future
DOI and the Mitteilungen: communicating scientific results in the futureDOI and the Mitteilungen: communicating scientific results in the future
DOI and the Mitteilungen: communicating scientific results in the future
 
Data Sharing Principles and Legal Interoperability for Essential Biodiversity...
Data Sharing Principles and Legal Interoperability for Essential Biodiversity...Data Sharing Principles and Legal Interoperability for Essential Biodiversity...
Data Sharing Principles and Legal Interoperability for Essential Biodiversity...
 
Revolutionizing the Research on Ants through new Methods and Technologies: th...
Revolutionizing the Research on Ants through new Methods and Technologies: th...Revolutionizing the Research on Ants through new Methods and Technologies: th...
Revolutionizing the Research on Ants through new Methods and Technologies: th...
 
20150701 opendata bern_agosti_2
20150701 opendata bern_agosti_220150701 opendata bern_agosti_2
20150701 opendata bern_agosti_2
 
Plazi or the challenge to free biodiversity data caught in hundreds of millio...
Plazi or the challenge to free biodiversity data caught in hundreds of millio...Plazi or the challenge to free biodiversity data caught in hundreds of millio...
Plazi or the challenge to free biodiversity data caught in hundreds of millio...
 
20141027 bouchout declaration
20141027 bouchout declaration20141027 bouchout declaration
20141027 bouchout declaration
 
20140924 rda _bouchout
20140924 rda _bouchout20140924 rda _bouchout
20140924 rda _bouchout
 
20140922 rda codata_legal_ig_plazi_final
20140922 rda codata_legal_ig_plazi_final20140922 rda codata_legal_ig_plazi_final
20140922 rda codata_legal_ig_plazi_final
 
A Step Towards (From) Read to Write Access to Taxonomic Publications
A Step Towards  (From) Read to Write Access to Taxonomic PublicationsA Step Towards  (From) Read to Write Access to Taxonomic Publications
A Step Towards (From) Read to Write Access to Taxonomic Publications
 
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
 
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
 
20140623 swets agosti_final
20140623 swets agosti_final20140623 swets agosti_final
20140623 swets agosti_final
 
20140523 swiss curators_bouchout_2
20140523 swiss curators_bouchout_220140523 swiss curators_bouchout_2
20140523 swiss curators_bouchout_2
 
20110725 ibc xml
20110725 ibc xml20110725 ibc xml
20110725 ibc xml
 
20110222 behesty monitoring and measuring biodiversity
20110222 behesty monitoring and measuring biodiversity20110222 behesty monitoring and measuring biodiversity
20110222 behesty monitoring and measuring biodiversity
 
20110122 vibrant final
20110122 vibrant final20110122 vibrant final
20110122 vibrant final
 
20090921 Art Databanken Agosti Final
20090921 Art Databanken Agosti Final20090921 Art Databanken Agosti Final
20090921 Art Databanken Agosti Final
 

Dernier

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
Bhagirath Gogikar
 

Dernier (20)

Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 

Nothing in taxonomy makes sense except in the light of Open Access

  • 1. Donat Agosti Plazi http://plazi.org Systematics Association Oxford, 28. August 2015 Nothing in taxonomy makes sense except in the light of Open Access
  • 2.
  • 3. I want to be able at anytime, anywhere to access, mine and analyse a significant body of published and digitized taxonomic knowledge. I want to build by machine the catalogue of life. I hope taxonomiy communications arrives in the 21st century Vision and hope
  • 4. 1. The demand Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the only location with a complete set of ant systematics publications from 1758 - present. Through antbase.org‘s digital library, access to this body of literature is worldwide, and it is actively used (>10,000 visits in one month only). 2004
  • 5. 2. The corpus of taxonomic literature
  • 6. Build and establish a TreatmentBank, such as Plazi, as basis for content mining of and linking to the taxonomic literature 3. The core corpus of taxonomic knowledge: Treatments
  • 7. 4. Make use of the semantic linked WWW Avoid all the waistful actual publishing! • Publish structured data • Publish open access • Make taxonomic literature first class literature by minting DOIs and making digital copies accessible • Add links to names, treatments, articles, DNA sequences, digital objects • Help by building your own public corpus of citable data Pensoft journals (e.g. Biodiversity Data Journal, Zookeys, Phytokeys) are the gold standard.
  • 8. Surfing or the seduction of science (for a young kid)
  • 9. Surfing or the seduction of science (for a young kid)
  • 10. Surfing or the seduction of science (for a young kid)
  • 11. Surfing or the seduction of science (for an adult)
  • 12. Get a copy of the Cyclothone paper Surfing or the seduction of science (for an adult)
  • 13. Surfing or the imperative for science
  • 14. Surfing or the imperative for science
  • 15. Linking treatments and data with external resources NCBI Surfing or the imperative for science
  • 16. Establish Plazi as, or use Plazi to build TreatmentBank as source for content mining of the taxonomic literature TreatmentBank
  • 17. What are the species in Amazonia? TreatmentBank
  • 18. Countries (Region) Australia (Queensland) Export species materials citations (DwC)
  • 19. Text mining tools: Visualization of treatment content Summary of content of 37 Zootaxa spider publications and 8 Biodiversity Data Journal. (Miller et al., 2015)
  • 20. Pseudomyrmex ants and Vachellia ant-acacias are a classic example of mutualism in biology. allenii melanoceras ruddiae chiapensis collinsii cookii cornigera globulifera hindsii janzenii mayana sphaerocephala boopis flavicornis hesperius ita janzeni kuenckeli mixtecus nigrocinctus nigropilosus opaciceps particeps peperi reconditus satanicus simulans spinicola subtilissimus veneficus ferrugineus gentlei gracilis Transbiotic link network Associated species linked through references in taxonomic treatments Acacia-ant species: Pseudomyrmex gracili Treatment: redescription Associated ant-acacia: Acacia gentlei Ants Plants Photocredits: Alex Wild Treatment Treatments linked through citations Text mining tools: Visualization of treatment content
  • 21. What does this mean? The Linking Open Data cloud diagram Linked Open Data Cloud
  • 22. The demand: scientists and citizen scientists Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the only location with a complete set of ant systematics publications from 1758 - present. Through antbase.org‘s digital library, access to this body of literature is worldwide, and it is actively used (>10,000 visits in one month only). Online catalogue Open access Online library
  • 23. Online catalogue The interest of big science 2004 2005
  • 24. The demand: scientists and citizen scientists
  • 25. The scientific challenge: Bridging the gap 1 tnntttccca cgaataaata atataagatt ttgattatta cctccttctt taattttatt 61 attatcaaga agattagttt ataaaggagt aggaacagga tgaactgttt atcctccttt 121 atctaataat ttatatcata atggattttc aactgattta gcaatttttt ctttacatat 181 tgcaggaata tcatcaatta taggagcaat taattttatt tcaacaattt taaatataca 241 tcataaaaat ttatcattag ataaaattcc attgttagtt tgatcaattt taattacagc 301 tattttatta ttattatctt tacctgtatt agcaggtgca attactatat tattaactga 361 tcgaaatcta aatacaactt tttttgatcc ttcgggtgga ggagatccaa ttttatatca 421 acatttattt
  • 26. Where do we stand?
  • 27.
  • 28. The bristlemouths are a rapacious family of deep-sea fishes that include the wildly successful genus Cyclothone In contrast, ichthyologists put the likely figure for bristlemouths at hundreds of trillions — and perhaps quadrillions, or thousands of trillions.
  • 29. The bristlemouths are a rapacious family of deep-sea fishes that include the wildly successful genus Cyclothone
  • 30.
  • 32.
  • 34. Get a copy of the Cyclothone paper Our contribution for a better understanding of biodiversity
  • 35. Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages. Source: (Agosti 2005) Access
  • 36. • Limited access (copyright) • Limited discoverability of content • Research results cannot be cited • Data mining does not work Issues of access
  • 37. Provide an open access, linked corpus of taxonomic literature A solution
  • 40. Surfing or the imperative for science
  • 41. Surfing or the imperative for science
  • 42. Surfing or the imperative for science
  • 43. LODPDF HNS H Surfing or the imperative for science: Use of name services
  • 45. Create a citable open corpus of taxonomic publications
  • 46.
  • 48. Biodiversity Literature Repository: RecordTreatment Illustration
  • 49. http://plazi.org/wiki/Blue_ListPatterson et al., 2014: http://dx.doi.org/10.1186/1756-0500-7-79 Legal issues
  • 50. Workflow Plazi SRS find scan «OCR» markup store + access
  • 51. Text <tax:treatment> <tax:nomenclature> <tax:name> <tax:xid source="HNS" identifier="193329"/> <tax:xmldata> <dc:Genus>Mystrium</dc:Genus> <dc:Species>leonie</dc:Species> </tax:xmldata> Mystrium leonie </tax:name> <tax:status>n. sp.</tax:status> Fig 1 D - F </tax:nomenclature> <tax:div type="description"> <tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margi to a sharp apical tooth, the apex parallel to the an (Holotype with material in mandibles, so mandibles a $ described below from paratypes.) Median clypeus .... </treatment> Semantisch erweiterter Text (TaxonX) … alternatives: From human to machine readable text RDF
  • 52. Plazi tools: table extraction «Treatment» Wissenschaftliche Artname Verbreitungsnachweis Cataglyphis tartessica workers Variable mean ± SD Head length 11.23 ± 0.12 Head width 11.15 ± 0.12 Scape length 11.47 ± 0.12 Mesosoma length 11.94 ± 0.16 Femur length 12.03 ± 0.14 Cephalic index 0 93.60 ± 3.940 Scape index 128.10 ± 7.660
  • 53. Plazi tools: discovering of scientific names
  • 54. Plazi tools: discovering and parsing of bibliographic references
  • 55. Plazi tools: discovering and parsing of observation data
  • 56. Plazi tools: discovering of treatments
  • 57. Treatment: a well defined part of an article that defines the particular usage of a scientific name by an authority at a given time (a page(s) in a publication). Treatment The special case taxonomic literature: The citated elements are treatments, not article Formica obsoleta Linnaeus, 1758: 580
  • 59. Original combinations Reference to an orginal combination Subsequent useages of names cite the referenced treatment What is a treatment?
  • 60. Treatment and treatment reference and citation Treatmentcitation Treatment references
  • 61. Treatment Citing of treatments or linking of treatments to treatments By minting persistent httpURIs for treatments, treatments can be cited like a bibliographic reference http://treatment.plazi.org/id/A9FFD1FC-4629-FFB4-968F-AD38386521BA
  • 62. Status quo • 50,000+ treatments life, daily growth • RDF in Betaversion • GoldenGate Imagine (PDF and text mining tool) in betaversion • Provider for data for NCBI, Wikidata, GBIF, EOL, antweb • Biodiversity Literature Repository functional
  • 63. Next steps • Collaborate with ContentMine to extract >50 treatments/day
  • 64. Next steps Planned collaboration with ContentMine to extract treatments on a daly bases http://www.slideshare.net/petermurrayrust/? BioDiv
  • 65. Next steps • Collaborate with ContentMine to extract 50 treatments/day • 1 Million treatments life • RDF Version accessibl • GoldenGate Imagine (Text mining tool) • Provider für Daten für NCBI, GBIF, EOL, antweb • Biodiversity Literature Repository mit 100,000 bibliographic references and digital copies (PDF, images, etc.)
  • 67. Next steps Avoid all this waste (our next generation will have to clean up)! Publish structured data Publish open access Publish in journals with DOI Add links to names, treatments, articles, DNA sequences, digital objects Help build your own corpus of citable data Pensoft journals (e.g. Biodiversity Data Journal, Zookeys, Phytokeys) are the gold standard.
  • 68. Thanks! Donat Agosti agosti@plazi.org Acknowledgment: Pensoft, Zenodo/CERN, NCBI, Wikidata, ContentMine