SlideShare une entreprise Scribd logo
1  sur  58
Télécharger pour lire hors ligne
GBIF towards
2030
Photo: ForBio/GBIF training at Baikal lake September 2018, CC-BY Dag Endresen
UiO Natural History Museum in Oslo, Department of Research and
Collections, November 8th 2018 CC-BY Dag Endresen
GBIF data
surpassed
1 billion species
occurrence
data points in
July 2018
So what …? “What can we do with a billion data points that we could NOT do with,
say, a hundred million?” (GBIF Science Chair Rod Page on Twitter 4 July 2018).
With this observation of a frilled anemone (Metridium dianthus) off Saint-
Pierre and Miquelon, a French archipelago in the northwestern Atlantic.
#GBIF1Billion
#GBIF1Billion
Status 7 November 2018: 1 033 835 285 occurrences
Status 7th Nov 2018
Occurrence records 1 033 809 115
Datasets 41 536; Publishing institutions 1 305
GBIF is a success … so, do we just
continue to deliver more of the same?
Illustration by Rod Page (former GBIF Science committee chair) 5 July 2018.
Darwin Core
archive
Research
data
portals
GBIF provides a mul$ple-purpose data service
portal
Bio-Collections
GBIF provides a data discovery system
global registry data portal
that is dependent on resolvable stable identifiers for efficient functionality
Towards a new governance
model for GBIF-Norway
GBIF Par)cipant Member Country
Key roles:
• Delega)on
– Head of Delega)on
– Delega)on (max 3 people)
• GBIF CommiDee members
– Governing board
– Node (global and regional)
– Science CommiDee
– Budget CommiDee
• Par$cipant Node
– Node Manager
– Node staff (min 4 people)
GBIF Participant Node
Key roles:
• Nodes Committee (min 2 committees)
– Global Nodes (GNC)
– Regional (Europe)
• Participant Node (min 4 people)
– Node Manager (coordination)
– IT developer (informatics)
– Data scientist (science support)
– Biodiversity informatics officer
GBIF.no towards a permanent
research infrastructure
Funding periods (15 years, 2005-2019, 50 MNOK)
• 2005-2007 (3 years, RCN 4,5 MNOK, total 5,6 MNOK)
• 2008-2011 (4 years, RCN 6,3 MNOK, total 12,1 MNOK)
• 2012-2016 (5 years, RCN 13,0 MNOK, total 20,0 MNOK)
• 2017-2019 (3 years, RCN 9,2 MNOK, total 12,1 MNOK)
• 2020 --> permanent long-term infrastructure
Forskningsrådet (RCN)
UiO Naturhistorisk museum
Artsdatabanken (NBIC)
GBIF-Norway Node consor2um
Towards formalized sharing of Node responsibili5es
Node team at NHM, University of Oslo
Dag Endresen, Node manager
Chris/an Svindseth, Data manager
Fridtjof Mehlum, Research director
Vidar Bakken, part-8me (30%)
Artsdatabanken, Trondheim
Wouter Koch, node member
Nils Valland, board member
NTNU University Museum
Anders Finstad, GBIF Science commiCee
Solveig Bakken, board member
Research Council of Norway
Chris/an Wexels Riser
Per Backe-Hansen (un8l 2016)
Contact us at: helpdesk@gbif.no
Status 2018
Node team at NHM, University of Oslo
Dag Endresen, Node manager
Vidar Bakken, part-time
Vacancy, Data manager
Artsdatabanken, Trondheim
Wouter Koch, node member
Nils Valland, board member
Stein A. Hoem, IT Manager
NTNU University Museum
Anders Finstad, GBIF Science committee
Solveig Bakken, board member
Data scientist (?)
Norwegian Institute for Nature Research
Erlend B. Nilsen, Science ambassador
Roald Vang, IT Manager
Frank Hanssen, node member
Research Council of Norway
Research Infrastructure Team
GBIF towards 2030
GBIF Governing Board 2018, GB25, October 2018
GBIF Science Committee: “Focus Forward
on increase usage and relevance”.
Thomas M. Orrell (chair)
Smithsonian Institution, Washington, USA
Greg Riccardi (1st vice chair)
Florida State University, Tallahassee, USA
Anders G. Finstad (2nd vice chair )
NTNU University Museum, Trondheim, Norway
Philippe Grandcolas (3rd vice chair)
Muséum naAonal d'Histoire naturelle, Paris, France GBIF Science
CommiFee
Almost 700 – about 2 papers a day
Peer-reviewed publications using GBIF-mediated data
GBIF Gove r ni ng Boa rd 2018, GB25, Oc tobe r 2018
Slide from the GBIF Science Committee Report, GB25, Kilkenny, Ireland, October 2018
GBIF Governing Board 2018, GB25, October 2018
Who are currently using GBIF?
▇ Using GBIF data
▇ All citations
Slide from the GBIF Science CommiFee Report, GB25, Kilkenny, Ireland, October 2018
GBIF Science
CommiFee
We could focus on
increasing GBIF relevance
over here?
GBIF citation per category (status 2018-1-15)
● Consolidate data indexing
● Expand data models
● Build strong linkages with
reference catalogues
GBIF infrastructure directions
Bringing data together
brings science together
GBIF Gove r ni ng Boa rd 2018, GB25, Oc tobe r 2018
Slide from the GBIF Science CommiAee Report, GB25 , Kilkenny, Ireland, October 2018
Engaging the (wider) science community
● Proper recogni(on of data-users as GBIF
stakeholders.
● Engage and involve through teaching and
relevant tools (e.g. R).
● Enable nodes to engage more closely on
naBonal / regional level.
Slide from the GBIF Science Committee Report, GB25, Kilkenny, Ireland, October 2018
GBIF Gove r ni ng Boa rd 2018, GB25, Oc tobe r 2018
Recommendations from the GBIF Nodes chair
● Focus on people (Secretariat, Nodes, Publishers
and Users).
● Training, especially for new Node managers.
● Identify a mechanism locally to:
○ Take part in the GBIF work program.
○ Invest in more sustainable Nodes:
■ Stable funding
■ Capacitated staff
■ Development plan
GBIF Gove r ni ng Boa rd 2018, GB25, Oc tobe r 2018
Slide from the GBIF Nodes Committee Report (Andre
Heughebaert), GB25, Kilkenny, Ireland, October 2018
GBIF Gove r ni ng Boa rd 2018, GB25, Oc tobe r 2018
Slide from the GBIF ExecuAve
Secretary (Donald Hobern)
Report, GB25, Kilkenny,
Ireland, October 2018
h"ps://www.gbif.org/strategic-plan
Priority 1: Empower global
network
Ensure that governments, researchers and users are
equipped and supported to share, improve and use data
through the GBIF network, regardless of geography,
language or institutional affiliation.
• Remove barriers to participation
• Increase benefits associated with publishing
biodiversity data
• Address capacity needs
G B I F S t r a t e g i c p l a n 2 0 1 7 - 2 0 2 1
Priority 2: Enhance biodiversity
information infrastructure
Provide leadership, expertise and tools to support the
integration of all biodiversity information as an
interconnected digital knowledgebase.
• Coordinate vision and strengthen partnerships with major
biodiversity informatics initiatives
• Promote standardization and common mechanisms for
exchange of biodiversity data
• Provide stable and persistent data infrastructure to
support research
G B I F S t r a t e g i c p l a n 2 0 1 7 - 2 0 2 1
Priority 3. Fill data gaps
Prioritize and promote mobilization of new data resources
which combine with existing resources to maximize the
coverage, completeness and resolution of GBIF data,
particularly with respect to taxonomy, geography and time.
• Expand checklists to cover all taxonomic groups
• Identify and prioritize gaps in spatial and temporal data
• Engage institutions and researchers with
complementary data
G B I F S t r a t e g i c p l a n 2 0 1 7 - 2 0 2 1
Priority 4. Improve data quality
Ensure that all data within the GBIF network are of
the highest-possible quality and associated with
clear indicators enabling users to assess their origin,
relevance and usefulness for any application.
• Enhance automated data validation
• Implement tools for expert curation
• Provide clear quality indicators for all data
G B I F S t r a t e g i c p l a n 2 0 1 7 - 2 0 2 1
Priority 5. Deliver relevant data
Ensure that GBIF delivers data in the form and
completeness required to meet the highest-priority
needs of science and, through science, society.
• Engage with expert communities to manage data
to the highest quality possible
• Deliver well-organized and validated data to
support key applications
G B I F S t r a t e g i c p l a n 2 0 1 7 - 2 0 2 1
Does GBIF provide access to the
appropriate tools needed to
address the current challenges
for biological diversity?
If you have a hammer, everything looks like a nail …
Uncertainties,
bias and errors is
a problem when
using GBIF data
Fitness for re-use
remains a major
challenge
• Darwin Core occurrence data
provide different types of evidence
for the occurrence of a species in
6me and space.
• Museum specimens & collec6ons
• Material samples & sequence data
• Species or ecosystem monitoring data
• Ci6zen species observa6ons
• … focus on adding new data types?
Focus efforts on Data standards?
Genomic Standards Consortium (GSC)
MIMARKS - Minimum
information about a marker
gene sequence
Biodiversity Information
Standards TDWG
Darwin Core
occurrenceID
materialSampleID
eventIDGlobal Genome Biodiversity
Network (GGBN)
79.2% (ci*zen science)
Observa*on data
14,6%
specimens
Rapid increase in GBIF of (ci*zen science) observa*on data…!
Data for natural history specimens was the beginning and remains at the core of
GBIF’s scope
Focus efforts on collection specimens and vouchered and curated physical
samples?
(biobank-samples)
Troudet et al. (2018) The Increasing Disconnection of Primary Biodiversity Data from Specimens doi:10.1093/sysbio/syy044
Bias in distribution from uneven reporting efforts!
Distribution of species occurrence records made available to GBIF by citizen
science data providers. https://www.gbif.org/citizen-science
Chandler et al. (2017) Contribution of citizen science towards international biodiversity monitoring. Biological Conservation doi:10.1016/j.biocon.2016.09.004
Focus efforts on filling gaps in species distribution coverage?
Bias: Dispropor)onate representa)on of taxa.
Focus efforts on under-represented taxa?
Total ≈ 8.7 millions species?
(excluding bacteria and micro-organisms)
Mora C, Tittensor DP, Adl S, Simpson AGB, Worm B (2011) How Many Species Are
There on Earth and in the Ocean? PLoS Biology doi:10.1371/journal.pbio.1001127
Caley J, Fisher R, and Mengersen K (2014) Global species richness have not
converged. Trends in Ecology and evolution doi:10.1016/j.tree.2014.02.002
Caley et al. 2014
The Catalogue of Life is a quality-assured checklist
of more than 1.8 million species known to science.
Focus efforts on mobilizing nomenclature resources?
New Species Concepts indexed in GBIF
Species concepts based on Opera8onal Taxon Units (OTUs) (from
PlutoF UNITE) are indexed into the GBIF taxon backbone.
Species concepts based on BOLD barcode index numbers (BINs)
are indexed into the GBIF taxon backbone.
Focus efforts on mobilizing yet unnamed species concepts?
Capacity building: Data capture & data publishing
• Tajikistan, Belarus, Ukraine, Armenia & Norway
• UiO NHM, ForBio, GBIF Norway & GBIF Secretariat
• 64 students & staff trained
• 8 events over three years:
– 2018 Sep Oslo Kick-off
– 2019 Feb Minsk Belarus
– 2019 Jun Dushanbe Tajikistan
– 2019 Nov Minsk Belarus
– 2020 Apr Yerevan Armenia
– 2020 Oct Kiev Ukraine
– 2021 March Oslo Norway
• DIKU/SIU grant 2018–2021
Focus efforts on capacity building & training?
Focus efforts on tools and services to improve data quality?
Example of data cleaning workflow
verbatimEventDate:
18 Mayo 2016
year: 2016
month: 5
day: 18
eventDate: 2016-05-18
startDayOfYear: 139
endDayOfYear: 139
DwC-ArchiveSource
Data
cleaning
Biased representa,on in country membership
Focus efforts on increasing the country membership coverage?
Low membership coverage in Asia and Africa
Asia (gap in data coverage)
Africa (gap in data)
M
ostdataarefrom
morerecentdates
Focus on filling data coverage and gaps in space and 3me?
The total number of
specimens in natural history
collec4ons worldwide is
es4mated to 1.2 to 3 billion.
(Ariño 2010; Duckworth et al. 1993)
GBIF indexes 876 million records –
including 128 million specimens
=> 4% to 10% coverage?
Photo: Botany Collection, Algae, Smithsonian National
Museum of Natural History Museum, by Chip Clark.
Focus efforts on services
for supporting digitizing of
legacy specimens?
Data fitness depends on data being
• accessible
• timely
• easy to read
• relevant
• consistent
• complete
• specific
• comprehensive
The true value of biodiversity
data can be measured by the
extent to which it is used.
Focus efforts on data re-use metrics and other incen2ves?
Community
peer-review
& annotation
New services for Annotating biodiversity data
Tschöpe et al. (2013) Annotating biodiversity data via the Internet.
"Machine learning algorithms have
successfully identified plant species in massive
herbaria just by looking at the dried
specimens. According to researchers, similar
AI approaches could also be used identify the
likes of fly larvae and plant fossils"
Researchers trained... algorithms on more than 260,000 scans of
herbarium sheets, encompassing more than 1,000 species. The
computer program eventually identified species with nearly 80%
accuracy: the correct answer was within the algorithms’ top 5 picks
90% of the time. That, says (Penn State paleobotanist Peter) Wilf,
probably out-performs a human taxonomist by quite a bit.
Carranza-Rojas J, Goeau H, Bonnet P, Mata-Montero E, and Joly A (2017) Going deeper
in the automated identification of Herbarium specimens. BMC Evolutionary Biology
17:181. https://doi.org/10.1186/s12862-017-1014-z
Ledford H (2017) Artificial intelligence identifies plant species for science: Deep-learning
methods successfully classify thousands of herbarium samples. Nature News 11 August
2017. doi:10.1038/nature.2017.22442
Carranza-Rojas J, Joly A, Bonnet P, Goëau H, Mata-Montero E (2017) Automated
Herbarium Specimen Identification using Deep Learning. Proceedings of TDWG 1:
e20302. https://doi.org/10.3897/tdwgproceedings.1.20302
Focus efforts on new machine learning services?
Future perspectives
The future is already here — it's
just not very evenly distributed
William Gibson
"Scien&fic irreproducibility
— the inability to repeat
others' experiments and
reach the same conclusion
— is a growing concern”.
Baker (2016) Nature
doi:10.1038/533452a
Open Access (OA): Research results distributed online and free
of costs or other barriers – often meaning free access to
research articles.
Open Science: researchers to share their methods, computer
code and research data in central data repositories.
Open Data: based on FAIR principles: findable, accessible,
interoperable and reusable (biodiversity) data - is the primary
objective of GBIF.
For full reproducibility we also need access to the physical
biological material – to be deposited in museum collections
and biobank-repositories.
"Scientific irreproducibility — the inability to repeat others'
experiments and reach the same conclusion” (Nature 2016)
"FAIR" data
• Findable
– assign persistent IDs, provide rich metadata, register in
a searchable resource (such as GBIF)
• Accessible
– Retrievable by their ID using a standard protocol,
metadata remain accessible even if data aren’t
• Interoperable
– Use formal, broadly applicable languages, use standard
vocabularies, qualified references (e.g. Darwin Core)
• Reusable
– Rich, accurate metadata, clear licences, provenance,
use of community standards (e.g. Dublin Core, EML)
www.force11.org/group/fairgroup/fairprinciples
• Wilkinson, M. D. et al. (2016) The FAIR Guiding Principles for scientific data
management and stewardship. Sci. Data 3:160018
[doi:10.1038/sdata.2016.18]
Slide source: OpenAIRE & EUDAT, CC-BY-4.0, 2013
Data Citation Principles
1. Data to be legitimate citable products of research.
2. Data citations giving scholarly credit and attribution.
3. In scholarly literature, whenever claims are based on data, data should
always be cited.
4. Persistent method for identification of data, that is machine actionable,
globally unique, universal.
5. Data citation facilitate access to data or at least to metadata.
6. Unique identifiers that persist even beyond the lifespan of the data.
7. Data citation identify and access the specific data that support
verification of the claim (provenance, time-slice, version).
8. Flexible, but attention to interoperability of practices across
communities.
Data Cita'on Synthesis Group: Joint Declara'on of Data Cita'on Principles. Martone M. (ed.) San Diego CA: FORCE11; 2014
Open research data
Forskningsrådet (2014). ISBN: 978-82-12-03361-0
The Research Council of Norway expects all research data from projects
funded by the Research Council to be made freely available as open data.
In some situations there can be valid and justified reasons for exceptions.
(2014)
Open Science
Kunnskapsdepartementet (2016)
EU (2016) Compe<<veness Council, 26-27/05/2016
EU (2007) INSPIRE Direc<ve
Norway is to be a careful pioneer in open access to research results.
Norway to follow the ambi6on of EU on full open access to publicly
funded research by 2020.
Results of research supported by public and public-private funds freely available to and reusable by anyone.
ARKIVERING AV FORSKNINGSDATA OG
MATERIALPRØVER (BIOBANK)
• Åpen arkivering og deling av data og fysiske
materialprøver sikrer at dine forskningsresultater er
reproduserbare.
• Profesjonell kuratering av data og materialprøver sparer
deg forskningstid fordi du selv, dine samarbeidspartnere
og andre finner, forstår, og får tilgang til dine
forskningsdata og prøver.
• Deling av data og materialprøver gir deg bredere
spredning og påvirkningskraft for din forskning.
• Tilrettelegging for gjenbruk av forskningsdata og
materialprøver forsterker åpen og nyskjerrighets-dreven
forskning og kan lede til uventede forsknings-
gjennombrudd!
GBIF towards 2030 (November 2018)

Contenu connexe

Tendances

Tendances (20)

Introduction to GBIF. GBIF seminar in Bergen. 2016-12-14
Introduction to GBIF. GBIF seminar in Bergen. 2016-12-14Introduction to GBIF. GBIF seminar in Bergen. 2016-12-14
Introduction to GBIF. GBIF seminar in Bergen. 2016-12-14
 
GBIF data publishing. GBIF seminar in Bergen. 2016-12-14
GBIF data publishing. GBIF seminar in Bergen. 2016-12-14GBIF data publishing. GBIF seminar in Bergen. 2016-12-14
GBIF data publishing. GBIF seminar in Bergen. 2016-12-14
 
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
 
2016-10-12 MUSIT & GBIF - Dataset portals
2016-10-12 MUSIT & GBIF - Dataset portals2016-10-12 MUSIT & GBIF - Dataset portals
2016-10-12 MUSIT & GBIF - Dataset portals
 
Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods
 
Global Biodiversity Information Facility (GBIF) - 2012
Global Biodiversity Information Facility (GBIF) - 2012Global Biodiversity Information Facility (GBIF) - 2012
Global Biodiversity Information Facility (GBIF) - 2012
 
GBIF data portal, ECPGR working group (2017-03-16)
GBIF data portal, ECPGR working group (2017-03-16)GBIF data portal, ECPGR working group (2017-03-16)
GBIF data portal, ECPGR working group (2017-03-16)
 
#HepaticaWeek April 2016, GBIF data publishing
#HepaticaWeek April 2016, GBIF data publishing#HepaticaWeek April 2016, GBIF data publishing
#HepaticaWeek April 2016, GBIF data publishing
 
Global Biodiversity Information Facility - 2013
Global Biodiversity Information Facility - 2013Global Biodiversity Information Facility - 2013
Global Biodiversity Information Facility - 2013
 
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
 
Trait data mining using FIGS (2006)
Trait data mining using FIGS (2006)Trait data mining using FIGS (2006)
Trait data mining using FIGS (2006)
 
GBIF BIFA mentoring, Day 1 GBIF intro, July 2016
GBIF BIFA mentoring, Day 1 GBIF intro, July 2016GBIF BIFA mentoring, Day 1 GBIF intro, July 2016
GBIF BIFA mentoring, Day 1 GBIF intro, July 2016
 
GBIF-Norway at NMBU, January 2015
GBIF-Norway at NMBU, January 2015GBIF-Norway at NMBU, January 2015
GBIF-Norway at NMBU, January 2015
 
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
 
Biodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary ChallengeBiodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary Challenge
 
European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...
European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...
European agrobiodioversity, ECPGR network meeting on EURISCO, Central Crop Da...
 
GBIF BIFA mentoring, Day 4b Event core, July 2016
GBIF BIFA mentoring, Day 4b Event core, July 2016GBIF BIFA mentoring, Day 4b Event core, July 2016
GBIF BIFA mentoring, Day 4b Event core, July 2016
 
Intro to GBIF: Infrastructures and Platforms for Environmental Crowd Sensing ...
Intro to GBIF: Infrastructures and Platforms for Environmental Crowd Sensing ...Intro to GBIF: Infrastructures and Platforms for Environmental Crowd Sensing ...
Intro to GBIF: Infrastructures and Platforms for Environmental Crowd Sensing ...
 
Germplasm data exchange, CGIAR SINGER (2009)
Germplasm data exchange, CGIAR SINGER (2009)Germplasm data exchange, CGIAR SINGER (2009)
Germplasm data exchange, CGIAR SINGER (2009)
 
GBIF BIFA mentoring, Day 2 Publish data, July 2016
GBIF BIFA mentoring, Day 2 Publish data, July 2016GBIF BIFA mentoring, Day 2 Publish data, July 2016
GBIF BIFA mentoring, Day 2 Publish data, July 2016
 

Similaire à GBIF towards 2030 (November 2018)

GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...
GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...
GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...
Phil Cryer
 
NLBIF_NIOO_2017v3
NLBIF_NIOO_2017v3NLBIF_NIOO_2017v3
NLBIF_NIOO_2017v3
Jan Kuiper
 

Similaire à GBIF towards 2030 (November 2018) (20)

Ices wgdim-may-2010
Ices wgdim-may-2010Ices wgdim-may-2010
Ices wgdim-may-2010
 
The Global Biodiversity Information Facility and Africa Rising
The Global Biodiversity Information Facility and Africa RisingThe Global Biodiversity Information Facility and Africa Rising
The Global Biodiversity Information Facility and Africa Rising
 
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
 
Developing the field of Biodiversity Informatics in South Africa through the ...
Developing the field of Biodiversity Informatics in South Africa through the ...Developing the field of Biodiversity Informatics in South Africa through the ...
Developing the field of Biodiversity Informatics in South Africa through the ...
 
GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...
GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...
GBIF (Global Biodiversity Information Facility) Position Paper: Data Hosting ...
 
NLBIF_NIOO_2017v3
NLBIF_NIOO_2017v3NLBIF_NIOO_2017v3
NLBIF_NIOO_2017v3
 
Schigel@al gbif i-bol7v2
Schigel@al gbif i-bol7v2Schigel@al gbif i-bol7v2
Schigel@al gbif i-bol7v2
 
GBIF Work Programme 2016 Update
GBIF Work Programme 2016 UpdateGBIF Work Programme 2016 Update
GBIF Work Programme 2016 Update
 
Joint GBIF Biodiversa+ symposium in Helsinki on 2024-04-16
Joint GBIF Biodiversa+ symposium in  Helsinki on 2024-04-16Joint GBIF Biodiversa+ symposium in  Helsinki on 2024-04-16
Joint GBIF Biodiversa+ symposium in Helsinki on 2024-04-16
 
2023-05-08 GLIS SAC Rome
2023-05-08 GLIS SAC Rome2023-05-08 GLIS SAC Rome
2023-05-08 GLIS SAC Rome
 
Gb17 gsap-nhc
Gb17 gsap-nhcGb17 gsap-nhc
Gb17 gsap-nhc
 
PhD defense Julien Troudet (29/11/2017)
PhD defense Julien Troudet (29/11/2017)PhD defense Julien Troudet (29/11/2017)
PhD defense Julien Troudet (29/11/2017)
 
Spnhc june-2010
Spnhc june-2010Spnhc june-2010
Spnhc june-2010
 
S P N H C June 2010
S P N H C  June 2010S P N H C  June 2010
S P N H C June 2010
 
RDA Presentation to G8
RDA Presentation to G8RDA Presentation to G8
RDA Presentation to G8
 
Tdwg 1-remsen
Tdwg 1-remsenTdwg 1-remsen
Tdwg 1-remsen
 
Dmitry Schigel – Open biodiversity information: international perspectives
Dmitry Schigel – Open biodiversity information: international perspectivesDmitry Schigel – Open biodiversity information: international perspectives
Dmitry Schigel – Open biodiversity information: international perspectives
 
CODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon HodsonCODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon Hodson
 
Capacity Building and Implementation Guidelines for Open Science: Practices o...
Capacity Building and Implementation Guidelines for Open Science: Practices o...Capacity Building and Implementation Guidelines for Open Science: Practices o...
Capacity Building and Implementation Guidelines for Open Science: Practices o...
 
Nlbif2015 foste rpresentation
Nlbif2015 foste rpresentationNlbif2015 foste rpresentation
Nlbif2015 foste rpresentation
 

Plus de Dag Endresen

Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Dag Endresen
 
Reuse of biodiversity data published in GBIF, November 2017
Reuse of biodiversity data published in GBIF, November 2017Reuse of biodiversity data published in GBIF, November 2017
Reuse of biodiversity data published in GBIF, November 2017
Dag Endresen
 

Plus de Dag Endresen (14)

Iliad webinar 2024-03-13, Accessing and publishing marine biodiversity data i...
Iliad webinar 2024-03-13, Accessing and publishing marine biodiversity data i...Iliad webinar 2024-03-13, Accessing and publishing marine biodiversity data i...
Iliad webinar 2024-03-13, Accessing and publishing marine biodiversity data i...
 
Modelling Research Expeditions in Wikidata: Best Practice for Standardisation...
Modelling Research Expeditions in Wikidata: Best Practice for Standardisation...Modelling Research Expeditions in Wikidata: Best Practice for Standardisation...
Modelling Research Expeditions in Wikidata: Best Practice for Standardisation...
 
Ontologies for biodiversity informatics, UiO DSC June 2023
 Ontologies for biodiversity informatics, UiO DSC June 2023 Ontologies for biodiversity informatics, UiO DSC June 2023
Ontologies for biodiversity informatics, UiO DSC June 2023
 
Evacuation of the Kherson herbarium
Evacuation of the Kherson herbariumEvacuation of the Kherson herbarium
Evacuation of the Kherson herbarium
 
BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24
 
Data and Stats Forum at MINA NMBU - 2023-04-26
Data and Stats Forum at MINA NMBU - 2023-04-26Data and Stats Forum at MINA NMBU - 2023-04-26
Data and Stats Forum at MINA NMBU - 2023-04-26
 
BioDATA final conference in Oslo, November 2022
BioDATA final conference in Oslo, November 2022BioDATA final conference in Oslo, November 2022
BioDATA final conference in Oslo, November 2022
 
GBIF at Living Norway Open Science Lab 2022-03-03
GBIF at Living Norway Open Science Lab 2022-03-03GBIF at Living Norway Open Science Lab 2022-03-03
GBIF at Living Norway Open Science Lab 2022-03-03
 
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
 
Open science curriculum for students, June 2019
Open science curriculum for students, June 2019Open science curriculum for students, June 2019
Open science curriculum for students, June 2019
 
Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...
Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...
Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...
 
GBIF/OBIS hackathon in Brussels January 2018
GBIF/OBIS hackathon in Brussels January 2018GBIF/OBIS hackathon in Brussels January 2018
GBIF/OBIS hackathon in Brussels January 2018
 
Reuse of biodiversity data published in GBIF, November 2017
Reuse of biodiversity data published in GBIF, November 2017Reuse of biodiversity data published in GBIF, November 2017
Reuse of biodiversity data published in GBIF, November 2017
 
GBIF lunch seminar at UiO Natural History Museum in Oslo, 2017-03-30
GBIF lunch seminar at UiO Natural History Museum in Oslo, 2017-03-30GBIF lunch seminar at UiO Natural History Museum in Oslo, 2017-03-30
GBIF lunch seminar at UiO Natural History Museum in Oslo, 2017-03-30
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

GBIF towards 2030 (November 2018)

  • 1. GBIF towards 2030 Photo: ForBio/GBIF training at Baikal lake September 2018, CC-BY Dag Endresen UiO Natural History Museum in Oslo, Department of Research and Collections, November 8th 2018 CC-BY Dag Endresen
  • 2. GBIF data surpassed 1 billion species occurrence data points in July 2018 So what …? “What can we do with a billion data points that we could NOT do with, say, a hundred million?” (GBIF Science Chair Rod Page on Twitter 4 July 2018). With this observation of a frilled anemone (Metridium dianthus) off Saint- Pierre and Miquelon, a French archipelago in the northwestern Atlantic. #GBIF1Billion
  • 3. #GBIF1Billion Status 7 November 2018: 1 033 835 285 occurrences
  • 4. Status 7th Nov 2018 Occurrence records 1 033 809 115 Datasets 41 536; Publishing institutions 1 305
  • 5. GBIF is a success … so, do we just continue to deliver more of the same? Illustration by Rod Page (former GBIF Science committee chair) 5 July 2018.
  • 6. Darwin Core archive Research data portals GBIF provides a mul$ple-purpose data service portal Bio-Collections
  • 7. GBIF provides a data discovery system global registry data portal that is dependent on resolvable stable identifiers for efficient functionality
  • 8. Towards a new governance model for GBIF-Norway
  • 9. GBIF Par)cipant Member Country Key roles: • Delega)on – Head of Delega)on – Delega)on (max 3 people) • GBIF CommiDee members – Governing board – Node (global and regional) – Science CommiDee – Budget CommiDee • Par$cipant Node – Node Manager – Node staff (min 4 people)
  • 10. GBIF Participant Node Key roles: • Nodes Committee (min 2 committees) – Global Nodes (GNC) – Regional (Europe) • Participant Node (min 4 people) – Node Manager (coordination) – IT developer (informatics) – Data scientist (science support) – Biodiversity informatics officer
  • 11. GBIF.no towards a permanent research infrastructure Funding periods (15 years, 2005-2019, 50 MNOK) • 2005-2007 (3 years, RCN 4,5 MNOK, total 5,6 MNOK) • 2008-2011 (4 years, RCN 6,3 MNOK, total 12,1 MNOK) • 2012-2016 (5 years, RCN 13,0 MNOK, total 20,0 MNOK) • 2017-2019 (3 years, RCN 9,2 MNOK, total 12,1 MNOK) • 2020 --> permanent long-term infrastructure Forskningsrådet (RCN) UiO Naturhistorisk museum Artsdatabanken (NBIC)
  • 12. GBIF-Norway Node consor2um Towards formalized sharing of Node responsibili5es
  • 13. Node team at NHM, University of Oslo Dag Endresen, Node manager Chris/an Svindseth, Data manager Fridtjof Mehlum, Research director Vidar Bakken, part-8me (30%) Artsdatabanken, Trondheim Wouter Koch, node member Nils Valland, board member NTNU University Museum Anders Finstad, GBIF Science commiCee Solveig Bakken, board member Research Council of Norway Chris/an Wexels Riser Per Backe-Hansen (un8l 2016) Contact us at: helpdesk@gbif.no Status 2018
  • 14. Node team at NHM, University of Oslo Dag Endresen, Node manager Vidar Bakken, part-time Vacancy, Data manager Artsdatabanken, Trondheim Wouter Koch, node member Nils Valland, board member Stein A. Hoem, IT Manager NTNU University Museum Anders Finstad, GBIF Science committee Solveig Bakken, board member Data scientist (?) Norwegian Institute for Nature Research Erlend B. Nilsen, Science ambassador Roald Vang, IT Manager Frank Hanssen, node member Research Council of Norway Research Infrastructure Team
  • 16. GBIF Governing Board 2018, GB25, October 2018 GBIF Science Committee: “Focus Forward on increase usage and relevance”. Thomas M. Orrell (chair) Smithsonian Institution, Washington, USA Greg Riccardi (1st vice chair) Florida State University, Tallahassee, USA Anders G. Finstad (2nd vice chair ) NTNU University Museum, Trondheim, Norway Philippe Grandcolas (3rd vice chair) Muséum naAonal d'Histoire naturelle, Paris, France GBIF Science CommiFee
  • 17. Almost 700 – about 2 papers a day Peer-reviewed publications using GBIF-mediated data GBIF Gove r ni ng Boa rd 2018, GB25, Oc tobe r 2018 Slide from the GBIF Science Committee Report, GB25, Kilkenny, Ireland, October 2018
  • 18. GBIF Governing Board 2018, GB25, October 2018 Who are currently using GBIF? ▇ Using GBIF data ▇ All citations Slide from the GBIF Science CommiFee Report, GB25, Kilkenny, Ireland, October 2018 GBIF Science CommiFee We could focus on increasing GBIF relevance over here? GBIF citation per category (status 2018-1-15)
  • 19. ● Consolidate data indexing ● Expand data models ● Build strong linkages with reference catalogues GBIF infrastructure directions Bringing data together brings science together GBIF Gove r ni ng Boa rd 2018, GB25, Oc tobe r 2018 Slide from the GBIF Science CommiAee Report, GB25 , Kilkenny, Ireland, October 2018
  • 20. Engaging the (wider) science community ● Proper recogni(on of data-users as GBIF stakeholders. ● Engage and involve through teaching and relevant tools (e.g. R). ● Enable nodes to engage more closely on naBonal / regional level. Slide from the GBIF Science Committee Report, GB25, Kilkenny, Ireland, October 2018 GBIF Gove r ni ng Boa rd 2018, GB25, Oc tobe r 2018
  • 21. Recommendations from the GBIF Nodes chair ● Focus on people (Secretariat, Nodes, Publishers and Users). ● Training, especially for new Node managers. ● Identify a mechanism locally to: ○ Take part in the GBIF work program. ○ Invest in more sustainable Nodes: ■ Stable funding ■ Capacitated staff ■ Development plan GBIF Gove r ni ng Boa rd 2018, GB25, Oc tobe r 2018 Slide from the GBIF Nodes Committee Report (Andre Heughebaert), GB25, Kilkenny, Ireland, October 2018
  • 22. GBIF Gove r ni ng Boa rd 2018, GB25, Oc tobe r 2018 Slide from the GBIF ExecuAve Secretary (Donald Hobern) Report, GB25, Kilkenny, Ireland, October 2018
  • 24. Priority 1: Empower global network Ensure that governments, researchers and users are equipped and supported to share, improve and use data through the GBIF network, regardless of geography, language or institutional affiliation. • Remove barriers to participation • Increase benefits associated with publishing biodiversity data • Address capacity needs G B I F S t r a t e g i c p l a n 2 0 1 7 - 2 0 2 1
  • 25. Priority 2: Enhance biodiversity information infrastructure Provide leadership, expertise and tools to support the integration of all biodiversity information as an interconnected digital knowledgebase. • Coordinate vision and strengthen partnerships with major biodiversity informatics initiatives • Promote standardization and common mechanisms for exchange of biodiversity data • Provide stable and persistent data infrastructure to support research G B I F S t r a t e g i c p l a n 2 0 1 7 - 2 0 2 1
  • 26. Priority 3. Fill data gaps Prioritize and promote mobilization of new data resources which combine with existing resources to maximize the coverage, completeness and resolution of GBIF data, particularly with respect to taxonomy, geography and time. • Expand checklists to cover all taxonomic groups • Identify and prioritize gaps in spatial and temporal data • Engage institutions and researchers with complementary data G B I F S t r a t e g i c p l a n 2 0 1 7 - 2 0 2 1
  • 27. Priority 4. Improve data quality Ensure that all data within the GBIF network are of the highest-possible quality and associated with clear indicators enabling users to assess their origin, relevance and usefulness for any application. • Enhance automated data validation • Implement tools for expert curation • Provide clear quality indicators for all data G B I F S t r a t e g i c p l a n 2 0 1 7 - 2 0 2 1
  • 28. Priority 5. Deliver relevant data Ensure that GBIF delivers data in the form and completeness required to meet the highest-priority needs of science and, through science, society. • Engage with expert communities to manage data to the highest quality possible • Deliver well-organized and validated data to support key applications G B I F S t r a t e g i c p l a n 2 0 1 7 - 2 0 2 1
  • 29. Does GBIF provide access to the appropriate tools needed to address the current challenges for biological diversity? If you have a hammer, everything looks like a nail …
  • 30. Uncertainties, bias and errors is a problem when using GBIF data Fitness for re-use remains a major challenge
  • 31. • Darwin Core occurrence data provide different types of evidence for the occurrence of a species in 6me and space. • Museum specimens & collec6ons • Material samples & sequence data • Species or ecosystem monitoring data • Ci6zen species observa6ons • … focus on adding new data types?
  • 32. Focus efforts on Data standards? Genomic Standards Consortium (GSC) MIMARKS - Minimum information about a marker gene sequence Biodiversity Information Standards TDWG Darwin Core occurrenceID materialSampleID eventIDGlobal Genome Biodiversity Network (GGBN)
  • 33. 79.2% (ci*zen science) Observa*on data 14,6% specimens Rapid increase in GBIF of (ci*zen science) observa*on data…! Data for natural history specimens was the beginning and remains at the core of GBIF’s scope Focus efforts on collection specimens and vouchered and curated physical samples? (biobank-samples) Troudet et al. (2018) The Increasing Disconnection of Primary Biodiversity Data from Specimens doi:10.1093/sysbio/syy044
  • 34. Bias in distribution from uneven reporting efforts! Distribution of species occurrence records made available to GBIF by citizen science data providers. https://www.gbif.org/citizen-science Chandler et al. (2017) Contribution of citizen science towards international biodiversity monitoring. Biological Conservation doi:10.1016/j.biocon.2016.09.004 Focus efforts on filling gaps in species distribution coverage?
  • 35. Bias: Dispropor)onate representa)on of taxa. Focus efforts on under-represented taxa?
  • 36. Total ≈ 8.7 millions species? (excluding bacteria and micro-organisms) Mora C, Tittensor DP, Adl S, Simpson AGB, Worm B (2011) How Many Species Are There on Earth and in the Ocean? PLoS Biology doi:10.1371/journal.pbio.1001127 Caley J, Fisher R, and Mengersen K (2014) Global species richness have not converged. Trends in Ecology and evolution doi:10.1016/j.tree.2014.02.002 Caley et al. 2014 The Catalogue of Life is a quality-assured checklist of more than 1.8 million species known to science. Focus efforts on mobilizing nomenclature resources?
  • 37. New Species Concepts indexed in GBIF Species concepts based on Opera8onal Taxon Units (OTUs) (from PlutoF UNITE) are indexed into the GBIF taxon backbone. Species concepts based on BOLD barcode index numbers (BINs) are indexed into the GBIF taxon backbone. Focus efforts on mobilizing yet unnamed species concepts?
  • 38. Capacity building: Data capture & data publishing • Tajikistan, Belarus, Ukraine, Armenia & Norway • UiO NHM, ForBio, GBIF Norway & GBIF Secretariat • 64 students & staff trained • 8 events over three years: – 2018 Sep Oslo Kick-off – 2019 Feb Minsk Belarus – 2019 Jun Dushanbe Tajikistan – 2019 Nov Minsk Belarus – 2020 Apr Yerevan Armenia – 2020 Oct Kiev Ukraine – 2021 March Oslo Norway • DIKU/SIU grant 2018–2021 Focus efforts on capacity building & training?
  • 39. Focus efforts on tools and services to improve data quality?
  • 40. Example of data cleaning workflow verbatimEventDate: 18 Mayo 2016 year: 2016 month: 5 day: 18 eventDate: 2016-05-18 startDayOfYear: 139 endDayOfYear: 139 DwC-ArchiveSource Data cleaning
  • 41. Biased representa,on in country membership Focus efforts on increasing the country membership coverage? Low membership coverage in Asia and Africa
  • 42. Asia (gap in data coverage) Africa (gap in data) M ostdataarefrom morerecentdates Focus on filling data coverage and gaps in space and 3me?
  • 43. The total number of specimens in natural history collec4ons worldwide is es4mated to 1.2 to 3 billion. (Ariño 2010; Duckworth et al. 1993) GBIF indexes 876 million records – including 128 million specimens => 4% to 10% coverage? Photo: Botany Collection, Algae, Smithsonian National Museum of Natural History Museum, by Chip Clark. Focus efforts on services for supporting digitizing of legacy specimens?
  • 44. Data fitness depends on data being • accessible • timely • easy to read • relevant • consistent • complete • specific • comprehensive The true value of biodiversity data can be measured by the extent to which it is used. Focus efforts on data re-use metrics and other incen2ves?
  • 45.
  • 47. New services for Annotating biodiversity data Tschöpe et al. (2013) Annotating biodiversity data via the Internet.
  • 48. "Machine learning algorithms have successfully identified plant species in massive herbaria just by looking at the dried specimens. According to researchers, similar AI approaches could also be used identify the likes of fly larvae and plant fossils" Researchers trained... algorithms on more than 260,000 scans of herbarium sheets, encompassing more than 1,000 species. The computer program eventually identified species with nearly 80% accuracy: the correct answer was within the algorithms’ top 5 picks 90% of the time. That, says (Penn State paleobotanist Peter) Wilf, probably out-performs a human taxonomist by quite a bit. Carranza-Rojas J, Goeau H, Bonnet P, Mata-Montero E, and Joly A (2017) Going deeper in the automated identification of Herbarium specimens. BMC Evolutionary Biology 17:181. https://doi.org/10.1186/s12862-017-1014-z Ledford H (2017) Artificial intelligence identifies plant species for science: Deep-learning methods successfully classify thousands of herbarium samples. Nature News 11 August 2017. doi:10.1038/nature.2017.22442 Carranza-Rojas J, Joly A, Bonnet P, Goëau H, Mata-Montero E (2017) Automated Herbarium Specimen Identification using Deep Learning. Proceedings of TDWG 1: e20302. https://doi.org/10.3897/tdwgproceedings.1.20302 Focus efforts on new machine learning services?
  • 49. Future perspectives The future is already here — it's just not very evenly distributed William Gibson
  • 50.
  • 51. "Scien&fic irreproducibility — the inability to repeat others' experiments and reach the same conclusion — is a growing concern”. Baker (2016) Nature doi:10.1038/533452a
  • 52. Open Access (OA): Research results distributed online and free of costs or other barriers – often meaning free access to research articles. Open Science: researchers to share their methods, computer code and research data in central data repositories. Open Data: based on FAIR principles: findable, accessible, interoperable and reusable (biodiversity) data - is the primary objective of GBIF. For full reproducibility we also need access to the physical biological material – to be deposited in museum collections and biobank-repositories. "Scientific irreproducibility — the inability to repeat others' experiments and reach the same conclusion” (Nature 2016)
  • 53. "FAIR" data • Findable – assign persistent IDs, provide rich metadata, register in a searchable resource (such as GBIF) • Accessible – Retrievable by their ID using a standard protocol, metadata remain accessible even if data aren’t • Interoperable – Use formal, broadly applicable languages, use standard vocabularies, qualified references (e.g. Darwin Core) • Reusable – Rich, accurate metadata, clear licences, provenance, use of community standards (e.g. Dublin Core, EML) www.force11.org/group/fairgroup/fairprinciples • Wilkinson, M. D. et al. (2016) The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3:160018 [doi:10.1038/sdata.2016.18] Slide source: OpenAIRE & EUDAT, CC-BY-4.0, 2013
  • 54. Data Citation Principles 1. Data to be legitimate citable products of research. 2. Data citations giving scholarly credit and attribution. 3. In scholarly literature, whenever claims are based on data, data should always be cited. 4. Persistent method for identification of data, that is machine actionable, globally unique, universal. 5. Data citation facilitate access to data or at least to metadata. 6. Unique identifiers that persist even beyond the lifespan of the data. 7. Data citation identify and access the specific data that support verification of the claim (provenance, time-slice, version). 8. Flexible, but attention to interoperability of practices across communities. Data Cita'on Synthesis Group: Joint Declara'on of Data Cita'on Principles. Martone M. (ed.) San Diego CA: FORCE11; 2014
  • 55. Open research data Forskningsrådet (2014). ISBN: 978-82-12-03361-0 The Research Council of Norway expects all research data from projects funded by the Research Council to be made freely available as open data. In some situations there can be valid and justified reasons for exceptions. (2014)
  • 56. Open Science Kunnskapsdepartementet (2016) EU (2016) Compe<<veness Council, 26-27/05/2016 EU (2007) INSPIRE Direc<ve Norway is to be a careful pioneer in open access to research results. Norway to follow the ambi6on of EU on full open access to publicly funded research by 2020. Results of research supported by public and public-private funds freely available to and reusable by anyone.
  • 57. ARKIVERING AV FORSKNINGSDATA OG MATERIALPRØVER (BIOBANK) • Åpen arkivering og deling av data og fysiske materialprøver sikrer at dine forskningsresultater er reproduserbare. • Profesjonell kuratering av data og materialprøver sparer deg forskningstid fordi du selv, dine samarbeidspartnere og andre finner, forstår, og får tilgang til dine forskningsdata og prøver. • Deling av data og materialprøver gir deg bredere spredning og påvirkningskraft for din forskning. • Tilrettelegging for gjenbruk av forskningsdata og materialprøver forsterker åpen og nyskjerrighets-dreven forskning og kan lede til uventede forsknings- gjennombrudd!