This presentation was given as a keynote during the CAML session at the SCAR open science conference in Buenos Aires, August 2010. Its an introduction to Polar data sharing, focusing on SCAR's Marine Biodiversity Information Network (www.scarmarbin.be) and the new Antarctic Biodiversity Information Facility (www.biodiversity.aq). Both these projects aim at offering free and open access to raw scientific data pertaining to Antarctic biodiversity.
4. The Antarctic Treaty
« In order to promote international cooperation in
scientific investigation in Antarctica, as provided for in
Article III (1c) of the Treaty, the Contracting Parties agree
that, to the greatest extent feasible and practicable: […]
Scientific observations and results from Antarctica shall be
exchanged and made freely available. »
Thursday 5 August 2010
6. Mark Parsons [THANKS]
Discoverable
Data should be accessible soon after collection (online wherever possible) in a
discovery portal such as the Global Change Master Directory.
Thursday 5 August 2010
7. Mark Parsons [THANKS]
Open
Open Data is a philosophy and practice requiring that certain data are freely
available to everyone, without restrictions from copyright, patents or other
mechanisms of control.—Wikipedia
Thursday 5 August 2010
8. Mark Parsons [THANKS]
Linked
The term Linked Data is used to describe a method of exposing, sharing, and
connecting data [using] the Web.—Wikipedia
Thursday 5 August 2010
9. Mark Parsons [THANKS]
Useful
Data from different projects, disciplines, and data centers should be
easily understood and used in conjunction with each other in
standard tools and analysis frameworks
Data should be well described so to be useful for a broad audience.
Thursday 5 August 2010
10. Mark Parsons [THANKS]
Interoperable
Metadata and data should be readily interchangeable between
different polar data systems to enable data discovery across multiple
portals.
Thursday 5 August 2010
11. Mark Parsons [THANKS]
Safe
Safe from hackers, from obsolescence, from undocumented change, from loss,
and from the ravages of time.
Thursday 5 August 2010
12. Some initial recommendations
for scientists and data centers
• Investigators should publish their IPY data immediately.
• The scientific community needs to recognize the value of good data
through citation, consideration of data publication in promotion and
tenure review, and by training young scientists in data management.
• Data centers must develop partnerships with other data centers in
other countries and other disciplines to enhance data accessibility and
interoperability.
• Data centers should partner with their scientific community to explicitly
meet their needs, provide easy submission tools, and make the data
more useful and integrated with other data.
Mark Parsons [THANKS]
Thursday 5 August 2010
13. SCAR-MarBIN
SCAR’s Marine Biodiversity Information Network
www.scarmarbin.be
Thursday 5 August 2010
14. General philosophy
• Build an electronic ecosystem
• Offer free and open access to data and technology
• Expose all the (biodiversity) data and metadata
• Remain community-driven
• Adopt strong standardization
• Work for science, conservation, management
Thursday 5 August 2010
15. SCAR-MarBIN
Marine Biodiversity Information Network
• www.scarmarbin.be
• Main funding: Belgian science Policy office
• International funding
• International Polar Year 2007/08
• Census of Antarctic Marine Life
• Ocean Biogeographic Information System
• Global Biodiversity Information Network
Thursday 5 August 2010
16. The webportal
taxonomy, biogeography
vizualisation
open access
800,000 visitors
5,800,000 hits
39,000,000 dld records
V2alpha coming up
Thursday 5 August 2010
17. The Register of Antarctic
Marine Species
• The first RAMS
• Board of 60+ editors
all taxa
all species
• Feeds WoRMS, CoL and
EoL
valid species
• 16,475 taxa
0 3.750 7.500 11.250 15.000
• 9,346 species
Thursday 5 August 2010
18. Biogeographic data
1,088,044 records
178 datasets
5,235 taxa
Feeds OBIS, GBIF
Downloadable
WebGIS
OGC Webservices
Thursday 5 August 2010
27. SCAR-MarBIN V2
• 100% Open Source
• Geo-oriented
• improved search engine
• improved interface, direct access to data
• all resources available through Webservices
• alpha version to be deployed soon
Thursday 5 August 2010
33. ANTABIF
Antarctic Biodiversity Information Facility
www.biodiversity.aq
Thursday 5 August 2010
34. ANTABIF
Antarctic Biodiversity Information Facility
• www.biodiversity.aq
• Funding: Belgian science Policy office
• International Year of Biodiversity
• Single Access to all Antarctic Biodiversity
data
• Australian Antarctic Division
• Global Biodiversity Information Network
Thursday 5 August 2010
35. ANTABIF
Technology: 100% Open Source
• Search engine: Full text (SOLR-Lucene)
• Database: PostGresql
• GIS: Geoserver, PostGIS, OpenLayers
• Web services: RESTish (all resources)
• Protocoles: DIF, DwCore, DwC archive, Tapir…etc
• GBIF tools : HIT, IPT…etc
• Hosting: BBPF
• Metadata system: Global Change Master Directory
Thursday 5 August 2010
47. [projects] some examples
Antarctic Field Guides
Georeferenced genetic data
Polar Macroscope Synthesis
Biogeographic Atlas of the SO
Scratchpads
Thursday 5 August 2010
49. What is a Scratchpad?
A Toolkit for our community Vincent Smith [THANKS]
1 2 3
Your data Uploaded & “Published” & reviewed
tagged on your site
Fast Intuitive Fit for use
Thursday 5 August 2010
50. Scratchpads Vincent Smith [THANKS]
A multi-site implementation of Drupal
http://scratchpads.eu
Thursday 5 August 2010
51. Scratchpads Vincent Smith [THANKS]
A multi-site implementation of Drupal
http://scratchpads.eu
Thursday 5 August 2010
52. Taxonomy Vincent Smith [THANKS]
Taxonomy import,
management and
navigation
Thursday 5 August 2010
53. Bibliographic data Vincent Smith [THANKS]
Reference manager /
Endnote support for
bibliographies
Thursday 5 August 2010
54. Images Vincent Smith [THANKS]
Image galleries,
image upload &
annotation
Thursday 5 August 2010
55. Phylogeny Vincent Smith [THANKS]
Nexus / Newick import for
visualizing phylogenies
Thursday 5 August 2010
56. Character Matricies Vincent Smith [THANKS]
Molecular & morphological character matricies (discrete,
morphometric and text characters)
Thursday 5 August 2010
57. Distribution Maps Vincent Smith [THANKS]
Presence absence
country maps
Thursday 5 August 2010
58. Specimens & locations Vincent Smith [THANKS]
Specimen & location
records (DwC)
Thursday 5 August 2010
59. Pages, Forums, Blogs, Newsletters
Vincent Smith [THANKS]
Static web pages
Web fora with e-
mail integration
Newsletters with e-
User blogs mail integration
Thursday 5 August 2010
60. Mass Import Vincent Smith [THANKS]
Import from CSV text file to any content type
Thursday 5 August 2010
61. Multilingual Support Vincent Smith [THANKS]
Create & switch between content in any language
Thursday 5 August 2010
62. Taxon Pages Vincent Smith [THANKS]
Integrating data & “publishing” in a Scratchpad
Including 3rd party content
Thursday 5 August 2010
63. Taxon Pages Vincent Smith [THANKS]
Integrating data & “publishing” in a Scratchpad
Including 3rd party content
Thursday 5 August 2010
86. METADATA
• WHAT?
• Contact, abstract, citation
• much more can be entered
• HOW?
• online form: Datasets>Submit a dataset
• download data toolkit then email to me
• contact me
Thursday 5 August 2010
92. Your data is
DISCOVERABLE
Thursday 5 August 2010
93. DATA
• WHAT?
• ScientificName, Latitude, Longitude, Date
• much more can be added
• HOW?
• enter the metadata (!!!!)
• use DTK, or your own stuff, or more advanced
techniques
• contact your data center or me
Thursday 5 August 2010
107. Food for thought
• its a time of rapid change, need for timely change
in our behavior regarding data publication:
NORMS
• IPY/CAML’s virtual legacy
• save the whales? save the birds? SAVE THE DATA
FIRST! [in as many contexts as possible]
• flying across disciplines is exciting!
• information networks are organic
• this is just a beginning!
Thursday 5 August 2010