R2R and BCO-DMO are linked oceanographic data repositories that provide metadata for datasets collected from ocean research vessels and expeditions. They utilize linked data to improve discovery of datasets across repositories and attribute datasets to contributors. R2R catalogs vessel instrumentation and contains over 500k triples, while BCO-DMO catalogs PI-submitted datasets including over land deployments and contains over 2 million triples. The repositories overlap in contributors and some cruises, and link metadata to external sources like DBPedia.
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
R2R+BCO-DMO Linked Oceanographic Datasets
1. R2R+BCO-DMO – Linked Oceanographic Datasets
Adila Krisnadhi1,5 Robert Arko2 Suzanne Carbotte2 Cynthia
Chandler3 Michelle Cheatham1 Pascal Hitzler1 Yingji Hu4
Krzysztof Janowciz4 Peng Ji2 Nazifa Karima1 Adam Shepherd3
Peter Wiebe3
1Data Semantics Lab, Wright State University
2Lamont-Doherty Observatory, Columbia University
3Woods Hole Oceanographic Institution
4Geography Department, University of California, Santa Barbara
5Faculty of Computer Science, Universitas Indonesia
Diversity++ 2015
Krisnadhi, et al Diversity++ 2015 1 / 13
2. Why Linked Data for Oceanography
Data proliferation
Increased number of repositories ⇒ increased heterogeneity.
Need to discover, access, and integrate data cross repositories
R2R & BCO-DMO are repositories.
Both hold datasets of field observations.
Linked data is for metadata of those datasets.
Linked data objective: starting point to enable dataset discovery.
Additional benefit: attribution of datasets to contributors in the form
of links.
Krisnadhi, et al Diversity++ 2015 2 / 13
3. Rolling Deck Repository (R2R)
Screen shot (10/10/2015) from: http://www.rvdata.us/catalog/Kilo_Moana
Krisnadhi, et al Diversity++ 2015 3 / 13
4. R2R
http://www.rvdata.us
Every NSF-funded cruise on a vessel in the
academic fleet creates an R2R record.
Environmental sensor data on-board vessels.
Catalog of vessels, instrumentation systems, expeditions, datasets,
investigators, organizations, funding awards, cruise reports, and
navigation tracks.
>530k triples, 25 in-service vessels, >4.3k cruises, >18 mil. archived
files
60,000 page views per month.
Krisnadhi, et al Diversity++ 2015 4 / 13
5. R2R: Architecture
Original picture from: http://www.rvdata.us/system/files/overview.png as displayed on (10/10/2015) at
http://www.rvdata.us/overview
Krisnadhi, et al Diversity++ 2015 5 / 13
7. Biological and Chemical Oceanography Data Management
Office (BCO-DMO)
Screen shot (10/10/2015) from: http://mapservice.bco-dmo.org/mapserver/maps-ol/index.php
Krisnadhi, et al Diversity++ 2015 7 / 13
8. BCO-DMO: Architecture
BCO-DMO Data Management Architectural Overview
Metadata
Database
and Web
Content
Data ServerData ServerData Server
Geospatial Access
MapServer-cartography
OpenLayers-interface;
interrogate and draw
features
ExtJS and other JavaScript
libraries-environment
MySQL-metadata
BCO-DMO
Website
Public access
via Drupal
Web content and
metadata
JGOFS/GLOBEC
Backend Data
Storage and
Retrieval
Supporting Software
- Drupal
- PHP, Perl
- Load navigation and date
information into Location
table (Perl)
- Report modules
- NSF Tracker subsystem
Data Manager access
Metadata and web
content insert, update,
delete and display
Perl Library
Perl code calling REST
API via Drupal
November 21, 2013
Highlights
Text based interface; Geospatial
(MapServer) interface; Metadata
database stored in Drupal CMS;
Distributed backend data
management system; Fitness for
purpose tools in MapServer and
JGOFS/GLOBEC; Browser clients,
also distributed; Ability to support
other data management backends;
Semantic elements (contributed vs
standard names); Advanced
search using triple stores from
several sources; No login required;
Access to metadata; Access to
actual data; Data manager
interface via Drupal; Direct transfer
of data and metadata to
appropriate national archive, such
as NODC, when data are final.
Original picture from: http://www.bco-dmo.org/sites/default/files/BCO-DMO_System_Architecture.pdf as displayed on
10/10/2015
Krisnadhi, et al Diversity++ 2015 8 / 13
9. BCO-DMO
http://bco-dmo.org
PI of NSF-funded research expedition must
submit data from their expedition to
BCO-DMO.
PI may bring own instruments.
Catalog of datasets, instrumentation systems, measurement
parameters, investigators, organizations, funding awards, projects,
programs, and deployments.
Deployments involve than just vessels (i.e., not just cruises).
>2.1 mil triples, 7,500 datasets including information about >1.7k
researchers, >2.1k deployments, 500 projects.
6.5k page views per month.
Krisnadhi, et al Diversity++ 2015 9 / 13
11. Overlaps
Only a few dozens oceanographic research vessels being deployed.
R2R is vessel-centric. BCO-DMO is PI-centric and has more than just
cruise.
Overlapping set of people, cruise identifiers (linked between each
other).
341 person instances (exact match)
External links
R2R organization to dbpedia: 288/520
BCO-DMO instruments to dbpedia: 42/409
BCO-DMO organization to dbpedia: 81/488
Krisnadhi, et al Diversity++ 2015 11 / 13