Biological Science Collections Tagging and Tracking presented at SPNHC
1. BiSciCol: Biological Science
Collections Tracker
Tracking Biodiversity
Objects to Brokering Standards
Brian Stucky, University of Colorado, Boulder
John Deck, University of California, Berkeley
Lukasz Ziemba, University of Florida, Gaineseville
Nico Cellinese, University of Florida, Gainesville
Rob Guralnick, University of Colorado, Boulder
BiSciCol Team:
Reed Beaman, Nico Cellinese, Jonathan Coddington, Neil Davies, John Deck, Rob
Guralnick, Bryan P. Heidorn, Chris Meyer, Tom Orrell, Rich Pyle, Kate Rachwal, Brian
Stucky, Rob Whitton, Lukasz Ziemba
Univ. Hawai’i
Univ. Arizona
Smithsonian
2. • National Science Foundation funded 2010 – 2014
• Infrastructure to tag & track specimens & derivates in cyberspace
• Relies on globally unique identifiers (GUIDs) to track objects
• Implements a Linked Data approach
9. Borrowing from Facebook and social media…
Can we track relationships for Biological Objects as well?
10. A Biological Relationship Graph …
Taxonomic Type Filter
Class Filter
X Specimens
Tissues
X Sequences
Functions
X Infer Relationships Across providers
11. Moorea Biocode Example: From field collection through
analysis, across multiple systems
Taxon (Taxon) Taxon*n Taxon
Key (Key) Blast*n Blast
(Biocode Event)
(metagenomic
Sequencing)
(CAMERA
Gut Sample Event)
(Essig Museum Specimen)
(Genbank Sequence)
(Smithsonian Tissue)
12.
13. Examples:
Global Unique identifiers: • Globally unique (mandatory)
• Persistent (not mandatory, but very helpful)
http://example.org/urn:lsid:example.org:specimen/7217D220-836A-11DF-8395-0800200C9A66
• Resolvable (not mandatory, but very helpful)
http://mycollection.org/specimen/JDeckSpecimen1
http://mycollection.org/specimen/uuid=7217D220-836A-11DF-8395-0800200C9A66
http://dx.doi.org/10.5072/FK2JW8GKM
15. ONE FINAL PIECE
OF THE PUZZLE:
GIVING BIRTH TO
DATA IN THE RIGHT
FORMAT FOR
LINKING
16. “Triplifier” - creating the format for linking biological objects
Darwin Core
Archive
Darwin
Core
Archive
Triplifier
Create links from
Native data formats
Mysql
KEMU
Mysql
19. IndividualID1 EventID1 GeoreferenceID1
EventID2 GeoreferenceID2
EventID3 GeoreferenceID3
Working with
Locations:
Tracking
location in
space of a
moving
individual
(whales)
20. Data Impact Factor – Graph Metrics
Collectors Graphs
Gustav Paulay [ ] GBIF Relations Graph
(102,000 direct children) [X] Moorea Biocode
[X] SI MSNGR System
[+] Add New Graph
Christopher Meyer
(83,000 direct children) Occurrences
MBIO99999
Craig Moritz (1024 total descendents)
(523 direct children)
IMBL8888888
(723 total descendents)
Events Cited occurrences over time
Biocode10234
(4234 direct children)
Expedition21234
(1023 direct children)
21. Why BiSciCol and Why SPNHC and Why Collaborations?
• New era of collections digitization
• new & derived data objects created, replicated, annotated
• BiSciCol tackles preservation of nat. hist. collections challenge:
• How to follow these digital objects
• How to link together objects and derivatives back to specimens
• BiSciCol is about community, collaborative practice
• Commitment to standards, ontologies
• Agreement on permanent, resolvable identifiers
• Triplification of data sources to enhance linked data