This document discusses sharing data between repositories. It proposes using the BagIt format to package data and metadata together in a portable way. The document outlines a potential workflow where data is uploaded from a source repository to Dryad using BagIt over HTTP. Key points are using open standards like BagIt, OAI-ORE, and Pairtree to allow data to be shared while avoiding reliance on local conventions. Future directions discussed include using these standards to allow data files and resource maps to be dereferenced over HTTP.
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Sharing data between repositories
1. Sharing Between
Data Repositories
Kevin S. Clarke
ksclarke@nescent.org
Thanks to the Dryad Data Repository contributors and funders:
Ryan Scherle, Todd J. Vision, Hilmar Lapp (NESCent)
Ben Bosman, Mark Diggory, Kevin Van de Velde (@mire, Inc.)
NESCent
3. Generic vs. Specific Repos
✔ Easy submission ✔ Complex submission
✔ Simple metadata ✔ More useful metadata
✔ Data is a “black box” ✔ Well structured data
✔ No “orphaned” data ✔ Specific type of data
13. BagIt Disseminator
(implements DSpace PackageDisseminator)
Dryad Application Profile
XSLT Dryad
Crosswalk Publication
DSpace Dryad
Data Dryad
Metadata Data File
Package Dryad
Data File
Dryad
Data File
Bag Data
from
DSpace
14. A BagIt Bag
bag-info.txt
data
bagit.txt
manifest-md5.txt tagmanifest-md5.txt
15. Dryad Data in the Bag
datafile-2
dryadpkg.xml
dryadfile-2.xml ApineDNA.nexus
datafile-1 dryadpub.xml
dryadfile-1.xml ApineCYTB.nexus
17. Lessons Learned
✔ Just enough to get the job done and no more
✔ Less local conventions and more “standards”
✔ There will always be custom solutions
✔ Options are developing quickly in this space
18. Future Directions
Less reliance on local conventions
✔ Plan to use OAI-ORE and Pairtree(s) within BagIt
OAI-ORE: Because it's Linked Data
Pairtree Filesystem
✔ So we can dereference URIs in ORE Resource Maps
http://dx.doi.org/10.5061/dryad.8343
URI prefix: http://dx.doi.org/10.5061/dryad.
Path: 83/43
83/43/Arctostaphylos.nex
19. Other Interesting Developments
DataONE
✔ Distributing data files and metadata
✔ May support packages in the future
“Dropbox of Bags”
or Bag replication network (BagNet?)
METS in Bags (in contrast to ORE)