Valen Metadata and the [Data] Repository

This presentation was provided by Dan Valen of Figshare during the NISO Training Thursday, Metadata and the IR, held on February 23, 2017.

  1. 1. Dan Valen, figshare 2.2017 Metadata and the [Data] Repository
  2. 2. Why research data repositories? •  Funder Mandates •  Publisher Mandates •  FACTS (aka, data or it didn’t happen) Source: hJp://datasharing.sparcopen.org/data
  3. 3. Drivers: Governments aligning to support public access of research   OSTP Memo   NIH, NSF, NOAA, SI, and more…   Tri-Agency Open Access Policy   FASTR   Other Agency/Funder Policies ie: Gates, Moore, Sloan, NEH, Howard Hughes   Concordat on Open Research Data (HEFCE, RCUK, UUK, Wellcome Trust), EPSRC, and more   Horizon 20/20 OpenAIRE North America UK / Europe Australia   ARC – data management plan expecta^on from researchers   Australia Na^onal Data Service – suppor^ng the sector, Ins^tu^onal Data Management Frameworks
  4. 4. Drivers: Nudge the system towards promo^ng research integrity
  5. 5. Source: hJp://www.dlib.org/dlib/january15/01guest_editorial.html Drivers: Data as a “first-class ci^zen”, Transparency, and the 3 Rs
  6. 6. Source: hJps://www.rd-alliance.org/sites/default/files/aJachment/The%20Data%20Harvest%20Final.pdf
  7. 7. So what is the community doing to make data discoverable? Source: hJp://www.niso.org/apps/ group_public/download.php/15375/ PrimerRDM-2015-0727.pdf
  8. 8. Standards and the community Source: hJp://rd-alliance.github.io/metadata-directory/standards/ + hJp://www.dcc.ac.uk/resources/subject-areas/physical-science
  9. 9. Adhering to Force11 Joint Declara^on of Data Cita^on principles (for discovery)… Source: hJps://www.force11.org/group/joint-declara^on-data-cita^on-principles-final Source: hJps://thlibrary.wordpress.com/2013/08/13/spotlight-on-data-cita^on-index/ PREAMBLE Sound, reproducible scholarship rests upon a founda^on of robust, accessible data. For this to be so in prac^ce as well as theory, data must be accorded due importance in the prac^ce of scholarship and in the enduring scholarly record. In other words, data should be considered legi^mate, citable products of research. Data cita^on, like the cita^on of other evidence and sources, is good research prac^ce and is part of the scholarly ecosystem suppor^ng data reuse. In support of this asser^on, and to encourage good prac^ce, we offer a set of guiding principles for data within scholarly literature, another dataset, or any other research object.
  10. 10. Data FAIRport. N.p., n.d. Web. 20 June 2016.
  11. 11. What’s out there?
  12. 12. The Generalist Repository
  13. 13. A look at the generalist data repo Source: hJps://zenodo.org/dev#harvest-metadata
  14. 14. Metadata at Zenodo: behind the scenes
  15. 15. Metadata at Zenodo: the “finished” product Source: hJps://zenodo.org/record/313122
  16. 16. Metadata at figshare: OAI-PMH
  17. 17. Metadata at figshare: behind the scenes
  18. 18. Metadata at figshare: the “finished” product Source: hJps://doi.org/10.6084/m9.figshare.104616.v4
  19. 19. Data repos, metadata, and discovery Source: hJp://www.makeuseof.com/tag/technology-explained-what-is-a-meta-search-engine/
  20. 20. Index content for discovery in Google + Google Scholar Can expose hidden content within published ar^cles
  21. 21. Google and Datasets (tabular data and beyond) Source: hJps://developers.google.com/search/docs/data-types/datasets
  22. 22. Addi^onal ways to discover content: SHARE hJps://share.osf.io/discover?sources=figshare
  23. 23. Addi^onal ways to discover content: DataCite + the importance of metadata Source: hJps://search.datacite.org
  24. 24. Mapping to different metadata standards is tough… Source: hJp://jennriley.com/metadatamap/
  25. 25. From a data model perspec^ve, this problem has at least one solu^on •  RDF (machine readable format to describe anything) •  Ontology (a machine readable dic^onary for gepng meaning from RDF)
  26. 26. Challenges and next steps •  Expose metadata as RDF or LD and allow folks to build extensions or mappings between what community has and other formats •  Common metadata standards •  (ie: from CRIS/RIM -> paper repo -> data repo -> through to) •  OAI-PMH - doesn’t support versioning/wasn’t designed with that in mind & lacks informa^on on the actual files related to metadata records and that might be of use for some people… so JSON-LD via the API? •  Automated metadata capture and genera^on upon upload Image courtesy of sheffield.figshare.com
  27. 27. Thank you so much for your ^me dan@figshare.com @figshare figshare.com