Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Persistent Identifier Services and their Metadata by John Kunze

805 vues

Publié le

Persistent identifiers (Pids) provide machine-actionable links to data and metadata that are vital to APIs (application programming interfaces) for publishing and citation. APIs are essentially request/response patterns that use Pids to reference things and metadata to describe not only the things themselves, but also any actions requested or taken. As a result, metadata design and standardization is wedded to API design and enhancement. With Pids as nouns and metadata as adjectives and qualifiers, Pid services play a key role in API implementation.

Publié dans : Formation
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Persistent Identifier Services and their Metadata by John Kunze

  1. 1. Persistent Identifier Services and their Metadata John Kunze California Digital Library
  2. 2. 2 Decoding the title persistent identifier services and their metadata || || || things, actions, and descriptors || || || nouns, verbs, and adjectives || preserving and serving scholarly communication around data (Context: scholarly research data) 2
  3. 3. Obama
  4. 4. 4 An identifier is not a string of characters An identifier is an association between a string and thing. An association is an opinion asserted by an authority. Example 1: http://allrecipes.com/recipe/sauteed-fiddleheads Example 2: 4CF3-57AB-2481-651D-D53D-Q 4 http://dx.doi.org/10.5072/4CF3-57AB-2481-651D-D53D-Qhttp://dx.doi.org/10.5240/4CF3-57AB-2481-651D-D53D-Q
  5. 5. 5 Identifier schemes (v1) • URL (Uniform Resource Locator) • the first time poor id management is blamed on syntax • URN (Uniform Resource Name) • first attempt to correct poor id management with syntax • Handle • second attempt to correct poor id management with syntax • DOI (Digital Object Identifier) • third attempt to correct poor id management with syntax • ARK (Archival Resource Key) • attempt to let id management be queryable (not yet realized) 5
  6. 6. 6 Identifier schemes (v2) • URL (Uniform Resource Locator) • world’s first actionable id, now underlying all other types • URN (Uniform Resource Name) • open infrastructure, not fully realized globally • Handle • closed infrastructure, fully realized globally • DOI (Digital Object Identifier) • CrossRef enforces good id management, DataCite learning • ARK (Archival Resource Key) • open infrastructure, realized locally and globally 6
  7. 7. 7 If DOIs won why talk about non-DOIs? • Cost • Open access • Changing nature of the DOI • Flexibility 7
  8. 8. 8 Types of identifier services • Repository – parking the bits • Data-aware dissemination • more than just returning parked bits • Citation management for end user researchers • Research tracking – measuring use and impact • Identifier creation, management, and resolution 8
  9. 9. 9 Many service tools, many APIs Repository Tools • ArXiv * • Dataverse * • Fedora/Hydra • Dspace * • Eprints • DataONE • Merritt/Stash • figshare • Zenodo 9 Citation Management • Mendeley • Zotero Metrics and Tracking • Altmetric • Impactstory • Thomson Reuters Data Citation Index • Elsevier Scopus
  10. 10. 10 API concepts Application Programming Interface (API) • how software talks to a service • unlike a Graphical User Interface (GUI) • more like a Command Line Interface (CLI) APIs and CLIs use language constructs • Verbs, nouns, and qualifiers are "words”, and • words form commands/requests/responses, • which form scripts and programs. 10
  11. 11. 11 APIs are metadata sentences A command line interface powering an API interaction 11 $ sort mydata > sorted_data $ grep Smith sorted_data Smith, Sally 2014-04-01 406B Wong, Frank 2013-11-28 334 $ wget --user=sam --no-check-certificate "https://n2t.net/a/ezid/b?set cost 25.50" status: ok
  12. 12. 12 Problem: traditional standardization • Change by committee is ugly, costly, and slow • Example: Dublin Core, 15 cross-domain terms 12 European Parliament Technology - DG ITEC @ flickr
  13. 13. The Metadata Universe Jenn Riley, IU
  14. 14. The Metadata Universe Jenn Riley, IU
  15. 15. The Metadata Universe Jenn Riley, IU
  16. 16. The Metadata Universe Jenn Riley, IU
  17. 17. The Metadata Universe Jenn Riley, IU
  18. 18. 18 An alternate metadata universe • Vision: one dictionary, one namespace • All research domains, any part of “metadata speech” • Names, values, units, relationships, ... • Search for terms, comment on terms, add terms, edit your terms, API for automated access • All terms with globally unique persistent identifiers • Available at yamz.net (yet another metadata zoo) 18
  19. 19. 19 YAMZ.net dictionary sociology • Crowd-sourced evolving vernacular terms, stable canonical terms, and deprecated terms • Use evolving terms depending on your risk tolerance • Reputation-based (gaming-resistant) voting means strong terms rise, weak terms decline 19 Applying lessons learned from Wikipedia, the Internet-Draft/RFC process, and StackOverflow
  20. 20. 20 Summary • Identifiers are not strings, but associations that break when things are not managed well • People can forget names because we can google, but APIs need persistent names for automation at scale • APIs are languages using metadata as “words” • Future API building will focus on vocabulary building • For example, yamz.net 20 Thank you! John.Kunze@ucop.edu

×