SlideShare une entreprise Scribd logo
1  sur  23
Persistent Identifiers for Scholarly Assets and the Web:
The Need for an Unambiguous Mapping

Herbert Van de Sompel
@hvdsomp
Robert Sanderson
@azaroth42
Harihar Shankar
@hariharshankar
Martin Klein
@mart1nkle1n

Los Alamos National Laboratory
Acknowledgments
•
•
•
•
•
•
•
•

Sean Bechhofer – University of Manchester
Geoff Bilder – CrossRef
Maarten Hoogerwerf – DANS
Pete Johnston – Cambridge University
Carl Lagoze - University of Michigan
Michael L. Nelson – Old Dominion University
Andrew Treloar – ANDS
Simeon Warner – Cornell University

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
Motivation
• Persistent/Persist-able Identifiers (PIDs) play a crucial role in the
identification of scholarly assets
• Motivated by concerns of long term persistence, PIDs are minted
outside of the dominant web information access protocol, HTTP
• Value added services targeted at humans and machines
assume/require resources identified by means of HTTP URIs
• Hence, an unambiguous bridge is required between:
• PID-oriented paradigm of research communication
• HTTP-oriented web, semantic web, linked data environment
• Preferably, such a bridge should work across PID systems
• Interoperability between PID systems

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
Status Quo of the PID/HTTP Bridge

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
HTTP HEAD != HTTP GET
• The expectation is that an HTTP HEAD on HTTP-URI-PID will yield
the same response (without body) as an HTTP GET
• Martin Fenner finds this is not always the case
• Not a CrossRef resolver problem, a publisher problem

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
Notation

Asset Identifier

PID

Resolving URI
HTTP-URI-PID
Redirect URI (landing page) HTTP-URI-LAND
Location URI (content)
HTTP-URI-LOC

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
Examples of Issues with the PID/HTTP Bridge
• Given an HTTP-URI-PID, how can a machine navigate towards the
actual content (i.e. not the landing page)?

• Given an HTTP-URI-LOC (of - say - an image), what is the PID of
the asset it resorts under?
• What is the URI of the Target of an Open Annotation that pertains to
a PID-identified asset (i.e. not to the landing page, not to the PDF,
the HTML, …)?

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
Requirements for the PID/HTTP Bridge
• Targeted at machines so richer applications (for humans and
machines) can emerge
• Follow your nose; typed links; RDF
• Support for bundling resources and describing those
resources to reflect that assets increasingly consist of multiple, not
just a single, resource
• Multiple HTTP-URI-LOC resort under a PID
• Support for resource versioning, discovery of versions, access to
versions to reflect that resources used or created during the
research process are increasingly dynamic

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
Evidence for these Requirements: Data Citation Principles
(4) Unique Identification: A data citation should include a persistent
method for identification that is machine actionable, globally
unique, and widely used in the community.
(5) Access: Data citations should facilitate access to the data
themselves and to such associated metadata, documentation,
code, and other materials, as are necessary for both humans and
machines to make informed use of the referenced data.
(7) Specificity and Verifiability: … Citations or citation metadata should
include information about provenance and fixity sufficient to
facilitate verifying that the specific timeslice, version and/or granular
portion of data retrieved subsequently is the same as was originally
cited.
Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
A Proposed PID/HTTP Bridge
• A bridge goes in two directions:
• Uniform path from the PID of an asset the asset’s constituent
resources, each identified by a distinct HTTP-URI-LOC
• Uniform path from the HTTP-URI-LOC of a constituent resource
of a scholarly asset to the PID of that asset
• In order to build the bridge, a rather basic question needs an answer
…

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
What is the Nature of the Resource Identified by HTTP-URI-PID?
• HTTP-URI-PID identifies the landing page HTTP-URI-LAND
• Interpretation supported by typical “302 Found” redirection
• HTTP-URI-PID identifies the asset identified by PID for the purpose
of web interactions
• Interpretation supported by:
• CrossRef display guideline that recommends using HTTPURI-PID in the online environment, replacing prior practice
to use PID
• CrossRef provides descriptive RDF metadata using “303
See Also” style content negotiation with HTTP-URI-PID
• The resource is conceptual, a so-called non-information
resource

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
A Proposed PID/HTTP Bridge
• A bridge goes in two directions:
• Uniform path from the PID of an asset to the asset’s constituent
resources, each identified by a distinct HTTP-URI-LOC
• Uniform path from the HTTP-URI-LOC of a constituent resource
of a scholarly asset to the PID of that asset
• HTTP-URI-PID identifies the asset identified by PID for the purpose
of web interactions
• The proposed bridge builds on: HTTP, Cool URIs for the
Semantic Web, HTTP Links and Link Relation Types, OAI-ORE,
Memento
Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
Requirements for the PID/HTTP Bridge
 Targeted at machines
 Support for bundling resources
• Support for resource versioning

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
Common Resource Versioning Pattern

version-specific URI
generic URI: always most recent version
version-specific URI
Resource Versioning
• This common resource versioning pattern can be used for
Aggregations (HTTP-URI-PID), Resource Maps (HTTP-URI-MACH),
Aggregated Resources (HTTP-URI-LOC, HTTP-URI-LAND)
• The pattern aligns perfectly with Memento which offers modular
functionality for discovering, accessing resource versions using
HTTP headers (See Resource Versioning and Memento):
• Express datetime of a resource version
• Interlink resource versions
• Interlink resource version and the associated generic resource
• Access an overview of all resource versions
• Access a resource version that was current at a given datetime

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
Requirements for the PID/HTTP Bridge
 Targeted at machines
 Support for bundling resources
 Support for resource versioning

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
Open Issues
Which ontologies for metadata,
types, relationships? Cf. SURF
info-eu-repo, State of the LOD
Cloud

• No URI schemes for PIDs
• PID/HTTP-URI-PID for each
version; typically none that
always yield the current version

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
Open Issues
Should it be owl:sameAs

Should it be rel=“collection”

Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014
References
• Martin Fenner. Challenges in automated DOI resolution.
http://blog.martinfenner.org/2013/10/13/broken-dois/
• FORCE11 Data Citation Principles. http://force11.org/datacitation
• Cool URIs for the Semantic Web. http://www.w3.org/TR/cooluris/
• Web Linking. http://tools.ietf.org/search/rfc5988
• IANA Link Relation Types. http://www.iana.org/assignments/linkrelations/link-relations.xhtml
• OAI-ORE. http://www.openarchives.org/ore/1.0/
• Memento, RFC 7089. http://tools.ietf.org/html/rfc7089
• Resource Versioning and Memento.
http://www.mementoweb.org/guide/howto/
• SURF info-eu-repo. http://purl.org/REP/standards/info-eu-repo
• State of the LOD Cloud. http://lod-cloud.net/state/
Van de Sompel, Sanderson, Shankar, Klein
IDCC 2014, San Francisco, CA, February 26 2014

Contenu connexe

Tendances

Carpenter - Wolfram Data Summit ResourceSync
Carpenter - Wolfram Data Summit ResourceSyncCarpenter - Wolfram Data Summit ResourceSync
Carpenter - Wolfram Data Summit ResourceSync
nisohq
 
Web of Data Usage Mining
Web of Data Usage MiningWeb of Data Usage Mining
Web of Data Usage Mining
Markus Luczak-Rösch
 

Tendances (20)

PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed Collections
 
NISO ResourceSync Training Session
NISO ResourceSync Training SessionNISO ResourceSync Training Session
NISO ResourceSync Training Session
 
ResourceSync: Web-based Resource Synchronization
ResourceSync: Web-based Resource SynchronizationResourceSync: Web-based Resource Synchronization
ResourceSync: Web-based Resource Synchronization
 
ResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource Synchronization
 
Carpenter - Wolfram Data Summit ResourceSync
Carpenter - Wolfram Data Summit ResourceSyncCarpenter - Wolfram Data Summit ResourceSync
Carpenter - Wolfram Data Summit ResourceSync
 
The web is rotting and what to do about it
The web is rotting and what to do about itThe web is rotting and what to do about it
The web is rotting and what to do about it
 
Paul Evan Peters Lecture
Paul Evan Peters LecturePaul Evan Peters Lecture
Paul Evan Peters Lecture
 
Discovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCIDDiscovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCID
 
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
 
The state of the art in Linked Data
The state of the art in Linked DataThe state of the art in Linked Data
The state of the art in Linked Data
 
Web of Data Usage Mining
Web of Data Usage MiningWeb of Data Usage Mining
Web of Data Usage Mining
 
Linked Data - Exposing what we have
Linked Data - Exposing what we haveLinked Data - Exposing what we have
Linked Data - Exposing what we have
 
How Libraries Use Publisher Metadata Redux (Steven Shadle)
How Libraries Use Publisher Metadata Redux (Steven Shadle)How Libraries Use Publisher Metadata Redux (Steven Shadle)
How Libraries Use Publisher Metadata Redux (Steven Shadle)
 
To the Rescue of the Orphans of Scholarly Communication
To the Rescue of the Orphans of Scholarly CommunicationTo the Rescue of the Orphans of Scholarly Communication
To the Rescue of the Orphans of Scholarly Communication
 
Open Source: Liberating your systems
Open Source: Liberating your systemsOpen Source: Liberating your systems
Open Source: Liberating your systems
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
Creating Topical Collections: Web Archives vs. Live Web
Creating Topical Collections:Web Archives vs. Live WebCreating Topical Collections:Web Archives vs. Live Web
Creating Topical Collections: Web Archives vs. Live Web
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than Done
 

Similaire à Persistent Identifiers and the Web: The Need for an Unambiguous Mapping

Open Data - Principles and Techniques
Open Data - Principles and TechniquesOpen Data - Principles and Techniques
Open Data - Principles and Techniques
Bernhard Haslhofer
 
Change Management for Libraries
Change Management for LibrariesChange Management for Libraries
Change Management for Libraries
Thomas King
 
Pushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesPushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the Pipes
Rajarshi Guha
 
TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22
jodischneider
 
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
National Information Standards Organization (NISO)
 

Similaire à Persistent Identifiers and the Web: The Need for an Unambiguous Mapping (20)

API Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid Rahimian
API Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid RahimianAPI Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid Rahimian
API Design, A Quick Guide to REST, SOAP, gRPC, and GraphQL, By Vahid Rahimian
 
Semantic web: where are we now?
Semantic web: where are we now? Semantic web: where are we now?
Semantic web: where are we now?
 
Linked (Open) Data
Linked (Open) DataLinked (Open) Data
Linked (Open) Data
 
Authors' and Publications' Citations knowledge base
Authors' and Publications' Citations knowledge base Authors' and Publications' Citations knowledge base
Authors' and Publications' Citations knowledge base
 
Linked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageLinked Open Data for Cultural Heritage
Linked Open Data for Cultural Heritage
 
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
 
Prototypes of pro-active approaches to support the archiving of web reference...
Prototypes of pro-active approaches to support the archiving of web reference...Prototypes of pro-active approaches to support the archiving of web reference...
Prototypes of pro-active approaches to support the archiving of web reference...
 
Open Data - Principles and Techniques
Open Data - Principles and TechniquesOpen Data - Principles and Techniques
Open Data - Principles and Techniques
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
Change Management for Libraries
Change Management for LibrariesChange Management for Libraries
Change Management for Libraries
 
Snac webinar v3
Snac webinar v3Snac webinar v3
Snac webinar v3
 
Pushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesPushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the Pipes
 
TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data Generation
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
 
NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)
 

Plus de Herbert Van de Sompel

DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
Herbert Van de Sompel
 

Plus de Herbert Van de Sompel (19)

Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized Web
 
FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning Issue
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly record
 
To the Rescue of Scholarly Orphans
To the Rescue of Scholarly OrphansTo the Rescue of Scholarly Orphans
To the Rescue of Scholarly Orphans
 
Almost two decades at LANL
Almost two decades at LANLAlmost two decades at LANL
Almost two decades at LANL
 
Perseverance on Persistence
Perseverance on PersistencePerseverance on Persistence
Perseverance on Persistence
 
Signposting Overview
Signposting OverviewSignposting Overview
Signposting Overview
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
 
Creating Pockets of Persistence
Creating Pockets of PersistenceCreating Pockets of Persistence
Creating Pockets of Persistence
 
Memento 101
Memento 101Memento 101
Memento 101
 
A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly Record
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
 
ResourceSync tutorial OAI8
ResourceSync tutorial OAI8ResourceSync tutorial OAI8
ResourceSync tutorial OAI8
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
The Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communicationThe Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communication
 
Paint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner InfrastructurePaint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner Infrastructure
 
ResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem Perspective
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication System
 

Dernier

Dernier (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Persistent Identifiers and the Web: The Need for an Unambiguous Mapping

  • 1. Persistent Identifiers for Scholarly Assets and the Web: The Need for an Unambiguous Mapping Herbert Van de Sompel @hvdsomp Robert Sanderson @azaroth42 Harihar Shankar @hariharshankar Martin Klein @mart1nkle1n Los Alamos National Laboratory
  • 2. Acknowledgments • • • • • • • • Sean Bechhofer – University of Manchester Geoff Bilder – CrossRef Maarten Hoogerwerf – DANS Pete Johnston – Cambridge University Carl Lagoze - University of Michigan Michael L. Nelson – Old Dominion University Andrew Treloar – ANDS Simeon Warner – Cornell University Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 3. Motivation • Persistent/Persist-able Identifiers (PIDs) play a crucial role in the identification of scholarly assets • Motivated by concerns of long term persistence, PIDs are minted outside of the dominant web information access protocol, HTTP • Value added services targeted at humans and machines assume/require resources identified by means of HTTP URIs • Hence, an unambiguous bridge is required between: • PID-oriented paradigm of research communication • HTTP-oriented web, semantic web, linked data environment • Preferably, such a bridge should work across PID systems • Interoperability between PID systems Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 4. Status Quo of the PID/HTTP Bridge Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 5.
  • 6.
  • 7. HTTP HEAD != HTTP GET • The expectation is that an HTTP HEAD on HTTP-URI-PID will yield the same response (without body) as an HTTP GET • Martin Fenner finds this is not always the case • Not a CrossRef resolver problem, a publisher problem Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 8. Notation Asset Identifier PID Resolving URI HTTP-URI-PID Redirect URI (landing page) HTTP-URI-LAND Location URI (content) HTTP-URI-LOC Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 9. Examples of Issues with the PID/HTTP Bridge • Given an HTTP-URI-PID, how can a machine navigate towards the actual content (i.e. not the landing page)? • Given an HTTP-URI-LOC (of - say - an image), what is the PID of the asset it resorts under? • What is the URI of the Target of an Open Annotation that pertains to a PID-identified asset (i.e. not to the landing page, not to the PDF, the HTML, …)? Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 10. Requirements for the PID/HTTP Bridge • Targeted at machines so richer applications (for humans and machines) can emerge • Follow your nose; typed links; RDF • Support for bundling resources and describing those resources to reflect that assets increasingly consist of multiple, not just a single, resource • Multiple HTTP-URI-LOC resort under a PID • Support for resource versioning, discovery of versions, access to versions to reflect that resources used or created during the research process are increasingly dynamic Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 11. Evidence for these Requirements: Data Citation Principles (4) Unique Identification: A data citation should include a persistent method for identification that is machine actionable, globally unique, and widely used in the community. (5) Access: Data citations should facilitate access to the data themselves and to such associated metadata, documentation, code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data. (7) Specificity and Verifiability: … Citations or citation metadata should include information about provenance and fixity sufficient to facilitate verifying that the specific timeslice, version and/or granular portion of data retrieved subsequently is the same as was originally cited. Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 12. A Proposed PID/HTTP Bridge • A bridge goes in two directions: • Uniform path from the PID of an asset the asset’s constituent resources, each identified by a distinct HTTP-URI-LOC • Uniform path from the HTTP-URI-LOC of a constituent resource of a scholarly asset to the PID of that asset • In order to build the bridge, a rather basic question needs an answer … Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 13. What is the Nature of the Resource Identified by HTTP-URI-PID? • HTTP-URI-PID identifies the landing page HTTP-URI-LAND • Interpretation supported by typical “302 Found” redirection • HTTP-URI-PID identifies the asset identified by PID for the purpose of web interactions • Interpretation supported by: • CrossRef display guideline that recommends using HTTPURI-PID in the online environment, replacing prior practice to use PID • CrossRef provides descriptive RDF metadata using “303 See Also” style content negotiation with HTTP-URI-PID • The resource is conceptual, a so-called non-information resource Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 14. A Proposed PID/HTTP Bridge • A bridge goes in two directions: • Uniform path from the PID of an asset to the asset’s constituent resources, each identified by a distinct HTTP-URI-LOC • Uniform path from the HTTP-URI-LOC of a constituent resource of a scholarly asset to the PID of that asset • HTTP-URI-PID identifies the asset identified by PID for the purpose of web interactions • The proposed bridge builds on: HTTP, Cool URIs for the Semantic Web, HTTP Links and Link Relation Types, OAI-ORE, Memento Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 15.
  • 16.
  • 17. Requirements for the PID/HTTP Bridge  Targeted at machines  Support for bundling resources • Support for resource versioning Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 18. Common Resource Versioning Pattern version-specific URI generic URI: always most recent version version-specific URI
  • 19. Resource Versioning • This common resource versioning pattern can be used for Aggregations (HTTP-URI-PID), Resource Maps (HTTP-URI-MACH), Aggregated Resources (HTTP-URI-LOC, HTTP-URI-LAND) • The pattern aligns perfectly with Memento which offers modular functionality for discovering, accessing resource versions using HTTP headers (See Resource Versioning and Memento): • Express datetime of a resource version • Interlink resource versions • Interlink resource version and the associated generic resource • Access an overview of all resource versions • Access a resource version that was current at a given datetime Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 20. Requirements for the PID/HTTP Bridge  Targeted at machines  Support for bundling resources  Support for resource versioning Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 21. Open Issues Which ontologies for metadata, types, relationships? Cf. SURF info-eu-repo, State of the LOD Cloud • No URI schemes for PIDs • PID/HTTP-URI-PID for each version; typically none that always yield the current version Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 22. Open Issues Should it be owl:sameAs Should it be rel=“collection” Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014
  • 23. References • Martin Fenner. Challenges in automated DOI resolution. http://blog.martinfenner.org/2013/10/13/broken-dois/ • FORCE11 Data Citation Principles. http://force11.org/datacitation • Cool URIs for the Semantic Web. http://www.w3.org/TR/cooluris/ • Web Linking. http://tools.ietf.org/search/rfc5988 • IANA Link Relation Types. http://www.iana.org/assignments/linkrelations/link-relations.xhtml • OAI-ORE. http://www.openarchives.org/ore/1.0/ • Memento, RFC 7089. http://tools.ietf.org/html/rfc7089 • Resource Versioning and Memento. http://www.mementoweb.org/guide/howto/ • SURF info-eu-repo. http://purl.org/REP/standards/info-eu-repo • State of the LOD Cloud. http://lod-cloud.net/state/ Van de Sompel, Sanderson, Shankar, Klein IDCC 2014, San Francisco, CA, February 26 2014

Notes de l'éditeur

  1. Suggesting that the resource identified by HTTP-URI-PID is a non-information resource that corresponds with the scholarly asset as an intellectual object