SlideShare a Scribd company logo
1 of 78
Download to read offline
@hvdsomp                                                                   #idcc13




           Wanderer above the Sea of Fog – Caspar David Friedrich (1818)
            http://en.wikipedia.org/wiki/Wanderer_above_the_Sea_of_Fog
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
The Scholarly Record is Changing

•  The scholarly record is extending with a wide range of non-
   traditional assets emerging from eScience and eHumanities
     •  e.g. datasets, software, ontologies, workflows, online debate,
        slides, blogs, videos, etc.

•  Many of these non-traditional assets:
    •  Have a wide range of relationships with and dependencies on
       other assets – grouping assets
    •  Are becoming increasingly dynamic, and do not have the sense
       of fixity that traditional assets such as journal articles or books
       have – versioning assets




                           Herbert Van de Sompel
           IDCC 2013, Amsterdam, The Netherlands, January 16 2013
grouping assets




   versioning assets




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
discovering assets




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
•  OAI was a heroic effort to fundamentally
                     transform scholarly communication
                       •  By promoting communication via
                          preprints, non-peer-reviewed papers

                  •  The OAI took a technical approach to
1999                 achieve the goal
                      •  Make preprints easier to discover,
                         access – Protocol for Metadata
                         Harvesting




                   Herbert Van de Sompel
   IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Don’t trust HTTP


                                                       HTTP GET on record identifier




                            An HTTP link



 Just another
HTTP baseURL



                                Herbert Van de Sompel
                IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
grouping assets




   versioning assets




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
•  OAI-ORE observation: Scholarly assets are
   rapidly becoming compound, consisting of
   multiple resources with various:
    •  Relationships
    •  Interdependencies

•  How to convey this compound-ness in an
                                                                     2007
   interoperable manner so that applications
   can access, consume such assets?




                                 Herbert Van de Sompel
                 IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
See e.g. http://www.ctwatch.org/quarterly/articles/2007/08/interoperability-
 for-the-discovery-use-and-re-use-of-units-of-scholarly-communication/8/
                               index.html



                                            Herbert Van de Sompel
                            IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
grouping assets




   versioning assets




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
•  Memento is about the Web and time:
    •  Resources evolve over time
    •  Only the current representation is
       available from a resource’s URI
    •  How to seamlessly access prior
       representation, if they exist?

•  Memento looks at this problem for the Web,
   in general                                                       2009




   Digital Preservation Award 2010


                                 Herbert Van de Sompel
                 IDCC 2013, Amsterdam, The Netherlands, January 16 2013
•  Memento has potential consequences for
   scholarly communication

•  Observation: Scholarly assets are
   becoming increasingly dynamic, and do not
   have the sense of fixity that traditional
   assets such as journal articles or books
   have
    •  Even traditional assets are becoming                        2009
       increasingly dynamic and dependent on
       other assets, which may themselves be
       dynamic




                                Herbert Van de Sompel
                IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Scientific Workflows, Services, Data, Workflow Engines




Carole Goble, JCDL 2012 Keynote https://dl.dropbox.com/u/617206/JCDL2012keynoteGoble.ppt

                                    Herbert Van de Sompel
                    IDCC 2013, Amsterdam, The Netherlands, January 16 2013
From The Version of Record to A Version of the Record

  •  The ever-evolving nature of some assets challenges the notion of
     fixity as “forever frozen” and begs considering the notion of the
     “state of the scholarly record at a specific moment in time”

  •  It will become essential to be able to determine what the state of
     related and interdependent assets was at certain moments in time




                             Herbert Van de Sompel
             IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Two Perspectives on Memento




   Web Archive




URI-M - http://web.archive.org/web/20010911203610/http://www.cnn.com/

URI-R - http://www.cnn.com/

                                    Herbert Van de Sompel
                    IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Two Perspectives on Memento




      CMS




URI-M - http://en.wikipedia.org/w/index.php?title=September_11_attacks&oldid=282333

URI-R - http://en.wikipedia.org/wiki/September_11_attacks

                                    Herbert Van de Sompel
                    IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
•  How to get to the time-specific resources
   from the generic resource?

•  Memento addresses the problem in a
   resource-centric way:
    •  Resource, URI, state, representation,
       link, content negotiation

                                                                    2009




                                 Herbert Van de Sompel
                 IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Access Versions via the original URI and datetime

                        Select Date
Today                                                       Sep 16 2010
                        Sep 12 2010




                                                              From
                                                            BL Archive
                        Herbert Van de Sompel
        IDCC 2013, Amsterdam, The Netherlands, January 16 2013
From The Version of Record to A Version of the Record

  •  The ever-evolving nature of some assets challenges the notion of
     fixity as “forever frozen” and begs considering the notion of the
     “state of the scholarly record at a specific moment in time”

  •  It will become essential to be able to determine what the state of
     related and interdependent assets was at certain moments in time




                             Herbert Van de Sompel
             IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Recreating a Version of the Record
•  Is it possible to reconstruct the Web-based scholarly record as it was at
a certain point in time?

•  Consider a special case: Given a paper can one see the referenced
materials as they were the time of publication of the paper?
     •  ti: Time of publication
     •  Relationship: Cited resources




                               Herbert Van de Sompel
               IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Published
                                      September 15 2004




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Domain Gone




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Archived copy
                                             December 5 2003




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Current version




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Archived copy
                                           December 11 2004




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Resource gone




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Archived copy
                                            December 5 2003




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Resource gone




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Archived copy
                                                unavailable




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Pilot Study at Scale with Memento

     •  Papers from arXiv: 400,000 papers => 144,000 unique URIs
     •  Papers from UNT ETD repository: 3,600 papers => 18,000 URIs
     •  Referenced URIs of established scholarly repositories removed (e.g.
     http://dx.doi.org), i.e. focusing in on the periphery of the scholarly record

     •  Study looks into:
          •  Does the referenced resource still exist?
          •  Are there archived versions of of the referenced resource?
               •  From around the time of publication of the citing paper?

     •  Study does not look into dynamic aspects:
          •  If the referenced resource still exists, is its content same as at ti?
          •  Does an archived version have the same content as at ti?

Sanderson, R., Phillips, M., and Van de Sompel, H. (2011) Analyzing the Persistence of Referenced Web
Resources with Memento. Open Repositories 2011; Arxiv preprint. arXiv:1105.3459 ; http://arxiv.org/abs/
1105.3459

                                           Herbert Van de Sompel
                           IDCC 2013, Amsterdam, The Netherlands, January 16 2013
UNT




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
arXiv




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
The Good News ™

•  Despite there not being a pro-active effort to archive those
   resources, a considerable amount were
    o  Because they had HTTP URIs and hence were archived as
       part of ongoing web archiving processes
    o  In The Wild archiving comes for free with the web
       infrastructure

•  404 resources exist in web archives and Memento can access
   them via their original HTTP URI
    o  Does that make an HTTP URI a PID?




                            Herbert Van de Sompel
            IDCC 2013, Amsterdam, The Netherlands, January 16 2013
The Bad News ™

•  Many resources were not archived

•  For many resources there were no archival versions around ti




                            Herbert Van de Sompel
            IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Automatic Creation of Archival Snapshots

•  There is a need for a more pro-active approach to archive
   dynamic, interdependent assets, e.g.:
    o  Web Archives as infrastructure

    o  Use CMS, wikis, datawikis with solid versioning mechanisms

    o  Archiving linked context at the time of publication

    o  Archive at the moment of use (social interaction,
       downloading, annotating, etc.)
    o  Delineate which resources are considered in/out of a
       scholarly assets (OAI-ORE) to understand what needs
       archiving




                            Herbert Van de Sompel
            IDCC 2013, Amsterdam, The Netherlands, January 16 2013
discovering assets




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
•  ResourceSync is about allowing 3rd party
                    systems and applications to remain
                    synchronized with a server’s evolving
                    resources.

                 •  Many use cases:
                     •  Mirroring repository content
                     •  Aggregating content
                     •  Replicating datasets
2012                 •  Exposing content to archives
                     •  Keeping linked data applications that
                        leverage remote data up-to-date

                 •  Differing needs regarding:
                     •  Coverage
                     •  Accuracy
                     •  Latency

                   Herbert Van de Sompel
   IDCC 2013, Amsterdam, The Netherlands, January 16 2013
ResourceSync Approach

•  Resource centric; it’s all about the URI (again)

•  Introduces a set of modular capabilities that a server can
   implement to allow 3rd parties to remain in sync with its
   resources. Recurrently publish:
    o  Resource Lists

    o  Change Lists

    o  Resource Dumps

    o  Change Dumps



•  All capabilities based on the Sitemap document formats and
   extensions thereof
    o  Existing Sitemaps are off-the-shelf compliant




                             Herbert Van de Sompel
             IDCC 2013, Amsterdam, The Netherlands, January 16 2013
ResourceSync Capabilities




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
•  Beta spec end 01/2013
                     •  http://www.openarchives.org/rs/

                 •  Feedback
                     •  mailto:resourcesync@googlegroups.com

                 •  Papers in D-Lib Magazine
2012                 •  http://dx.doi.org/10.145/september2012-
                        vandesompel
                     •  http://dx.doi.org/10.145/january2013-klein

                 •  Paper in Ariadne
                     •  http://www.ariande.ac.uk/issue70/lewis-et-
                        al



                   Herbert Van de Sompel
   IDCC 2013, Amsterdam, The Netherlands, January 16 2013
1998 - 2013




                Herbert Van de Sompel
IDCC 2013, Amsterdam, The Netherlands, January 16 2013
1998 - 2013




a stack of journals or                    a network of interconnected
a bunch of PDF files                           assets and actors



                           Herbert Van de Sompel
           IDCC 2013, Amsterdam, The Netherlands, January 16 2013
Conclusion

•  OAI-ORE, Memento, ResourceSync illustrate the potential of
leveraging the Web infrastructure for scholarly communication

•  This suggests that other special requirements of scholarly
communication (certification, archiving, persistence, trust, annotation,
metrics, …) may be addressable in an interoperable manner by
leveraging the Web infrastructure

•  Wins:
     •  Long Term Sustainability: Reuse of infrastructure (network,
     software, platforms, standards, etc.) that the entire world depends
     on
     •  Integration of scholarly discourse with other Web-based discourse




                           Herbert Van de Sompel
           IDCC 2013, Amsterdam, The Netherlands, January 16 2013
@hvdsomp                                                                   #idcc13




                                 Herbert Van de Sompel
           Wanderer above the Sea of Fog – Caspar David Friedrich (1818)
              IDCC 2013, Amsterdam, The Netherlands, January 16 2013
            http://en.wikipedia.org/wiki/Wanderer_above_the_Sea_of_Fog

More Related Content

What's hot

A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordHerbert Van de Sompel
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataSebastian Hellmann
 
Charper.lawdi.20130531
Charper.lawdi.20130531Charper.lawdi.20130531
Charper.lawdi.20130531charper
 
20101015 linked openeuropeanafi
20101015 linked openeuropeanafi20101015 linked openeuropeanafi
20101015 linked openeuropeanafiStefan Gradmann
 
Linked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the CitizenLinked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the CitizenStefan Gradmann
 
Presentatie for "Studiemiddag Linked Data Archieven"
Presentatie for "Studiemiddag Linked Data Archieven"Presentatie for "Studiemiddag Linked Data Archieven"
Presentatie for "Studiemiddag Linked Data Archieven"Victor de Boer
 
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkAn Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkHerbert Van de Sompel
 
Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Herbert Van de Sompel
 
Open Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeOpen Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeHerbert Van de Sompel
 
Web at 25 - Ontos Linked Open Data
Web at 25 - Ontos Linked Open DataWeb at 25 - Ontos Linked Open Data
Web at 25 - Ontos Linked Open DataAI4BD GmbH
 

What's hot (16)

A Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly RecordA Perspective on Archiving the Scholarly Record
A Perspective on Archiving the Scholarly Record
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Charper.lawdi.20130531
Charper.lawdi.20130531Charper.lawdi.20130531
Charper.lawdi.20130531
 
Linked Open Data stuff
Linked Open Data stuffLinked Open Data stuff
Linked Open Data stuff
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
20101015 linked openeuropeanafi
20101015 linked openeuropeanafi20101015 linked openeuropeanafi
20101015 linked openeuropeanafi
 
Unlocking Doors: recent initiatives in open and linked data at the National L...
Unlocking Doors: recent initiatives in open and linked data at the National L...Unlocking Doors: recent initiatives in open and linked data at the National L...
Unlocking Doors: recent initiatives in open and linked data at the National L...
 
Linked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the CitizenLinked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the Citizen
 
A comparative census of EU data initiatives, Martin Kaltenböck, 26.1.2011, Br...
A comparative census of EU data initiatives, Martin Kaltenböck, 26.1.2011, Br...A comparative census of EU data initiatives, Martin Kaltenböck, 26.1.2011, Br...
A comparative census of EU data initiatives, Martin Kaltenböck, 26.1.2011, Br...
 
Presentatie for "Studiemiddag Linked Data Archieven"
Presentatie for "Studiemiddag Linked Data Archieven"Presentatie for "Studiemiddag Linked Data Archieven"
Presentatie for "Studiemiddag Linked Data Archieven"
 
Thesauri and the Semantic Web
Thesauri and the Semantic WebThesauri and the Semantic Web
Thesauri and the Semantic Web
 
How to Open data
How to Open dataHow to Open data
How to Open data
 
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkAn Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
 
Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)
 
Open Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeOpen Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & Exchange
 
Web at 25 - Ontos Linked Open Data
Web at 25 - Ontos Linked Open DataWeb at 25 - Ontos Linked Open Data
Web at 25 - Ontos Linked Open Data
 

Viewers also liked

Memento: Big Leaps Towards Seamless Navigation of the Web of the Past
Memento: Big Leaps Towards Seamless Navigation of the Web of the PastMemento: Big Leaps Towards Seamless Navigation of the Web of the Past
Memento: Big Leaps Towards Seamless Navigation of the Web of the PastHerbert Van de Sompel
 
Augmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesAugmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesHerbert Van de Sompel
 
MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataHerbert Van de Sompel
 
Attempts at innovation in scholarly communication
Attempts at innovation in scholarly communicationAttempts at innovation in scholarly communication
Attempts at innovation in scholarly communicationHerbert Van de Sompel
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersHerbert Van de Sompel
 
An HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataAn HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataHerbert Van de Sompel
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Herbert Van de Sompel
 
Motivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustrationMotivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustrationHerbert Van de Sompel
 
OAC Presentation at CNI 09 Fall Forum
OAC Presentation at CNI 09 Fall ForumOAC Presentation at CNI 09 Fall Forum
OAC Presentation at CNI 09 Fall ForumRobert Sanderson
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference LinkingHerbert Van de Sompel
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTHerbert Van de Sompel
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemHerbert Van de Sompel
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiativeHerbert Van de Sompel
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...Herbert Van de Sompel
 

Viewers also liked (20)

Memento: Big Leaps Towards Seamless Navigation of the Web of the Past
Memento: Big Leaps Towards Seamless Navigation of the Web of the PastMemento: Big Leaps Towards Seamless Navigation of the Web of the Past
Memento: Big Leaps Towards Seamless Navigation of the Web of the Past
 
the UPS protoproto project
the UPS protoproto projectthe UPS protoproto project
the UPS protoproto project
 
The Roof is on Fire
The Roof is on FireThe Roof is on Fire
The Roof is on Fire
 
Augmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesAugmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositories
 
MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage data
 
The djatoka Image Server
The djatoka Image ServerThe djatoka Image Server
The djatoka Image Server
 
Attempts at innovation in scholarly communication
Attempts at innovation in scholarly communicationAttempts at innovation in scholarly communication
Attempts at innovation in scholarly communication
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking Servers
 
An HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataAn HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked Data
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
 
Motivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustrationMotivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustration
 
Memento: Time Travel for the Web
Memento: Time Travel for the WebMemento: Time Travel for the Web
Memento: Time Travel for the Web
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
 
OAC Presentation at CNI 09 Fall Forum
OAC Presentation at CNI 09 Fall ForumOAC Presentation at CNI 09 Fall Forum
OAC Presentation at CNI 09 Fall Forum
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference Linking
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication System
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiative
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
 

More from Herbert Van de Sompel

FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueHerbert Van de Sompel
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Herbert Van de Sompel
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly recordHerbert Van de Sompel
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsHerbert Van de Sompel
 
Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarshipHerbert Van de Sompel
 
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingPersistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingHerbert Van de Sompel
 
Paint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner InfrastructurePaint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner InfrastructureHerbert Van de Sompel
 
ResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationHerbert Van de Sompel
 
ResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveHerbert Van de Sompel
 

More from Herbert Van de Sompel (20)

FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning Issue
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly record
 
To the Rescue of Scholarly Orphans
To the Rescue of Scholarly OrphansTo the Rescue of Scholarly Orphans
To the Rescue of Scholarly Orphans
 
Almost two decades at LANL
Almost two decades at LANLAlmost two decades at LANL
Almost two decades at LANL
 
Perseverance on Persistence
Perseverance on PersistencePerseverance on Persistence
Perseverance on Persistence
 
Paul Evan Peters Lecture
Paul Evan Peters LecturePaul Evan Peters Lecture
Paul Evan Peters Lecture
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed Collections
 
Signposting Overview
Signposting OverviewSignposting Overview
Signposting Overview
 
Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarship
 
Reminiscing about interoperability
Reminiscing about interoperabilityReminiscing about interoperability
Reminiscing about interoperability
 
Creating Pockets of Persistence
Creating Pockets of PersistenceCreating Pockets of Persistence
Creating Pockets of Persistence
 
ResourceSync Quick Overview
ResourceSync Quick OverviewResourceSync Quick Overview
ResourceSync Quick Overview
 
Memento 101
Memento 101Memento 101
Memento 101
 
ResourceSync Overview
ResourceSync OverviewResourceSync Overview
ResourceSync Overview
 
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingPersistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
 
ResourceSync tutorial OAI8
ResourceSync tutorial OAI8ResourceSync tutorial OAI8
ResourceSync tutorial OAI8
 
Paint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner InfrastructurePaint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner Infrastructure
 
ResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource Synchronization
 
ResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem Perspective
 

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

The Web as infrastructure for scholarly research and communication

  • 1. @hvdsomp #idcc13 Wanderer above the Sea of Fog – Caspar David Friedrich (1818) http://en.wikipedia.org/wiki/Wanderer_above_the_Sea_of_Fog
  • 2. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 3. The Scholarly Record is Changing •  The scholarly record is extending with a wide range of non- traditional assets emerging from eScience and eHumanities •  e.g. datasets, software, ontologies, workflows, online debate, slides, blogs, videos, etc. •  Many of these non-traditional assets: •  Have a wide range of relationships with and dependencies on other assets – grouping assets •  Are becoming increasingly dynamic, and do not have the sense of fixity that traditional assets such as journal articles or books have – versioning assets Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 4. grouping assets versioning assets Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 5. discovering assets Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 6. •  OAI was a heroic effort to fundamentally transform scholarly communication •  By promoting communication via preprints, non-peer-reviewed papers •  The OAI took a technical approach to 1999 achieve the goal •  Make preprints easier to discover, access – Protocol for Metadata Harvesting Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 7. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 8. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 9. Don’t trust HTTP HTTP GET on record identifier An HTTP link Just another HTTP baseURL Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 10. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 11. grouping assets versioning assets Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 12. •  OAI-ORE observation: Scholarly assets are rapidly becoming compound, consisting of multiple resources with various: •  Relationships •  Interdependencies •  How to convey this compound-ness in an 2007 interoperable manner so that applications can access, consume such assets? Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 13. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 14. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 15. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 16. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 17. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 18. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 19. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 20. See e.g. http://www.ctwatch.org/quarterly/articles/2007/08/interoperability- for-the-discovery-use-and-re-use-of-units-of-scholarly-communication/8/ index.html Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 21. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 22. grouping assets versioning assets Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 23. •  Memento is about the Web and time: •  Resources evolve over time •  Only the current representation is available from a resource’s URI •  How to seamlessly access prior representation, if they exist? •  Memento looks at this problem for the Web, in general 2009 Digital Preservation Award 2010 Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 24. •  Memento has potential consequences for scholarly communication •  Observation: Scholarly assets are becoming increasingly dynamic, and do not have the sense of fixity that traditional assets such as journal articles or books have •  Even traditional assets are becoming 2009 increasingly dynamic and dependent on other assets, which may themselves be dynamic Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 25. Scientific Workflows, Services, Data, Workflow Engines Carole Goble, JCDL 2012 Keynote https://dl.dropbox.com/u/617206/JCDL2012keynoteGoble.ppt Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 26. From The Version of Record to A Version of the Record •  The ever-evolving nature of some assets challenges the notion of fixity as “forever frozen” and begs considering the notion of the “state of the scholarly record at a specific moment in time” •  It will become essential to be able to determine what the state of related and interdependent assets was at certain moments in time Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 27. Two Perspectives on Memento Web Archive URI-M - http://web.archive.org/web/20010911203610/http://www.cnn.com/ URI-R - http://www.cnn.com/ Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 28. Two Perspectives on Memento CMS URI-M - http://en.wikipedia.org/w/index.php?title=September_11_attacks&oldid=282333 URI-R - http://en.wikipedia.org/wiki/September_11_attacks Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 29.
  • 30.
  • 31. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 32.
  • 33. •  How to get to the time-specific resources from the generic resource? •  Memento addresses the problem in a resource-centric way: •  Resource, URI, state, representation, link, content negotiation 2009 Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 34.
  • 35.
  • 36. Access Versions via the original URI and datetime Select Date Today Sep 16 2010 Sep 12 2010 From BL Archive Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 37. From The Version of Record to A Version of the Record •  The ever-evolving nature of some assets challenges the notion of fixity as “forever frozen” and begs considering the notion of the “state of the scholarly record at a specific moment in time” •  It will become essential to be able to determine what the state of related and interdependent assets was at certain moments in time Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46. Recreating a Version of the Record •  Is it possible to reconstruct the Web-based scholarly record as it was at a certain point in time? •  Consider a special case: Given a paper can one see the referenced materials as they were the time of publication of the paper? •  ti: Time of publication •  Relationship: Cited resources Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 47. Published September 15 2004 Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 48. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 49. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 50. Domain Gone Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 51. Archived copy December 5 2003 Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 52. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 53. Current version Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 54. Archived copy December 11 2004 Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 55. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 56. Resource gone Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 57. Archived copy December 5 2003 Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 58. Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 59. Resource gone Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 60. Archived copy unavailable Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 61. Pilot Study at Scale with Memento •  Papers from arXiv: 400,000 papers => 144,000 unique URIs •  Papers from UNT ETD repository: 3,600 papers => 18,000 URIs •  Referenced URIs of established scholarly repositories removed (e.g. http://dx.doi.org), i.e. focusing in on the periphery of the scholarly record •  Study looks into: •  Does the referenced resource still exist? •  Are there archived versions of of the referenced resource? •  From around the time of publication of the citing paper? •  Study does not look into dynamic aspects: •  If the referenced resource still exists, is its content same as at ti? •  Does an archived version have the same content as at ti? Sanderson, R., Phillips, M., and Van de Sompel, H. (2011) Analyzing the Persistence of Referenced Web Resources with Memento. Open Repositories 2011; Arxiv preprint. arXiv:1105.3459 ; http://arxiv.org/abs/ 1105.3459 Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 62. UNT Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 63. arXiv Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 64. The Good News ™ •  Despite there not being a pro-active effort to archive those resources, a considerable amount were o  Because they had HTTP URIs and hence were archived as part of ongoing web archiving processes o  In The Wild archiving comes for free with the web infrastructure •  404 resources exist in web archives and Memento can access them via their original HTTP URI o  Does that make an HTTP URI a PID? Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 65. The Bad News ™ •  Many resources were not archived •  For many resources there were no archival versions around ti Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 66.
  • 67.
  • 68.
  • 69. Automatic Creation of Archival Snapshots •  There is a need for a more pro-active approach to archive dynamic, interdependent assets, e.g.: o  Web Archives as infrastructure o  Use CMS, wikis, datawikis with solid versioning mechanisms o  Archiving linked context at the time of publication o  Archive at the moment of use (social interaction, downloading, annotating, etc.) o  Delineate which resources are considered in/out of a scholarly assets (OAI-ORE) to understand what needs archiving Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 70. discovering assets Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 71. •  ResourceSync is about allowing 3rd party systems and applications to remain synchronized with a server’s evolving resources. •  Many use cases: •  Mirroring repository content •  Aggregating content •  Replicating datasets 2012 •  Exposing content to archives •  Keeping linked data applications that leverage remote data up-to-date •  Differing needs regarding: •  Coverage •  Accuracy •  Latency Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 72. ResourceSync Approach •  Resource centric; it’s all about the URI (again) •  Introduces a set of modular capabilities that a server can implement to allow 3rd parties to remain in sync with its resources. Recurrently publish: o  Resource Lists o  Change Lists o  Resource Dumps o  Change Dumps •  All capabilities based on the Sitemap document formats and extensions thereof o  Existing Sitemaps are off-the-shelf compliant Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 73. ResourceSync Capabilities Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 74. •  Beta spec end 01/2013 •  http://www.openarchives.org/rs/ •  Feedback •  mailto:resourcesync@googlegroups.com •  Papers in D-Lib Magazine 2012 •  http://dx.doi.org/10.145/september2012- vandesompel •  http://dx.doi.org/10.145/january2013-klein •  Paper in Ariadne •  http://www.ariande.ac.uk/issue70/lewis-et- al Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 75. 1998 - 2013 Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 76. 1998 - 2013 a stack of journals or a network of interconnected a bunch of PDF files assets and actors Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 77. Conclusion •  OAI-ORE, Memento, ResourceSync illustrate the potential of leveraging the Web infrastructure for scholarly communication •  This suggests that other special requirements of scholarly communication (certification, archiving, persistence, trust, annotation, metrics, …) may be addressable in an interoperable manner by leveraging the Web infrastructure •  Wins: •  Long Term Sustainability: Reuse of infrastructure (network, software, platforms, standards, etc.) that the entire world depends on •  Integration of scholarly discourse with other Web-based discourse Herbert Van de Sompel IDCC 2013, Amsterdam, The Netherlands, January 16 2013
  • 78. @hvdsomp #idcc13 Herbert Van de Sompel Wanderer above the Sea of Fog – Caspar David Friedrich (1818) IDCC 2013, Amsterdam, The Netherlands, January 16 2013 http://en.wikipedia.org/wiki/Wanderer_above_the_Sea_of_Fog