SlideShare une entreprise Scribd logo
1  sur  77
A Clean Slate?
@hvdsomp
http://public.lanl.gov/herbertv/
herbert van de sompel
Includes slides by Sean Bechhofer, Carole Goble, Robert Sanderson
paper-based scholarly communication system
scanned version of paper-based scholarly communication system
natively digital, web-based, scholarly communication system
Context of My Work, My Talk
painful	
  transi,on	
  
In Silico (Computational) Science
Datasets
Data collections
Algorithms
Configurations
Tools and Apps
Codes
Code Libraries
Services,
Infrastructure,
Compilers
Hardware
Simulations, data exploration, data processing, analytics, database based, text
mining, auto recommendation, visual analytics…Actually Digital Science is just
Science
Carole Goble, JCDL 2012 Keynote
https://dl.dropbox.com/u/617206/JCDL2012keynoteGoble.ppt
Scientific Workflows, Services, Data, Workflow Engines	
  
Carole Goble, JCDL 2012 Keynote
https://dl.dropbox.com/u/617206/JCDL2012keynoteGoble.ppt
All components
continuously in
flux. How to
reproduce results
in such an
environment?
A Lot of Rs for Reproducibility
•  Rerun re-execute original experiment using revised setting.
•  Review Validate and justify the results empirically. Trust.
Understand. Train. Convincing and comfort
•  Replicate / Repeat Exactly replicate the original experiment.
Eliminate change.
•  Reproduce Run experiment with differences in elements (materials,
methods, platform or setting) and compare to test for same result.
•  Replay Run through what happened using logs without original
platform or need to execute.
Carole Goble, JCDL 2012 Keynote
https://dl.dropbox.com/u/617206/JCDL2012keynoteGoble.ppt
A Lot of Rs for Reuse
•  Refresh execute an upgraded original experiment.
•  Reconstruct rebuild using new elements or different platform when
they are lost/unavailable/inaccessible
•  Reuse use as part of new experiments.
•  Repurpose/Reassemble reuse elements in a new experiment
Carole Goble, JCDL 2012 Keynote
https://dl.dropbox.com/u/617206/JCDL2012keynoteGoble.ppt
The Article is the Knowledge Bottleneck
“An article about computational science in a scientific
publication is not the scholarship itself, it is merely
advertising of the scholarship. The actual scholarship is the
complete software development environment, [the complete
data] and the complete set of instructions which generated
the figures.”
Backheit, J. and Donoho, D. (1995) Wavelab and reproducible research http://
citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.2982
The Article is the Knowledge Bottleneck
“Changes are occurring in the ways in which scientific
research is conducted. Within e-laboratories, methods such
as scientific workflows, research protocols, standard
operating procedures and algorithms for analysis and
simulation are used to manipulate and produce data.
Experimental or observational data and scientific models are
typically born digital with no physical counterpart. This move
to digital content is driving a sea-change in scientific
publication, and challenging traditional scholarly
publication.”
Bechhofer S. et al (2010) Research Objects: Towards Exchange and Reuse of Digital
Knowledge http://dx.doi.org/10.1038/npre.2010.4626.1
•  Involved in each such experiment is a complex set of resources
with complex relationships
•  There is a need to share these resources in order to support
forms of reuse, reproducibility
•  This entails the augmentation of the scholarly record with
an explicit account of the research process
•  Digital exchange of each resource individually is trivial,
exchange of the combined knowledge is not
•  Traditional, electronic publications, can not handle this job
•  Targeted at humans, not machines
•  Communicates findings not all scientific knowledge behind
the findings
•  Content not decomposable in actionable units
•  Outputs, results, methods not reusable
If not the Article, then What?
Bechhofer S. et al (2010) Research Objects: Towards Exchange and Reuse of Digital
Knowledge http://dx.doi.org/10.1038/npre.2010.4626.1
The Clean Slate Challenge
The Clean Slate Challenge
Add features to
support these
needs to the
existing scholarly
communication
system?
The Clean Slate Challenge
Start with
a clean slate?
Research Objects
http://www.researchobject.org/ http://www.wf4ever-project.org/
Research Objects: Aggregated Content
•  Data used or results produced in
an experiment study
•  Methods employed to produce and
analyze that data
•  Provenance and setting
information about the experiments
•  People involved in the
investigation
•  Annotations about these
resources, that are essential to the
understanding and interpretation of
the scientific outcomes captured
by a research object.
http://www.researchobject.org/
http://www.w3.org/community/rosc/
Research Objects
http://www.researchobject.org/
Research Objects: Aggregation
“Research Objects are aggregations of content. Thus a
Research Object framework needs to provide a mechanism
for this aggregation. Aggregations are likely to include
references to resources but there may also, however, be
situations, where, for reasons of efficiency or in order to
support persistence, Research Objects should also be able
to aggregate literal data as well as references to data.”
Bechhofer S. et al (2010) Research Objects: Towards Exchange and Reuse of Digital
Knowledge http://dx.doi.org/10.1038/npre.2010.4626.1
•  OAI-ORE observation: Scholarly assets are
rapidly becoming compound, consisting of
multiple resources
•  e.g. datasets, software, ontologies,
workflows, online debate, slides, blogs,
videos, etc.
with various:
•  Relationships
•  Interdependencies
•  How to convey this compound-ness in an
interoperable manner so that applications
can access, consume such assets?
2007	
  
Funded by the Mellon Foundation & Microsoft Research
http://www.openarchives.org/ore/
Foundations of the ORE Solution
•  Web Architecture - Resource, URI, Representation
•  Semantic Web:
•  URIs for documents (information resources),
•  URIs for physical entities, concepts, abstractions (non-information
resources)
•  RDF – to express properties, relationships pertaining to resources
•  Linked Data:
•  HTTP URIs for both information and non-information resources
•  HTTP 303 redirect:
•  From: The HTTP URI of non-information resource
•  To: The HTTP URI of an information resource that describes
the non-information resource
Adding Account of Research Life Cycles to Scholarly Record
Pepe, A., Mayernik, M., Borgman, C., Van de Sompel, H. (2009) Technology to
Represent Scientific Practice: Data, Life Cycles, and Value Chains. http://dx.doi.org/
10.1002/asi21263
ORE & Research Objects
“…, Research Objects should also be able to aggregate literal data as
well as references to data.”
•  Aggregated Resources in ORE have HTTP URIs; probably needs to
be relaxed.
•  Embedding content in RDF, irrespective of ORE, is … interesting
•  See: Representing Content in RDF 1.0 http://www.w3.org/TR/
Content-in-RDF10/
•  Allows embedding base64, text, XML
•  Resource Map as manifest in e.g. ZIP file?
Research Objects
http://www.researchobject.org/
Research Objects: Annotation
“Annotations about these resources, that are essential to the
understanding and interpretation of the scientific outcomes
captured by a research object.”
http://www.researchobject.org/
•  Annotation is a pervasive scholarly activity,
conducted by people and machines
•  Many annotation efforts and tools
•  But annotations stuck in silos:
•  Only consumable by client that created
it
•  Annotations not shareable beyond
original environment
•  Open Annotation focuses on interoperability
for annotations in order to allow sharing of
annotations across:
•  Annotation clients
•  Content collections
•  Services that leverage annotations
2009	
  
Funded by the Mellon Foundation
http://www.openanotation.org/spec/core/
•  Established to reconcile Open Annotation Collaboration and
Annotation Ontology models
•  67 participants from around the world: 7th of 119 groups
Many universities, also commercial and not-for-profit
•  Mission:
Interoperability between Annotation systems and platforms, by
…following the Architecture of the Web
…reusing existing web standards
…providing a single, coherent model to implement
…without requiring adoption of specific platforms
…while maintaining low implementation costs
W3C Open Annotation Community Group
http://www.w3.org/community/openannotation/
An Annotation is considered to be a set of connected
resources, typically including a body and target, where
the body is related to the target.
“	
   ”	
  
Highlighting, Bookmarking
Commenting, Describing
Tagging, Linking
Classifying, Identifying
Questioning, Replying
Editing, Moderating
…Provide an Aide-Memoire
…Share and Inform
…Improve Discovery
…Organize Resources
…Interact with Others
…Create as well as Consume
What is an Annotation?
http://www.w3.org/community/openannotation/
Annotates	
  
Annotations
Annotates?	
  
Annotations?
Basic Open Annotation Data Model
Use Case: Bookmarking
Use Case: Commenting
Use Case: Commenting
Use Case: Tagging
Specific Body and Specific Target resources identify the region of
interest, and/or the state of the resource.
Need to be able to describe the state of the resource, the segment
of interest, and potentially styling hints for how to render it.
Open Annotation introduces:
State Describes how to retrieve representation
Selector Describes how to select segment
Style Describes how to render/process segment
Scope Describes context of the resource
Further Specification of Resources
Use Case: Changing Content at the Same URI
Use Case: Segment of Interest
W3C Open Annotation & Research Objects
•  Early renderings of Research Objects emerging from the Wf4Ever
project use Annotation Ontology as the annotation framework
•  But since the Annotation Ontology and Open Annotation Collaboration
models now merge into the W3C Open Annotation model, it is safe to
assume W3C Open Annotation will be used for Research Objects
Research Objects
http://www.researchobject.org/
Research Objects: Versioning and Evolution
“Research Objects are dynamic in that their contents can
change and be changed – additional contents may be
added to aggregations, or additional metadata can be
asserted about the contents or relationships between
content. The resources that are aggregated may change.
Thus there is a need for versioning, allowing the recording
of changes to objects, potentially along with facilities for
retrieving objects or aggregated elements at particular
historical points in their lifecycle.”
Bechhofer S. et al (2010) Research Objects: Towards Exchange and Reuse of Digital
Knowledge http://dx.doi.org/10.1038/npre.2010.4626.1
ORE Experiment: Versioning and Evolution of Compound Objects
Van de Sompel, H. et al. (2007) Appendix to Interoperability for the Discovery, Use, and
Re-Use of Units of Scholarly Communication
http://www.ctwatch.org/quarterly/articles/2007/08/interoperability-for-the-discovery-use-
and-re-use-of-units-of-scholarly-communication/
•  Memento is about the Web and time:
•  Resources evolve over time
•  Only the current representation is
available from a resource’s URI
•  How to seamlessly access prior
representation, if they exist?
•  Memento looks at this problem for the Web,
in general
Digital	
  Preserva,on	
  Award	
  2010	
  
2009	
  
Funded by the Library of Congress
http://www.mementoweb.org/
URI for Original, URI for Version	
  
URI-­‐M	
  -­‐	
  hDp://web.archive.org/web/20010911203610/hDp://www.cnn.com/	
  	
  
Web	
  Archive	
  
URI-­‐R	
  -­‐	
  hDp://www.cnn.com/	
  	
  
URI for Original, URI for Version	
  
URI-­‐M	
  -­‐	
  hDp://en.wikipedia.org/w/index.php?,tle=September_11_aDacks&oldid=282333	
  	
  
CMS	
  
URI-­‐R	
  -­‐	
  hDp://en.wikipedia.org/wiki/September_11_aDacks	
  
Time Travel for the Web: Demo	
  
http://www.mementoweb.org/demo/Memento_Time_Travel.mov
Memento & Research Objects
•  The combination of:
•  Pro-active archiving of Research Objects and their constituent
resources, using
•  Web archiving techniques, e.g. crawling, transactional
archiving
•  Platforms with strong versioning capabilities, e.g. datawikis,
github
•  Assigning URIs to Research Objects and their constituent
resources according to the well-established time-generic (URI-R)
and time-specific (URI-M) resource pattern
•  The Memento protocol to access time-specific versions of
Research Objects and their constituent resources via their time-
generic URI and timestamp
makes a good candidate for addressing the versioning and evolution
need.
Research Objects
http://www.researchobject.org/
Research Objects: Provenance
“The issue of provenance, and being able to audit
experiments and investigations is key to the scientific
method. Third parties must be able to audit the steps
performed in an experiment in order to be convinced of the
validity of results. Audit is required not just for regulatory
purposes, but allows for the results of experiments to be
interpreted and reused, thus a Research Object should
provide sufficient information to support audit of the
aggregation as a whole, its constituent parts, and any
process that it may encapsulate.”
Bechhofer S. et al (2010) Research Objects: Towards Exchange and Reuse of Digital
Knowledge http://dx.doi.org/10.1038/npre.2010.4626.1
Van de Sompel, H. (2003) Roadblocks http://www.sis.pitt.edu/~dlwkshop/paper_sompel.html
Provenance
Moreau, L. et al. (2010) The Open Provenance Model: Abstract Model
http://eprints.ecs.soton.ac.uk/21449/
Open Provenance Model
W3C Provenance
http://www.w3.org/TR/prov-primer/
Research Objects
http://www.researchobject.org/
W3C	
  PROV	
  
The Clean Slate Challenge
•  ResourceSync is about synchronization of
web resources, things with a URI that can
be dereferenced
•  Small websites/repositories (a few
resources) to large repositories/datasets/
linked data collections (many millions of
resources)
•  Low change frequency (weeks/months) to
high change frequency (seconds)
•  Synchronization latency and accuracy
needs may vary
•  Modular framework based on Sitemaps and
extensions
2012	
  
Funded by the Sloan Foundation
http://www.openarchives.org/rs/
•  Investigates reference rot at massive scale:
•  Citation rot - Do HTTP references in
scholarly articles still resolve?
•  Content rot - If so, is the content at the
end of the HTTP reference still
representative of the content that was
originally referenced?
•  Investigates pro-active ways to archive
HTTP referenced resources that occur in
scholarly articles
2013	
  
hiberlink
Funded by the Mellon Foundation
Soon at http://www.hiberlink.org
Research Objects
http://www.researchobject.org/ http://www.wf4ever-project.org/
http://www.w3.org/community/rosc/
A Clean Slate?
@hvdsomp
http://public.lanl.gov/herbertv/
herbert van de sompel
Includes slides by Sean Bechhofer, Carole Goble, Robert Sanderson

Contenu connexe

Tendances

The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking Servers
Herbert Van de Sompel
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
Carole Goble
 
DataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefDataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRef
Crossref
 
Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016
Carole Goble
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
Norman Morrison
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
Carole Goble
 
Dagstuhl "Future" sesssion intro slides
Dagstuhl "Future" sesssion intro slidesDagstuhl "Future" sesssion intro slides
Dagstuhl "Future" sesssion intro slides
Tim Clark
 

Tendances (20)

The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking Servers
 
MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage data
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
 
Open Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | FutureOpen Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | Future
 
Modern Tools & Rationales for 21st Century Research
Modern Tools & Rationales  for 21st Century ResearchModern Tools & Rationales  for 21st Century Research
Modern Tools & Rationales for 21st Century Research
 
DataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefDataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRef
 
Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
 
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
Specimen-level mining: bringing knowledge back 'home' to the Natural History ...
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
Museum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on themMuseum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on them
 
Mtsr2015 goble-keynote
Mtsr2015 goble-keynoteMtsr2015 goble-keynote
Mtsr2015 goble-keynote
 
The State of Open Research Data
The State of Open Research DataThe State of Open Research Data
The State of Open Research Data
 
Semantic Web, Linked Data and Education: A Perfect Fit?
Semantic Web, Linked Data and Education: A Perfect Fit?Semantic Web, Linked Data and Education: A Perfect Fit?
Semantic Web, Linked Data and Education: A Perfect Fit?
 
The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?
 
Dagstuhl "Future" sesssion intro slides
Dagstuhl "Future" sesssion intro slidesDagstuhl "Future" sesssion intro slides
Dagstuhl "Future" sesssion intro slides
 
Doing Clever Things with the Semantic Web
Doing Clever Things with the Semantic WebDoing Clever Things with the Semantic Web
Doing Clever Things with the Semantic Web
 
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisExtracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 
Data management for researchers
Data management for researchersData management for researchers
Data management for researchers
 

En vedette

Augmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesAugmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositories
Herbert Van de Sompel
 
Attempts at innovation in scholarly communication
Attempts at innovation in scholarly communicationAttempts at innovation in scholarly communication
Attempts at innovation in scholarly communication
Herbert Van de Sompel
 
Motivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustrationMotivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustration
Herbert Van de Sompel
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
Herbert Van de Sompel
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference Linking
Herbert Van de Sompel
 

En vedette (18)

Augmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesAugmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositories
 
An HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataAn HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked Data
 
The djatoka Image Server
The djatoka Image ServerThe djatoka Image Server
The djatoka Image Server
 
The Roof is on Fire
The Roof is on FireThe Roof is on Fire
The Roof is on Fire
 
the UPS protoproto project
the UPS protoproto projectthe UPS protoproto project
the UPS protoproto project
 
Attempts at innovation in scholarly communication
Attempts at innovation in scholarly communicationAttempts at innovation in scholarly communication
Attempts at innovation in scholarly communication
 
The Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communicationThe Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communication
 
Motivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustrationMotivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustration
 
Memento: Time Travel for the Web
Memento: Time Travel for the WebMemento: Time Travel for the Web
Memento: Time Travel for the Web
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
 
Memento: Big Leaps Towards Seamless Navigation of the Web of the Past
Memento: Big Leaps Towards Seamless Navigation of the Web of the PastMemento: Big Leaps Towards Seamless Navigation of the Web of the Past
Memento: Big Leaps Towards Seamless Navigation of the Web of the Past
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference Linking
 
Untitled I: Challenges ahead
Untitled I: Challenges aheadUntitled I: Challenges ahead
Untitled I: Challenges ahead
 
The Era of Open
The Era of OpenThe Era of Open
The Era of Open
 
Ngsp
NgspNgsp
Ngsp
 
An Introduction to Force11 at WWW2013
An Introduction to Force11 at WWW2013An Introduction to Force11 at WWW2013
An Introduction to Force11 at WWW2013
 
Overview of Digital Publishing
Overview of Digital PublishingOverview of Digital Publishing
Overview of Digital Publishing
 

Similaire à A Clean Slate?

The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
Carole Goble
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
Lucy McKenna
 

Similaire à A Clean Slate? (20)

The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
2013 DataCite Summer Meeting - Elsevier's program to support research data (H...
2013 DataCite Summer Meeting - Elsevier's program to support research data (H...2013 DataCite Summer Meeting - Elsevier's program to support research data (H...
2013 DataCite Summer Meeting - Elsevier's program to support research data (H...
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
 
Open Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and ExchangeOpen Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and Exchange
 
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
Zudilova-Seinstra-Elsevier-data and the article of the future-nfdp13
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
Research Objects @ HARMONY 2014
Research Objects @ HARMONY 2014Research Objects @ HARMONY 2014
Research Objects @ HARMONY 2014
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community Update
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects2012 03-28 Wf4ever, preserving workflows as digital research objects
2012 03-28 Wf4ever, preserving workflows as digital research objects
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest Group
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Knowledge Infrastructure for Global Systems Science
Knowledge Infrastructure for Global Systems ScienceKnowledge Infrastructure for Global Systems Science
Knowledge Infrastructure for Global Systems Science
 

Plus de Herbert Van de Sompel

ResourceSync Overview
ResourceSync OverviewResourceSync Overview
ResourceSync Overview
Herbert Van de Sompel
 

Plus de Herbert Van de Sompel (20)

The web is rotting and what to do about it
The web is rotting and what to do about itThe web is rotting and what to do about it
The web is rotting and what to do about it
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized Web
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than Done
 
FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning Issue
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly record
 
To the Rescue of Scholarly Orphans
To the Rescue of Scholarly OrphansTo the Rescue of Scholarly Orphans
To the Rescue of Scholarly Orphans
 
Almost two decades at LANL
Almost two decades at LANLAlmost two decades at LANL
Almost two decades at LANL
 
Perseverance on Persistence
Perseverance on PersistencePerseverance on Persistence
Perseverance on Persistence
 
Paul Evan Peters Lecture
Paul Evan Peters LecturePaul Evan Peters Lecture
Paul Evan Peters Lecture
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed Collections
 
Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)
 
Signposting Overview
Signposting OverviewSignposting Overview
Signposting Overview
 
Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarship
 
Reminiscing about interoperability
Reminiscing about interoperabilityReminiscing about interoperability
Reminiscing about interoperability
 
Creating Pockets of Persistence
Creating Pockets of PersistenceCreating Pockets of Persistence
Creating Pockets of Persistence
 
ResourceSync Quick Overview
ResourceSync Quick OverviewResourceSync Quick Overview
ResourceSync Quick Overview
 
Memento 101
Memento 101Memento 101
Memento 101
 
ResourceSync Overview
ResourceSync OverviewResourceSync Overview
ResourceSync Overview
 
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingPersistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

A Clean Slate?

  • 1. A Clean Slate? @hvdsomp http://public.lanl.gov/herbertv/ herbert van de sompel Includes slides by Sean Bechhofer, Carole Goble, Robert Sanderson
  • 2. paper-based scholarly communication system scanned version of paper-based scholarly communication system natively digital, web-based, scholarly communication system Context of My Work, My Talk painful  transi,on  
  • 3. In Silico (Computational) Science Datasets Data collections Algorithms Configurations Tools and Apps Codes Code Libraries Services, Infrastructure, Compilers Hardware Simulations, data exploration, data processing, analytics, database based, text mining, auto recommendation, visual analytics…Actually Digital Science is just Science Carole Goble, JCDL 2012 Keynote https://dl.dropbox.com/u/617206/JCDL2012keynoteGoble.ppt
  • 4. Scientific Workflows, Services, Data, Workflow Engines   Carole Goble, JCDL 2012 Keynote https://dl.dropbox.com/u/617206/JCDL2012keynoteGoble.ppt All components continuously in flux. How to reproduce results in such an environment?
  • 5. A Lot of Rs for Reproducibility •  Rerun re-execute original experiment using revised setting. •  Review Validate and justify the results empirically. Trust. Understand. Train. Convincing and comfort •  Replicate / Repeat Exactly replicate the original experiment. Eliminate change. •  Reproduce Run experiment with differences in elements (materials, methods, platform or setting) and compare to test for same result. •  Replay Run through what happened using logs without original platform or need to execute. Carole Goble, JCDL 2012 Keynote https://dl.dropbox.com/u/617206/JCDL2012keynoteGoble.ppt
  • 6. A Lot of Rs for Reuse •  Refresh execute an upgraded original experiment. •  Reconstruct rebuild using new elements or different platform when they are lost/unavailable/inaccessible •  Reuse use as part of new experiments. •  Repurpose/Reassemble reuse elements in a new experiment Carole Goble, JCDL 2012 Keynote https://dl.dropbox.com/u/617206/JCDL2012keynoteGoble.ppt
  • 7. The Article is the Knowledge Bottleneck “An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment, [the complete data] and the complete set of instructions which generated the figures.” Backheit, J. and Donoho, D. (1995) Wavelab and reproducible research http:// citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.2982
  • 8. The Article is the Knowledge Bottleneck “Changes are occurring in the ways in which scientific research is conducted. Within e-laboratories, methods such as scientific workflows, research protocols, standard operating procedures and algorithms for analysis and simulation are used to manipulate and produce data. Experimental or observational data and scientific models are typically born digital with no physical counterpart. This move to digital content is driving a sea-change in scientific publication, and challenging traditional scholarly publication.” Bechhofer S. et al (2010) Research Objects: Towards Exchange and Reuse of Digital Knowledge http://dx.doi.org/10.1038/npre.2010.4626.1
  • 9. •  Involved in each such experiment is a complex set of resources with complex relationships •  There is a need to share these resources in order to support forms of reuse, reproducibility •  This entails the augmentation of the scholarly record with an explicit account of the research process •  Digital exchange of each resource individually is trivial, exchange of the combined knowledge is not •  Traditional, electronic publications, can not handle this job •  Targeted at humans, not machines •  Communicates findings not all scientific knowledge behind the findings •  Content not decomposable in actionable units •  Outputs, results, methods not reusable If not the Article, then What? Bechhofer S. et al (2010) Research Objects: Towards Exchange and Reuse of Digital Knowledge http://dx.doi.org/10.1038/npre.2010.4626.1
  • 10. The Clean Slate Challenge
  • 11. The Clean Slate Challenge Add features to support these needs to the existing scholarly communication system?
  • 12. The Clean Slate Challenge Start with a clean slate?
  • 14. Research Objects: Aggregated Content •  Data used or results produced in an experiment study •  Methods employed to produce and analyze that data •  Provenance and setting information about the experiments •  People involved in the investigation •  Annotations about these resources, that are essential to the understanding and interpretation of the scientific outcomes captured by a research object. http://www.researchobject.org/
  • 17. Research Objects: Aggregation “Research Objects are aggregations of content. Thus a Research Object framework needs to provide a mechanism for this aggregation. Aggregations are likely to include references to resources but there may also, however, be situations, where, for reasons of efficiency or in order to support persistence, Research Objects should also be able to aggregate literal data as well as references to data.” Bechhofer S. et al (2010) Research Objects: Towards Exchange and Reuse of Digital Knowledge http://dx.doi.org/10.1038/npre.2010.4626.1
  • 18. •  OAI-ORE observation: Scholarly assets are rapidly becoming compound, consisting of multiple resources •  e.g. datasets, software, ontologies, workflows, online debate, slides, blogs, videos, etc. with various: •  Relationships •  Interdependencies •  How to convey this compound-ness in an interoperable manner so that applications can access, consume such assets? 2007   Funded by the Mellon Foundation & Microsoft Research http://www.openarchives.org/ore/
  • 19.
  • 20.
  • 21. Foundations of the ORE Solution •  Web Architecture - Resource, URI, Representation •  Semantic Web: •  URIs for documents (information resources), •  URIs for physical entities, concepts, abstractions (non-information resources) •  RDF – to express properties, relationships pertaining to resources •  Linked Data: •  HTTP URIs for both information and non-information resources •  HTTP 303 redirect: •  From: The HTTP URI of non-information resource •  To: The HTTP URI of an information resource that describes the non-information resource
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30. Adding Account of Research Life Cycles to Scholarly Record Pepe, A., Mayernik, M., Borgman, C., Van de Sompel, H. (2009) Technology to Represent Scientific Practice: Data, Life Cycles, and Value Chains. http://dx.doi.org/ 10.1002/asi21263
  • 31. ORE & Research Objects “…, Research Objects should also be able to aggregate literal data as well as references to data.” •  Aggregated Resources in ORE have HTTP URIs; probably needs to be relaxed. •  Embedding content in RDF, irrespective of ORE, is … interesting •  See: Representing Content in RDF 1.0 http://www.w3.org/TR/ Content-in-RDF10/ •  Allows embedding base64, text, XML •  Resource Map as manifest in e.g. ZIP file?
  • 33. Research Objects: Annotation “Annotations about these resources, that are essential to the understanding and interpretation of the scientific outcomes captured by a research object.” http://www.researchobject.org/
  • 34. •  Annotation is a pervasive scholarly activity, conducted by people and machines •  Many annotation efforts and tools •  But annotations stuck in silos: •  Only consumable by client that created it •  Annotations not shareable beyond original environment •  Open Annotation focuses on interoperability for annotations in order to allow sharing of annotations across: •  Annotation clients •  Content collections •  Services that leverage annotations 2009   Funded by the Mellon Foundation http://www.openanotation.org/spec/core/
  • 35. •  Established to reconcile Open Annotation Collaboration and Annotation Ontology models •  67 participants from around the world: 7th of 119 groups Many universities, also commercial and not-for-profit •  Mission: Interoperability between Annotation systems and platforms, by …following the Architecture of the Web …reusing existing web standards …providing a single, coherent model to implement …without requiring adoption of specific platforms …while maintaining low implementation costs W3C Open Annotation Community Group http://www.w3.org/community/openannotation/
  • 36. An Annotation is considered to be a set of connected resources, typically including a body and target, where the body is related to the target. “   ”   Highlighting, Bookmarking Commenting, Describing Tagging, Linking Classifying, Identifying Questioning, Replying Editing, Moderating …Provide an Aide-Memoire …Share and Inform …Improve Discovery …Organize Resources …Interact with Others …Create as well as Consume What is an Annotation? http://www.w3.org/community/openannotation/
  • 39. Basic Open Annotation Data Model
  • 44. Specific Body and Specific Target resources identify the region of interest, and/or the state of the resource. Need to be able to describe the state of the resource, the segment of interest, and potentially styling hints for how to render it. Open Annotation introduces: State Describes how to retrieve representation Selector Describes how to select segment Style Describes how to render/process segment Scope Describes context of the resource Further Specification of Resources
  • 45. Use Case: Changing Content at the Same URI
  • 46. Use Case: Segment of Interest
  • 47. W3C Open Annotation & Research Objects •  Early renderings of Research Objects emerging from the Wf4Ever project use Annotation Ontology as the annotation framework •  But since the Annotation Ontology and Open Annotation Collaboration models now merge into the W3C Open Annotation model, it is safe to assume W3C Open Annotation will be used for Research Objects
  • 49. Research Objects: Versioning and Evolution “Research Objects are dynamic in that their contents can change and be changed – additional contents may be added to aggregations, or additional metadata can be asserted about the contents or relationships between content. The resources that are aggregated may change. Thus there is a need for versioning, allowing the recording of changes to objects, potentially along with facilities for retrieving objects or aggregated elements at particular historical points in their lifecycle.” Bechhofer S. et al (2010) Research Objects: Towards Exchange and Reuse of Digital Knowledge http://dx.doi.org/10.1038/npre.2010.4626.1
  • 50. ORE Experiment: Versioning and Evolution of Compound Objects Van de Sompel, H. et al. (2007) Appendix to Interoperability for the Discovery, Use, and Re-Use of Units of Scholarly Communication http://www.ctwatch.org/quarterly/articles/2007/08/interoperability-for-the-discovery-use- and-re-use-of-units-of-scholarly-communication/
  • 51. •  Memento is about the Web and time: •  Resources evolve over time •  Only the current representation is available from a resource’s URI •  How to seamlessly access prior representation, if they exist? •  Memento looks at this problem for the Web, in general Digital  Preserva,on  Award  2010   2009   Funded by the Library of Congress http://www.mementoweb.org/
  • 52. URI for Original, URI for Version   URI-­‐M  -­‐  hDp://web.archive.org/web/20010911203610/hDp://www.cnn.com/     Web  Archive   URI-­‐R  -­‐  hDp://www.cnn.com/    
  • 53. URI for Original, URI for Version   URI-­‐M  -­‐  hDp://en.wikipedia.org/w/index.php?,tle=September_11_aDacks&oldid=282333     CMS   URI-­‐R  -­‐  hDp://en.wikipedia.org/wiki/September_11_aDacks  
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60. Time Travel for the Web: Demo   http://www.mementoweb.org/demo/Memento_Time_Travel.mov
  • 61.
  • 62.
  • 63.
  • 64.
  • 65. Memento & Research Objects •  The combination of: •  Pro-active archiving of Research Objects and their constituent resources, using •  Web archiving techniques, e.g. crawling, transactional archiving •  Platforms with strong versioning capabilities, e.g. datawikis, github •  Assigning URIs to Research Objects and their constituent resources according to the well-established time-generic (URI-R) and time-specific (URI-M) resource pattern •  The Memento protocol to access time-specific versions of Research Objects and their constituent resources via their time- generic URI and timestamp makes a good candidate for addressing the versioning and evolution need.
  • 67. Research Objects: Provenance “The issue of provenance, and being able to audit experiments and investigations is key to the scientific method. Third parties must be able to audit the steps performed in an experiment in order to be convinced of the validity of results. Audit is required not just for regulatory purposes, but allows for the results of experiments to be interpreted and reused, thus a Research Object should provide sufficient information to support audit of the aggregation as a whole, its constituent parts, and any process that it may encapsulate.” Bechhofer S. et al (2010) Research Objects: Towards Exchange and Reuse of Digital Knowledge http://dx.doi.org/10.1038/npre.2010.4626.1
  • 68. Van de Sompel, H. (2003) Roadblocks http://www.sis.pitt.edu/~dlwkshop/paper_sompel.html Provenance
  • 69. Moreau, L. et al. (2010) The Open Provenance Model: Abstract Model http://eprints.ecs.soton.ac.uk/21449/ Open Provenance Model
  • 72. The Clean Slate Challenge
  • 73. •  ResourceSync is about synchronization of web resources, things with a URI that can be dereferenced •  Small websites/repositories (a few resources) to large repositories/datasets/ linked data collections (many millions of resources) •  Low change frequency (weeks/months) to high change frequency (seconds) •  Synchronization latency and accuracy needs may vary •  Modular framework based on Sitemaps and extensions 2012   Funded by the Sloan Foundation http://www.openarchives.org/rs/
  • 74. •  Investigates reference rot at massive scale: •  Citation rot - Do HTTP references in scholarly articles still resolve? •  Content rot - If so, is the content at the end of the HTTP reference still representative of the content that was originally referenced? •  Investigates pro-active ways to archive HTTP referenced resources that occur in scholarly articles 2013   hiberlink Funded by the Mellon Foundation Soon at http://www.hiberlink.org
  • 77. A Clean Slate? @hvdsomp http://public.lanl.gov/herbertv/ herbert van de sompel Includes slides by Sean Bechhofer, Carole Goble, Robert Sanderson