SlideShare une entreprise Scribd logo
1  sur  57
Paving the way to open and interoperable
research data service workflows
Progress from 3 perspectives
Angus Whyte, Digital Curation Centre
Rory Macneil, Research Space
Stuart Lewis, University of Edinburgh
Repository Fringe, Edinburgh, 2nd August 2016
Paving the way to open and interoperable
research data service workflows
Progress from 3 perspectives
Angus Whyte, Digital Curation Centre
Research data service models, new DCC guidance and some draft ‘design
principles’ for institutions integrating research data service workflows
Rory Macneil, Research Space
Integrating the RSpace ELN with University of Edinburgh’s DataShare and
Harvard’s Dataverse repositories
Stuart Lewis, University of Edinburgh
DataVault –Jisc Research Data Spring prototype for packaging data to be archived
Guidance from Digital Curation Centre
Aims to effectively support organisations with-
• Providing effective research data services
• Promoting reusability of their research data
• and reproducibility of their research
Models that reflect and help shape reality
Higgins, S. (2008) The DCC Curation Lifecycle Model. Int. Journal Digital Curation 2008, Vol. 3, No. 1, pp. 134-140 doi:10.2218/ijdc.v3i1.48
Key element of
DCC guidance
e.g. Curation
Lifecycle Model
(2008)
DCC research data service model (2016)
RDM policy & strategy
Business plans &
sustainability
Training
Data Management
Planning
Active data
management
Appraisal & risk
assessment
Preservation
Access &
publishing
Discovery
Ref. DCC How-to Guide on Evaluating Data Repository and Catalogue Platforms (forthcoming, 2016)
Advisory services
Some working definitions
Research data service – “a means of delivering value to the producers and
users of digital objects by facilitating outcomes they want to achieve without
the ownership of specific costs or risks” (derived from ITIL definition of a service)
Sub-types -
Active data management “services used to create or transform digital
objects for the purposes of research”
Preservation: “services offering to ensure digital objects meet a defined
level of FAIRness - findability, accessibility, interoperability, and reusability -
for a designated community and period of time”
Publication: “services offering to enhance digital objects FAIRness by
reviewing their quality on specified criteria, or connecting them to
additional metadata”
Guidance: “services offering practical guidance on choosing or using the
above services”
Draft design principles for integration of research data
service workflows
1. Active data management services should use open
standards to express and expose the objects and
metadata they offer to downstream services,
including their access and reuse terms
Draft design principles for integration of research data
service workflows
2. Preservation and publication services should publish
policies stating what digital object types they accept,
for what communities, and on what terms and
conditions
Draft design principles for integration of research data
service workflows
3. Preservation and publication services should make
openly available sufficient metadata to enable reuse
of their outputs, including all terms and conditions
for third-party access and reuse
Draft design principles for integration of research data
service workflows
4. Active data management, preservation and
publication services should make sufficient detail of
their workflows available to support the
reproducibility of research that produced the digital
objects they act upon
Draft design principles for integration of research data
service workflows
5. Guidance services should support users of other
services to make an informed choice of downstream
service capabilities, informed by best practices for
reuse and reproducibility
Draft design principles for integration of research data
service workflows
6. Guidelines 1-5 should be implemented using
machine-actionable content
DCC research data service model (2016)
RDM policy & strategy
Business plans &
sustainability
Training Advisory services
Data Management
Planning
Active data
management
Appraisal & risk
assessment
Preservation
Access &
publishing
Discovery
Ref. DCC How-to Guide on Evaluating Data Repository and Catalogue Platforms (forthcoming, 2016)
Data Management
Planning
Active data
management
Appraisal & risk
assessment
Preservation
Access &
publishing
Discovery
Real services are made up of many more parts
tools and services at researchers disposal are increasing
‘lifecycles’
are often
non-linear
Integration
use cases
do not
follow neat
sequences
One size does not fit all
What about Functional
Models? OAIS foundation for
Trustworthy Digital Repository
standards
Ingest
Data
management
AIP AIP
Preservation
PlanningContext Context
Description Description
RDM
SIP DIP
UsersProducer
Archival storage
Access
Administration
RDM platforms can be grafted
on, to relate OAIS functions
to context info
sources
Ingest
Data
management
AIP AIP
CRIS
Open access
Repository
Preservation
PlanningContext Context
Description Description
RDM
SIP DIP
UsersProducer
Data
Catalogue
Archival storage
Collection
Appraisal
Access
Data
Preservation
Active Data
Management
Data Mgmt
Planning
Data
Publication
Administration
What best practice models apply ‘upstream’?
Q. How can institutions ensure researchers have informed choice of
trustworthy services before they need a repository
• “Whole lifecycle” service models are one solution
• But how should they be governed?
– Commercial services?
– Commons?
– Hybrid?
Commercial services model
e.g. Elsevier
“On Wednesday 1 June, Elsevier acquired
Hivebench to help further streamline the workflow
of researchers – putting research data
management at their fingertips. The added value
of the integration lies in linking Hivebench with
Elsevier’s existing Research Data Management
portfolio for products and services.
The research data that researchers have stored in
the Hivebench notebook are linked to the
Mendeley Data repository, which will be linked to
Pure. This ….adds instant value to the datasets
because they become far more suitable for reuse.”
Commons approach
e.g. Principles for Open Scholarly Infrastructures
“…What should a shared infrastructure look like?
Infrastructure at its best is invisible. We tend to only
notice it when it fails. If successful, it is stable and
sustainable. Above all, it is trusted and relied on by the
broad community it serves. Trust must run strongly
across each of the following areas: running the
infrastructure (governance), funding it (sustainability),
and preserving community ownership of it
(insurance).” (emphasis added)
Cameron Neylon, Science in the Open, 25 Feb 2015
Commons approach
e.g. Open Science Framework
Workflows
in effect
governed
by a ‘mixed
economy’
of platforms
So what
principles
should
apply?
Mathew Spitzer An Open Science Framework for Solving Institutional Challenges: Supporting the Institutional
Research Method Research Data Access and Preservation Summit, 2016 Atlanta, GA May 4-7, 2016
Background - RDA Working Group on Data Publishing
Workflows
Reviewed 25 examples of repository data publishing workflows,
including some integrating with ‘downstream’ services e.g. data
journals for peer review
Austin, Claire C et al.. (2015). Key components of data publishing: Using current best practices to develop a reference model for data publishing. Zenodo.
http://dx.doi.org/10.5281/zenodo.34542
Drafted a reference model and made best practice
recommendations-
1. Start small, building modular, open source and shareable
components
2. Follow standards that facilitate interoperability and permit
extensions
3. Facilitate data citation, e.g. through use of digital object PIDs,
data/article linkages, researcher PIDs
4. Document roles, workflows and services
Background - RDA Working Group on Data Publishing
Workflows
• Follow up call (Dec 15) for examples of repositories
connecting with upstream research workflows e.g. to gather
metadata earlier
• Aiming to identify whether recommendations apply, and how
the intention to publish data is changing research practices.
• Collected 12 cases - mix of concrete examples, prototypes
(e.g. Dendro), conceptual models (e.g. Science 2.0
repositories)
• Report in preparation
Review of upstream workflow examples
Are the services components underpinning research
workflows loosely coupled?
– Modular design of workflow components? yes, plenty
evidence
– Standard vocabularies and protocols to describe
components some evidence
– Significant investment in building trust-based
relationships among participants some evidence
– Standardized ways of specifying capabilities and
performance requirements limited (e.g. WDS-DSA
Catalogue of Requirements)
* ‘The Joy of Flex’ Hagel and Seely-Brown, 2005 see e.g. http://www.cio.com.au/article/29148/joy_flex
So do we need more …
E.g. DCC How-to describe research data service workflows (forthcoming)
• Definitions – to describe services ?
• Design principles – to articulate best practice?
• Capability models - to articulate mutual expectations
of service owners?
• Case studies of repository workflow integration?
• Understanding of how tools are actually being
integrated, and effect on data publication practices
Draft design principles for integration of research data
service workflows
1. Active data management services should use open standards to express and
expose the objects and metadata they offer to downstream services, including
their access and reuse terms
2. Preservation and publication services should publish policies stating what digital
object types they accept, for what communities, and on what terms and
conditions
3. Preservation and publication services should make openly available sufficient
metadata to enable reuse of their outputs, including all terms and conditions for
third-party access and reuse
4. Active data management, preservation and publication services should make
sufficient detail of their workflows available to support the reproducibility of
research that produced the digital objects they act upon
5. Guidance services should support users of other services to make an informed
choice of downstream service capabilities, informed by best practices for reuse
and reproducibility
6. Guidelines 1-5 should be implemented using machine-actionable content
Thanks for comments to Suenje Dalmeier Tiessen and Amy Nurnberger, co-chairs of the RDA Working
Group on Data Publishing Workflows
Maintaining trust across the research cycle
Q. How do these principles apply in real cases?
Case Study 1 – Rory Macneill, Research Space
Integrating the RSpace ELN with University of Edinburgh’s DataShare
and Harvard’s Dataverse repositories
Export data and metadata
Archive or Repository
Current Repository Paradigm
The reality is (a lot) more complex
Data
Files
Links
Repositories
Archives
ELNs
File systems
Databases
Capture and structure data
Generate metadata
Export data and metadata
Establish links to file(s) and databases
Track file locations
ActionsVehiclesResearch units
Issues
Data versus file links
Forms of data export
Data from intermediary vehicles
File location and integrity of links
Post deposit access - permissions
Post deposit access - capabilities
Pre-deposit impact on post-deposit capabilities
Repositories need the ability to easily ingest diverse data types and formats,
and links to files,
in a structured manner,
directly and from other tools
The Dataverse – Starfish – RSpace
project @Harvard Medical School
Towards a new paradigm
Data, files and research Capture and structure data
Generate metadata
Track file locations Make data and files available for
public access and query
Access hyperlinks √
Track file locations √
Export data and metadata
In various formats
Using open standards
PDF √
Word √
HTML √
XML √
On premises or
Cloud file
system
Database
Capture and structure data √
Generate metadata √
Track file locations √
Draft design principles for integration of research data
service workflows
1. Active data management services should use open standards to express and
expose the objects and metadata they offer to downstream services, including
their access and reuse terms
2. Preservation and publication services should publish policies stating what digital
object types they accept, for what communities, and on what terms and
conditions
3. Preservation and publication services should make openly available sufficient
metadata to enable reuse of their outputs, including all terms and conditions for
third-party access and reuse
4. Active data management, preservation and publication services should make
sufficient detail of their workflows available to support the reproducibility of
research that produced the digital objects they act upon
5. Guidance services should support users of other services to make an informed
choice of downstream service capabilities, informed by best practices for reuse
and reproducibility
6. Guidelines 1-5 should be implemented using machine-actionable content
Case study 2
Stuart Lewis, University of Edinburgh
DataVault –Jisc Research Data Spring prototype for packaging data to
be archived
Stuart Lewis
Deputy Director, Library &
University Collections
The University of Edinburgh
What is a Data Vault?
@JiscDataVault
Research Data Management Services
Data Management Support
Data
Management
Planning
Active Data
Infrastructure
Data
Stewardship
Data Stewardship
• DataVault
– Long term archival storage
– First envisaged a few years ago…
What is the DataVault - Analogies
https://www.flickr.com/photos/brookward/8457736952
What is the DataVault - Analogies
https://www.flickr.com/photos/timshelyn/412548076
7
Where does it sit?
Active Storage
Lab Equipment
Other Media
Archival Storage
Where does it sit?
Active Storage
Lab Equipment
Other Media
Archival Storage
PURE / CRIS / Data Catalogue
Where does could it sit?
Active Storage
Lab Equipment
Other Media
Archival Storage
PURE / CRIS / Data Catalogue
Open Data Repository
Phase 1 (3 months)
Phase 2 (4 months)
Phase 3 (6 months)
Make available online
Metadata already captured from CRIS, plus files from the Vault
Stuart Lewis
stuart.lewis@ed.ac.uk
@stuartlewis
What is a Data Vault?
Draft design principles for integration of research data
service workflows
1. Active data management services should use open standards to express and
expose the objects and metadata they offer to downstream services, including
their access and reuse terms
2. Preservation and publication services should publish policies stating what digital
object types they accept, for what communities, and on what terms and
conditions
3. Preservation and publication services should make openly available sufficient
metadata to enable reuse of their outputs, including all terms and conditions for
third-party access and reuse
4. Active data management, preservation and publication services should make
sufficient detail of their workflows available to support the reproducibility of
research that produced the digital objects they act upon
5. Guidance services should support users of other services to make an informed
choice of downstream service capabilities, informed by best practices for reuse
and reproducibility
6. Guidelines 1-5 should be implemented using machine-actionable content

Contenu connexe

Tendances

Big Data As a service - Sethuonline.com | Sathyabama University Chennai
Big Data As a service - Sethuonline.com | Sathyabama University ChennaiBig Data As a service - Sethuonline.com | Sathyabama University Chennai
Big Data As a service - Sethuonline.com | Sathyabama University Chennaisethuraman R
 
Business cases and costs RDN
Business cases and costs RDNBusiness cases and costs RDN
Business cases and costs RDNJisc RDM
 
Text mining and machine learning
Text mining and machine learningText mining and machine learning
Text mining and machine learningJisc RDM
 
Rachel Bruce on DMP
Rachel Bruce on DMPRachel Bruce on DMP
Rachel Bruce on DMPJisc RDM
 
Data Curation: A New Frontier in Faculty-Librarian Collaboration
Data Curation: A New Frontier in Faculty-Librarian CollaborationData Curation: A New Frontier in Faculty-Librarian Collaboration
Data Curation: A New Frontier in Faculty-Librarian Collaborationjpotter49505
 
Measuring the costs and benefits of RDM to supporta a business case
Measuring the costs and benefits of RDM to supporta a business caseMeasuring the costs and benefits of RDM to supporta a business case
Measuring the costs and benefits of RDM to supporta a business caseJisc RDM
 
NDS Relevant Update from the NIH Data Science (ADDS) Office
NDS Relevant Update from the NIH Data Science (ADDS) OfficeNDS Relevant Update from the NIH Data Science (ADDS) Office
NDS Relevant Update from the NIH Data Science (ADDS) OfficePhilip Bourne
 
The NIH Data Commons - BD2K All Hands Meeting 2015
The NIH Data Commons -  BD2K All Hands Meeting 2015The NIH Data Commons -  BD2K All Hands Meeting 2015
The NIH Data Commons - BD2K All Hands Meeting 2015Vivien Bonazzi
 
BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands Vivien Bonazzi
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsVivien Bonazzi
 
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...ASIS&T
 
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...CILIP MDG
 
Jisc research data shared service overview IDCC 2016
Jisc research data shared service overview IDCC 2016Jisc research data shared service overview IDCC 2016
Jisc research data shared service overview IDCC 2016Jisc RDM
 
Jisc Research Data Management Shared Service Workshop: An institutional persp...
Jisc Research Data Management Shared Service Workshop: An institutional persp...Jisc Research Data Management Shared Service Workshop: An institutional persp...
Jisc Research Data Management Shared Service Workshop: An institutional persp...Jisc RDM
 
challenges of big data to big data mining with their processing framework
challenges of big data to big data mining with their processing frameworkchallenges of big data to big data mining with their processing framework
challenges of big data to big data mining with their processing frameworkKamleshKumar394
 
Komatsoulis internet2 global forum 2015
Komatsoulis internet2 global forum 2015Komatsoulis internet2 global forum 2015
Komatsoulis internet2 global forum 2015George Komatsoulis
 
Big data: Challenges, Practices and Technologies
Big data: Challenges, Practices and TechnologiesBig data: Challenges, Practices and Technologies
Big data: Challenges, Practices and TechnologiesNavneet Randhawa
 

Tendances (20)

Big Data As a service - Sethuonline.com | Sathyabama University Chennai
Big Data As a service - Sethuonline.com | Sathyabama University ChennaiBig Data As a service - Sethuonline.com | Sathyabama University Chennai
Big Data As a service - Sethuonline.com | Sathyabama University Chennai
 
Business cases and costs RDN
Business cases and costs RDNBusiness cases and costs RDN
Business cases and costs RDN
 
Jonathan Breeze, Symplectic
Jonathan Breeze, SymplecticJonathan Breeze, Symplectic
Jonathan Breeze, Symplectic
 
Text mining and machine learning
Text mining and machine learningText mining and machine learning
Text mining and machine learning
 
Rachel Bruce on DMP
Rachel Bruce on DMPRachel Bruce on DMP
Rachel Bruce on DMP
 
Data Curation: A New Frontier in Faculty-Librarian Collaboration
Data Curation: A New Frontier in Faculty-Librarian CollaborationData Curation: A New Frontier in Faculty-Librarian Collaboration
Data Curation: A New Frontier in Faculty-Librarian Collaboration
 
Measuring the costs and benefits of RDM to supporta a business case
Measuring the costs and benefits of RDM to supporta a business caseMeasuring the costs and benefits of RDM to supporta a business case
Measuring the costs and benefits of RDM to supporta a business case
 
NDS Relevant Update from the NIH Data Science (ADDS) Office
NDS Relevant Update from the NIH Data Science (ADDS) OfficeNDS Relevant Update from the NIH Data Science (ADDS) Office
NDS Relevant Update from the NIH Data Science (ADDS) Office
 
The NIH Data Commons - BD2K All Hands Meeting 2015
The NIH Data Commons -  BD2K All Hands Meeting 2015The NIH Data Commons -  BD2K All Hands Meeting 2015
The NIH Data Commons - BD2K All Hands Meeting 2015
 
BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...
ESIP Federation: Community-Driven, Collaborative Governance - Carol Beaton Me...
 
Yale Day of Data
Yale Day of Data Yale Day of Data
Yale Day of Data
 
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
 
Jisc research data shared service overview IDCC 2016
Jisc research data shared service overview IDCC 2016Jisc research data shared service overview IDCC 2016
Jisc research data shared service overview IDCC 2016
 
Jisc Research Data Management Shared Service Workshop: An institutional persp...
Jisc Research Data Management Shared Service Workshop: An institutional persp...Jisc Research Data Management Shared Service Workshop: An institutional persp...
Jisc Research Data Management Shared Service Workshop: An institutional persp...
 
challenges of big data to big data mining with their processing framework
challenges of big data to big data mining with their processing frameworkchallenges of big data to big data mining with their processing framework
challenges of big data to big data mining with their processing framework
 
Komatsoulis internet2 global forum 2015
Komatsoulis internet2 global forum 2015Komatsoulis internet2 global forum 2015
Komatsoulis internet2 global forum 2015
 
Big data: Challenges, Practices and Technologies
Big data: Challenges, Practices and TechnologiesBig data: Challenges, Practices and Technologies
Big data: Challenges, Practices and Technologies
 
RDA UK
RDA UKRDA UK
RDA UK
 

Similaire à Paving the way to open and interoperable research data service workflows

Recognising data sharing
Recognising data sharingRecognising data sharing
Recognising data sharingJisc RDM
 
Jisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring UpdateJisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring UpdateJisc RDM
 
Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)Anna Fensel
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataSusanna-Assunta Sansone
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Geoffrey Fox
 
Jisc Research data shared service overview and update - May 2016
Jisc Research data shared service overview and update - May 2016Jisc Research data shared service overview and update - May 2016
Jisc Research data shared service overview and update - May 2016Jisc RDM
 
UKRDDS Project Overview - Feb 2016
UKRDDS Project Overview - Feb 2016UKRDDS Project Overview - Feb 2016
UKRDDS Project Overview - Feb 2016Christopher Brown
 
What infrastructure is necessary for successful research data management (RDM...
What infrastructure is necessary for successful research data management (RDM...What infrastructure is necessary for successful research data management (RDM...
What infrastructure is necessary for successful research data management (RDM...heila1
 
RD shared services and research data spring
RD shared services and research data springRD shared services and research data spring
RD shared services and research data springJisc RDM
 
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...IJEACS
 
Supporting Research Data Management in UK Universities: the Jisc Managing Res...
Supporting Research Data Management in UK Universities: the Jisc Managing Res...Supporting Research Data Management in UK Universities: the Jisc Managing Res...
Supporting Research Data Management in UK Universities: the Jisc Managing Res...L Molloy
 
Research Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghResearch Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghRobin Rice
 
Data citationworkshop idcc_2014 Altman
Data citationworkshop idcc_2014 AltmanData citationworkshop idcc_2014 Altman
Data citationworkshop idcc_2014 AltmanMicah Altman
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”DuraSpace
 
Meeting the NSF DMP Requirement: March 7, 2012
Meeting the NSF DMP Requirement: March 7, 2012Meeting the NSF DMP Requirement: March 7, 2012
Meeting the NSF DMP Requirement: March 7, 2012IUPUI
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?Elena Simperl
 
RDM shared services at IDCC
RDM shared services at IDCCRDM shared services at IDCC
RDM shared services at IDCCJisc RDM
 

Similaire à Paving the way to open and interoperable research data service workflows (20)

Recognising data sharing
Recognising data sharingRecognising data sharing
Recognising data sharing
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
Jisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring UpdateJisc Research Data Shared Service - Spring Update
Jisc Research Data Shared Service - Spring Update
 
Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)Towards Semantic APIs for Research Data Services (Invited Talk)
Towards Semantic APIs for Research Data Services (Invited Talk)
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
Jisc Research data shared service overview and update - May 2016
Jisc Research data shared service overview and update - May 2016Jisc Research data shared service overview and update - May 2016
Jisc Research data shared service overview and update - May 2016
 
UKRDDS Project Overview - Feb 2016
UKRDDS Project Overview - Feb 2016UKRDDS Project Overview - Feb 2016
UKRDDS Project Overview - Feb 2016
 
What infrastructure is necessary for successful research data management (RDM...
What infrastructure is necessary for successful research data management (RDM...What infrastructure is necessary for successful research data management (RDM...
What infrastructure is necessary for successful research data management (RDM...
 
RD shared services and research data spring
RD shared services and research data springRD shared services and research data spring
RD shared services and research data spring
 
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...
Stacked Generalization of Random Forest and Decision Tree Techniques for Libr...
 
Supporting Research Data Management in UK Universities: the Jisc Managing Res...
Supporting Research Data Management in UK Universities: the Jisc Managing Res...Supporting Research Data Management in UK Universities: the Jisc Managing Res...
Supporting Research Data Management in UK Universities: the Jisc Managing Res...
 
Johnston - How to Curate Research Data
Johnston - How to Curate Research DataJohnston - How to Curate Research Data
Johnston - How to Curate Research Data
 
Research Data Service at the University of Edinburgh
Research Data Service at the University of EdinburghResearch Data Service at the University of Edinburgh
Research Data Service at the University of Edinburgh
 
Data citationworkshop idcc_2014 Altman
Data citationworkshop idcc_2014 AltmanData citationworkshop idcc_2014 Altman
Data citationworkshop idcc_2014 Altman
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”
 
Meeting the NSF DMP Requirement: March 7, 2012
Meeting the NSF DMP Requirement: March 7, 2012Meeting the NSF DMP Requirement: March 7, 2012
Meeting the NSF DMP Requirement: March 7, 2012
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
RDM shared services at IDCC
RDM shared services at IDCCRDM shared services at IDCC
RDM shared services at IDCC
 
Digital Curation 101 - Taster
Digital Curation 101 - TasterDigital Curation 101 - Taster
Digital Curation 101 - Taster
 

Plus de The University of Edinburgh

Institutional Support for Research Data Management- Why, what and where next?...
Institutional Support for Research Data Management- Why, what and where next?...Institutional Support for Research Data Management- Why, what and where next?...
Institutional Support for Research Data Management- Why, what and where next?...The University of Edinburgh
 
OR2013 workshop "Institutional Repositories Dealing with Data " DCC Introduction
OR2013 workshop "Institutional Repositories Dealing with Data " DCC IntroductionOR2013 workshop "Institutional Repositories Dealing with Data " DCC Introduction
OR2013 workshop "Institutional Repositories Dealing with Data " DCC IntroductionThe University of Edinburgh
 
How will repository and subject librarians roles interact to support data man...
How will repository and subject librarians roles interact to support data man...How will repository and subject librarians roles interact to support data man...
How will repository and subject librarians roles interact to support data man...The University of Edinburgh
 
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...The University of Edinburgh
 
Reasons to select research data and where to start
Reasons to select research data and where to startReasons to select research data and where to start
Reasons to select research data and where to startThe University of Edinburgh
 

Plus de The University of Edinburgh (8)

Lhstm whyte readiness_slides
Lhstm whyte readiness_slidesLhstm whyte readiness_slides
Lhstm whyte readiness_slides
 
Institutional Support for Research Data Management- Why, what and where next?...
Institutional Support for Research Data Management- Why, what and where next?...Institutional Support for Research Data Management- Why, what and where next?...
Institutional Support for Research Data Management- Why, what and where next?...
 
OR2013 workshop "Institutional Repositories Dealing with Data " DCC Introduction
OR2013 workshop "Institutional Repositories Dealing with Data " DCC IntroductionOR2013 workshop "Institutional Repositories Dealing with Data " DCC Introduction
OR2013 workshop "Institutional Repositories Dealing with Data " DCC Introduction
 
How will repository and subject librarians roles interact to support data man...
How will repository and subject librarians roles interact to support data man...How will repository and subject librarians roles interact to support data man...
How will repository and subject librarians roles interact to support data man...
 
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
 
Data Selection & Triage
Data Selection & TriageData Selection & Triage
Data Selection & Triage
 
Introduction to Research Data Management
Introduction to Research Data ManagementIntroduction to Research Data Management
Introduction to Research Data Management
 
Reasons to select research data and where to start
Reasons to select research data and where to startReasons to select research data and where to start
Reasons to select research data and where to start
 

Dernier

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 

Dernier (20)

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 

Paving the way to open and interoperable research data service workflows

  • 1. Paving the way to open and interoperable research data service workflows Progress from 3 perspectives Angus Whyte, Digital Curation Centre Rory Macneil, Research Space Stuart Lewis, University of Edinburgh Repository Fringe, Edinburgh, 2nd August 2016
  • 2. Paving the way to open and interoperable research data service workflows Progress from 3 perspectives Angus Whyte, Digital Curation Centre Research data service models, new DCC guidance and some draft ‘design principles’ for institutions integrating research data service workflows Rory Macneil, Research Space Integrating the RSpace ELN with University of Edinburgh’s DataShare and Harvard’s Dataverse repositories Stuart Lewis, University of Edinburgh DataVault –Jisc Research Data Spring prototype for packaging data to be archived
  • 3. Guidance from Digital Curation Centre Aims to effectively support organisations with- • Providing effective research data services • Promoting reusability of their research data • and reproducibility of their research
  • 4. Models that reflect and help shape reality Higgins, S. (2008) The DCC Curation Lifecycle Model. Int. Journal Digital Curation 2008, Vol. 3, No. 1, pp. 134-140 doi:10.2218/ijdc.v3i1.48 Key element of DCC guidance e.g. Curation Lifecycle Model (2008)
  • 5. DCC research data service model (2016) RDM policy & strategy Business plans & sustainability Training Data Management Planning Active data management Appraisal & risk assessment Preservation Access & publishing Discovery Ref. DCC How-to Guide on Evaluating Data Repository and Catalogue Platforms (forthcoming, 2016) Advisory services
  • 6. Some working definitions Research data service – “a means of delivering value to the producers and users of digital objects by facilitating outcomes they want to achieve without the ownership of specific costs or risks” (derived from ITIL definition of a service) Sub-types - Active data management “services used to create or transform digital objects for the purposes of research” Preservation: “services offering to ensure digital objects meet a defined level of FAIRness - findability, accessibility, interoperability, and reusability - for a designated community and period of time” Publication: “services offering to enhance digital objects FAIRness by reviewing their quality on specified criteria, or connecting them to additional metadata” Guidance: “services offering practical guidance on choosing or using the above services”
  • 7. Draft design principles for integration of research data service workflows 1. Active data management services should use open standards to express and expose the objects and metadata they offer to downstream services, including their access and reuse terms
  • 8. Draft design principles for integration of research data service workflows 2. Preservation and publication services should publish policies stating what digital object types they accept, for what communities, and on what terms and conditions
  • 9. Draft design principles for integration of research data service workflows 3. Preservation and publication services should make openly available sufficient metadata to enable reuse of their outputs, including all terms and conditions for third-party access and reuse
  • 10. Draft design principles for integration of research data service workflows 4. Active data management, preservation and publication services should make sufficient detail of their workflows available to support the reproducibility of research that produced the digital objects they act upon
  • 11. Draft design principles for integration of research data service workflows 5. Guidance services should support users of other services to make an informed choice of downstream service capabilities, informed by best practices for reuse and reproducibility
  • 12. Draft design principles for integration of research data service workflows 6. Guidelines 1-5 should be implemented using machine-actionable content
  • 13. DCC research data service model (2016) RDM policy & strategy Business plans & sustainability Training Advisory services Data Management Planning Active data management Appraisal & risk assessment Preservation Access & publishing Discovery Ref. DCC How-to Guide on Evaluating Data Repository and Catalogue Platforms (forthcoming, 2016)
  • 14. Data Management Planning Active data management Appraisal & risk assessment Preservation Access & publishing Discovery Real services are made up of many more parts tools and services at researchers disposal are increasing
  • 15. ‘lifecycles’ are often non-linear Integration use cases do not follow neat sequences One size does not fit all
  • 16. What about Functional Models? OAIS foundation for Trustworthy Digital Repository standards Ingest Data management AIP AIP Preservation PlanningContext Context Description Description RDM SIP DIP UsersProducer Archival storage Access Administration
  • 17. RDM platforms can be grafted on, to relate OAIS functions to context info sources Ingest Data management AIP AIP CRIS Open access Repository Preservation PlanningContext Context Description Description RDM SIP DIP UsersProducer Data Catalogue Archival storage Collection Appraisal Access Data Preservation Active Data Management Data Mgmt Planning Data Publication Administration
  • 18. What best practice models apply ‘upstream’? Q. How can institutions ensure researchers have informed choice of trustworthy services before they need a repository • “Whole lifecycle” service models are one solution • But how should they be governed? – Commercial services? – Commons? – Hybrid?
  • 19. Commercial services model e.g. Elsevier “On Wednesday 1 June, Elsevier acquired Hivebench to help further streamline the workflow of researchers – putting research data management at their fingertips. The added value of the integration lies in linking Hivebench with Elsevier’s existing Research Data Management portfolio for products and services. The research data that researchers have stored in the Hivebench notebook are linked to the Mendeley Data repository, which will be linked to Pure. This ….adds instant value to the datasets because they become far more suitable for reuse.”
  • 20. Commons approach e.g. Principles for Open Scholarly Infrastructures “…What should a shared infrastructure look like? Infrastructure at its best is invisible. We tend to only notice it when it fails. If successful, it is stable and sustainable. Above all, it is trusted and relied on by the broad community it serves. Trust must run strongly across each of the following areas: running the infrastructure (governance), funding it (sustainability), and preserving community ownership of it (insurance).” (emphasis added) Cameron Neylon, Science in the Open, 25 Feb 2015
  • 21. Commons approach e.g. Open Science Framework Workflows in effect governed by a ‘mixed economy’ of platforms So what principles should apply? Mathew Spitzer An Open Science Framework for Solving Institutional Challenges: Supporting the Institutional Research Method Research Data Access and Preservation Summit, 2016 Atlanta, GA May 4-7, 2016
  • 22. Background - RDA Working Group on Data Publishing Workflows Reviewed 25 examples of repository data publishing workflows, including some integrating with ‘downstream’ services e.g. data journals for peer review Austin, Claire C et al.. (2015). Key components of data publishing: Using current best practices to develop a reference model for data publishing. Zenodo. http://dx.doi.org/10.5281/zenodo.34542 Drafted a reference model and made best practice recommendations- 1. Start small, building modular, open source and shareable components 2. Follow standards that facilitate interoperability and permit extensions 3. Facilitate data citation, e.g. through use of digital object PIDs, data/article linkages, researcher PIDs 4. Document roles, workflows and services
  • 23. Background - RDA Working Group on Data Publishing Workflows • Follow up call (Dec 15) for examples of repositories connecting with upstream research workflows e.g. to gather metadata earlier • Aiming to identify whether recommendations apply, and how the intention to publish data is changing research practices. • Collected 12 cases - mix of concrete examples, prototypes (e.g. Dendro), conceptual models (e.g. Science 2.0 repositories) • Report in preparation
  • 24. Review of upstream workflow examples Are the services components underpinning research workflows loosely coupled? – Modular design of workflow components? yes, plenty evidence – Standard vocabularies and protocols to describe components some evidence – Significant investment in building trust-based relationships among participants some evidence – Standardized ways of specifying capabilities and performance requirements limited (e.g. WDS-DSA Catalogue of Requirements) * ‘The Joy of Flex’ Hagel and Seely-Brown, 2005 see e.g. http://www.cio.com.au/article/29148/joy_flex
  • 25. So do we need more … E.g. DCC How-to describe research data service workflows (forthcoming) • Definitions – to describe services ? • Design principles – to articulate best practice? • Capability models - to articulate mutual expectations of service owners? • Case studies of repository workflow integration? • Understanding of how tools are actually being integrated, and effect on data publication practices
  • 26. Draft design principles for integration of research data service workflows 1. Active data management services should use open standards to express and expose the objects and metadata they offer to downstream services, including their access and reuse terms 2. Preservation and publication services should publish policies stating what digital object types they accept, for what communities, and on what terms and conditions 3. Preservation and publication services should make openly available sufficient metadata to enable reuse of their outputs, including all terms and conditions for third-party access and reuse 4. Active data management, preservation and publication services should make sufficient detail of their workflows available to support the reproducibility of research that produced the digital objects they act upon 5. Guidance services should support users of other services to make an informed choice of downstream service capabilities, informed by best practices for reuse and reproducibility 6. Guidelines 1-5 should be implemented using machine-actionable content Thanks for comments to Suenje Dalmeier Tiessen and Amy Nurnberger, co-chairs of the RDA Working Group on Data Publishing Workflows
  • 27. Maintaining trust across the research cycle Q. How do these principles apply in real cases? Case Study 1 – Rory Macneill, Research Space Integrating the RSpace ELN with University of Edinburgh’s DataShare and Harvard’s Dataverse repositories
  • 28. Export data and metadata Archive or Repository Current Repository Paradigm
  • 29. The reality is (a lot) more complex Data Files Links Repositories Archives ELNs File systems Databases Capture and structure data Generate metadata Export data and metadata Establish links to file(s) and databases Track file locations ActionsVehiclesResearch units Issues Data versus file links Forms of data export Data from intermediary vehicles File location and integrity of links Post deposit access - permissions Post deposit access - capabilities Pre-deposit impact on post-deposit capabilities Repositories need the ability to easily ingest diverse data types and formats, and links to files, in a structured manner, directly and from other tools
  • 30. The Dataverse – Starfish – RSpace project @Harvard Medical School Towards a new paradigm Data, files and research Capture and structure data Generate metadata Track file locations Make data and files available for public access and query
  • 31. Access hyperlinks √ Track file locations √ Export data and metadata In various formats Using open standards PDF √ Word √ HTML √ XML √ On premises or Cloud file system Database Capture and structure data √ Generate metadata √ Track file locations √
  • 32. Draft design principles for integration of research data service workflows 1. Active data management services should use open standards to express and expose the objects and metadata they offer to downstream services, including their access and reuse terms 2. Preservation and publication services should publish policies stating what digital object types they accept, for what communities, and on what terms and conditions 3. Preservation and publication services should make openly available sufficient metadata to enable reuse of their outputs, including all terms and conditions for third-party access and reuse 4. Active data management, preservation and publication services should make sufficient detail of their workflows available to support the reproducibility of research that produced the digital objects they act upon 5. Guidance services should support users of other services to make an informed choice of downstream service capabilities, informed by best practices for reuse and reproducibility 6. Guidelines 1-5 should be implemented using machine-actionable content
  • 33. Case study 2 Stuart Lewis, University of Edinburgh DataVault –Jisc Research Data Spring prototype for packaging data to be archived
  • 34. Stuart Lewis Deputy Director, Library & University Collections The University of Edinburgh What is a Data Vault? @JiscDataVault
  • 35. Research Data Management Services Data Management Support Data Management Planning Active Data Infrastructure Data Stewardship
  • 36. Data Stewardship • DataVault – Long term archival storage – First envisaged a few years ago…
  • 37. What is the DataVault - Analogies https://www.flickr.com/photos/brookward/8457736952
  • 38. What is the DataVault - Analogies https://www.flickr.com/photos/timshelyn/412548076 7
  • 39. Where does it sit? Active Storage Lab Equipment Other Media Archival Storage
  • 40. Where does it sit? Active Storage Lab Equipment Other Media Archival Storage PURE / CRIS / Data Catalogue
  • 41. Where does could it sit? Active Storage Lab Equipment Other Media Archival Storage PURE / CRIS / Data Catalogue Open Data Repository
  • 42. Phase 1 (3 months) Phase 2 (4 months) Phase 3 (6 months)
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54. Make available online Metadata already captured from CRIS, plus files from the Vault
  • 55.
  • 57. Draft design principles for integration of research data service workflows 1. Active data management services should use open standards to express and expose the objects and metadata they offer to downstream services, including their access and reuse terms 2. Preservation and publication services should publish policies stating what digital object types they accept, for what communities, and on what terms and conditions 3. Preservation and publication services should make openly available sufficient metadata to enable reuse of their outputs, including all terms and conditions for third-party access and reuse 4. Active data management, preservation and publication services should make sufficient detail of their workflows available to support the reproducibility of research that produced the digital objects they act upon 5. Guidance services should support users of other services to make an informed choice of downstream service capabilities, informed by best practices for reuse and reproducibility 6. Guidelines 1-5 should be implemented using machine-actionable content