4. Jisc Digital Futures
3 July, 2018 Jisc CNI UK Data Management and Support
Store
services
Playlists Diagnostic
tool builder
Curation
and remix
Learner
Analytics Services
Digital
capability
Learning
analytics
Digital
launchpad
Apprentice
workforce
development
Digital
leadership
Summer of
student
innovation
Analytics
academy
Analytics
labs
Qualification
verification
App
and
content
store
Research data
discovery
Research
data
usage
metrics
Equipment
data
Repository and
preservation platform
Research
data
shared
service
?
4
5. UK Policy
» Awareness of regulatory environment
» Data access statement
» Policies and processes
» Data storage
» Structured metadata descriptions
» DOIs for data
» Data securely preserved for a minimum of 10 years
from last use
» University roadmaps in place 2012, mandate in place
from 1 May 2015
6. Concordat on Open Research Data
Principles of the Concordat on Open Research Data (July 2016) between research funders and universities:
1. Open access to research data is an enabler of high quality research, a facilitator of innovation and safeguards good research practice.
2. There are sound reasons why the openness of research data may need to be restricted but any restrictions must be justified and
justifiable.
3. Open access to research data carries a significant cost, which should be respected by all parties.
4. The right of the creators of research data to reasonable first use is recognised.
5. Use of others’ data should always conform to legal, ethical and regulatory frameworks including appropriate acknowledgement.
6. Good data management is fundamental to all stages of the research process and should be established at the outset.
7. Data curation is vital to make data useful for others and for long-term preservation of data
8. Data supporting publications should be accessible by the publication date and should be in a citeable form.
9. Support for the development of appropriate data skills is recognised as a responsibility for all stakeholders.
10.Regular reviews of progress towards open research data should be undertaken.
Developed by Universities, funders and other UK stakeholders together – signed by UUK, RCUK, HEFCE,
Wellcome.
Now need to develop the practical plans for this…
7. Issues Raised by Research Managers
» Encourage open data and open research, but find it difficult or almost not within their
remit to tell researchers what to do.
» Challenges
› Large data volumes (what to keep?)
› Duplication of datasets
› Cost of long-term storage
› More evidence that re-use of data is happening
› Making data management plans more dynamic
› Reviewing and understanding local requirements and ownership over this area
› Need for a stronger community in this area
› Reporting from systems currently in use
› Integrations between systems in use
› Clearer funder policies
Source: RDSS Market Research 2017
8. Repository systems within institutions
*Source: Jisc analysis (various sources- see spreadsheet for further information)
61%
18%
10%
4%
1% 1% 1% 1% 1% 1% 1% 1%
Repository systems in place within HE organisations
Based on 143 entries where
data was available*
64 organisations missing data
Based on information gathered from across the organisation, EPrints seems to dominate the repository space currently with DSpace and
Pure also being key providers.
9. CRIS Systems within institutions
Source: Corporate Information Systems Group (CISG) annual survey of UCISA members 2017. n=122 responses
25%
13%
10%
7% 7%
2% 2% 2%
7%
25%
CRIS Systems used by UCISA members
62%
36%
2%
CRIS system delivery
In-house
Software as a service
Hybrid
Pure is the dominant CRIS system in place in most institutions, followed by EPrints and Symplectic Elements. The majority 62% have an in
house system.
10. CRIS system trends
Source: Corporate Information Systems Group (CISG) annual survey of UCISA members 2017 n=122, 2016 n=113, 2015 n=87
25%
27%
25%
3%
12% 13%14%
12%
10%
36%
25% 25%
0%
5%
10%
15%
20%
25%
30%
35%
40%
2015 2016 2017
Trends in top 3 CRIS systems 2015-2017
Pure Eprints Elements (Symplectic) None
2015 2016 2017
Pure 25% 27% 25%
EPrints 3% 12% 13%
Elements
(Symplectic)
14% 12% 10%
Converis 0% 8% 7%
Bespoke/in-
house
6% 6% 7%
IRIS 1% 1% 2%
Worktribe 1% 4% 2%
Vidatum 1% 0% 2%
Radar 1% 1% 1%
Haplo 0% 1% 1%
Ideate 0% 1% 1%
InfoEd 0% 1% 1%
Other 6% 2% 2%
None 36% 25% 25%
Not known 1% 1% 0%
Looking at trend data on CRIS systems over the last 3 years, the proportion of UCISA
members reporting to have a CRIS system has risen in this time. Whilst Pure's market
share appears to have remained static, data indicates that EPrints have increased
market share and Symplectic have seen a slight decline
11. Issues Raised by Research Managers
» Encourage open data and open research, but find it difficult or almost not within their
remit to tell researchers what to do.
» Challenges
› Large data volumes (what to keep?)
› Duplication of datasets
› Cost of long-term storage
› More evidence that re-use of data is happening
› Making data management plans more dynamic
› Reviewing and understanding local requirements and ownership over this area
› Need for a stronger community in this area
› Reporting from systems currently in use
› Integrations between systems in use
› Clearer funder policies
12. A sector-wide approach to research data management
Jisc shared serviceToday’s RDM cottage industry
No sector-wide
visibility of RDM
effectiveness
Only 20% of research data
from the 1990s is still
usable
Majority of research
institutions want a
solution which cuts
the pain
Share and optimise cost and
benefit across the sector
Grow value of research data and
increase research productivity
A single, cost-effective, pre-
integrated solution available
to all
Research Funders
require data to be
managed and
EPSRC have put
onus on institution
to preserve and
make research
data available as a
condition of
funding
Stand-alone solutions need every
institution to manage and
maintain on limited
budgets
c.200 UK public
sector institutions
perform around £6b of
funded research
per year in public
purse every year
3 July, 2018 Jisc CNI UK Data Management and Support 12
13. Why a Shared Service?
There is no single “solution”
easily available and that meets
requirements for Universities to
enable better management of
research data
research data network: http://researchdata.network
web: https://www.jisc.ac.uk/rd/projects/research-data-shared-service
github https://github.com/JiscRDSS
3 July, 2018 Jisc CNI UK Data Management and Support 13
14. Key researcher issues
3 July, 2018 Jisc CNI UK Data Management and Support 14
Source: Jisc DAF Survey results 2016
Capture & reuse Preserve Report
Advise &
best practise
Following input from our ExpertAdvisory Group, the Research Data Network, funders, and
dialogue with global users and vendors, Jisc RDSS will provide the following
core researcher functional needs:
Filling a gap
75% of respondents
look first to their
institution to
preserve their data
Uptake of RDM
Only 40% of
respondents have a
Research Data
Management plan
Advocacy
Only 16% of
respondents are
currently accessing
university RDM
support services
Metadata
Only 18% of
respondents say
they follow
established
metadata guidelines
Public datasets
>70% recognise that
research is a public
good and should be
publicly released
Sensitive data
41% of respondents
have some form of
sensitive data
15. University services to support RDM
“Support is woeful in the university currently, in particular
long-term data archiving is critically required. Most of my
non-current data is rotting on CD's and hard-drives.”
16. University services to support RDM
“Please, individualise the support.Workshop are useless,
emails with information are useless, brochures are useless,
posters are useless.”
17. University services to support RDM
“Please, individualise the support.Workshop are useless,
emails with information are useless, brochures are useless,
posters are useless.”
18. Preservation of research data
“I currently spend about £1,200 pa on data storage from my
own salary. I have the highest data needs in my School, and
there is no plan in place for storing my data.”
21. Service workflow summary
21
Repository
Messaging
Preservation
service
Reporting
and
analytics
Archival
data
storage
National research data aggregation
Or
1.a. Researcher
deposits data
2. Data added to
aggregation
3. Data is automatically
preserved
4. Use of data and service
is monitored
7. Data stored long term
6. Researchers find and
reuse data
Institutional or external
services
5. Other services are
updated
3 July, 2018 Jisc CNI UK Data Management and Support
1.b. Record of data
external deposit
Layer
22. Pilot Alpha MVP
3 July, 2018 Jisc CNI UK Data Management and Support 22
24. RDSS Overview
Preservation
Systems
Multi-tenant
administration
Discovery User Interfaces and Portals
API
s
User
InterfacesUser
InterfacesUser
InterfacesTenant User
Interfaces
APIs
Jisc Reporting
API
s
API
s
API
s
Tenant
Storage
API
s
Jisc Repository Core Infrastructure
APIs
Metadata Store
Publish
Subscribe
Messaging
Service
Cloud Data
Storage
(Access and
Archival)
Tenant Repository,
CRIS and research
systems
Scholarly Communications,
Service APIs
25. Reactions to specific features - preservation & interoperability
Preservation option was met positively
and seen as a distinguishing feature
The preservation aspect of the service was met positively and could be seen as a potential distinguishing feature of a Jisc system against the
competition. A Jisc system would also need to be fully interoperable with a number of key internal and external systems.
Factor that could sit the system apart from
existing solutions and competition
CRIS systems don’t offer this currently, so
felt to offer a point of difference
CRIS systems- Symplectic, Pure
Orchid
Pub Router
Jisc Monitor
Internal systems (HR, finance/grant
management, data warehouse)
Altemetrics
Elservier/Clarivate
DataCite (for DOIs)
OpenURL resolver
Primo (library catalogue)
Interoperability was seen as an essential
feature.The following data links were
mentioned as existing currently/required
either through repository or linkedCRIS
3 July, 2018 Jisc CNI UK Data Management and Support 25
26. Demo
Pre-recorded short demo
showing data being uploaded
into a test Samvera repository
instance, with automatic ingest
into Preservica.
3 July, 2018 Jisc CNI UK Data Management and Support 26
https://www.youtube.com/watch?v=d-l1ARNUwWA&t=13s
29. 3 standard service options
End-to-end
service
Repository+
service
Preservation+
service
Service to be launched in
Q4 2018
+ is for integrations and
reporting
All 3 options include:
Financial benefits
Standards
Advisory
Network membership
3 July, 2018 Jisc CNI UK Data Management and Support 29
30. RDSS Priorities
»First priority is research data
› Research output (Article/Thesis etc.)
› Research data
› Research software/code
› Provenance metadata (method)
»But also…..
› Preservation systems tailored for
multiple digital objects and data
types
› Use cases and pilots for objects
beyond research data
3 July, 2018 Jisc CNI UK Data Management and Support 30
https://creativecommons.org/licenses/by/2.0/
https://www.flickr.com/photos/cogdog/
31. Preservation Challenges
»Automated preservation workflow
› Lack of resources
› Preservation sausage machine
»Interactive preservation workflow
› Work with the researcher and their data
»Appropriate Workflow
› What is an appropriate workflow?
32. Jisc national shared research platform
3 July, 2018 Jisc CNI UK Data Management and Support 32
Information sources
» Publications Router
» Publishers
» Crossref
» ORCID
» DataCite
» PubMed
» Sherpa policy tools
University systems
» (Single Sign-On,
Finance, HR..)
Information
destinations
» Google etc.
» Discovery services
» JiscCORE
(global OA aggregation)
» Jisc Monitor
(compliance checking)
» JiscCollections
» Funders systems
» OpenAIRE + for EU
Preservation
services
Reports and
dashboards
University X repository
Open Access publications
Research datasets
University Y repository
Open Access publications
Research datasets
University Z repository
Open Access publications
Research datasets
33. jisc.ac.uk
Except where otherwise noted, this work
is licensed under CC-BY-NC-ND
John Kaye
Jisc Digital Futures
John.kaye@jisc.ac.uk
3 July, 2018 Jisc CNI UK Data Management and Support 33
Research organisations have primary responsibility for ensuring that researchers manage their data effectively. They need established infrastructure and processes to ensure:
Retained EPSRC-funded research data is preserved for a minimum of ten years
Effective data curation is provided throughout full data lifecycle
Knowledge of publicly-funded research data holdings
Discoverability; recording of third party access requests
Notice and justification of access restrictions, for example ‘commercially confidential’
Awareness and use of relevant law, for example FOI
Awareness and compliance with research data policies
Adequate RDM resource allocation for example from quality-related research (QR) funding or research grants
Developed through stakeholder consultation with Jisc, universities, publishers and BL
Valuable process – now need to implement
Taskforce established as part of Tickell advice to government to recommend what’s required to make this happen
Responsibilities split across institution. Librarians, Research Office, IT, Archivists (role going forward)
More effective Research Data Management must happen to comply with Funder Mandates, ensure data is not lost, and to realise a whole range of positive benefits
A shared service (provided by Jisc) seems to offer a number of benefits:
Cost savings and efficiencies
Common approaches and practice – do this together
Research system standardisation and interoperability ( do it once rather than many times! , & also address it across essential systems so we can key once and share)
Address market gaps
We worked with a lot of people!
How and why we’ve got to where we are
Pilots
Worked with 16 pilot institutions of various sizes - need to get various groups together Library, RO, IT, Archives where they exist.
Created a procurement framework
Pilots
17 institutions
Cross spectrum use cases
Large and small
Collaborative development
Multiple suppliers and solution vendors
Drivers
More than £5 million investment over 2 years
Open access
Sector defined requirements
“R@R” co-design
Over half the HEI sector involved
Overview of system
Where are we now
Multi tenant user interface
Workflows
Where are we now
Multi tenant user interface
Workflows
First priority is research data
However research data is not good enough on it’s own for research publication the ideal is to store, or link to the complete research package
Research output (Article/Thesis etc.)
Research data
Research software/code
Provenance metadata (method)
Without this research is often not re-usable or repeatable.
But…..we are aware that we are providing preservation systems that are tailored to deal with all kinds of digital objects and data.
We are exploring use cases and pilots for objects beyond research data.
Automated preservation workflow:
lack resources (staff, skills, budget, time) to do a comprehensive job of preserving all their research datasets and will instead
want a low-cost, fully automated, 'black box' approach to digital preservation of at least some of their data.
They want a 'preservation sausage machine' whereby research data is fed in at one end and out of the other comes 'preservation packages' containing the research data in a form that is better described and structured for long-term usability.
Interactive preservation workflow:
Institution will want to work closely with both the Research Data and the Researcher as part of an iterative process of quality control and digital
This is a more interactive and resource intensive process than 'automated' preservation, but can yield better results and may be more appropriate for specific types of research or institution.
The most appropriate workflow to use will depend on many factors, e.g. the experience an institution has with digital preservation, the resources at its disposal, the research discipline or type of data involved, the requirements of the research funder, the institutions policy and so on.