This document provides an overview of a webinar on digital curation and research data management for universities. The webinar covers an introduction to digital curation, the benefits and drivers for research data management, current initiatives in UK universities, and the role of libraries in supporting research data management. Libraries are increasingly involved in developing institutional policies, providing training, and advising researchers on writing data management plans and sharing data. The webinar highlights training opportunities for librarians to develop skills in research data management and digital curation.
Handwritten Text Recognition for manuscripts and early printed texts
RDM LIASA webinar
1. Digital curation: why managing and
sharing data matters to universities
Sarah Jones and Joy Davidson
Digital Curation Centre
sarah.jones@glasgow.ac.uk
joy.davidson@glasgow.ac.uk
a LIASA HELIG webinar, 30th April 2013, www.liasa.org.za/node/977
2. Digital Curation Centre
Jisc-funded consortium comprising units from the
– Universities of Bath (UKOLN)
– Edinburgh (DCC Centre)
– Glasgow (HATII)
Launched 1st March 2004 as a national centre for solving challenges
in digital curation that could not be tackled by any single institution or
discipline
3. Overview of session: four brief modules
1. Introduction to digital curation – how does research
data management fit into the curation lifecycle?
2. Benefits and drivers for research data management
3. Review of current research data management
activity in UK Universities
4. What role does the library have to play in research
data management?
4. Please feel free to ask questions at any
time!
• During the session you can ask questions.
Simply type these into the chat box.
• Questions will be gathered and speakers will
respond to selected questions at the end of
each module.
• There will be a chance for additional questions
at the end of the session.
6. An introduction to digital curation
• What is digital curation?
• What is the difference between
curation, preservation and data
management?
• What sort of activities are involved in
digital curation?
• Who should be involved in digital
curation? 6
7. “the active management and appraisal of data
over the lifecycle of scholarly and scientific interest”
Data have importance as the evidential base
of scholarly conclusions
Curation is part of good research practice
What is data curation?
8. Are data curation, preservation and
management different?
• Lots of different terms being used - are the
they same or different?
• Essentially, they are all part of the curation
lifecycle
10. Key questions to consider:
• what data will be created?
• how much storage is needed?
• where will data be stored in the short and longer term?
• are there ethical issues that require consent?
Many funders expect data management & sharing plans at the
grant application stage!
Data Management Planning
11. Key questions to consider:
What information do users need to understand the data?
- descriptions of all variables / fields and their values
- code labels, classification schema, abbreviations list
- information about the project and data creators
- tips on usage e.g. exceptions, quirks, questionable results
How will this capture this and who will capture/record it?
Are there standards that need to be followed?
Metadata & documentation
12. Key questions to consider:
• What data must be kept? (for validation, etc)
• What must not be kept? (e.g. personal data)
• Is it worth keeping the data? – cost/benefits
• Where will the data be kept?
Selecting what to keep
13. Storing data
Key questions to consider:
What amount of storage is available for the
active phase?
What facilities are needed in the active phase?
- remote access to work from home
- file sharing with others
- high-levels of security for sensitive data
How will the data be backed up?
Where will data be stored for the longer-term?
14. Institutional data repositories
Not intended to
replace
national, subject or
other established
data collections
Acknowledge hybrid
environment
http://datashare.is.ed.ac.uk
www.dspace.cam.ac.uk/
https://databank.ora.ox.ac.uk
Essex-RDR and
DataPool at Southampton
15. External data centres
Research funders’
data centres…
List of data centres:
http://databib.org
Structured databases
Disciplinary&
community
initiatives
16. Finding and reusing data
Key questions to consider:
How can researchers make their
data visible and citable?
17. Data catalogues
Develop a research data
extension to the cerif standard
JISC & DCC planning
National coordination
http://cerif4datasets.wordpress.com
18. Who should be involved in curation?
Research
Organisations
Funders
Data centresAdvisory
bodies
Support services
Researchers
Publishers
23. Expectations of public access
“Publicly funded research data are a public
good, produced in the public interest, which should
be made openly available with as few restrictions as
possible in a timely and responsible manner that
does not harm intellectual property.”
RCUK Common Principles on Data Policy
http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
26. Benefits of data sharing (1)
www.nytimes.com/2010/08/13/health/research/
13alzheimer.html?pagewanted=all&_r=0
“It was unbelievable. Its not science
the way most of us have practiced in
our careers. But we all realised that
we would never get biomarkers
unless all of us parked our egos and
intellectual property noses outside
the door and agreed that all of our
data would be public
immediately.”
Dr John Trojanowski, University of Pennsylvania
... scientific breakthroughs
27. Benefits of data sharing (2)
www.guardian.co.uk/politics/2013/apr/18/
uncovered-error-george-osborne-austerity
... validation of results
“It was a mistake in a spreadsheet that could have
been easily overlooked: a few rows left out of an
equation to average the values in a column.
The spreadsheet was used to draw the conclusion
of an influential 2010 economics paper: that public
debt of more than 90% of GDP slows down growth.
This conclusion was later cited by the International
Monetary Fund and the UK Treasury to justify
programmes of austerity that have arguably led to
riots, poverty and lost jobs.”
28. Benefits of data sharing (3)
“There is evidence that studies that make their
data available do indeed receive more citations
than similar studies that do not.”
Piwowar H. and Vision T.J 2013 "Data reuse and the open data
citation advantage“ https://peerj.com/preprints/1.pdf
9% - 30% increase
... more citations
29. Why YOU need a Data
Management Plan
http://blogs.ch.cam.ac.uk
/pmr/2011/08/01/why-
you-need-a-data-
management-plan
Direct benefits to individuals
30. “Research organisations will ensure that effective
data curation is provided throughout the full data
lifecycle, with ‘data curation’ and ‘data lifecycle’ being
as defined by the Digital Curation Centre. The full
range of responsibilities associated with data curation
over the data lifecycle will be clearly allocated...”
www.epsrc.ac.uk/about/standards/researchdata/Pages/expectations.aspx
...institutional responsibility
31. Research funder data policies
www.dcc.ac.uk/resources/policy-and-legal/ overview-funders-data-policies
32. Ultimately funders expect:
• timely release of data
- once patents are filed or on (acceptance for) publication
• open data sharing
- minimal or no restrictions if possible
• preservation of data
- typically 5-10+ years if of long-term value
See the RCUK Common Principles on Data Policy:
www.rcuk.ac.uk/research/Pages/DataPolicy.aspx
33. Jisc MRD programmes
Managing Research Data programmes funded by the Jisc:
• MRD 01: October 2009 – July 2011
– £4.3 million investment
– www.jisc.ac.uk/whatwedo/programmes/mrd.aspx
• MRD 02 – October 2011 – July 2013
– £4.6 million investment
– www.jisc.ac.uk/whatwedo/programmes/di_researchmanage
ment/managingresearchdata.aspx
Programme Manager: Simon Hodson s.hodson@jisc.ac.uk
Twitter: #jiscmrd
34. The DCC Mission
“Helping to build
capacity, capability and
skills in data management
and curation across the
UK’s higher education
research community”
Phase 3 Business Plan
www.dcc.ac.uk
35. DCC Institutional Engagements
With funding from HEFCE we’re:
• Working intensively with 21 HEIs to increase RDM capability
– 60 days of effort per HEI drawn from a mix of DCC staff
– Deploy DCC & external tools, new approaches & best practice
• Support varies based on what each institution wants/needs
• Lessons & examples will be shared with the community
www.dcc.ac.uk/community/institutional-engagements
37. Common DCC IE activities
• Establishing steering groups
• Making the case for RDM
• Assessing needs
• Developing policy and strategy
• Piloting tools
• Offering DMP consultations
• Delivering training
• Setting up guidance websites
• ...
42. Early research data policies
“Statement of commitment”
Infrastructure policy
“10 commandments”
mutual promises
aspirational
Baseline of RCUK Code
+ procedures & support
legal tone / language
a section in uni DM policy
useful guide as appendix
Based on Edin.
with a few
additions
43. RDM strategies and roadmaps
A series of blog posts
www.dcc.ac.uk/news
Links to example roadmaps
http://tiny.cc/EPSRCroadmaps
44. University of Bath RDM roadmap
• Based on Monash University RDM strategy
• Identifies the current position and proposes activity
• Defines roles and responsibilities and timeframes
http://www.bath.ac.uk/rdso/University-of-Bath-Roadmap-for-EPSRC.pdf
48. Data Management Planning support
• Guidelines / templates on what to include in plans
• Example answers, guidance and links to local support
• A library of successful DMPs to reuse
• Tailored consultancy services
• Online tools (e.g. customised DMPonline)
• Links / flags embedded in grant systems
• ...
49. Research data storage
Blue Peta at Bristol1st 5TB free per Data Steward then
£400 per TB p.a. for disk storage;
tape backup £40 per TB
http://data.bris.ac.uk
• £2m funding to date
• Petascale facility – expandable
• 3 machine rooms – resilience
(tape archive 2012)
• Available to all researchers for
research data
50. Institutional data repositories
Not intended to
replace
national, subject or
other established
data repositories
Acknowledge hybrid
environment
http://datashare.is.ed.ac.uk
www.dspace.cam.ac.uk
https://databank.ora.ox.ac.uk
Research Data at Essex and
DataPool at Southampton
51. Data catalogues
Develop a research data
extension to the cerif standard
JISC & DCC planning
National coordination
http://cerif4datasets.wordpress.com
52. Bringing it all together into a service
Diagram courtesy of Sally Rumsey, University of Oxford
53. THE ROLE OF THE LIBRARY
– RE-SKILLING FOR DATA CURATION
54. How are libraries engaging in RDM?
Library
IT
Research
Office
The library is leading on most DCC institutional engagements
www.dcc.ac.uk/community/institutional-engagements
They are involved in:
defining the institutional strategy
developing RDM policy
delivering training courses
helping researchers to write DMPs
advising on data sharing and citation
setting up data repositories
...
55. Why should libraries support RDM?
• existing data and open access leadership roles
• often run publication repositories
• have good relationships with researchers
• proven liaison and negotiation skills
• knowledge of information management, metadata...
• highly relevant skill set
56. Possible Library RDM roles
• Leading on local (institutional) data policy
• Bringing data into undergraduate research-based learning
• Teaching data literacy to postgraduate students
• Developing researcher data awareness
• Providing advice, e.g. on writing DMPs or advice on RDM within a project
• Explaining the impact of sharing data, and how to cite data
• Signposting who in the Uni to consult in relation to a particular question
• Auditing to identify data sets for archiving or RDM needs
• Developing and managing access to data collections
• Documenting what datasets an institution has
• Developing local data curation capacity
• Promoting data reuse by making known what is available
RDMRose Lite
57. Training for librarians
• RDM for librarians, DCC
http://www.dcc.ac.uk/training/rdm-librarians
• RDMRose, University of Sheffield
http://rdmrose.group.shef.ac.uk
• Data Intelligence for librarians, 3TU, Netherlands
http://dataintelligence.3tu.nl/en/about-the-course
• DIY Training Kit for Librarians, University of Edinburgh
http://datalib.edina.ac.uk/mantra/libtraining.html
• SupportDM modules, University of East London
http://www.uel.ac.uk/trad/outputs/resources
58. RDM for Librarians
• 3 hour course by the DCC covering:
– Research data and RDM
– Data management planning
– Data sharing
– Skills
– RDM at [INSERT YOUR UNI]
• Slides and accompanying handbook
• Used UKDA guide as pre-reading
• http://www.dcc.ac.uk/training/rdm-librarians
59. RDMRose
• Taught and CPD learning materials in RDM tailored
for information professionals, by the Uni of Sheffield
• 8 sessions, each of which is half day of study
• Strong emphasis on practical hands-on activities
• Also offer a short (2hr) course called RDMRose Lite
• http://rdmrose.group.shef.ac.uk
60. Data Intelligence for Librarians
• A course produced by 3TU, a consortium of technical
universities in the Netherlands
• Combination of online and face-to-face education
• Four meetings to learn and share knowledge
• Theory (on website) and assignments are conducted
between sessions
• http://dataintelligence.3tu.nl/en/home
61. DIY Training Kit for Librarians
• By EDINA and Data Library at University of Edinburgh
• Self-directed course, intended to be used by a group of
librarians to build confidence in supporting researchers
• MANTRA modules as pre-reading, short
presentation, reflective questions and exercises to guide
discussion
• Five face-to-face sessions
– Data Management Planning
– Organising and documenting data
– Data security and storage
– Ethics and copyright
– Data sharing
62. SupportDM
• By the TraD project at the University of East London
• SupportDM comprises five sessions
– About research data management
– Providing guidance and support for researchers
– Data Management Planning
– Selecting which data to keep
– Cataloguing and sharing data
• Each topic is introduced in a face-to-face session and
explored via exercises and discussion
• Learning is reinforced via an online tutorial and practical
exercises to do before the next session
• http://www.uel.ac.uk/trad/outputs/resource
63. Thanks – any questions?
DCC guidance, tools and case studies:
www.dcc.ac.uk/resources
Follow us on twitter:
@digitalcuration and #ukdcc
Notes de l'éditeur
Data storageoccure during both the active phase of research and for longer-term preservation. If there isn’t a data repository in your institution, check to see if there are any external subject based repositories that might be a suitable home. BUT REMEMER! If you are planning to deposit data in a repository, check repository policies on the formats that are accepted before you begin. Make sure that any normalisation procedures will not affect the usability of the data.