Transaction Management in Database Management System
Establishing a UQ Research Data Management Service
1. Establishing a UQ Research Data
Management Service
Dr Rebecca Deuble
Scholarly and Communication Services, UQ Library, The
University of Queensland, Brisbane, Australia
2. Overview
• Background
• Research integrity issues at UQ
• New infrastructure – UQ Research Data Manager (RDM) system
• Roll-out of RDM – training, education, policy and procedures
• Benefits associated with the RDM service
3. University of Queensland, Brisbane
• Research-intensive
• 6 faculties and 8 research institutes
• 50+sites – hospitals, mines, farms
etc.
• 50,000+ students (4,600 HDR
students)
• Academic Staff (FTE) = 6,700
4. Images taken from: Todd, H., & Morgan, H. (2017, August). Why horror stories don’t lead to nightmares.
Paper presented at the IFLA World Library and Information Congress, Wroclaw, Poland.
5. Images taken from: Todd, H., & Morgan, H. (2017, August). Why horror stories don’t lead to
nightmares. Paper presented at the IFLA World Library and Information Congress, Wroclaw,
Poland.
7. Dropbox horror stories
College Professor: “I Lost Tons Of
Critical Files Because Of Dropbox”
(2013)
“Dropbox data breach: 68 million user account
details leaked” (2016)
9. Finding the data?
Vines, T.H. et al. (2014). The availability of research data declines rapidly with article age. Curr.
Biol, 24(1), http://dx.doi.org/10.1016/j.cub.2013.11.014
A report in Current Biology looked for the
data behind 516 ecology papers
published between 1991 and 2011.
“"In theory, the data still exist, but the time and
effort required by the researcher to get them to you
is prohibitive.” (Timothy Vines, University of British
Columbia, Vancouver, 2014)
10. Responsibilities of institutions/researchers
R7: Provide facilities for the safe and secure storage and
management of research data, records, and primary
materials and, where possible and appropriate, allow access
and reference to these by interested parties
R8: Identify and comply with relevant legislation,
regulations and policies related to the conduct of research
R20: Retain clear, accurate and complete records of all
research including research data and primary materials and,
where possible and appropriate, allow access and reference
to these by interested parties. Where required, maintain the
confidentiality of research data, records and primary
materials
11. UQ identified the need to develop a central infrastructure that
would:
1. Satisfy the university’s requirements under the “Code”
2. Encompass data creation and curation, as well as short and long term usage (whole
of research lifecycle system)
3. Support existing broad policy statements, such as the “UQ Research Data
Management Policy”
12. UQ Research Data Manager
PUBLISH
RECORD
Metadata
registry
F.A.I.R. data1
Archival
workflows
Auto storage
allocation
IMPACT
ACCESS
1. https://www.force11.org/group/fairgroup/fairprinciples
13. RECORD
Metadata
registry
Auto storage
allocation
ACCESS
1. Researcher completes a Data
Management Record (DMR) for their
project which includes the minimal
project metadata (e.g. project name,
collaborators).
2. Working data storage is auto-
allocated and made available to edit by
any authorized collaborator.
Accessible via a computer desktop
and/or via the cloud.
14.
15. PUBLISH
F.A.I.R. data
Archival
workflows
IMPACT
3. Unmanaged data can be migrated
to managed datasets, which can then
be linked to publications via the
institutional repository (UQ eSpace).
4. The system will be supported by
appropriate policies and procedures
(in accordance with the FAIR data
principles – Findable, Accessible,
Interoperable, Reusable).
16. Purposeful roll-out of RDM system
• To fully realise the potential of RDM
system, will establish a UQ Research
Data Management Service over next
12 months
• Aim is to roll-out the “Record” and
“Access” components of system
(working data storage) to high risk
schools/institutes first
• Major emphasis on training and
education
17. Research Data Champions
• To ease adoption and promote best-
practice RDM have identified number of
“Champions”
• Includes researchers who have been
actively involved in system pilot
• Will assist with outreach to own
groups/schools
18. RDM features to launch
• User documentation– explaining main workflows (e.g. how to request storage)
• User training – including liaison librarians (prior to launch)
• Project metadata harvesting
• Automated storage allocation – brokered by ITS in production system
• Access to project storage via “UQ mapped drive” for UQ users at all UQ sites
• Access to Cloud based interface and sync client (where appropriate)
• Persistent ID's for research projects (RAiD) and persistent ID's of users (ORCiD) will
be harvested where possible
19. Best-practice RDM standards
• Promote awareness through policy and supporting procedures
• Research data stored securely during research project and archived for minimum
required retention periods
• Data curation throughout whole lifecycle
• Published outputs will specify how to access any relevant supporting research data
• Metadata describing archived data will be published in institutional repository (UQ
eSpace)
• Data custodian “chain of command”
• Restricted data – published metadata will provide justification etc.
20. Benefits of a comprehensive RDM service
For research data:
Remains accurate, authentic, reliable, and complete
Retains integrity and research results may be replicated
Security is enhanced, minimising risk of data loss
Reuse is enabled by collecting critical metadata early
Available in accordance with FAIR data principles
Other benefits:
Funding, journal, regulatory body requirements are met
Research administration effort is kept to a minimum
21. Rebecca Deuble – Project Officer (RDM)
Email: r.deuble@library.uq.edu.au
Thank you for listening
Notes de l'éditeur
Today I’ll be talking about establishing a UQ research data management service at UQ – including the key integrity issues that prompted this overhaul, and development of central infrastructure at UQ, and supporting policy and procedures
A bit about myself: I am the project officer on the research data management project – responsible for developing and implementing this new infrastracture. I’ve been in this role since September last year, prior to that I was a PhD researcher within Human Movement Science at UQ – I believe this gives me a good insight to the research integrity problems that currently exist, and what we can do to try and improve education about data management among researchers
- UQ
Talk about conversations had with researchers over past 12 months
Large percentage using dropbox to share and store data – mostly those schools/institutes lacking infrastructure, using dropbox – easy, convenient
Cloudstor also commonly used at UQ – recently terms and conditions changed, and no longer suitable for human data. Makes sharing human identifiable data difficult
Physical data storage loss examples – losing all PhD data in backpack (withcomputer, hard drive etc.)
Storing on cloud based methods – means that university has no oversight over the data, which in the worst case scenario can mean that the data is never collected. Need a system that can allow university to easily track back working data that is associated with a publication – at the moment this process is very difficult for research integrity office. Often don’t know where the data is located, or in the case of a Phd student – stories of them leaving, and then deleting the data off their laptop, so can’t ever be found again.
- illustrated in report in current biology – looked at ecology papers published over 20 year period… more difficulty identifying the data the further back they went
Recently added responsibilities for institutions and researchers in the code of conduct
Good RDM is integral to ensuring research integrity – in order to ensure RDM is “good” institutions need to provide appropriate infrastructure, education and RDM support.
Currently UQ researchers can share data with interested parties but doing so via methods that are not secure – e.g. dropbox. Human identifiable data – no safe way to share via cloud at the moment – cloudstor not appropriate for human identifiable data.
Researchers are sharing data via dropbox because it is easy – non complicated, want to do thing quickly
Lack of infrastructure across university prompted development of this system (and lack of oversight).
Being backed by DVC-R (this project)
Go through each stage – and explain
Talk about different storage backend for human identifiable data/sensitive data verses normal research data.
Storage auto allocated will be based on type of data – identifiable human data all be on ITS
- High risk schools/institutes – those that are lacking infrastrure, access control mechanisms. These schools will be approached first – many of these have been part of pilot.
- Data champions include those already practising good data management, and those keen to start! Will be key in helping spread word about importance of data management, and promoting the RDM system. Without the “champions” the roll out will not be possible
NextCloud web interface/sync client – functionality turned off for human identifiable data/sensitive data – requiring data to be stored just at UQ (locally)
-will have an idea of retention periods required from project metadata will fill in
Data custodian will not ncessarily grant access directly to data but will define decision makers in absence of others, or for purpose of compliance
- Restricted data e.g. human identifiable data – conditions of consent for any data reuse given at time of data collection should be codified and stored with data
Admin effort is minimum once system and procedures/policy is put in place
Researcher can access system and control access by themselves.. No need for ITS intervention