This document discusses building a business case and policy to support a 10-year research data management roadmap at an institution. It provides examples of roadmaps from other universities that were informed by funder policies. It also summarizes the development of a roadmap, business model, and policy at the University of Southampton, including engaging stakeholders, surveying researchers, and estimating storage needs and costs over 10 years. Roadmaps are useful for engaging senior leadership and researchers to understand current practices and align them with policy.
Building a business case and institutional policy on a 10Y research data management roadmap: can it work?
1. Building a business case and
institutional policy on a 10Y
research data management
roadmap: can it work?
Steve Hitchcock and Wendy White, DataPool Project
JISC MRD Benefits and Evidence Workshop
Bristol, 29-30 November 2012
2. Roadmap – Business case – Policy?
Roadmap – Business case – Policy
Roadmap – Policy – Business case
Policy – Roadmap – Business case
This talk: Roadmap – Business case – (Policy)
3. Roadmap: short term (1-3 years)
Core components for this phase are:
• robust institutional policy framework
http://www.calendar.soton.ac.uk/sectionIV/research-data-management.html
• agreed scalable and sustainable business model for
storage
• working institutional data repository, piloted, and grown
over the short-to-medium term
• one-stop shop for data management advice and guidance
http://www.southampton.ac.uk/library/research/researchdata/
This and subsequent roadmap slides from
Brown, et al., Institutional data management blueprint, September 2011, p6-7
http://eprints.soton.ac.uk/196241/
4. DataPool
Building Capacity, Developing Skills, Supporting Researchers
October 2011
Policy and guidance Training Data repository
SharePoint
Doctoral Training
Centres
Graduate
& staff
training
services
Progress
Case studies + EPrints 3.3
• Imaging, 3D
•Geodata University Strategic
• ++ Research Groups
IDMB EPrints data apps
Informed Surveys of
by data practices
among academics
3-layer metadata
March 2013
Support for Data Capture/share with Assign
Developing/ Management Plans external sources, Large-scale DataCite
e.g. SWORD-ARM data storage DOIs
working with e.g.
JISCMRD Progress Byatt, D. (D.R.Byatt@soton.ac.uk)
Workshop Hitchcock, S. (sh94r@ecs.soton.ac.uk )
24-25 October 2012 White, W. (whw@soton.ac.uk )
Nottingham
http:/datapool.soton.ac.uk/
5. Roadmap: medium term (3-6 years)
Core components for this phase are:
•extensible research information management framework
to respond the variations in discipline needs
•comprehensive and affordable backup service
•effective data management repository model
• embedding data management training and support
6. Roadmap: longer term (6-10 years)
“aspirations will focus on providing significant benefits
realisation … mixed-mode of data management within
consortia or national framework”
Core components for this phase are:
•Coherent and flexible data management support
•Agile business plans for continual improvement
•Active participation in consortia and national framework
agreements
7. Cf Edinburgh Uni. RDM Roadmap
• Phase 0: August 2012 – January
2013: largely a planning phase, with
some pilot activity and early
deliverables.
• Phase 1: February – July 2013:
Initial rollout of primary services.
• Phase 2: August – January 2014:
Continued rollout; maturation of
services.
V1.0, November 2012
http://www.ed.ac.uk/polopoly_fs/1.101223!/fileManager/UoE-RDM-Roadmap201121102.pdf
8. IDMB Roadmap and DataPool
DataPool takes forward the first phase of the Roadmap.
Southampton has agreed a policy which is more specific and more legally
expressed than the more general policies represented by other institutions
There is no doubt that the requirement of the EPSRC, and by extension RCUK
policy, exercised significant influence on Senate’s decision to adopt the policy. It
has been emphasised throughout the consultation that the policy is intended to be
part of an iterative process, which assumes that policy and practice will evolve in
response to changes in the external environment and experience internally of
managing data more effectively.
From Brown, DataPool Update Report for 1st Steering Group Meeting, May 2012
9. Policies, Strategies, Roadmaps
Research360 (Bath University) roadmap: response to EPSRC’s
expectations is important
St Andrews: when the EPSRC roadmap work was completed, as with Bath, it
helped to demonstrate the relevance of RDM to diverse areas of institutional
activity.
Edinburgh: Next steps include attaching costs, both in terms of person-time
and financial, to the actions specified under their EPSRC roadmap
‘Institutional Policies, Strategies, Roadmaps’ session, JISC MRD and DCC IE workshop,
Nottingham
http://mrdevidence.jiscinvolve.org/wp/2012/11/12/institutional-policies-strategies-
roadmaps-session-at-jisc-mrd-and-dcc-ie-workshop-nottingham/
10. data.bris business case development
After the success of our Stage 0 business case … Some of the
detail we’ll need to include in our Stage 1 business case is
known, some is almost known and some is still unknown
but all of it is currently in a jumble of mixed-up strata
http://data.blogs.ilrt.org/2012/10/29/data-bris-business-case-development/
11. IDMB business model
The business model is applied only to the IT services delivery of an
infrastructure (technology & people) to deliver and sustain an institutional
repository for the University‟s digital assets.
The basic assumption will be that “the University wishes to provide a
secure and sustainable repository capable of hosting the
University’s entire digital assets”.
The University today offers ~200TB of secure storage for research data. This
service currently hosts ~120TB of research data.
The IDMB researcher survey suggests that the University centrally hosts some
10-15% of the research data of the surveyed researchers. If representative of
the University as a whole, we can estimate that the University digital research
data assets are of the order 0.8-1.2PB.
This and all subsequent IDMB slides from
Brown, White, Parchment, Institutional data management blueprint,
September 2011, Appendix B, p11-21 http://eprints.soton.ac.uk/196241/
12. IDMB est. growth in research data
Estimated Growth in Research Data Assets %CAGR*
“Content Depot” (High) 121
“Content Depot” (Mid) 76
“Unstructured Data” (Low) 62
*compound annual growth rate
14. IDMB cost modelling
for Tape Based Repository (Scenario 1)
2011/12 2012/13 2013/14 2014/15 2015/16
Totals £677,563 £396,815 £899,307 £678,931 £757,448
for Disk Based Repository (Scenario 2)
2011/12 2012/13 2013/14 2014/15 2015/16
Totals £981,848 £580,152 £2,392,436 £1,051,539 £1,315,037
Overall the cost difference in the two scenarios over 5 years is estimated to
be £2.9M or £600K per annum in favour of a tape based model.
The per TB price reduces significantly over the 5 year period with the
exception of year 3 where the upcoming (year 4) end of life of the disk
requires a major procurement. Smoothing out this cost hike will be
necessary to provide a sustainable model.
15. IDMB: how to pay for RDM
•A default allocation available to all researchers, with any requirements
in excess of the default provided as a chargeable service.
• A free-at-point-of-use service for the entire institution regardless of
need, which of course does not require any further analysis.
Assigning Costs
The basic requirement of the research data repository is to store the data
forever, which would suggest an operational model to cost recovery, i.e.
the storage is priced on a monthly or annually. However, research
projects overwhelmingly are funded capitally. This disconnect has been
and continues to be problematic, but is exacerbated by the need to store
research data for decades, when most research grants and contracts are
finished in 5 years or less.
16. Summary
Roadmap particularly useful for:
• engaging VC, PVCs, Deans, Senior PIs
• knock-on impact of senior staff engaging with other networks, e.g.
UUK, RUGIT and members/leaders of funder boards/themes
• engaging Faculty champions and key innovators and enthusiasts - likely to be
"brokers" in knowledge management terms with overlapping communities.
This links to our work through our multi-disciplinary University Strategic
Research Groups (USRGs)
Roadmaps are tools for increasing the understanding of current research
activity - our IDMB Roadmap and Policy was informed by the cross-
disciplinary survey of staff, and you can map current activity back to
policy/roadmap
- Wendy White, correspondence
Notes de l'éditeur
In this talk we want to connect the development of roadmaps with the business case and policy for making progress with research data management at an institutional level. Taking the IDMB example with others, this seems like a logical sequence, but in practice this is not always the case. At Southampton we have a roadmap and an official institutional research data policy, but the business case is still to be approved. Other institutions appear to have begun with a policy. Here we will focus on the roadmap and business case rather than policy.
If the IDMB project elaborated the roadmap, DataPool represents progress along the first part (18 months) of the first phase (3 years) of the plan, and is beginning to fill in components of the map, as can be seen by the links in this slide.
For reference, this is a recent poster designed to show graphically the full scope of the DataPool Project. It shows the characteristic tripartite approach of this and comparable JISC institutional RDM projects: policy, training, and technical infrastructure (data repository and storage services).
This middle part of the Southampton RDM roadmap looks like it may have been the trickiest part of the map to elaborate. It’s not imminent and depends on outcomes from the first stage; on the other hand, it’s not that far away that we don’t need to be aware and making plans for it. As seen in this extract, it is essentially describing refinements of many of the expected developments from stage 1.
If looking ahead is trickier than framing immediate work, this final phase looking up to 10 years ahead might have been hardest to describe. It is, however, more aspirational in tone and less inclined to deal with specifics, and seems more appropriate for adopting that approach.
A recent and interesting comparison with the Southampton RDM roadmap is that from Edinburgh University. Edinburgh has a target completion date of early 2014, a startlingly short roadmap compared with a 10Y example. The two are not directly comparable, of course. The Edinburgh case looks to be a well specified, well structured and comprehensive first phase and can be commended for that. Whether it is achievable within the time and resources specified we cannot judge yet. The illustration reproduced here is a helpful representation of the plan – at least, it is once you’ve read the plan.
This extract connects the first progress report of the DataPool Project, by then-PI Mark Brown, with the roadmap and policy. It makes the clear point that research funder requirements (EPSRC, RCUK) had an important influence on adoption of the policy at an executive level, even if some discussion at this JISC MRD Benefits Meeting was around whether supporting compliance with such requirements can usefully be presented to researchers as a ‘benefit’.
Other JISC MRD projects that have roadmaps have similarly emphasised the importance of EPSRC requirements on the production of the roadmap.
Now we move on to the second part of the talk, the business case. The data.bris project from Bristol University was presenting in the same session at this event, so we will spare the detail here, but this extract from a recent blog post by the project illustrates some of the imponderables, Donald Rumsfeld-style, of forming a business case for RDM.
We are heading towards the critical part of this presentation, the financial numbers. First some context. This case covers just the technical infrastructure – IT services – not the wider factors outlined by data.bris. This business model has been updated and presented at the University of Southampton and, as we have already indicated is currently undergoing further revision with a view to official acceptance. The assumption stated here is not based on the university’s current research data policy, which requires a record of all data produced in the course of research at the institution rather than full data deposit, and can’t be said therefore to have stopped short, so far, of accepting the business case for supporting the costs of the policy. The data on usage of storage services and projected usage are the basis for the financials that follow.
In the style of the financial services industry, given there are a number of uncertain factors to accommodate in projections of the growth of storage requirements, this chart attempts to draw upper and lower bounded curves to underpin the calculations.
This illustration also comes directly from the IDMB report. Allowing that the metadata should ideally attach to both active and archive layers, the cost factors introduced here are access bandwidth latency and storage technology. The basic choices considered are between more expensive and faster access disk storage, and slower tape stores.
Now we get to the actual financial numbers resulting from this analysis. The number that stands out is Y3 in the disk-based scenario, which not only rises above £1M for the first time but gets closer to £2.4M. Subsequent annual costs shown here remain above £1M for this scenario. The slower tape-based costs are always lower.
Having identified the numbers, the critical decision is how to pay for it. This was an important issue for the second DataPool Steering Group meeting recently. A full free-at-point-of-use service may be the simplest if most expensiveoption for the institution, but it has been strongly argued that RDM must be viewed as a direct cost of research, and funded accordingly. The dilemma for institutions is how much to invest in infrastructure directly, compared with leaving projects to raise additional costs for data management and risking research bids becoming less competitive than those from institutions with more generous direct support.
In summary, Roadmaps are useful for focussing discussion on research data management at an institutional level, and for engaging other stakeholders across all disciplines. Given that a roadmap should be based on prior consultations with those stakeholders, it follows that further interaction with the roadmap should lead to further change and consultation. The roadmap must therefore be used a living document. Southampton has not yet finalised its business case for supporting RDM, but it has established a process through engaging with the roadmap in the first instance.