Transcript of Webinar: Data management plans (DMPs) - audio

Webinar: Data Management Plans (DMP)
15 Feb 2017
Speakers:
Katheryn Unsworth (ANDS), Natasha Simons (ANDS), Nick Smale (University of
Melbourne)
START OF TRANSCRIPT
Kathryn Unsworth: Welcome everybody. We're going to, obviously, talk about data
management plans today. Well, I best introduce myself, first. Kathryn
Unsworth, data librarian at ANDS, out of Melbourne. The next slide is
the three circles that ANDS has, in relation to research data assets -
making research data assets more valuable for researchers and
research institutions in the nation. We do that through our trusted
partnerships with various communities related to research. Also, in
terms of our reliable services, such as Research Data Australia, the
Research Vocab service, as well. The DOI minting services and also
in terms of enhancing capabilities, so building capability within the
research space around data management.
This webinar is part of that building data management capability for
our Australian institutions and, obviously, research data management
plans are a key element in this. We've got three presenters today. Oh,
well - yes, we do. Myself, in Melbourne, Natasha in Brisbane and
we've also got Nick Smale from the University of Melbourne, who is
going to re-do his talk that he did at the eResearch DMP BoF. Now,
I'm not able to show the slide of the DMP ANDS webpage. On that
page, there are a number of tiles with various topics that fall under the
topic - within data management plans and you can click out to those
and get a lot more information on that there.

of 14
Moving on, obviously, today's topic is about research data
management and I've spoken initially about how we've organised it
previously. But due to the numbers - which is quite exciting that
there's so many people really interested in this topic - we've decided
to break it up into three parts. We'll have talks - one from Natasha,
giving us an overview of DMPs and an intro to second generation
DMPs, DMP Birds of a Feather recap that we did at eResearch
Australasia. Nick will slot into that particular talk, as well, and he'll talk
about DMPs at the University of Melbourne and also highlight some
case studies that we've put together - or use cases, actually, not case
studies.
Then it will be open mike time and that's when we'll be expecting all of
you guys to come in and provide some comment and talk about the
issues that you are having in your own institutions. What you're doing
in your institutions, in terms of DMPs and the challenges, and any
exciting news around that space as well, would be really welcome.
Then, from there, we'll do - we want to talk about the possibility and
the interest of actually initiating a DMP community of practice. We'll
get to that at the end of our talk. But, again, just a reminder, tweet
#andsdata, questions to put in the question pod. I'll just throw over to
Natasha for her talk.
Natasha Simons: Okay. Yeah, so we are really delighted to have a lot of people
attending this webinar and I just wanted to say a special welcome to
the people attending from ALIA Information Online in Sydney and a
special thanks to Liz Stokes at UTS library and also to the ALIA
executive for helping to make that happen. They've got a special room
there, where they're tuning in, so that's really exciting. But I think when
Kathryn and I looked at the registration list we realised that there's a
really large variety of backgrounds represented in the people
attending this webinar, so I'm just going to do a very short overview of
DMPs and DMP tools and then look at some of the characteristics of
DMP, version 2.0.

of 14
What's a research data management plan? Well it's a formal
document that describes how data will be collected, organised,
described, shared and preserved through the course of a research
project and beyond. Data management plans are structured to provide
needed information about the kinds of data collected, the formats,
descriptions, how long the data will be retained, in what manner the
data will be disseminated and how data will be preserved over the
long term.
If you want to learn more about data management plans, I've put in
the website - sorry, the link to the ANDS website, which includes a
guide on data management plans. Also, if you haven't already, you
can actually undertake more of a look at data management planning
tools and so forth through Thing 15 of the ANDS 23 (research data)
Things program, and I've put the link in there. Thing 15 actually
sparked quite a lot of interesting reflections and discussions, both in
person and on the online meetup boards and I'm hoping that some of
the people who contributed to that discussion will share their thoughts
at this webinar today.
Why do we have data management plans or why do we need them?
There's a carrot. The carrot is that well organised and structured data
and that's what you have to do when you write a data management
plan, is easier to access, analyse, store securely, describe fully and
share publically at the end of a project or even during a project. The
stick is that data management planning is actually required by the
Australian Code for the Responsible Conduct of Research. Some
funders, particularly international funders such as the National
Science Foundation mandate the completion of data management
plans.
There are also institutional reasons for data management plans. One
of them is so that institutions can keep a registry of who at their
institution has got funds for collecting data and, therefore, that will
help them if people fill out a data management plan on how they can
actually plan their resources at their institution to match the needs, as

of 14
reflected in the DMPs. Also, to reduce risk associated with
unorganised data collection. So, basically, if someone, a researcher at
an institution, is asked to verify the results of their findings, they need
to be able to produce the data. Having a data management plan does
help researchers to think about that process and to plan for that
eventuality.
There are also - institutions are thinking of ways to have added
incentives to researchers for filling out data management plans. The
University of Colorado Boulder had a DMP competition in 2015 and
they put up the winners on the website there, and they've actually got
a variety of disciplines represented in the winners of that DMP
competition, so it's worth having a look at that. Also, at Curtin
University, data management plans are mandated for researchers, if
you're a HDR student, if you required human or animal research
ethics approval and if you want access to data storage at Curtin.
There's a range of DMP tools available, but probably DMPonline by
the Digital Curation Centre in the UK is probably the most popular and
the most used by institutions worldwide. But there are others and I am
not attempting to make any sort of list here, but I'm just mentioning
QCIF - which is the Queensland Cyber Infrastructure Foundation - has
a platform called ReDBox, which is used by a number of Australian
universities and includes a DMP module. At International Data Week
in the USA in September last year there was some interesting
discussions about moving to the next generation of DMP tools, just
nicknamed DMP, version 2.0.
This picture shows an afternoon tea that was put on at one of the IDW
events. Basically, you take your apples and you dip them in the
peanut butter and it's surprisingly delicious. By the way, this is the only
time you will see apples and peanut butter on an ANDS slide.
[Laughs] But, for me, there's an analogy here to make and that is that
apples represent the first version of data management planning tools
and when you dip them in the peanut butter, you get version 2.0.
Explaining in a little bit more detail, the apples or first version of DMP

of 14
tools are, basically, just a PDF or Word document. Something that's -
they are not connected to any other system at the university, they're
sort of just stand alone, fill out this form, type things.
You complete them at the start of a research project, and then that's it,
you walk away and you've done your DMP now. The outcome is not
measured, so we don't know if a researcher did what they said they
were going to do in their data management plan. The DMPs are not
machine readable, mainly because they're just in that PDF or Word
document. They're also private, so it's only researchers and the
institutions who can actually see the data and the DMPs, they're not
shared. There are some questions around the effectiveness of that.
Do they just prove that researchers can fill out a form or do they prove
that researchers are actually thinking about what to do with their data?
Is it a way of prompting them to consider things that they wouldn't
have considered if they didn't fill out the data management plan?
There's no follow up, again, on whether you did what you said you
would do. Okay, so you get the apples and you add the peanut butter
and you get DMP, version 2.0. The idea of this - there's some work
being done for a project called EAGER - E-A-G-E-R - which is led by
Victoria Stodden and funded by the National Science Foundation. I've
put a link to her talk at the bottom of the slide, there.
(http://web.stanford.edu/~vcs/talks/RDA-DMP-2016-STODDEN.pdf)
In that, she is looking at the next generation of data management
planning tools. Some of the characteristics of the 2.0 versions are that
they are public documents. There's actually some debate around
whether DMPs should be public or not and its sort of, well, they should
be public - the arguments for being public are so that people are
accountable with what they say they're going to do. The arguments
against are more along the lines of, well then people can simply copy
one of the public ones and make that their own.
DMPs, version 2.0 are also something that's measurable. Did you do
what you said you were going to do in your data management plan?

of 14
They're ones which are connected to at least one system, plans which
are also machine readable, and the richness in that is that you can
mine information from them. Institutions will be able to get some
information by using the machine readable access to find out what
their researchers are going to do with their data and, therefore, put
resources into supporting that end, basically.
Also, that the data that is described in the DMPs is consistent with the
FAIR principles - findable, accessible, interoperable and reusable.
Also, the concept that DMPs are a flexible, living document. You don't
just create them once, at the start. You're going to, through the course
of your research project, actually rethink some of the things that you
thought at the start and therefore you go back to the data
management planning tool and say, oh, I've decided to store my data
here and not there.
This idea of machine readable DMPs - and the EAGER project was
actually something raised by Chris Erdmann from North Carolina State
University at the eResearch Australasia Birds of a Feather session.
I'm going to hand over, now, to Kathryn, who's going to talk to you a
bit more about that.
Kathryn Unsworth: This is what they called the BoF. DMPs aligning use to motivations
and intended outcomes. Part of the abstract was to look at the
mechanisms for researchers to state their intentions on how they
would manage their data across the lifecycle were. We looked at - we
were hoping to have a look at the agents and motivations and how
they are different. There was a number of use cases that we came up
with, to examine and interrogate, which Nick will talk about a little later
on. But we were looking at the multiple agents of funding bodies, to
encourage data sharing.
The main thing here is to look at the questions here, in terms of; why
implement a DMP tool? Does DMP use align with an agent's
motivations? Also, more importantly, with intended outcomes, what
are the expected outcomes? And enterprise level DMP tools, one

of 14
size fits all, what is their place in the landscape? And is best practice
for researchers an aim or a hoped for by-product? The first speaker
that we had up was - as Natasha mentioned - Chris Erdmann who is
the chief strategist for research collaboration at North Carolina State
University. He first of all talked about the services at North Carolina.
There's an article about this DMP service, written by Chris and David,
around what they're doing at North Carolina State University.
At the moment - it's probably a really useful article to read, but they're
offering a DMP review service. They're like actually help researchers
review their DMPs. He also went on to talk about the future around
machine readable DMPs, which Natasha has already talked about,
and the EAGER project, which is basically - this is really allowing
funders to identify trends in data and software submission, repository
use patterns and carry out other analysis that consist in understanding
community use patterns and needs. That's also, if you take it from an
institutional perspective, something that's quite interesting for
institutions to have that kind of information, too.
He also - as Natasha has also spoken about - actually publishing
DMPs, so that they are more transparent and accountable. He gives
an example here of the DMP for a more investigatory data driven
discovery grant. Also, part of his talk was about access plans - public
access plans. Not opposed to DMP plans, but just a different
approach to how we would accumulate the sort of information that we
need from what researchers are doing within their projects and the
creation of data.
Our second speaker was Sue Cook from CSIRO I've just put up her
goals slide. Helping the research group to reach document and
communicate data management decisions. Obviously, whenever
we're talking to researchers, we talk about it being a live document.
We're also very interested in the interoperability between systems, so
being able to push metadata from existing - well, pull metadata from
existing systems and then pull that metadata to other systems, as
well.

of 14
Sue talked about guided questions, which is basically scaffolding the
process of filling out a research data management plan for
researchers. It's providing them with some guidance, as they go. Also,
minimum mandatory questions and also conditional questions. Where,
if you answer this question then you need to answer the next five
questions or you don't have to answer the next five questions. She
also spoke about researcher driven. Their engagement with
researchers was quite strong in the work that they're doing with
implementing it and - well developing and implementing their DMPs.
She was talking about future aspirations. They're not there yet with the
full integration into organisation project proposal and planning
systems. Also, about metadata cascade, which is a term that came up
at UQ, evidently, through all data management ecosystems, so
metadata being reused for the data repository, metadata reused for
storage provisioning requests and so on. Also, she spoke about
machine-actionable, which is obviously a pretty hot topic around
DMPs, and persistent URLs.
Then we had Libby Blanchard from Central Queensland University
who - her - the essential tenet, I think, of her talk was around the
working party and the fact that the working party had representatives
from the library, from IT, from the research office, eResearch and also
risk management and ethics, as well. It was quite a broad working
party. They're, basically, looking through all of the issues around - in
terms of implementation. Which tool do they choose, to start with, and
how they actually, then, link that to policy and procedure.
They have a policy in place at the moment that actually mandates the
completion of DMPs and then - in terms of the policy and procedure,
socialising that across the university - the complexity that that involves
and the work that that involves, as well. Then, looking at the actual
way they would present the DMP in terms of user experience and all
of that sort of stuff. Then, of course, the big ticket item is the systems
integration, which is still a way off for them, obviously. They're very
much at the beginning of this process.

of 14
Now, I'll pass over to Nick to talk about what's happening at the
University of Melbourne.
Nick Smale: Fantastic. Great to be here. I was just going to talk a little bit about the
University of Melbourne DMP and the process we went through in
making the new DMP. I should just start off by saying that [Peter
Niche] is really leading this effort at the university, but I'm just here
putting my own views forward. The University of Melbourne developed
a DMP in 2011 and, briefly, it contained two forms - two separate
forms that researchers had to look at, about 90 separate questions
that researched had to fill in.
The DMP template alone had 3,500 words in it that you had to read
and there was a 12,000 word, 40 page guidance document called
Procedures and Guidelines for the Management of Research Data
and Records that you were supposed to read to complete this
document. It was also, according to policy, mandatory, although
there's very little evidence of any researchers actually doing it of their
own free will. It also had no definite, stated purpose. Just vague
words, data management. Nothing, really, very - all that specific.
I think of this - and it's a word that's been used a little bit - as being a
monster DMP. It's just huge and it made no inroads into the research
community at all. It wasn't actually used. Why is it there? Why do we
have - why are we spending bandwidth on it? In 2016 there was the
idea, let's make a new DMP. I'm not going to tell you too much about
that new DMP, it's still sort of in development, but I'm going to tell you
two things. Firstly, all of those numbers in the left hand columns -
they're much smaller in the right hand columns, now. We're certainly
are asking researchers to complete 90 separate questions or read two
separate forms.
The other thing I'm going to say is that, when we first started working
on this, we really thought about, what are the reasons why you'd want
a DMP? What is the purpose of this DMP? What are the difference -
why - and all of these different reasons why you might want

of 14
researchers to do DMPs should theoretically produce DMP templates
that actually look quite different. We thought, well we want to make a
good DMP template. What is a good DMP template versus a bad DMP
template? But there's just - very little research has gone into this. No
one has really said, this is what a good DMP template should look
like, this is what a bad DMP template looks like. Don't do that.
In fact, the problem is a little bit worse than that, and I'll put it this way,
and I made the same offer at eResearch - some of you might
remember. I'll give $50 to anyone who can show me any non-
anecdotal and systematic evidence that DMPs have any benefits for
anybody. That's a pretty - I mean, I think someone - there must be
some evidence out there, somewhere, but I haven't been able to find
it, I know Kathryn has not been able to find it. If anyone has the
evidence, please come forward and I'll happily give you $50. It's a one
day only offer though, so don't go out and get [on R] and start doing all
sorts of stats right now, because that doesn't [count].
There are many different reasons why you might want to have a DMP.
I guess we really drilled down and we thought, what's the reason why
the University of Melbourne wants to have a DMP for researchers? I
guess the reason we came up with is that we want to help them with
their own project management, to do a good job. There are also some
secondary benefits around using it as a - collecting that data and
using it to help plan out how much space we need to procure for our
systems and all of that sort of thing. There are other benefits, but the
real, main driver is that we want to benefit the individual researchers
who are doing it.
Kathryn and I have thought through, what are the different use cases
of why we would want to have DMPs mandated? Then, secondly, how
do you - and you should - how do you measure the outcomes of
whether those use cases are actually working for you? We've sort of
got four together, here, and you might want to add your own or help
us or refine these. But the first one - I'll just, briefly, go through these -
is that - we think that one of the reasons is that funding bodies, in

of 14
particular, really want researchers to complete DMPs because they
think that that will encourage researchers to share their managed -
share their data and that increases the return on that public
investment in that research data.
If that's the case then we should be measuring that. We should be
saying, researchers who do DMPs are sharing more data. Someone
should have done that analysis and, as far as I can tell, no one has
really done that. Maybe one person has and they really found that,
actually, researchers aren't more likely to share their data, and that
was a US study that was quite small. Another one is, institutions might
require researchers to complete DMPs to create changes in research
behaviour and culture and use it as an educative tool. The measure
there would be, researchers who do DMPs are more efficient and
productive and produce more papers in a set period of time. That
should be a pretty simple analysis to do, it still hasn't been done all
that much.
Another one is, institutions require researchers to complete DMPs to,
basically, use it as a business intelligence tool. Use it to plan out the -
create acquisition of data and other resources and to look into what
data sharing platforms should be invested in. The measurable
outcome there would be actual - the use of that information in decision
making by the institution. I know there are a few institutions that have
started to do that and it would be really great to see how that's going.
The final major use case is that - and this is, perhaps, the original use
case, that DMPs were invented for in the 1970s, and that's
researchers using DMPs as part of their routine project management
design and planning.
It's researchers going out, creating a DMP and using that to share with
fellow researchers and share with others, to help them understand
what everyone's roles and responsibilities in collection and
management of data are. That would really be, projects that use those
DMPs would be more efficient and better capitalised. I think that,
whenever talking about DMPs and the DMP with the apple and the

of 14
peanut butter together, I think what's really important to [unclear] to
add there], in my opinion, is to really think about why? Why do we
want to make DMPs, perhaps, mandatory? Or why do we want
researchers to use them? Really think about, what should that DMP
look like, depending on what that use case is. That's all I have to say.
Back to you, Kathryn.
Kathryn Unsworth: Thanks, Nick. During the BoF we did a live poll, as well, and asked a
number of questions of our audience. It was probably only a small
sample, really, in the end, if we really thought about it. But the first
question was, in the Australian context, what do you see as the main
motivations for institutions implementing DMPs? So, we asked people
to rank those and the first one, not surprising - and I think, if you bear
in mind the sorts of people that were in the audience, they'd probably
mostly be librarians, data managers, eResearch folk and not too many
researchers - that the funders and institutions demonstrating to
government return on investment through requiring best practice in
data management, was going to be the top of the list.
Then, in second, came the institutions can capture information about
the generation of research data, so the business intelligence tool use
case. Then, of course, coming in third, not too far behind, the
institutions capturing information - it was the educative tool. Funders
and institutions wanting researchers' behaviours to align better with
best practice and using DMPs in that way. Then, of course, the fourth
one was, basically, recognising the benefit - researchers, themselves,
recognising the benefit and utilising DMPs as just a routine part of
project management.
Natasha Simons: Kathryn, I just have a question related to this: how many participants
in that survey?
Kathryn Unsworth: I think it was around, about 28, but I'm not - some of the actual
questions - not everyone answered each of the polls. But it was
around, about that size sample, so not a lot. The next question was,
are we seeing change [unclear] behaviours in researchers as a result

of 14
of DMPs? This was kind of a good one, I think, because 11 per cent of
that sample said yes, 17 per cent, no. But as Nick was saying, we just
don't have any evidence to support whether DMPs are, in fact,
translating into changed behaviours by researchers. So, 72 per cent
said, basically, not sure. I think, really, if we're going to get serious
about DMPs and the benefits that they have for researchers in terms
of efficiency and that translation into best practice then we really need
to do some research in this area and find out just what's happening.
Then the next question was, should Australian funders follow the lead
of international agencies and mandate a requirement for DMPs? I was
so disappointed with the result of this poll, I have to say. [Laughs]
Because 82 per cent said yes to compliance and mandating DMPs by
funders and institutions. I actually, from a personal perspective,
believe that compliance actually changes a person's mindset. With
researchers, they will then just do the barest minimum that they have
to, because that's what they have to do, rather than looking at it in
terms of a benefit to themselves and their own workflows and
practices.
I was - again, but you need to bear in mind that the audience here are
basically from that administration point of view, so it would make it
easier for them, as administrators, if funders did follow the lead of
international agencies. Then the final question we didn't get to actually
ask, but, would you be interested in joining a local DMP interest group
that could feed into and connect with international initiatives? We did
ask it, verbally, but we didn't get a chance to actually poll people,
because we ran out of time. A few people came up and said that they
would be interested in joining such a group. That's one of the
questions that we're going to have for you guys a little later on, so just
bear that in mind.
I will wind up. Thank you, Natasha, for moving me along. Thanks
everyone.
END OF TRANSCRIPT

Transcript of Webinar: Data management plans (DMPs) - audio

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Transcript of Webinar: Data management plans (DMPs) - audio

Similaire à Transcript of Webinar: Data management plans (DMPs) - audio (20)

Plus de ARDC

Plus de ARDC (20)

Dernier

Dernier (20)

Transcript of Webinar: Data management plans (DMPs) - audio