2. Outline
• Definitions
• What We Mean By “Data”
• Wiley Researcher Data Insights Survey
• Methodology
• Data Sharing Behavior
• Data Sharing by Field
• Data Sharing by Geography
• The Role of Publishers
• The Future
4. What We Mean By Data
Research data is data that is collected, observed, or created, for purposes of
analysis to produce original research results.
• Text or Word documents, spreadsheets
• Laboratory notebooks, field notebooks, diaries
• Questionnaires, transcripts, codebooks
• Audiotapes, videotapes
• Photographs, films
• Test responses
• Slides, artifacts, specimens, samples
• Collection of digital objects acquired and generated during the process of
research
• Data files
• Database contents including video, audio, text, images
• Models, algorithms, scripts
• Contents of an application such as input, output, log files for analysis software,
simulation software, schemas
• Methodologies and workflows
• Standard operating procedures and protocols
Source: Boston University Libraries
7. Wiley Researcher Data Insights Survey
Our objective was to establish a baseline view of data
sharing practices, attitudes, and motivations globally,
with participation from researchers in every scholarly field.
In March 2014, more than 90,000 researchers around the
world were invited to participate in Wiley’s Researcher
Data Insights Survey. Participants were researchers who
had published at least one journal article in the past year
with any publisher.
We received an overwhelming 2,886 responses from
around the world.
8. Wiley Researcher Data Insights Survey
Key Findings
• Most researchers are sharing their data.
• Those not sharing have a variety of reasons.
• Data that’s being shared typically is <10 GB.
• The most common type of data that is being shared is flat,
tabular data (.csv, .txt, .xl)
• Data is usually saved on hard drives.
9. Wiley Researcher Data Insights Survey
Why Researchers Do Not Share
• Intellectual property or confidentiality issues (59%)
• Concerned research might be “scooped” (39%)
• Concerns about misinterpretation or misuse (32%)
• Concerns about attribution/citation credit (31%)
• Ethical concerns (24%)
• Insufficient time/resources (19%)
• Funder/institution does not require sharing (13%)
• Lack of funding (13%)
• Not sure where to share (5%)
• Not sure how to share (3%)
10. Wiley Researcher Data Insights Survey
Why do you share your data?
Sharing is standard practice within their research communities (59%)
Sharing increases the impact and visibility of their research (56%)
Sharing benefits the public (51%)
52%
48%
Have made data publicly available
Have not made data publicly available
While only 52% have made
their data publicly available,
66% of researchers share their
data.
11. File types # of
Respondents
% Total
Tabular (flat) data (CSV,
spreadsheet, txt)
1,767 83%
Images 2-D 807 38%
Executable code/Models 460 22%
Interview transcripts (or
data generated from
interview scripts)
298 14%
Relational Databases
(SQL, Oracle, Access,
etc.)
254 12%
Images 3-D 254 12%
Video/Audio 228 11%
Other 227 11%
10
65
72
89
96
127
141
206
606
648
>100 TBs
I don't know
2 - 50 TB
101-500 GB
501 GB -1 TB
51-100 GB
21-50 GB
11-20 GB
<1 GB
1-10 GB
Data Produced by Research File Sizes
Wiley Researcher Data Insights Survey
12. Computer hard drive 24%
External hard drive 22%
Shared/networked drive 11.5%
USB/flash drive 10.5%
Web service e.g. Dropbox 9%
Non-digital lab notebooks 8.5%
Institutional repository 6%
Email 6%
General purpose repository 1.5%
Other 1%
Where do you store your tabular data once a
project is complete?
Wiley Researcher Data Insights Survey
13. GERMANY
55%
JAPAN
44%
AUSTRALIA
41%
US
46%
UK
43%
• US researchers are highly likely to be
sharing data as supplementary material in
journals. The majority of US researchers
say sharing data is standard practice within
their communities.
• UK and Australian researchers are more
comfortable sharing data at conferences
rather than on publicly and permanently
accessible platforms. Their primary
motivation for sharing data is to increase
the impact and visibility of their work.
• Australian and German researchers are
more driven by their global counterparts to
share their data to ensure preservation as
well as to allow for transparency and reuse.
• Japanese researchers are significantly
more likely to be using discipline-specific
repositories (44% compared with 26% for
the full pool).
Regional Differences in Data Sharing
(% Overall that Share Data)
Wiley Researcher Data Insights Survey
14. INDIA
65%
CHINA
36%
BRAZIL
52%
Data Sharing in Developing Markets
Data sharing practices vary across
developing markets, aligning largely
with the presence (or lack thereof) of
funder mandates.
• Chinese researchers share their data
when they are required to (by journals or
funders) but are less likely overall to
share their work because they don’t
believe it is their personal responsibility.
• Researchers in India are significantly
more likely to utilize institutional (46%)
and discipline-specific (41%)
repositories compared to the global pool
of respondents.
Wiley Researcher Data Insights Survey
15. Life Sciences
52 66
45 48 36
48 34
55 52 64
0%
20%
40%
60%
80%
100%
I haven't shared my data
publicly
I've shared my data publicly
The majority of Life Science researchers that share data are doing so as supplementary material
in a journal. Four in ten are utilizing institutional data repositories while 29% are sharing via
personal/institutional/lab webpages.
Top Motivations to Share Top Reasons Not to
Share
Standard practice within
their research community
(64%)
Concerns that their
research will be scooped
(56%)
Journal requirement (56%) Intellectual property or
confidentiality issues
(54%)
To increase the impact and
visibility of their research
(55%)
Concerns about
misinterpretation or
misuse (43%)
Wiley Researcher Data Insights Survey
16. Health Sciences
The majority of Health Science researchers that share data are doing so as supplementary
material in a journal (68%). About one in three researchers are utilizing institutional data
repositories or personal/institutional/lab webpages, while 21% are depositing into discipline-
specific repositories to share and archive their data.
Top Motivations to Share Top Reasons Not to Share
Data sharing is standard
practice within their research
community (57%)
Intellectual property or
confidentiality issues (68%)
To increase the impact and
visibility of their research (52%)
Ethical concerns (36%)
For the public’s benefit (49%) Concerns about
misinterpretation or misuse
(36%)
Wiley Researcher Data Insights Survey
52 66
45 48 36
48 34
55 52 64
0%
20%
40%
60%
80%
100%
I haven't shared my data
publicly
I've shared my data publicly
17. The majority of Physical Science researchers that share data are doing so as supplementary
material in a journal (69%) while four in ten are sharing via personal/institutional/lab webpages. Just
under a third utilize institutional data repositories (28%). Compared with the global average, physical
science researchers are significantly less likely to be utilizing discipline-specific repositories (10%) or
general purpose repositories (3%) to share and archive their data.
Top Motivations to
Share
Top Reasons Not to
Share
Standard practice within
their research
community (61%)
Intellectual property or
confidentiality issues
(47%)
To increase the impact
and visibility of their
research (59%)
No funder or institutional
require (29%)
For the public’s benefit
(52%)
Concerns that their
research will be scooped
(27%)
Physical Sciences
Wiley Researcher Data Insights Survey
52 66
45 48 36
48 34
55 52 64
0%
20%
40%
60%
80%
100%
I haven't shared my data
publicly
I've shared my data publicly
18. The majority of SSH researchers that share data are doing so as supplementary material in a journal
(52%), or on personal, institutional or project websites (51%). A quarter are utilizing institutional data
repositories (25%) while, cumulatively, only 5% are sharing in general purpose or discipline-specific
repositories.
Top Motivations to
Share
Top Reasons Not to
Share
To increase the impact
and visibility of their
research (53%)
Intellectual property or
confidentiality issues
(47%)
Data sharing is
standard practice within
their research
community (53%)
Concerns about being
scooped (30%)
For the public’s benefit
(46%)
No funder or
institutional requirement
(28%)
Wiley Researcher Data Insights Survey
52 66
45 48 36
48 34
55 52 64
0%
20%
40%
60%
80%
100%
I haven't shared my data
publicly
I've shared my data publicly
Social Science and
Humanities (SSH)
Where researchers are sharing their data:
As supplementary material in a journal - 67%
At a conference - 57%
Informal paths or upon request (email, direct contact etc) - 42%
Personal, institutional or project webpage - 37%
Institutional data repository (i.e. university or institute-sponsored) - 26%
Discipline-specific data repository (e.g. GenBank, OpenEI, Protein Data Bank, TreeBASE) - 19%
General-purpose data repository (e.g. Dryad, figshare) -6%
Other - 5%
Motivations for sharing data:
Data sharing is standard practice within my research community - 57%
Increase the impact and visibility of my research - 55%
Public benefit - 50%
Journal requirement - 42%
Transparency and re-use - 37%
Personal trust in the requester - 30%
Discoverability and accessibility - 25%
Funder requirement - 23%
Institutional requirement - 18%
Freedom of information request - 13%
Preservation - 13%
Other – 2%
12
Images to reflect our support of the community determining standards (what and how) for sharing, author licensing options, author education; agent that drive
Publishers are not the best to host data, but we can provide the links and enable discoverability
Data “is ready for its close up”, and could sit alongside the narrative layer as an equally important artifact worthy of its own DOI. MOVE BEYOND SUPPLEMENTARY INFO