1. dans.knaw.nl
DANS is an institute of KNAW en NWO
Enabling data sharing in the Netherlands:
Contributions by DANS
Ingrid Dillo
Deputy Director DANS
Research Data Network Workshop
Parallel Sessions II: Research Data Solutions from SURF and DANS
York, 27 June 2017
2. Outline
RDN workshop: “..focus on innovative tools, and approaches
that offer practical solutions to current and
future RDM challenges”
NL RDM landscape issues: “funding, putting policy into
practice”
• DANS
• Frontoffice-backoffice model
• Certification of digital repositories
• FAIR data assessment
• Business models for digital repositories
3. Institute of
Dutch Academy
and Research
Funding
Organisation
(KNAW & NWO)
since 2005
First predecessor
dates back to
1964 (Steinmetz
Foundation),
Historical Data
Archive 1989
Mission: promote
and provide
permanent
access to digital
research
resources
DANS organisation
4. DataverseNL
to support data
storage during
research until
10 years after
NARCIS
Portal
aggregating
research
information and
institutional
repositories
EASY
Certified Long-
term Archive
DANS core services
https://dans.knaw.nl
10. Data sharing incentives
• Influence of sharing norms within direct
research circle
• Professional rewards for data sharing
• External drivers:
• Publisher requirements (DAPs)
• Funder policies/mandates
http://repository.jisc.ac.uk/5662/1/KE_report-incentives-for-
sharing-researchdata.pdf
11. Other data sharing challenges
Enabling the researcher to comply with open data requirements:
• awareness raising, training and support for data management
(DMPs, FAIR data)
• infrastructure for preservation of and long-term access to the
data
12. Sustainable support model
Frontoffice-backoffice model
• Division of labour
• Economies of scale
Backoffice
• Curation and preservation expertise
• Training of local data experts
• Long-term preservation infrastructure
13. “Perhaps the biggest challenge in sharing data is
trust: how do you create a system robust enough for
scientists to trust that, if they share, their data
won’t be lost, garbled, stolen or misused?”
14. Pillars of trust
• actions and attributes of the trustee (integrity, transparency,
competence, predictability, guarantees, positive intentions)
• external acknowledgements:
• reputation (researchers)
• third party endorsements (funders, publishers)
15. DANS and Data Seal of Approval
• 2005: DANS to promote and provide permanent access to
digital research resources
• Formulate quality guidelines for digital repositories including
DANS
• 2009: international DSA Board
• Almost 70 seals acquired around the globe, but with a focus
on Europe
• https://www.datasealofapproval.org/en/
17. Partnership with WDS under the umbrella
of RDA
• Goals:
• Realizing efficiencies
• Simplifying assessment options
• Stimulating more certifications
• Outcomes:
• Common catalogue of requirements for core repository
assessment
• Common procedures for self-assessment and review
process
• One new certification body: CoreTrustSeal Board
19. Requirements dealing with “data quality” or “fitness
for use” or “FAIRness”
R2. The repository maintains all applicable licenses covering data access and use and
monitors compliance.
R3. The repository has a continuity plan to ensure ongoing access to and preservation
of its holdings.
R4. The repository ensures, to the extent possible, that data are created, curated,
accessed, and used in compliance with disciplinary and ethical norms.
R7. The repository guarantees the integrity and authenticity of the data.
R8. The repository accepts data and metadata based on defined criteria to ensure
relevance and understandability for data users.
R10. The repository assumes responsibility for long-term preservation and manages
this function in a planned and documented way.
R11. The repository has appropriate expertise to address technical data and metadata
quality and ensures that sufficient information is available for end users to make
quality-related evaluations.
R13. The repository enables users to discover the data and refer to them in a
persistent way through proper citation.
R14. The repository enables reuse of the data over time, ensuring that appropriate
metadata are available to support the understanding and use of the data.
20. All data sets in a Trustworthy Repository are FAIR, but
some are more FAIR than others
21. Experiences with Data Reviews at DANS
started in 2011
• M. Grootveld, J. van Egmond
en B. Sørensen
• https://goo.gl/Tf4HFN
22. FAIR badge scheme
• Proxy for data “quality” or “fitness
for (re-)use”
• Prevent interactions among
dimensions to ease scoring
• Consider Reusability as the
resultant of the other three:
• the average FAIRness as an indicator of
data quality
• (F+A+I)/3=R
• Manual and automatic scoring
F A I R
2 User Reviews
1 Archivist Assessment
24 Downloads
23. Findable (defined by metadata (PID included) and documentation)
1. No PID nor metadata/documentation
2. PID without or with insufficient metadata
3. Sufficient/limited metadata without PID
4. PID with sufficient metadata
5. Extensive metadata and rich additional documentation available
Accessible (defined by presence of user license)
1. Metadata nor data are accessible
2. Metadata are accessible but data is not accessible (no clear terms of reuse in license)
3. User restrictions apply (i.e. privacy, commercial interests, embargo period)
4. Public access (after registration)
5. Open access unrestricted
Interoperable (defined by data format)
1. Proprietary (privately owned), non-open format data
2. Proprietary format, accepted by Certified Trustworthy Data Repository
3. Non-proprietary, open format = ‘preferred format’
4. As well as in the preferred format, data is standardised using a standard vocabulary
format (for the research field to which the data pertain)
5. Data additionally linked to other data to provide context
24. Creating a FAIR data assessment tool
Using an online questionnaire system
Prototype:
https://www.surveymonkey.com/r/fairdat
25. Website FAIRDAT
• To contain FAIR data
assessments from any
repository or website,
linking to the location of
the data set via (persistent)
identifier
• The repository can show
the resultant badge, linking
back to the FAIRDAT
website
F A I
R
2 User Reviews
1 Archivist
Assessment
24 Downloads
Neutral, Independent
Analogous to DSA website
26. Sustainable business models for data repositories
Increasing need for data repositories and data stewardship.
• Increasing volume presents a challenge.
• Requirements for stewardship present a greater challenge.
Sustaining digital data infrastructure is a major issue for
science policy
• current funding models will prove inelastic and not meet the
growing requirements – concern on the part of repositories
and funders
27. Sustainable business models for data repositories
RDA Cost Recovery Interest Group, also supported by WDS and CODATA
Report Income Streams for Data Repositories (Feb 2016;
https://zenodo.org/record/46693#.WTUR-TOB2T8)
• based on 25 in-depth interviews, identifying topics and trends,
alternative revenue streams
28. Sustainable business models for data repositories
• Continuation of the work under the umbrella of OECD/GSF
• Around 50 interviews in total
• Thorough economic analysis
• Cost optimization
• Stakeholder workshops
• Presentation of report and stakeholder recommendations at
RDA Plenary Montreal
• Expected OECD publication end of 2017
https://www.innovationpolicyplatform.org/open-data-science-oecd-project
29. User Base
• Data depositors
• Data users
• Research institutions
• Research funders
• Others
Products
• Research data
• Research facilities
• Value-adding services
• Contract services
• Research services
Revenue Sources
• Structural funding
• Host institutional funding
• Deposit-side charges
• Access charges
• Services charges
Financing
• Investment funding
• Development funding
• Operational revenue
Identifying the
user base
Developing the
product mix
Making the
value
proposition(s)
Understanding
cost drivers &
matching revenue
streams
Elements of a Business Model for Data Repositories
30. Thank you for listening
ingrid.dillo@dans.knaw.nl
www.dans.knaw.nl