Presentation on data citations for publishers, given by Jez Cope, Data Services Lead at the British Library/Crossref workshop in London on 5 February 2019.
4. www.bl.uk
Why cite data?
• For authors:
• Support validation, reproducibility and reuse
• Get credit for valuable data
• Improve provenance tracking
• Comply with funder requirements
• For reviewers:
• Evaluate submissions more easily
4
5. www.bl.uk
Why cite data?
• For publishers:
• Improve reader & author service
• Improve editor & reviewer service
• Help journals & authors comply easily with funder mandates
• Meet editorial goals to publish more open/reproducible
research
• Make the most of repository partnerships
5
6. www.bl.uk
FAIR data
• The FAIR principles are now a cornerstone of most funder
policy
• Findable
• Accessible
• Interoperable
• Reusable
• Data citation plays a key role in making data Findable and
Accessible
6
7. www.bl.uk
How to do it: principles
• FORCE11 Joint Declaration of Data Citation Principles
• Key set of overarching guidelines
• Agreed between researchers, funders, publishers, information professionals, …
• Data Citation Synthesis Group. ‘Joint Declaration of Data Citation Principles’, 2014.
https://doi.org/10.25490/a97f-egyk.
7
9. www.bl.uk
How to do it: practice
• A lot to consider! Take it one step at a time
• Valuable guide: “A Data Citation Roadmap for Scientific Publishers”:
• Cousijn, Helena, Amye Kenall, Emma Ganley, Melissa Harrison, David Kernohan, Thomas Lemberger, Fiona Murphy,
et al. ‘A Data Citation Roadmap for Scientific Publishers’. Scientific Data 5 (20 November 2018): 180259.
https://doi.org/10.1038/sdata.2018.259.
9
10. www.bl.uk
Information for authors
• Request data availability statements
• Include it in author template
• Examples of citation format
• Many style guides and standards now have dataset guidance (e.g. ISO 690-
2010)
• Guidance on suitable repositories
• Reminder of responsibility of corresponding author for research validity,
including proper data management
10
11. www.bl.uk
Data Availability Statements
• Data Availability (or Access) Statements are not formal citations
but can provide more nuance in describing how to access data
• Some good examples collated by university libraries:
• University of Bristol
• University of Bath
• JATS4R recommendation on markup of data availability
statements
11
12. www.bl.uk
Citation formats for data
Author(s), Year, Dataset Title, Data Repository or Archive,
Version, Global Persistent Identifier
• Principle 2: Credit & attribution
• Principles 4, 5, 6: Unique Identifier, Access, Persistence
• Principle 7: Specificity & verification
Taken from https://www.force11.org/node/4771
12
13. www.bl.uk
re3data: Registry of Research Data Repositories
https://re3data.org
• Huge range of data repositories
• Search for keywords or browse by discipline
• Good to select a subset focused on your authors
13
14. www.bl.uk
Policy
• 4 broad policy types (with thanks to Springer Nature):
See https://www.springernature.com/gp/authors/research-data-policy/data-
policy-types/12327096
14
Policy Type Policy summary Example Journal
Type 1 Data sharing and data citation is encouraged Photosynthesis Research
Type 2
Data sharing and evidence of data sharing
encouraged
Plant and Soil
Type 3
Data sharing encouraged and statements of
data availability required
Palgrave Communications (see Editorial policies)
Type 4
Data sharing, evidence of data sharing and
peer review of data required
Scientific Data (see Data policies)
15. www.bl.uk
Structured information
• Display data citations & data availability statements in the
article
• Update DTDs etc. to correctly tag these
• E.g. JATS for Reuse (JATS4R) recommendation
https://jats4r.org/data-citations
• Deliver data citation information to CrossRef
15
17. www.bl.uk
How does the British Library help?
• Point of coordination for DataCite in the UK
• Provide UK research institutions with DOIs for data
• Advocate for data citation within library and research
communities
• International collaborations to develop standard approaches to
bibliographic metadata
• E.g. FREYA project: “Connected Open Identifiers for
Discovery, Access and Use of Research Resources”
https://www.project-freya.eu/en
17
18. www.bl.uk
Specialist issues
• Citing dynamic data
• Important for datasets under ongoing update
• E.g. long-term cohort studies, live streaming datasets
• What was the state of the dataset when the analysis done?
• Andreas Rauber, Ari Asmi, Dieter van Uytvanck, and Stefan Proell. ‘Data Citation of Evolving Data: Recommendations
of the Working Group on Data Citation (WGDC)’, 23 October 2016. http://dx.doi.org/10.15497/RDA00016.
• Citing software
• In some disciplines, software is the key research artefact
• In others, it still plays a key part in validation of the analysis
• Smith, Arfon M., Daniel S. Katz, and Kyle E. Niemeyer. ‘Software Citation Principles’. PeerJ Computer Science 2 (19
September 2016): e86. https://doi.org/10.7717/peerj-cs.86.
18