Workshop 4: Open Science & Open Data for Librarians/Ina Smith
1. Workshop 4:
Open Science & Open Data for Librarians
24 April 2018 14:00 – 17:30
XXIII SCECSAL Conference, Entebbe, Uganda
24 April 2018
2. Programme
14:00 – 14:30 Introduction to Open Science/Open Data
14:30 – 15:00 Data informing the library profession
15:00 – 15:40 Data in support of research
15:40 – 16:00 Health Break
16:00 – 17:00 Working with data – tools & applications
17:00 – 17:30 Towards a data strategy for your library & institution
3.
4. Data Stakeholders
• Governments (policy)
• Institutions (policy & strategy)
• Research Offices (reporting, impact)
• Researchers (collecting data in an ethical and
trusted way so that it can be re-used)
• Statisticians (processing, analysing and visualising
data)
• System engineers (to maintain a network and
allow for data to be digitally transmitted)
• Librarians (managing and organizing the data, and
making sure it is digitally preserved for the
unforeseeable future)
5. Why Librarians as Data Partners?
• Information standards
• Organizational skills
• Setting up file structures (organizing
information)
• Knowledge of workflows
• Knowledge of collection management
• Describing data using established metadata
schemes & controlled vocabulary
• Collection curation/preservation
6. Role of Librarians
• Advocate for transparency, openness in research,
access to data
• Initiating conversation on Open Science Open Data
Policy & Strategy - implement
• Develop own data skills (data skills but also
informed on copyright, licensing, citation)
• Increase visibility of research data
• Manage & register trusted data repositories
• Recommend trusted data repositories
• Promote & support proper research data
management planning among researchers
7. Data Skills for Librarians (1)
• Data terminology
• Unix-style command line interface, allowing librarians to
efficiently work with directories and files, and find and manipulate
data
• Cleaning and enhancing data in OpenRefine and spreadsheets
• Git version control system and the GitHub collaboration tool
• Web scraping and extracting data from websites
• Scientific writing in useful, powerful, and open mark-up
languages such as LaTeX, XML, and Markdown
• Formulating and managing citation data, publication lists, and
bibliographies in open formats such as BiBTeX, JSON, XML and
using open source reference management tools such as JabRef
and Zotero
8. Data Skills for Librarians (2)
• Transforming metadata documenting research outputs into open plain
text formats for easy reuse in research information systems in support of
funder compliance mandates and institutional reporting
• Scholarly identity with ORCiD and managing reputation with ORCiD-
enabled scholarly sharing platforms such as ScienceOpen
• Authorship, contributorship, and copyright ownership in collaborative
research projects
• Demonstrating best practices in attribution, acknowledgement, and
citation, particularly for non-traditional research outputs (software,
datasets)
• Identifying reputable Open Access publications and Open
Institutional/Open Data repositories
• Scholarly annotation and open peer review
• Investigating and managing copyright status of a work, and evaluating
conditions for Fair Use
10. Types of data
• Government data
• Communication data (mobile phones)
• Internet data
• Statistical data
• Research data (social & natural sciences)
• Discipline specific
• And more …
14. Open Science Defined
“Open Science is the practice of science in such a
way that others can collaborate and contribute,
where research data, lab notes and other
research processes are freely available, under
terms that enable reuse, redistribution and
reproduction of the research and its
underlying data and methods.” - FOSTER Project,
funded by the European Commission
17. Original Research Data Lifecycle image from University of California, Santa Cruz
http://guides.library.ucsc.edu/datamanagement/
Repositories
Repositories
Tools
Plan
Policy&Infrastructure
21. Fears Researchers Experience
• Getting scooped
• Time & effort by researcher
• Someone else finding a path-breaking application
of the data that researcher hasn’t considered
• Fear of problems/errors in the measurement
process being exposed
• Confidentiality/privacy of respondents - ethics
clearance
• Intellectual Property Rights – signed away, little
understanding, no IP in place
22. • When should research data be open?
• When should research data be closed?
30. Data Repositories vs Social Media
• Social media sites/3rd party software:
• Connect researchers sharing interests
• Marketing data
• Sites belong to third parties – and data
• Repository:
• Supports export/harvesting of metadata
• Offers long-term preservation
• Non-profit – no advertisements
• Uses open standards and protocols
• Copyright
32. Register Data Initiatives
• re3data.org
https://www.re3data.org/
• Open Data Barometer
https://opendatabarometer.org/
• Global Open Data Index
https://index.okfn.org/
• African Open Science Platform
http://africanopenscience.org.za/
• Dataverse …. And more …
38. Working with Data
• Using R, Python, ggplot and more ..
• Collection e.g. Survey
• Normalisation & Cleaning e.g. OpenRefine
• Analysis
• Visualisation
• Preservation
• Mining
42. Data Mining
• Set of methods to analyse data from various
dimensions and perspectives, finding previously
unknown hidden patterns, classifying and grouping
the data and summarizing the identified
relationships
The tasks of data mining are twofold:
• Create predictive power using features to predict
unknown or future values of the same or other
feature
• Create a descriptive power, find interesting,
human-interpretable patterns that describe the
data
46. Self- & Lifelong Learning
• Bachelor of Science in Data Science, Sol Plaatje University (South Africa)
• Coursera Data Science
• Coursera Research Data Management and Sharing
• Foster Open Science Courses
• Masters Program in Biodiversity Informatics, Prof Jean Ganglo, University of Abomey-
Calavi (Benin)
• MANTRA for Researchers
• MANTRA for Librarians
• Agricultural Information Management Standards (AIMS)
• Author Carpentry
• Data Carpentry
• Library Carpentry
• WDS Training Resources
• UCT eResearch
59. Awareness – start the conversation
• To begin ….
• What data repositories? Which data type? Which
metadata standards?
• Data web page
• Market services re data support
• Meet with stakeholders at institution
• Form a committee to implement strategy, policy,
etc.
• Implement Research Data Management Plans
• Implement Institutional Data Repository
61. Thank you
Ina Smith
Project Manager, African Open Science Platform Project, Academy of
Science of South Africa (ASSAf)
ina@assaf.org.za
Susan Veldsman
Director, Scholarly Publishing Programme, Academy of Science of
South Africa (ASSAf)
susan@assaf.org.za
Visit http://africanopenscience.org.za