Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Open Science - Global Perspectives/Simon Hodson
1. Open Science: Global Perspectives
Simon Hodson, Executive Director, CODATA
www.codata.org
SA-EU Open Science Dialogue Workshop
Birchwood Hotel & OR Tambo Conference Centre
Johannesburg, South Africa
30 November 2017
2. Why Open Science / FAIR Data?
• Good scientific practice depends on communicating the evidence.
• Open research data are essential for reproducibility, self-correction.
• Academic publishing has not kept up with age of digital data.
• Danger of an replication / evidence / credibility gap.
• Boulton: to fail to communicate the data that supports scientific
assertions is malpractice
• Open data practices have transformed certain areas of research.
• Genomics and related biomedical sciences; crystallography;
astronomy; areas of earth systems science; various disciplines using
remote sensing data…
• FAIR data helps use of data at scale, by machines, harnessing
technological potential.
• Research data often have considerable potential for reuse,
reinterpretation, use in different studies.
• Open data foster innovation and accelerate scientific discovery through
reuse of data within and outside the academic system.
• Research data produced by publicly funded research are a public
asset.
3. Policy Push for
Open Research Data
• The three Bs (Budapest, Berlin and Bethesda) and Open Access, 2002-3
• OECD Principles and Guidelines on Access to Research Data, 2004, 2007
• UK Funder Data Policies, from 2001, but accelerates from 2009
• NSF Data Management Plan Requirements, 2010
• Royal Society Report ‘Science as an Open Enterprise’, 2012
• OSTP Memo ‘Increasing Access to the Results of Federally Funded Scientific Research’,
Feb 2013
• G8 Science Ministers Statement, June 2013
• G8 Open Data Charter and Technical Appendix, June 2013
• EC H2020 Open Data Policy Pilot, 2014; Adoption of FAIR Data Principles, 2017.
• Science International Accord on Open Data in a Big Data World, Dec 2015:
http://bit.ly/opendata-bigdata
4. Developments: Journal Data Policies
Dryad Joint Data Archiving Policy, Feb 2010: http://datadryad.org/jdap
This journal requires, as a condition for publication, that data supporting the results in the
paper should be archived in an appropriate public archive, such as GenBank, TreeBASE,
Dryad, or the Knowledge Network for Biocomplexity.
PLOS Data Availability Policy, revised Feb 2014:
http://www.plosone.org/static/policies.action#sharing
PLOS journals require authors to make all data underlying the findings described in their
manuscript fully available without restriction, with rare exceptions.
Springer Nature initiative to standardise policies:
http://www.springernature.com/gp/group/data-policy/policy-types
FAIRsharing https://fairsharing.org ;
RDA Interest Group to encourage development and adoption of journal data policies
AGU COPDESS activity to promote greater data availability in earth system sciences.
5. Resources: Current Best Practice for Research Data
Management Policies
Expert report commissioned by CODATA member.
Provides comprehensive summary of best practice
in funder data policies.
Identifies key elements to be addressed:
1. Summary of policy drivers
2. Intelligent openness
3. Limits of openness
4. Definition of research data
5. Define data in scope
6. Criteria for selection
7. Summary of responsibilities
8. Infrastructure and costs
9. DMP requirements
10. Enabling discovery and reuse
11. Recognition and reward
12. Reporting requirements, compliance monitoring
Zenodo: http://dx.doi.org/10.5281/zenodo.27872
6. CODATA Data Policy Activities
New Data Policy Committee, chaired by Paul Uhlir,
international expert in Data Policies and member of
CODATA Executive Committee.
Current Best Practice for Research Data Management
Policies http://dx.doi.org/10.5281/zenodo.27872
The Value of Open Data Sharing, report for GEO
http://dx.doi.org/10.5281/zenodo.33830
Legal Interoperability, Principles and Implementation
Guidelines https://doi.org/10.5281/zenodo.162241
FAIR Data
Simon Hodson is chairing the European
Commission’s Expert Group on FAIR Data:
http://bit.ly/FAIR_Data_Expert_Group
OECD Global Science Forum and CODATA Project on
Business Models for Sustainable Data Repositories:
http://www.codata.org/working-groups/oecd-gsf-
sustainable-business-models
7. The Case for Open Data
in a Big Data World
• Science International Accord on Open Data in a Big Data World:
http://www.science-international.org/
• Supported by four major international science organisations.
• Takes a global approach: data revolution and Open Science are
phenomena with global ramifications. Repeatedly stresses the
opportunities for LMICs and the negative consequences of being
left outside a system of data intensive research.
• Presents a powerful case that the profound transformations
mean that data should be:
• Open by default
• Intelligently open, FAIR data
• Lays out a framework of principles, responsibilities and enabling
practices for how the vision of Open Data in a Big Data World
can be achieved.
• Campaign for endorsements: over 150 organisations so far.
• Please consider endorsing the Accord: http://www.science-
international.org/#endorse
8. The “Science International” Accord:
principles of open data
(www.icsu.org/science-international)
Responsibilities
1-2. Scientists
3. Research institutions & universities
4. Publishers
5. Funding agencies
6. Scholarly societies and academies
7. Libraries & repositories
8. Boundaries of openness
Enabling practices
9. Citation and provenance
10. Interoperability
11. Non-restrictive re-use
12. Linkability
9. Framework for Regional, National and
Institutional Data Strategies
National / Institutional Open Science and FAIR Data Strategy
Consultative forum, stakeholder engagement.
Open data policies and guidance at national and institutional
level.
Clarify the boundaries of open (particularly privacy, IPR).
Clarify the data in scope, guidelines on selection.
Develop incentives and reward systems.
Mechanisms (infrastructure and policy) to ensure
concurrent publication of data as research output.
Data ‘publication’ and citations of data included in
assessment of research contribution.
Promotion of data skills:
Essential data skills for researchers.
Develop skills and competencies for data stewards, data
scientists.
10. Framework for Regional, National and
Institutional Data Strategies
Scope, roadmap and implement data infrastructure.
Key components of national and regional
infrastructure (network / NREN, economies of scale
for storage and compute).
Development of regional, national and institutional
infrastructure(s) for research collaboration and data
stewardship/RDM, generic research
platforms/environments, trusted digital repositories.
Collaborative infrastructures for certain research
disciplines, nationally, regionally to pool expertise and
lower costs.
International infrastructure / data ecosystem
components: permanent identifiers, metadata
standards.
11. Establish African Open Data Forum / Platform
Funded Research Data Infrastructure Initiatives
Funded, co-designed transdisciplinary research
projects
Co-design African Open Data Policies
Develop Incentives Frameworks
Develop Research Data Science Training
African Research Data Infrastructure Roadmap
Activities require
low funding for
coordination,
secondment,
contributions in
kind and evaluation.
Activities require
higher investment
for coordination,
co-design
implemenatation
and evaluation.
African Open Science Platform
Pilot Project Workpackages
12. International ‘ecosystem’ of open science
components
Open Science infrastructure is not just the network, storage and compute.
Ecosystem of components which are created and governed internationally.
Reporting Research Outputs: information systems for research output reporting (CRIS), metadata
standards e.g. CERIF, managed by euroCRIS.
Persistent and Unique Identifiers: DOIs for articles (CrossRef); DOIs for data sets (DataCite); author IDs
(ORCID).
Data and Metadata Standards: CIF in crystallography, FITS in astronomy, DDI in social science surveys,
Darwin Core in biodiversity, etc, etc.
DCC Registry of Metadata Standards http://www.dcc.ac.uk/resources/metadata-standards ; now
maintained by RDA IG http://rd-alliance.github.io/metadata-directory/
Data Repositories: listed in Re3Data, registry of data repositories: https://www.re3data.org/
Trusted Data Repositories: Core Trust Seal https://www.coretrustseal.org/, a merger of Data Seal of
Approval and the World Data System criteria.
Criteria for Trustworthy Digital Archives (DIN 31644) http://www.data-
archive.ac.uk/curate/trusted-digital-repositories/standards-of-trust?index=3
Audit and certification of trustworthy digital repositories (ISO 16363) http://www.data-
archive.ac.uk/curate/trusted-digital-repositories/standards-of-trust?index=2
13. Global Registry of Data Repositories
Country coverage in Re3Data.org (registry of data repositories.
14. Data Seal of Approval
Location of repositories having acquired Data Seal of Approval
15. CODATA 2017 Session: ‘World Tour
of Open Data and Open Science’
Presentations from USA, Finland, Japan, China, Australia, Canada, Israel, South Africa, Kenya.
Presentations covered: National Policies, Key Policy Players, Most Significant Projects, Barriers.
Intention is for the session to be written up as a series of surveys.
16. Australia: Policies
Australia Research Council (ARC)
Encourages researchers to deposit data arising from research in publicly accessible repositories.
Since 2014, requires researchers to provide a DMP as part of the application process
Australian Code for the Responsible Conduct of Research (2007) include the proper management
of research data
“Research data should be made available for use by the other researchers unless this is prevented
by ethical, privacy or confidentiality matters.”
Reference OECD Principles and Guidelines for Access to Research Data from Public Funding (2007)
National Health and Medical Research Council (NHMRC)
Acknowledges the importance of making data publicly accessible.
Encourages data sharing and providing access to data and other research outputs arising from
NHMRC supported research.
2016 National Research Infrastructure Roadmap
Recommends Australian Research Data Cloud
Slide Credit: Jane Hunter
17. Australia: Key Initiatives
ANDS – Australian National Data Service
Research Data Australia https://researchdata.ands.org.au/ (currently 133539 records)
ANDS has been a major player nationally and internationally (72M AUS$ over 3.5 years from 2009-
13)
NeCTAR – National eResearch Collaborative Tools and Resources
RDS – Research Data Services
CSIRO data activities; Data61 (digital and data innovation group).
Various Data Initiatives – particularly strong in ecosystem / biodiversity studies
TERN Terrestrial Ecosystem Research Network http://www.tern.org.au/
Atlas of Living Australia https://www.ala.org.au/
IMOS, Integrated Marine Observing System, a National Collaborative Research Infrastructure
Slide Credit: Jane Hunter
18. Canada: Policies
Canada does not have an overarching policy per se but a patchwork of policy initiatives by important
players is starting to emerge.
“Government and its information must be open by default. Simply put, it is time to shine more
light on government to make sure it remains focused on the people it was created to serve –
Canadians.“(PM Trudeau)
The 3 federal granting councils (NSERC, SSHRC, CIHR) have drafted a policy embracing the FAIR
principles for research data and are entering a consultation process. They embraced open
publications in 2015
Nine provinces/territories and 55 cities have embraced open data
Federal publications are available as part of the Depository Services Program.
Some individual universities are developing data repositories and support services to assist
researchers in research data management.
Open journal publishing at an embryonic stage but is strongly supported by academic libraries.
Slide Credit: Jon Broome and Ernie Boyko
19. Canada: Policies
Canada’s Portage Network works with research libraries and other
stakeholders to coordinate expertise, services, and technology in
research data management: https://portagenetwork.ca/
Federated Research Data Repository A scalable federated platform
for digital research data management and the discovery of Canadian
research data https://www.frdr.ca/repo/?locale=en
Research Data Canada a stakeholder-driven and supported
organization dedicated to improving the management of research
data in Canada.
Datacite Canada Canada's data registration service provided by
National Research Council.
GENOME Canada
Canada Astronomy Data Centre Thirty years of open data research
GeoGratis Was an early example of open data when Natural
Resources Canada made its mapping data open for use by the pubic
FACETS is Canada's first multidisciplinary open access science
journal which will have a data section edited by Chuck Humphrey,
Canada’s preeminent data scientist
Slide Credit: Jon Broome and Ernie Boyko
20. Indian National Data Sharing and
Accessibility Policy
Global experience has demonstrated convincingly that access to data leads to
breakthroughs in scientific understanding as well as to economic and public
good, in addition to several benefits to civil society. Given the deployment of
substantial level of investment of public funds in collection of data and the
untapped potentials of benefits to social society, it has become important to
make available non-sensitive data for legitimate and registered use.
National Data Sharing and Accessibility Policy (NDSAP), March 2012:
http://www.dst.gov.in/NDSAP.pdf
Places emphasis on a negative list of sensitive data types, rather than a
positive list of data to be released: i.e. the default is open, unless the data is
on the ‘stop’ list.
Allows for data to be Open, accessible to registered users and under
restricted access.
21. Indian National Data Sharing and
Accessibility Policy
Implementation Guidelines, Feb 2014:
http://data.gov.in/sites/default/files/NDSAP_
Implementation_Guidelines_2.2.pdf
Deal mostly with public data, but research
data produced by government institutes of
funded by the government is in scope.
For data to be reused, it needs to be
adequately described and linked to services
that disseminate the data to other
researchers and stakeholders. The current
methods of storing data are as diverse as
the disciplines that generate it. It is
necessary to develop institutional
repositories, data centers on domain and
national levels that all methods of storing
and sharing have to exist within the specific
infrastructure to enable all users to access
and use it.
22. India: Policies and Initiatives
Policies
Government of India National Data Sharing and Accessibility Policy (NDSAP),
March 2012
Followed by Implementation Guidelines, Feb 2014
Very unclear the level of implementation: lack of data infrastructure and
cultural barriers.
Key Initiatives
Lots of Open Access initiatives: e.g. in major universities; Librarians’ Digital
Library (LDL)
(https://drtc.isibang.ac.in); OpenMed@NIC (http://openmed.nic.in/) Open
Access self-archive for medical and allied sciences.
Specific initiatives: e.g. National Mission for Manuscripts
(http://namami.nic.in/) ; lots of activities around multilingual computing.
Little national infrastructure: activities around specific projects of
institutions and portals?
Slide Credit: Devika Madalli
23. China: Policies
MOST (Ministry of Science and Technology) has taken one year to draft a regulation
on open research data, estimated it will be published in 2018
Previously there have been some data sharing polices on program level, for example,
NSTIC (National Science and Technology Infrastructure), CAS (Chinese Academy of
Science) scientific data program.
NSFC is pushing open research products, mainly focus on OA for research papers.
Slide Credit: Jianhui LI
24. China: Key Initiatives
National Science and Technology Infrastructure, supported by MOST (the
Ministry of Science and Technology)
From an the early stage (from 2001), supported the creation of 13
scientific data centres/sharing platforms covering the fields of agriculture,
forestry, seismicity, meteorology, marine science, earth system, population
and health, biology, chemistry, materials, energy, etc.
CAS Practice-Scientific Data Programme
Long-term mission started in 1986, funded by CAS
Many institutes involved; long-term, large-scale collaboration; data from
research, for research
Collecting multi-discipline research data and promoting data sharing
More than 350 research databases and 1350 datasets by 61 institutes
Over 600TB data available to open access and download
CAS Big Data Earth programme
The strategic Priority Research Program (Category A)
Almost 1.8 billion YMB for 5 years;
Biodiversity, ecology system, environment, natural resources, earth science
Slide Credit: Jianhui LI
25. RDA and Open Science
RDA is a unique global platform to synchronize between the national,
European, disciplinary and sector stakeholders to support the transition
towards Open Science, an Open World and Open Innovation. Our open,
neutral social platform where international research data experts meet,
covering 130 countries and more than 6000 individual members,
facilitates the exchange of views and alignment on topics at the heart of
Open Science, e.g. social hurdles on data culture, data stewardship and
training challenges, data management plans and certification of data
repositories to name just a few of the priorities addressed.
RDA is building the social and technical bridges that enable open sharing of
data to achieve its vision of researchers and innovators openly sharing data
across technologies, disciplines, and countries to address the grand challenges
of society.
rd-alliance.org/about-rda
WWW.RD-ALLIANCE.ORG
@RESDATALL
CC BY-SA 4.0
28. Digital Frontiers of Global Science
Frontier issues for research in a global and digital age.
Applications, progress and challenges of data intensive research.
Data infrastructure and enabling practices for international and
collaborative research.
Data, development and innovation: data as an interface between
research, industry, government, society and development.
30. Simon Hodson
Executive Director CODATA
www.codata.org
http://lists.codata.org/mailman/listinfo/codata-international_lists.codata.org
Email: simon@codata.org
Twitter: @simonhodson99
Tel (Office): +33 1 45 25 04 96 | Tel (Cell): +33 6 86 30 42 59
CODATA (ICSU Committee on Data for Science and Technology), 5 rue Auguste Vacquerie, 75016 Paris,
Thank you for your attention!
Slide Credits: Geoffrey Boulton, Jane Hunter, John Broome, Ernie Boyko, Devika Madalli, LI Jianhui
31. What Are the Barriers
Reasons why respondents are hesitant to share their data, n=7082
32. What Are the Barriers in Australia to
Advancing Open Data/Open Science
32
• Future funding situation – long term sustainable business models for “Aust
Research Data Cloud”
• Simple, fast, clear easy-to-use processes & services
• Data curation is expensive & time consuming
– Other priorities, no incentives
• University repositories vs national repositories (RDA) vs discipline
repositories ????
• Data licensing & agreements – govt & agency data
33. What are the key barriers in Canada to
advancing OR/OS?
• The lack of a national data services strategy. There have been
previous articulations of a strategy but all that exists is a patchwork.
• The tenure and promotion procedures used by research institutions
Do not recognize data as a research product
Competitive nature of obtaining grants
• Researcher reluctance is causing Tri-Agencies to move slowly on
implementing research data policies linking grants to data deposit
(even if data are not shared).
• Incomplete understanding of the issues by senior university
administration.
• Insufficient expertise and financial support to prepare data for
reuse.
• A shortage of suitable data repositories.
35. Göttingen-CODATA Symposium
18-20 March 2018
The critical role of university RDM infrastructure in transforming data to knowledge
http://conference.codata.org/2018-Goettingen-RDM/
An opportunity to share experiences, research and insights in the development
implementation of RDM services in research institutions.
Special collection of Data Science Journal.
Themes: services and solutions; strategy; measuring success; skills and support;
sustainability; shared services and outsourcing / consortiums; service level, trust and FAIR;
champions and engaging with researchers.
Announcement of call for papers in the next week or so.
Deadline for abstracts, 15 December.