Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Jisc Research Data Shared Service Open Repositories 2018 Paper

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 36 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Jisc Research Data Shared Service Open Repositories 2018 Paper (20)

Publicité

Plus récents (20)

Publicité

Jisc Research Data Shared Service Open Repositories 2018 Paper

  1. 1. Jisc Research Data Shared Service Looking at the past, looking to the future
  2. 2. Needs »A better name… “What's in a name? That which we call a rose By any other word would smell as sweet;” Prize available! @johnpkaye http://pngimg.com/download/651/?i=1
  3. 3. Content »Jisc Introduction »International Policy Context »Jisc Response »RDSS Pilot 07/06/2018 Building a National Data Service 3
  4. 4. Who we are 07/06/2018 Building a National Data Service Jisc is the UK higher, further education and skills sectors’ not-for-profit organisation for digital services and solutions Operate shared digital infrastructure and services Provide trusted advice and practical assistance for universities, colleges and learning providers We… Negotiate sector-wide deals and conditions with IT vendors and commercial publishers 4
  5. 5. Jisc Digital Futures 07/06/2018 Building a National Data Service Store services Playlists Diagnostic tool builder Curation and remix Learner Analytics Services Digital capability Learning analytics Digital launchpad Apprentice workforce development Digital leadership Summer of student innovation Analytics academy Analytics labs Qualification verification App and content store Research data discovery Research data usage metrics Equipment data Repository and preservation platform Research data shared service ? 5
  6. 6. Open science » Open Science: an umbrella term for a technology and data driven systemic change in how researchers work, collaborate, share ideas, disseminate and re-use results, by adopting the core values that knowledge should be reusable, modifiable and redistributable.This allows us to address the increasing demand in society to address societal challenges of our time. https://vimeo.com/161464468 PV 2018 , open science research data Open as possible & as closed as necessary
  7. 7. Why open science, research data sharing ? »Innovation and impact »Meet global challenges with new forms of research »Supports research integrity and better research »Accountability PV 2018 , open science research data https://royalsociety.org/topics-policy/projects/science-public- enterprise/report/ http://doi.org/10.1371/journal.pone.0026828
  8. 8. G7 Science Ministers, Open Science,2017 “We recognize that ICT developments, the digitisation and the vast availability of data, efforts to push the science frontiers, and the need to address complex economic and societal challenges, are transforming the way in which science is performed towards Open Science paradigms.We agree that an international approach can help the speed and coherence of this transition, and that it should target in particular two aspects. First, the incentives for the openness of the research ecosystem: the evaluation of research careers should better recognize and reward Open Science activities. Secondly, the infrastructures for an optimal use of research data: all researchers should be able to deposit, access and analyse scientific data across disciplines and at the global scale, and research data should adhere to the FAIR principles of being findable, accessible, interoperable, and reusable.” September 2017 http://www.g8.utoronto.ca/science/2017-science-communique.html PV 2018 , open science research data
  9. 9. UK Policy PV 2018 , open science research data » Awareness of regulatory environment » Data access statement » Policies and processes » Data storage » Structured metadata descriptions » DOIs for data » Data securely preserved for a minimum of 10 years from last use » University roadmaps in place 2012, mandate in place from 1 May 2015
  10. 10. Solution Shared research data service to meet the requirements for universities to enable better management of research data web: https://www.jisc.ac.uk/rd/projects/research-data-shared-service Github: https://github.com/JiscRDSS Blog: https://researchdata.jiscinvolve.org/wp/ PV 2018 , open science research data
  11. 11. Research Data Preservation Challenge Implementing Archivematica for research data preservation at York and Hull Jenny Mitcham (Digital Archivist) - University of York 26 April 2018 Jisc ResearchSupport 11
  12. 12. Co-Design
  13. 13. Key researcher issues driving investment in Preservation Source: Jisc DAF Survey results 2016 Capture & reuse Preserve Report Advise & best practise Filling a gap 75% of respondents look first to their institution to preserve their data Uptake of RDM Only 40% of respondents have a Research Data Management plan Advocacy Only 16% of respondents are currently accessing university RDM support services Metadata Only 18% of respondents say they follow established metadata guidelines Public datasets >70% recognise that research is a public good and should be publicly released Sensitive data 41% of respondents have some form of sensitive data PV 2018 , open science research data
  14. 14. preservation PV 2018 , open science research data “Support is woeful in the university currently, in particular long-term data archiving is critically required. Most of my non-current data is rotting on CD's and hard-drives.”
  15. 15. Preservation Challenges »Automated preservation workflow › Lack of resources › Preservation sausage machine »Interactive preservation workflow › Work with the researcher and their data »Appropriate Workflow › What is an appropriate workflow?
  16. 16. This is it! Preservation Systems Multi-tenant administration Discovery User Interfaces and Portals APIs User InterfacesUser InterfacesUser InterfacesTenant User Interfaces APIs Jisc Reporting APIs APIs APIs Tenant Storage APIs Jisc Repository Core Infrastructure APIs Metadata Store Publish Subscribe Messaging Service Cloud Data Storage (Access and Archival) Tenant Repository, CRIS and research systems Scholarly Communications, Service APIs
  17. 17. Service workflow summary 17 Repository Messaging Preservation service Reporting and analytics Archival data storage National research data aggregation Or 1.a. Researcher deposits data 2. Data added to aggregation 3. Data is automatically preserved 4. Use of data and service is monitored 7. Data stored long term 6. Researchers find and reuse data Institutional or external services 5. Other services are updated 07/06/2018 Building a National Data Service 1.b. Record of data external deposit Layer
  18. 18. Pilot Alpha MVP 26 April 2018 Jisc ResearchSupport 18
  19. 19. Demo Pre-recorded short demo showing data being uploaded into a test Samvera repository instance, with automatic ingest into Preservica. 26 April 2018 Jisc ResearchSupport 19 https://www.youtube.com/watch?v=d-l1ARNUwWA&t=13s
  20. 20. Multi-tenant Research Repository What we’re working on
  21. 21. Data Model https://github.com/JiscRDSS/rdss-canonical-data-model
  22. 22. Samvera Community Feedback »For discussion »Open for comment »https://goo.gl/g2BGkQ 26 April 2018 Jisc ResearchSupport 22
  23. 23. Production Service Nick Youngson - CC BY-SA - http://www.thebluediamondgallery.com/handwriting/i/insurance-requirements.html »Scalable »Sustainable »Intuitive
  24. 24. Production Priorities »First priority is research data › Research output (Article/Thesis etc.) › Research data › Research software/code › Provenance metadata (method) »But also….. › Preservation systems tailored for multiple digital objects and data types › Use cases and pilots for objects beyond research data 26 April 2018 Jisc ResearchSupport 24 https://creativecommons.org/licenses/by/2.0/ https://www.flickr.com/photos/cogdog/
  25. 25. Production: Where we are now? Preservation Systems Multi-tenant administration Discovery User Interfaces and Portals APIs User InterfacesUser InterfacesUser InterfacesTenant User Interfaces Jisc Reporting APIs APIs Tenant Storage Jisc Repository Core Infrastructure APIs Metadata Store Publish Subscribe Messaging Service Cloud Data Storage (Access and Archival) Tenant Repository, CRIS and research systems Scholarly Communications, Service APIs APIs APIs APIs
  26. 26. What we’re working on
  27. 27. Part of the Jisc family Submission Acceptance Publication Use SHERPA JULIET SHERPA RoMEO SHERPA REF SHERPA Fact Monitor UK Jisc collections OpenDOAR Publications Router Monitor local CORE IRUS-UK RIOXX Research publication lifecycle Jisc services Report on compliance Deposit in repository Manage costs Check compliance Select Journal Maximise impact Record reach Record impact ORCID support OpenAIRE NOAD Research Data Shared Service Metrics lab experiment
  28. 28. 3 standard service options End-to-end service Repository service Preservation service Service to be launched in Autumn 2018 All 3 options include:  Financial benefits  Standards  Advisory  Network membership
  29. 29. 29
  30. 30. 30
  31. 31. 31
  32. 32. 32
  33. 33. Preservation action registry (PAR) PV 2018 , open science research data Arkivum, Artefactual, Preservica, Open Preservation Foundation & Jisc » Sharing information on digital formats, actions (e.g. conversion) and tools (e.g. JHOVE) for preservation via APIs » Exchange between systems and organisations on preservation policies and solutions » Capture best practice and give confidence for preservation actions, accelerate adoption of digital preservation » Prototype based on Archivematica’s Format Policy Register and Preservica’s Linked Data Registry as part of the Jisc research data shared service
  34. 34. Jisc national shared research platform Information sources » Publications Router » Publishers » Crossref » ORCID » DataCite » PubMed » Sherpa policy tools University systems » (SingleSign-On, Finance,HR..) Information destinations » Google etc. » Discovery services » Jisc CORE (global OA aggregation) » Jisc Monitor (compliance checking) » Jisc Collections » Funders systems » OpenAIRE + for EU Preservation services Reports and dashboards University X repository Open Access publications Research datasets University Y repository Open Access publications Research datasets University Z repository Open Access publications Research datasets
  35. 35. Under Development:Tiered Storage Service 6/7/2018 35
  36. 36. jisc.ac.uk John Kaye Head of Change – Research john.kaye@jisc.ac.uk @johnpkaye PV 2018 , open science research data

Notes de l'éditeur

  • A definition of open science and its scope – open science covers the whole of the research lifecycle – from the creation, to collaboration and analysis and being as open as possible and as closed as necessary, for example there is sensitive data and rights to first use etc.
    One of the key ways to describe open science is via the FAIR principles, findable, accessible, interoperable and re-useable. So it is not just about having access to a paper or data it is being able to re-use it for multiple purposes.

  • The Spanish Cucumber E. Coli. This genome was analysed within weeks of its outbreak because of a global and open effort; data about the strain’s genome sequence were released freely over the internet as soon as they were produced. Often we see this in human related emergencies – so imagine if this was the case for more research and more widely applied ?
    Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results
  • This is an international effort , it is global and G7 see it as important – global – focus on incentives and change towards open practice and infrastructure /FAIR
    Also the vision here is very similar to that of the European Open Science Cloud – EOSC – Jisc is active in the EOSC and it is relevant to Jisc remaining an essential part of the global infrastructure for research
    French president recently stated public funded research data should be open by default , this was alongside the AI and innovation agenda – so for AI to meet its potential open data is needed
  • Research organisations have primary responsibility for ensuring that researchers manage their data effectively. They need established infrastructure and processes to ensure:
    Retained EPSRC-funded research data is preserved for a minimum of ten years
    Effective data curation is provided throughout full data lifecycle
    Knowledge of publicly-funded research data holdings
    Discoverability; recording of third party access requests
    Notice and justification of access restrictions, for example ‘commercially confidential’
    Awareness and use of relevant law, for example FOI
    Awareness and compliance with research data policies
    Adequate RDM resource allocation for example from quality-related research (QR) funding or research grants

  • More effective Research Data Management must happen to comply with Funder Mandates, ensure data is not lost, and to realise a whole range of positive benefits
    A shared service (provided by Jisc) seems to offer a number of benefits:
    Cost savings and efficiencies
    Common approaches and practice – do this together
    Research system standardisation and interoperability ( do it once rather than many times! , & also address it across essential systems so we can key once and share)
    Address market gaps

  • The long tail

    The long tail of unidentifiable files that we will have to deal with
    Mention Jenny Mitcham's stats - around 60% of unidentifiable items in the RDM collection using existing workflows
    PDF's - easy to deal with, as problem solved by global initiatives e.g. JHOVE, VeraPDF

  • We worked with a lot of people!

    How and why we’ve got to where we are
    Pilots
    Worked with 16 pilot institutions of various sizes 
    Created a procurement framework 

    Pilots
    17 institutions
    Cross spectrum use cases
    Large and small
    Collaborative development
    Multiple suppliers and solution vendors

    Drivers
    More than £5 million investment over 2 years
    Open access
    Sector defined requirements
    “R@R” co-design
    Over half the HEI sector involved
  • Automated preservation workflow:
    lack resources (staff, skills, budget, time) to do a comprehensive job of preserving all their research datasets and will instead
    want a low-cost, fully automated, 'black box' approach to digital preservation of at least some of their data.
    They want a 'preservation sausage machine' whereby research data is fed in at one end and out of the other comes 'preservation packages' containing the research data in a form that is better described and structured for long-term usability.
    Interactive preservation workflow:
    Institution will want to work closely with both the Research Data and the Researcher as part of an iterative process of quality control and digital
    This is a more interactive and resource intensive process than 'automated' preservation, but can yield better results and may be more appropriate for specific types of research or institution.
    The most appropriate workflow to use will depend on many factors, e.g. the experience an institution has with digital preservation, the resources at its disposal, the research discipline or type of data involved, the requirements of the research funder, the institutions policy and so on.
  • How and why we’ve got to where we are

    Text and
  • Where are we now
    Multi tenant user interface
    Workflows
  • Obligatory slide that cant be read by anyone
    It wouldn’t be an RDSS presentation without it
    See Dom if you want to read it

    data model, apis and reference (example) apis and data available to develop
  • First priority is research data
    However research data is not good enough on it’s own for research publication the ideal is to store, or link to the complete research package
    Research output (Article/Thesis etc.)
    Research data
    Research software/code
    Provenance metadata (method)
    Without this research is often not re-usable or repeatable.
    But…..we are aware that we are providing preservation systems that are tailored to deal with all kinds of digital objects and data.
    We are exploring use cases and pilots for objects beyond research data.
  • » Scalable multi-tenant repository and preservation infrastructure for multiple content types
    » Intuitive user experience - automated workflows
    » Financially sustainable
    » Multiple storage options
    » Event based open APIs
    » Reporting APIs and dashboards
     
  • First priority is research data
    However research data is not good enough on it’s own for research publication the ideal is to store, or link to the complete research package
    Research output (Article/Thesis etc.)
    Research data
    Research software/code
    Provenance metadata (method)
    Without this research is often not re-usable or repeatable.
    But…..we are aware that we are providing preservation systems that are tailored to deal with all kinds of digital objects and data.
    We are exploring use cases and pilots for objects beyond research data.
  • Where we are now
    Core Architecture
    multitenant database
    Interoperability layer
    Data model
    Proof of concept front end API
    Initial Front end design

  • Where are we now
    Multi tenant user interface
    Workflows
    Integrations with open access service

    Core Architecture consisting of multitenant database with tech, descriptive and event metadata.
    Proof of concept API that shows that we can put a lightweight front end on multi tenant core architecture
    Begun design of front end development with our Design and User Experience spcialists
  • Integrations with open access service
  • * This year, the University of Westminster will be piloting RDSS within their existing information infrastructure. * Since 2013, the University has been using Haplo Research Manager as their Virtual Research Environment. * As well as CRIS, postgraduate research workflows and research ethics functions, it provides the University's repository, integrating REF and Open Access management. * It has excellent buy-in from researchers, as a "one stop shop" for research information with an compelling user experience. * This is their current setup, with a custom hybrid Haplo/EPrints repository. * Information flows from University systems into Haplo. * Researchers self-deposit outputs into the Haplo Repository module, and a workflow helps them work with the metadata team to prepare their outputs for publication. * When published, outputs are pushed to EPrints which provides the public view of the repository. * The researcher's full research profile, including publications, is pushed to the corporate web site.
  • * By August 2018, Westminster will be using an "all Haplo" Repository. * EPrints will be retired, * and Haplo will provide the public interface of Westminster repository. * The project is lead by the Research and Scholarly Communications team based in Library and Archives at UoW in collaboration with University of Westminster academics * The move is motivated by a desire to use a standard repository, using the Haplo core product with a small amount of local configuration on top. * Haplo Repository provides many "building blocks", and Westminster have selected the ones most appropriate for their research. * In particular, collections workflow, support for practice-based research, and REF & OA reporting * Westminster are working with two other institutions to define common workflows and reporting, aiming to launch their new repositories at roughly the same time.
  • * By the end of the year, Westminster will be using RDSS services for preservation. * Haplo already has initial support for the RDSS message bus. * The Haplo developers like the RDSS because a single interface allows integration with many different research data components: "we get many integrations for the price of one" * The flexibility of the RDSS allows Westminster to pick the components they need, and easily integrate into their research information infrastructure.
  • Actions = check sums , convert files – for normalization or rendering etc. TIFF to JPEG
    These act on objects, create AIP
    Actions are executed by tools = DROID, JHOVE
    general purpose or disciplinary specific

  • Building foundations for the future

×