Are you a researcher, citizen scientist, institution or community looking for data storage and value-added services? Do you want access to tools to make your research data more FAIR (findable, accessible, interoperable, and reusable)? Interested in seeing how the future European Open Science Cloud could support research data and practically foster cross-border, cross-disciplinary collaboration? Then this webinar is for you!
2. The EUDAT Collaborative Data Infrastructure (EUDAT CDI) is
one the major e-infrastructures supporting research in Europe
EUDAT’s vision:
Data is shared and preserved across borders and disciplines
According the vision, EUDAT provides a set of interoperable
services that support European researchers in different steps
of the research data life cycle.
6/1/2021 2
EUDAT CDI in a nutshell
3. The EUDAT members
A growing network of
29 European research
organisations, data and
computing centres from
16 countries
8. The EUDAT CDI supports the vision of the European Open
Science Cloud offering open and seamless services for storage,
management, analysis and re-use of research data, across
borders and scientific disciplines.
6/1/2021 8
EUDAT and EOSC
10. eudat.eu @eudat_eu
EUDAT SERVICES
- SERVING THE FULL RESEARCH DATA LIFECYCLE -
THE
Mark van de Sanden (SURF, The Netherlands)
29 APRIL 2021 @DICE – EUDAT WEBINAR
11. Trust
Data
Curation
Common Data Services
Users
User functionalities, data capture
& transfer, virtual research
environments
Persistent storage, identification,
authenticity, workflow execution,
mining
Data
Generators
Community Support Services
Data discovery & navigation,
workflow generation, annotation,
interpretability
… a framework for the future
11
12. … the problem space
Community repositories
Institute repositories
Scientists personal data
Homeless scientists
Citizen scientists
12
13. The CDI – A Service Infrastructure & Network
13
14. 1. Research data management is complex (tailored solutions
are often needed)
2. Research communities are heterogeneously organised
(some mature, some just starting)
3. Data management and storage services cannot easily be
dissociated from computing services (but behave differently)
4. Storage services require a specific kind of commitment over
a long period (typically >10 years)
6/1/2021
14
Challenges
15. If there are hundreds of Research Infrastructures, how many different
data management systems can be sustained?
15
15
16. B2ACCESS – Authentication and Authorisation
B2DROP – Data Workspace
B2SAFE – Distributed, Secure Policy Based Data Storage
B2SHARE – Searchable Data Repository
B2STAGE – (High Performance) Data Movement
B2FIND – Searchable Metadata Aggregator
B2HANDLE – Persistent Identifier Provider
B2NOTE – Semantic Metadata Annotation
easy.DMP – Data Management Planning Assistant
Gitlab – Git repository and collaborative software
development platform
Building blocks - different service types
17.
18. CREATING
DATA
PROCESS
ING DATA
ANALY-
ZING
DATA
PRESERV
ING DATA
GIVING
ACCESS
TO DATA
RE-USING
DATA
PROCESSING
DATA: entering,
transcribing, checking &
validating, anonymising
and describing
ANALYSING
DATA: interpreting,
deriving, producing
outputs & publishing,
preparing for sharing
PRESERVING DATA: migrating, backing-
up, storing, creating metadata and
documentation, archiving
ACCESS TO
DATA: distributing,
sharing, controlling
access, promoting
RE-USING DATA: for
follow-ups, new research,
research reviews,
scrutinising, teaching &
learning
CREATING DATA: designing,
planning consent, collection and
management, capturing and
creating metadata
Ref: UK Data Archive
Research Data Life Cycle
21. 3/31/20 10
CompBioMed is a European Commission
H2020 funded Centre of Excellence focused on
the use and development of computational
methods for biomedicalapplications.
CompBioMed
Use case: Data Movement, Processing,
Publication and Discovery
Bring data generated at research facility
close to large scale computing facilities
for data processing and publish data in a
FAIR way.
22. Data Transfer
(scp/rsync/gridftp)
Quality checks
Processing
Ingest Data
Register PID
Register PID
Register PID
Replicate Data
Replicate Data
Publish
Publish
Processing Processing
Processing
Harvest
Discover
Hamburg, DE London, UK Munich, DE
Amsterdam, NL
Barcelona, ES
Grenoble, FR
23. The Language Resource Switchboard
(LRS) support users in processing and
analysing language resources. The
LRS has been integrated with B2DROP
to provide easy access.
Use case: Connecting data to
analysis tools
makes digital language resources
available to scholars, researchers,
students and citizen-scientists from
all disciplines, especially in the
humanities and social sciences,
through single sign-on access. CLARIN
offers long-term solutions and
technology services for deploying,
connecting, analyzing and sustaining
digital language data and tools.
69 CLARIN Centres
CLARIN ERIC
25. 6/1/2021 25
Conducts research to provide
comprehensive solutions to the
grand challenges facing society in
the fields of energy and
environment, information and brain
research. Our aim is to lay the
foundation for the key technologies
of tomorrow.
Organisational level out-of-the-box
FAIR data repository to support
local communities
Use case: FAIR Data Repository
26. 6/1/2021 26
Organisational level out-of-the-box FAIR data
repository to support publication of data from a
secure environment
Use case: FAIR Data Repository for
sensitive data
Is a platform for researchers
at UiO and other public
research institutions.
Provide a secure project
area to collect, store and
analyze sensitive research
data in a secure
environment and access
from anywhere in the world
TSD
27. CDI Service Offering (advanced)
Data Analysis and/or Processing
Short term storage resources,
tightly coupled to computing resources,
used during data processing and analysis
Personal and/or Project workspace
Mid-term storage resources during project,
Good connection to computing facilities,
To storage active resource data
Basic Data Archive
Mid term storage resources after/between projects,
accessible from computing facilities to store non-active
research data, bitwise preservation
Advanced Data Archive
Mid-long term data resources, accessible from computing
facilities to store non-active research data, high level data
management
Data Repository
Long term data resources, accessible from computing
facilities to store non-active FAIR data
Data Discovery
Making FAIR research data findable
27
6/1/2021
28. eudat.eu @eudat_eu
QUESTIONS ?
Mark van de Sanden
mark.vandesanden@surf.nl
EUDAT Technical Coordinator
All credits go to the contributors from EUDAT partners and the EUDAT CDI
(Damien Lecarpentier, Debora Testi, Johannes Reetz, Rob Carrillo, …)