SlideShare une entreprise Scribd logo
1  sur  51
Télécharger pour lire hors ligne
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 www.eudat.eu
Data Management Plans –
EUDAT best practices and case study
Stéphane COUTIN (CINES)
26 April 2017
This work is licensed under the Creative
Commons CC-BY 4.0 licence
Objectives
High level presentation of research
data management and H2020 context
Present a simple approach and draft a
DMP for a given case.
Overview of EUDAT services
CINES
Based in Montpellier, France – approx. 60 people
(engineers, techs, admin)
Created in 1999, aka CNUSC (Centre National
Universitaire Sud de Calcul) – created in 1980
Administrated and funded by Ministry of Higher
Education & Research (MESR)
4
Provides the French public research
community with computing
resources, services and
expertise
3 main mandates / activities:
High performance computing
Digital preservation
Hosting
Centre Informatique National de l’Enseignement
Supérieur
Stéphane COUTIN coutin@cines.fr
Research engineer at CINES since 2013
Initialy in digital preservation dept
Now working in HPC dept
Involved in EU projects
Leading PRACE collaboration task with other
eInfra
Working on EUDAT for collaboration with PRACE
Background in Information Systems projects and
programmes management
EUDAT – www.eudat.eu
Image CC-BY-NC ‘Data centre’ by Bob Mical
www.flickr.com/photos/small_realm/15995555571
A pan-European e-Infrastructure solution for pan-
European RI data Challenges
All RIs are facing data challenges
Where to store the growing amount of data?
How to find it?
How to make the most of it?
Solutions are needed at pan-European level
7
We need to promote synergies
Some services are common to many
communities
Costs and investments can be
optimised
Better integration of e-infras and
research infrastructures can be
achieved
e-Science Data Factory
EUDAT2020 - 35 Partners
A truly pan-European Infrastructure
EUDAT offers common data services, supporting
multiple research communities as well as individuals,
through a geographically distributed, resilient network
of 35 European organisations
The EUDAT vision is to enable
European researchers and
practitioners from any research
discipline to preserve, find,
access, and process data in a
trusted environment, as part of
a Collaborative Data
Infrastructure
PRACE – EUDAT collaboration
Joint Open Calls for proposals
EUDAT offering data services and resources
through regular PRACE calls
Review process is transparent to users
Joint training activities
Continuous technical discussion and
developments of new components
Definition of the EUDAT Workspace area
Synchronization of authentication credentials
for single sign-on
10
Quick question:
Think to your ongoing or next to start HPC project
What are your data related requirement?
What is the budget for this?
THE CHANGING DATA LANDSCAPE
Image CC-BY-SA ‘data.path Ryoji.Ikeda - 3’ by r2hox www.flickr.com/photos/rh2ox/9990016123
Data explosion
More and more data is
being created
Issue is not creating
data, but being able to
navigate and use it
Data management is
critical to make sure
data are well-organised,
understandable and
reusable
Digital data are fragile and susceptible to loss for a wide variety of reasons
Natural disaster
Facilities infrastructure failure
Storage failure
Server hardware/software failure
Application software failure
Format obsolescence
Human error
Malicious attack
Loss of staffing competencies
Loss of institutional commitment
Loss of financial stability
Changes in user expectations
Data loss
Image CC-BY ‘Hard Drive 016’ by Jon Ross www.flickr.com/photos/jon_a_ross/1482849745
Link rot – more 404 errors
generated over time
Reference rot* – link rot
plus content drift i.e.
webpages evolving and
no longer reflecting
original content cited
* Term coined by Hiberlink http://hiberlink.org
Data persistency issues
Jonathan D. Wren Bioinformatics 2008;24:1381-1385
CINES
Why manage research data?
To make your research easier!
To stop yourself drowning in irrelevant stuff
In case you need the data later
To avoid accusations of fraud or bad science
To share your data for others to use and learn from
To get credit for producing it
Because funders or your organisation require it
Well-managed data opens up opportunities
for re-use, integration and new science
H2020 open research data pilot
• Already expanded from a select pilot to all work
areas
• All need to consider which data can be made
open
• Mantra = “As open as possible as closed as
necessary”
• Underlying driver is good (FAIR) data
management
Image CC-BY-SA by SangyaPundir
Key requirements of the open data pilot
Beneficiaries participating in the Pilot will:
Deposit data in a research data repository of
their choice
Take measures to make it possible for others to
access, mine, exploit, reproduce and
disseminate the data free of charge
Provide information about tools and instruments
necessary for validating the results (where
possible, provide the tools and instruments
themselves)
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi
/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
Suggested DMP creation process
Simple diagram focusing on data dynamics
You can use other diagram type
DFD : Data Flow Diagram
You and your team are submitting a proposal for a project in the domain of smart cities.
The City has implemented a large set of sensors measuring traffic. The data are collected
in the City datacenter.
You want to develop an application being able to forecast the traffic and also how it will
be impacted by events like planned roadworks. This application would run on a PRACE
site, not located in the City. On the PRACE site your storage space is limited to 10 TB.
The application uses the following inputs:
Sensors historical data over the last 12 months : sensors produce 1TB of data a day.
You implement a preprocessing module translating those data into a reduced data set
(10 MB per day). It is based on a format you have defined to describe the traffic.
The results provided by the simulation. This enables comparison between forecasted
and actual traffic in order to ‘train’ the application.
Weather data (historical and forecast) provided by the national meteo agency. They
use the SYNOP format. The volume is negligible.
Results will be accessible by the city council employees.
Create the project data flow diagram and fill the data summary chapter using a
table.
What would you appreciate to use efficiently the weather data?
Exercise – Phase 1
Data summary table
Dataset Description Origin?
Existing?
Format Size Who could use it?
Proposed data flow diagram
Sensors collection area
PRACE HPC Site
Simulations
PRACE
Storage
Output files
extractor
Input files
Raw sensor
data
Data
Preprocessing
Reduced
sensor data
Weather data
City council
employees
Data transfer
Data summary table
Dataset Description Origin? Existing? Format Size Who could use it?
Raw sensor
data
Available, collected
from sensors
Various 1TB per
day
Reduced
sensor data
Actual
traffic, …
Extracted from raw
sensor data
Binary
(specific)
10 MB a
day
Our simulation
Weather
data
Actual and
forecast
Existing. Meteo open
data platform
SYNOP 1MB a
week
Our simulation
Citizens, scientists, ..
Simulation
results
Forecasted
traffic
Results of our
simulation
Binary
(specific)
10 MB a
day
City council
employees, our
application
CREATING
DATA
PROCESSING
DATA
ANALYSING
DATA
PRESERVING
DATA
GIVING
ACCESS TO
DATA
RE-USING
DATA
Research data lifecycle
CREATING DATA: designing research,
DMPs, planning consent, locate existing
data, data collection and management,
capturing and creating metadata
RE-USING DATA: follow-
up research, new
research, undertake
research reviews,
scrutinising findings,
teaching & learning
ACCESS TO DATA:
distributing data,
sharing data,
controlling access,
establishing copyright,
promoting data PRESERVING DATA: data storage, back-
up & archiving, migrating to best format
& medium, creating metadata and
documentation
ANALYSING DATA:
interpreting, & deriving
data, producing outputs,
authoring publications,
preparing for sharing
PROCESSING DATA:
entering, transcribing,
checking, validating and
cleaning data, anonymising
data, describing data,
manage and store data
Ref: UK Data Archive: http://www.data-archive.ac.uk/create-manage/life-cycle
Findable
– Assign persistent IDs, provide rich metadata, register in a searchable
resource,...
Accessible
– Retrievable by their ID using a standard protocol, metadata remain accessible
even if data aren’t...
Interoperable
– Use formal, broadly applicable languages, use standard vocabularies,
qualified references...
Reusable
– Rich, accurate metadata, clear licences, provenance, use of community
standards...
FAIR for machines as well as people
www.force11.org/group/fairgroup/fairprinciples
Making data FAIR
Bitstream
Persistent Identifier
Metadata
Digital objects can be
aggregated to digital
collections
What is a digital object?
Digital object example
Metadata and documentation is needed to locate
and understand research data
Think about what others would need in order to find,
evaluate, understand, and reuse your data.
Get others to check the metadata to improve quality
Use standards to enable interoperability
Metadata and documentation
Use of standards
Controlled vocabularies for unambiguous keywords
Simple, complete andconsistent information
Appropriate description
Explanation of limitations to support reuse
Avoid special characters e.g. !@<~ etc...
Provide persistent identifiers such as DOIs
What makes metadata good?
The good and the bad
Metres / seconds
2015-09-10T15:00:01+01:00
Longitudinal wind speed
PDF 1.7
2008 US Population statistics
Barcelona, Venezuela
Furlongs and fortnight
10th Sept. 2015 15:00:01
U
PDF
Population statistics
Barcelona
More precise and
standardised Ambiguous
Digital preservation context
34
Main risks deal with:
• Comprehension
• Integrity
• Exploitation
• Valorization
Quality assurance
procedures to be setup for
• Metadata
• File formats
• Representation information
• Storage
• Access
• Technology watching
Based in Montpellier, France – approx. 60 people
(engineers, techs, admin)
Created in 1999, aka CNUSC (Centre National
Universitaire Sud de Calcul) – created in 1980
Administrated and funded by Ministry of Higher
Education & Research (MESR)
4
Provides the French public research
community with computing
resources, services and
expertise
3 main mandates / activities:
High performance computing
Digital preservation
Hosting
Centre Informatique National de l’Enseignement
Supérieur
Based in Montpellier, France – approx. 60 people
(engineers, techs, admin)
Created in 1999, aka CNUSC (Centre National
Universitaire Sud de Calcul) – created in 1980
Administrated and funded by Ministry of Higher
Education & Research (MESR)
4
Provides the French public research
community with computing
resources, services and
expertise
3 main mandates / activities:
High performance computing
Digital preservation
Hosting
Centre Informatique National de l’Enseignement
Supérieur
Based in Montpellier, France – approx. 60 people
(engineers, techs, admin)
Created in 1999, aka CNUSC (Centre National
Universitaire Sud de Calcul) – created in 1980
Administrated and funded by Ministry of Higher
Education & Research (MESR)
4
Provides the French public research
community with computing
resources, services and
expertise
3 main mandates / activities:
High performance computing
Digital preservation
Hosting
Centre Informatique National de l’Enseignement
Supérieur
Based in Montpellier, France – approx. 60 people
(engineers, techs, admin)
Created in 1999, aka CNUSC (Centre National
Universitaire Sud de Calcul) – created in 1980
Administrated and funded by Ministry of Higher
Education & Research (MESR)
4
Provides the French public research
community with computing
resources, services and
expertise
3 main mandates / activities:
High performance computing
Digital preservation
Hosting
Centre Informatique National de l’Enseignement
Supérieur
Based in Montpellier, France – approx. 60 people
(engineers, techs, admin)
Created in 1999, aka CNUSC (Centre National
Universitaire Sud de Calcul) – created in 1980
Administrated and funded by Ministry of Higher
Education & Research (MESR)
4
Provides the French public research
community with computing
resources, services and
expertise
3 main mandates / activities:
High performance computing
Digital preservation
Hosting
Centre Informatique National de l’Enseignement
Supérieur
Based in Montpellier, France – approx. 60 people
(engineers, techs, admin)
Created in 1999, aka CNUSC (Centre National
Universitaire Sud de Calcul) – created in 1980
Administrated and funded by Ministry of Higher
Education & Research (MESR)
4
Provides the French public research
community with computing
resources, services and
expertise
3 main mandates / activities:
High performance computing
Digital preservation
Hosting
Centre Informatique National de l’Enseignement
Supérieur
Exercise phase 2 updated DFD
EUDAT services
CDI Data Domain
EUDAT Data Domain modeled on the ANDS1 Data Curation Continiuum
1. Australian National Data Service organization – www.ands.org.au
Store and exchange data with
colleagues and team members,
including research data not
finalized for publishing
share data with fine-grained
access controls
synchronize multiple versions of
data across different devices
An ideal solution for researchers and scientists to:
Features:
20 GB storage per user
Living objects, so no PIDs
Versioning and offline use
Desktop synchronisation
Sync and Share Research Data
B2DROP – personal cloud
b2drop.eudat.eu
store data safely at a trusted
and certified data centre
preserve data to guarantee
long-term persistence
control access and share
data with colleagues and the
world
A winning solution for researchers, scientists and
communities to:
Features:
Metadata management
Permanent PIDs
Open Access support
Store and Publish Research Data
B2SHARE - repository
b2share.eudat.eu
replicate research data into secure
data stores
archive and preserve research data
in the long-term
bring data close to powerful
compute resources
co-locate data with different
communities
benefit from economies of scale
The ideal solution for communities with no facility for
archival to:
Features:
Large-scale storage
Robust and highly available
Permanent PIDs
Replicate Research Data Safely
B2SAFE - preservation
eudat.eu/b2safe
move large amounts of data
between data stores and high-
performance compute resources
re-ingest computational results
back into EUDAT
deposit large data sets onto
EUDAT resources for long-term
preservation
Facilitating communities to:
Features:
High-speed transfer
Reliable and light-weight
Manages permanent PIDs
Get Data to Computation
B2STAGE - transfer
eudat.eu/b2stage
seek data objects and collections
using powerful metadata searches
catalogue community data by
means of selected metadata
browse through multi-disciplinary
data collections filtered by
content, provenance and
temporal keywords
A metadata catalogue service to:
Features:
Simple to use
Standards-based
Comprehensive catalogue
Find Research Data
B2FIND - catalogue
b2find.eudat.eu
Update the dataflow diagram with EUDAT services
you could use for preservation and metadata
publication.
Exercise phase 3
DFD with other EUDAT services
Simulations
PRACE
Storage
Output files
extractor
Input files
EUDAT B2SHARE
B2SHARE
storage
Web front end
Or API
Results
Traffic data
Data extractor
(uses API)
Publication
(uses API)
Actual traffic
Forecast traffic
Citizens,
researchers,
companies, ...
Search and retrieve data
EUDAT Site
EUDAT B2SAFE sorage
EUDAT
B2SAFE
Data and metadata
EUDAT
B2FIND
Metadata
Metadata search
Replication
www.eudat.eu
Thanks – any questions
Acknowledgements:
Thanks to Mark van de Sanden, Marjan Grootveld , Sarah Jones
and Giuseppe Fiameni for some of the slides

Contenu connexe

Tendances

Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
semanticsconference
 

Tendances (20)

The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....
The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....
The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....
 
How FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.eu
How FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.euHow FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.eu
How FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.eu
 
B2FIND Integration | www.eudat.eu |
B2FIND Integration | www.eudat.eu | B2FIND Integration | www.eudat.eu |
B2FIND Integration | www.eudat.eu |
 
EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu | EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu |
 
EDF2014: Talk of Stefan Decker, Director, Insight Galway, Ireland & Anthony M...
EDF2014: Talk of Stefan Decker, Director, Insight Galway, Ireland & Anthony M...EDF2014: Talk of Stefan Decker, Director, Insight Galway, Ireland & Anthony M...
EDF2014: Talk of Stefan Decker, Director, Insight Galway, Ireland & Anthony M...
 
OSFair2017 Workshop | Towards a Policy Framework for the European Open Scienc...
OSFair2017 Workshop | Towards a Policy Framework for the European Open Scienc...OSFair2017 Workshop | Towards a Policy Framework for the European Open Scienc...
OSFair2017 Workshop | Towards a Policy Framework for the European Open Scienc...
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 7, 2016|...
 
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
 
B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu | B2STAGE- how to shift large amounts of data| www.eudat.eu |
B2STAGE- how to shift large amounts of data| www.eudat.eu |
 
Persistent Identifiers in EUDAT Services. B2HANDLE Python library | www.eudat...
Persistent Identifiers in EUDAT Services. B2HANDLE Python library | www.eudat...Persistent Identifiers in EUDAT Services. B2HANDLE Python library | www.eudat...
Persistent Identifiers in EUDAT Services. B2HANDLE Python library | www.eudat...
 
Webinar@AIMS: How to practically support Open Access: Guidelines for Data Pro...
Webinar@AIMS: How to practically support Open Access: Guidelines for Data Pro...Webinar@AIMS: How to practically support Open Access: Guidelines for Data Pro...
Webinar@AIMS: How to practically support Open Access: Guidelines for Data Pro...
 
Semantic interoperability courses training module 1 - introductory overview...
Semantic interoperability courses   training module 1 - introductory overview...Semantic interoperability courses   training module 1 - introductory overview...
Semantic interoperability courses training module 1 - introductory overview...
 
Open Data Support - bridging open data supply and demand
Open Data Support - bridging open data supply and demandOpen Data Support - bridging open data supply and demand
Open Data Support - bridging open data supply and demand
 
Horizon 2020 and the open research data pilot
Horizon 2020 and the open research data pilotHorizon 2020 and the open research data pilot
Horizon 2020 and the open research data pilot
 
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
 
What's all the data about? - Linking and Profiling of Linked Datasets
What's all the data about? - Linking and Profiling of Linked DatasetsWhat's all the data about? - Linking and Profiling of Linked Datasets
What's all the data about? - Linking and Profiling of Linked Datasets
 
Legal Issues in Research Data Collection and Sharing: An Introduction by EUDA...
Legal Issues in Research Data Collection and Sharing: An Introduction by EUDA...Legal Issues in Research Data Collection and Sharing: An Introduction by EUDA...
Legal Issues in Research Data Collection and Sharing: An Introduction by EUDA...
 
How EUDAT services support FAIR data - IDCC 2017| www.eudat.eu |
How EUDAT services support FAIR data - IDCC 2017| www.eudat.eu | How EUDAT services support FAIR data - IDCC 2017| www.eudat.eu |
How EUDAT services support FAIR data - IDCC 2017| www.eudat.eu |
 
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
Linking HPC to Data Management - EUDAT Summer School (Giuseppe Fiameni, CINECA)
 
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch CatalogueExposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
 

Similaire à Data management plans – EUDAT Best practices and case study | www.eudat.eu

Open data for innovation, smart and sustainable prof muliaro
Open data for innovation, smart and sustainable prof muliaroOpen data for innovation, smart and sustainable prof muliaro
Open data for innovation, smart and sustainable prof muliaro
gyleodhis
 

Similaire à Data management plans – EUDAT Best practices and case study | www.eudat.eu (20)

H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaire
 
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
 
Open data pilot
Open data pilotOpen data pilot
Open data pilot
 
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
 
Turning FAIR data into reality
Turning FAIR data into realityTurning FAIR data into reality
Turning FAIR data into reality
 
What is a DMP
What is a DMPWhat is a DMP
What is a DMP
 
Gergely Sipos (EGI): Exploiting scientific data in the international context ...
Gergely Sipos (EGI): Exploiting scientific data in the international context ...Gergely Sipos (EGI): Exploiting scientific data in the international context ...
Gergely Sipos (EGI): Exploiting scientific data in the international context ...
 
LIBER Webinar: Turning FAIR Data Into Reality
LIBER Webinar: Turning FAIR Data Into RealityLIBER Webinar: Turning FAIR Data Into Reality
LIBER Webinar: Turning FAIR Data Into Reality
 
Frankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee ProjeectFrankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee Projeect
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
H2020 Open Data Pilot
H2020 Open Data PilotH2020 Open Data Pilot
H2020 Open Data Pilot
 
Open data for innovation, smart and sustainable prof muliaro
Open data for innovation, smart and sustainable prof muliaroOpen data for innovation, smart and sustainable prof muliaro
Open data for innovation, smart and sustainable prof muliaro
 
Open data-for-innovation-smart-and-sustainable
Open data-for-innovation-smart-and-sustainableOpen data-for-innovation-smart-and-sustainable
Open data-for-innovation-smart-and-sustainable
 
H2020 Open Research Data pilot
H2020 Open Research Data pilotH2020 Open Research Data pilot
H2020 Open Research Data pilot
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...
 
FAIR data: what it means, how we achieve it, and the role of RDA
FAIR data: what it means, how we achieve it, and the role of RDAFAIR data: what it means, how we achieve it, and the role of RDA
FAIR data: what it means, how we achieve it, and the role of RDA
 
Project management monitoring tools
Project management monitoring toolsProject management monitoring tools
Project management monitoring tools
 
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructureeROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
 
Seminario Sobre Datasets Consorcio Madrono
Seminario Sobre Datasets Consorcio Madrono Seminario Sobre Datasets Consorcio Madrono
Seminario Sobre Datasets Consorcio Madrono
 

Plus de EUDAT

Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...
EUDAT
 
Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...
EUDAT
 

Plus de EUDAT (20)

EUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdfEUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
EUDAT_Brochure_Generica_Jan_UPDATED(5).pdf
 
EUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdfEUDAT Booklet Mar22 (2).pdf
EUDAT Booklet Mar22 (2).pdf
 
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdfEUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
EUDAT_Brochure_Generica_Jan_UPDATED (1).pdf
 
EUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdfEUDAT Brochure - B2HANDLE.pdf
EUDAT Brochure - B2HANDLE.pdf
 
EUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdfEUDAT Brochure - B2DROP.pdf
EUDAT Brochure - B2DROP.pdf
 
EUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdfEUDAT Brochure - B2SHARE.pdf
EUDAT Brochure - B2SHARE.pdf
 
EUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdfEUDAT Brochure - B2SAFE.pdf
EUDAT Brochure - B2SAFE.pdf
 
EUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdfEUDAT Brochure - B2FIND(1).pdf
EUDAT Brochure - B2FIND(1).pdf
 
EUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdfEUDAT Brochure - B2ACCESS.pdf
EUDAT Brochure - B2ACCESS.pdf
 
Rob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT servicesRob Carrillo - Writing effective service documentation for EUDAT services
Rob Carrillo - Writing effective service documentation for EUDAT services
 
Ariyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentationAriyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentation
 
Introduction to eudat and its services
Introduction to eudat and its servicesIntroduction to eudat and its services
Introduction to eudat and its services
 
Using B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto PilotUsing B2NOTE: The U.Porto Pilot
Using B2NOTE: The U.Porto Pilot
 
OpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last weekOpenAIRE Advance - Kick off last week
OpenAIRE Advance - Kick off last week
 
European Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshopEuropean Open Science Cloud - Skills workshop
European Open Science Cloud - Skills workshop
 
Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...Linking service capabilities to data stweardship competences for professional...
Linking service capabilities to data stweardship competences for professional...
 
FAIRness of training materials
FAIRness of training materialsFAIRness of training materials
FAIRness of training materials
 
Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...Training by EOSC-hub - Integrating and Managing services for the European Ope...
Training by EOSC-hub - Integrating and Managing services for the European Ope...
 
Draft Governance Framework for the EOSC
Draft Governance Framework for the EOSCDraft Governance Framework for the EOSC
Draft Governance Framework for the EOSC
 
Building Interoperable AAI for Researchers
Building Interoperable AAI for ResearchersBuilding Interoperable AAI for Researchers
Building Interoperable AAI for Researchers
 

Dernier

Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 

Dernier (20)

Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 

Data management plans – EUDAT Best practices and case study | www.eudat.eu

  • 1. EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065 www.eudat.eu Data Management Plans – EUDAT best practices and case study Stéphane COUTIN (CINES) 26 April 2017 This work is licensed under the Creative Commons CC-BY 4.0 licence
  • 2. Objectives High level presentation of research data management and H2020 context Present a simple approach and draft a DMP for a given case. Overview of EUDAT services
  • 4. Based in Montpellier, France – approx. 60 people (engineers, techs, admin) Created in 1999, aka CNUSC (Centre National Universitaire Sud de Calcul) – created in 1980 Administrated and funded by Ministry of Higher Education & Research (MESR) 4 Provides the French public research community with computing resources, services and expertise 3 main mandates / activities: High performance computing Digital preservation Hosting Centre Informatique National de l’Enseignement Supérieur
  • 5. Stéphane COUTIN coutin@cines.fr Research engineer at CINES since 2013 Initialy in digital preservation dept Now working in HPC dept Involved in EU projects Leading PRACE collaboration task with other eInfra Working on EUDAT for collaboration with PRACE Background in Information Systems projects and programmes management
  • 6. EUDAT – www.eudat.eu Image CC-BY-NC ‘Data centre’ by Bob Mical www.flickr.com/photos/small_realm/15995555571
  • 7. A pan-European e-Infrastructure solution for pan- European RI data Challenges All RIs are facing data challenges Where to store the growing amount of data? How to find it? How to make the most of it? Solutions are needed at pan-European level 7 We need to promote synergies Some services are common to many communities Costs and investments can be optimised Better integration of e-infras and research infrastructures can be achieved
  • 9. A truly pan-European Infrastructure EUDAT offers common data services, supporting multiple research communities as well as individuals, through a geographically distributed, resilient network of 35 European organisations The EUDAT vision is to enable European researchers and practitioners from any research discipline to preserve, find, access, and process data in a trusted environment, as part of a Collaborative Data Infrastructure
  • 10. PRACE – EUDAT collaboration Joint Open Calls for proposals EUDAT offering data services and resources through regular PRACE calls Review process is transparent to users Joint training activities Continuous technical discussion and developments of new components Definition of the EUDAT Workspace area Synchronization of authentication credentials for single sign-on 10
  • 11. Quick question: Think to your ongoing or next to start HPC project What are your data related requirement? What is the budget for this?
  • 12. THE CHANGING DATA LANDSCAPE Image CC-BY-SA ‘data.path Ryoji.Ikeda - 3’ by r2hox www.flickr.com/photos/rh2ox/9990016123
  • 13. Data explosion More and more data is being created Issue is not creating data, but being able to navigate and use it Data management is critical to make sure data are well-organised, understandable and reusable
  • 14. Digital data are fragile and susceptible to loss for a wide variety of reasons Natural disaster Facilities infrastructure failure Storage failure Server hardware/software failure Application software failure Format obsolescence Human error Malicious attack Loss of staffing competencies Loss of institutional commitment Loss of financial stability Changes in user expectations Data loss Image CC-BY ‘Hard Drive 016’ by Jon Ross www.flickr.com/photos/jon_a_ross/1482849745
  • 15. Link rot – more 404 errors generated over time Reference rot* – link rot plus content drift i.e. webpages evolving and no longer reflecting original content cited * Term coined by Hiberlink http://hiberlink.org Data persistency issues Jonathan D. Wren Bioinformatics 2008;24:1381-1385
  • 16. CINES
  • 17. Why manage research data? To make your research easier! To stop yourself drowning in irrelevant stuff In case you need the data later To avoid accusations of fraud or bad science To share your data for others to use and learn from To get credit for producing it Because funders or your organisation require it Well-managed data opens up opportunities for re-use, integration and new science
  • 18. H2020 open research data pilot • Already expanded from a select pilot to all work areas • All need to consider which data can be made open • Mantra = “As open as possible as closed as necessary” • Underlying driver is good (FAIR) data management Image CC-BY-SA by SangyaPundir
  • 19. Key requirements of the open data pilot Beneficiaries participating in the Pilot will: Deposit data in a research data repository of their choice Take measures to make it possible for others to access, mine, exploit, reproduce and disseminate the data free of charge Provide information about tools and instruments necessary for validating the results (where possible, provide the tools and instruments themselves) http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi /oa_pilot/h2020-hi-oa-data-mgt_en.pdf
  • 20.
  • 22. Simple diagram focusing on data dynamics You can use other diagram type DFD : Data Flow Diagram
  • 23. You and your team are submitting a proposal for a project in the domain of smart cities. The City has implemented a large set of sensors measuring traffic. The data are collected in the City datacenter. You want to develop an application being able to forecast the traffic and also how it will be impacted by events like planned roadworks. This application would run on a PRACE site, not located in the City. On the PRACE site your storage space is limited to 10 TB. The application uses the following inputs: Sensors historical data over the last 12 months : sensors produce 1TB of data a day. You implement a preprocessing module translating those data into a reduced data set (10 MB per day). It is based on a format you have defined to describe the traffic. The results provided by the simulation. This enables comparison between forecasted and actual traffic in order to ‘train’ the application. Weather data (historical and forecast) provided by the national meteo agency. They use the SYNOP format. The volume is negligible. Results will be accessible by the city council employees. Create the project data flow diagram and fill the data summary chapter using a table. What would you appreciate to use efficiently the weather data? Exercise – Phase 1
  • 24. Data summary table Dataset Description Origin? Existing? Format Size Who could use it?
  • 25. Proposed data flow diagram Sensors collection area PRACE HPC Site Simulations PRACE Storage Output files extractor Input files Raw sensor data Data Preprocessing Reduced sensor data Weather data City council employees Data transfer
  • 26. Data summary table Dataset Description Origin? Existing? Format Size Who could use it? Raw sensor data Available, collected from sensors Various 1TB per day Reduced sensor data Actual traffic, … Extracted from raw sensor data Binary (specific) 10 MB a day Our simulation Weather data Actual and forecast Existing. Meteo open data platform SYNOP 1MB a week Our simulation Citizens, scientists, .. Simulation results Forecasted traffic Results of our simulation Binary (specific) 10 MB a day City council employees, our application
  • 27. CREATING DATA PROCESSING DATA ANALYSING DATA PRESERVING DATA GIVING ACCESS TO DATA RE-USING DATA Research data lifecycle CREATING DATA: designing research, DMPs, planning consent, locate existing data, data collection and management, capturing and creating metadata RE-USING DATA: follow- up research, new research, undertake research reviews, scrutinising findings, teaching & learning ACCESS TO DATA: distributing data, sharing data, controlling access, establishing copyright, promoting data PRESERVING DATA: data storage, back- up & archiving, migrating to best format & medium, creating metadata and documentation ANALYSING DATA: interpreting, & deriving data, producing outputs, authoring publications, preparing for sharing PROCESSING DATA: entering, transcribing, checking, validating and cleaning data, anonymising data, describing data, manage and store data Ref: UK Data Archive: http://www.data-archive.ac.uk/create-manage/life-cycle
  • 28. Findable – Assign persistent IDs, provide rich metadata, register in a searchable resource,... Accessible – Retrievable by their ID using a standard protocol, metadata remain accessible even if data aren’t... Interoperable – Use formal, broadly applicable languages, use standard vocabularies, qualified references... Reusable – Rich, accurate metadata, clear licences, provenance, use of community standards... FAIR for machines as well as people www.force11.org/group/fairgroup/fairprinciples Making data FAIR
  • 29. Bitstream Persistent Identifier Metadata Digital objects can be aggregated to digital collections What is a digital object?
  • 31. Metadata and documentation is needed to locate and understand research data Think about what others would need in order to find, evaluate, understand, and reuse your data. Get others to check the metadata to improve quality Use standards to enable interoperability Metadata and documentation
  • 32. Use of standards Controlled vocabularies for unambiguous keywords Simple, complete andconsistent information Appropriate description Explanation of limitations to support reuse Avoid special characters e.g. !@<~ etc... Provide persistent identifiers such as DOIs What makes metadata good?
  • 33. The good and the bad Metres / seconds 2015-09-10T15:00:01+01:00 Longitudinal wind speed PDF 1.7 2008 US Population statistics Barcelona, Venezuela Furlongs and fortnight 10th Sept. 2015 15:00:01 U PDF Population statistics Barcelona More precise and standardised Ambiguous
  • 34. Digital preservation context 34 Main risks deal with: • Comprehension • Integrity • Exploitation • Valorization Quality assurance procedures to be setup for • Metadata • File formats • Representation information • Storage • Access • Technology watching
  • 35. Based in Montpellier, France – approx. 60 people (engineers, techs, admin) Created in 1999, aka CNUSC (Centre National Universitaire Sud de Calcul) – created in 1980 Administrated and funded by Ministry of Higher Education & Research (MESR) 4 Provides the French public research community with computing resources, services and expertise 3 main mandates / activities: High performance computing Digital preservation Hosting Centre Informatique National de l’Enseignement Supérieur
  • 36. Based in Montpellier, France – approx. 60 people (engineers, techs, admin) Created in 1999, aka CNUSC (Centre National Universitaire Sud de Calcul) – created in 1980 Administrated and funded by Ministry of Higher Education & Research (MESR) 4 Provides the French public research community with computing resources, services and expertise 3 main mandates / activities: High performance computing Digital preservation Hosting Centre Informatique National de l’Enseignement Supérieur
  • 37. Based in Montpellier, France – approx. 60 people (engineers, techs, admin) Created in 1999, aka CNUSC (Centre National Universitaire Sud de Calcul) – created in 1980 Administrated and funded by Ministry of Higher Education & Research (MESR) 4 Provides the French public research community with computing resources, services and expertise 3 main mandates / activities: High performance computing Digital preservation Hosting Centre Informatique National de l’Enseignement Supérieur
  • 38. Based in Montpellier, France – approx. 60 people (engineers, techs, admin) Created in 1999, aka CNUSC (Centre National Universitaire Sud de Calcul) – created in 1980 Administrated and funded by Ministry of Higher Education & Research (MESR) 4 Provides the French public research community with computing resources, services and expertise 3 main mandates / activities: High performance computing Digital preservation Hosting Centre Informatique National de l’Enseignement Supérieur
  • 39. Based in Montpellier, France – approx. 60 people (engineers, techs, admin) Created in 1999, aka CNUSC (Centre National Universitaire Sud de Calcul) – created in 1980 Administrated and funded by Ministry of Higher Education & Research (MESR) 4 Provides the French public research community with computing resources, services and expertise 3 main mandates / activities: High performance computing Digital preservation Hosting Centre Informatique National de l’Enseignement Supérieur
  • 40. Based in Montpellier, France – approx. 60 people (engineers, techs, admin) Created in 1999, aka CNUSC (Centre National Universitaire Sud de Calcul) – created in 1980 Administrated and funded by Ministry of Higher Education & Research (MESR) 4 Provides the French public research community with computing resources, services and expertise 3 main mandates / activities: High performance computing Digital preservation Hosting Centre Informatique National de l’Enseignement Supérieur
  • 41. Exercise phase 2 updated DFD
  • 43. CDI Data Domain EUDAT Data Domain modeled on the ANDS1 Data Curation Continiuum 1. Australian National Data Service organization – www.ands.org.au
  • 44. Store and exchange data with colleagues and team members, including research data not finalized for publishing share data with fine-grained access controls synchronize multiple versions of data across different devices An ideal solution for researchers and scientists to: Features: 20 GB storage per user Living objects, so no PIDs Versioning and offline use Desktop synchronisation Sync and Share Research Data B2DROP – personal cloud b2drop.eudat.eu
  • 45. store data safely at a trusted and certified data centre preserve data to guarantee long-term persistence control access and share data with colleagues and the world A winning solution for researchers, scientists and communities to: Features: Metadata management Permanent PIDs Open Access support Store and Publish Research Data B2SHARE - repository b2share.eudat.eu
  • 46. replicate research data into secure data stores archive and preserve research data in the long-term bring data close to powerful compute resources co-locate data with different communities benefit from economies of scale The ideal solution for communities with no facility for archival to: Features: Large-scale storage Robust and highly available Permanent PIDs Replicate Research Data Safely B2SAFE - preservation eudat.eu/b2safe
  • 47. move large amounts of data between data stores and high- performance compute resources re-ingest computational results back into EUDAT deposit large data sets onto EUDAT resources for long-term preservation Facilitating communities to: Features: High-speed transfer Reliable and light-weight Manages permanent PIDs Get Data to Computation B2STAGE - transfer eudat.eu/b2stage
  • 48. seek data objects and collections using powerful metadata searches catalogue community data by means of selected metadata browse through multi-disciplinary data collections filtered by content, provenance and temporal keywords A metadata catalogue service to: Features: Simple to use Standards-based Comprehensive catalogue Find Research Data B2FIND - catalogue b2find.eudat.eu
  • 49. Update the dataflow diagram with EUDAT services you could use for preservation and metadata publication. Exercise phase 3
  • 50. DFD with other EUDAT services Simulations PRACE Storage Output files extractor Input files EUDAT B2SHARE B2SHARE storage Web front end Or API Results Traffic data Data extractor (uses API) Publication (uses API) Actual traffic Forecast traffic Citizens, researchers, companies, ... Search and retrieve data EUDAT Site EUDAT B2SAFE sorage EUDAT B2SAFE Data and metadata EUDAT B2FIND Metadata Metadata search Replication
  • 51. www.eudat.eu Thanks – any questions Acknowledgements: Thanks to Mark van de Sanden, Marjan Grootveld , Sarah Jones and Giuseppe Fiameni for some of the slides