SlideShare une entreprise Scribd logo
1  sur  93
Télécharger pour lire hors ligne
PILOT PHASE KICK-OFF EVENT AND
AWARD CEREMONY
29 November 2021
Contact: info@archiver-project.eu
Project website: www.archiver-project.eu
Welcome to the Pilot phase!
Started in January 2019, ARCHIVER is a unique initiative currently
running in the EOSC framework that is competitively procuring
R&D services for archiving and digital preservation for data
generated in the petabyte range.
Between December 2020 and August 2021 three consortia worked
on innovative, prototype solutions for Long-term data
preservation, in close collaboration with CERN, EMBL-EBI, DESY
and PIC.
2
The selection process for
proceeding to the next phase
is now over and the consortia
selected to continue with the
pilot phase will be officially
announced on the public
ceremony today!
3
14:30 - 14:40: Welcome from Sergey Yakubov (DESY)
14:40 - 15:00: Project overview / update - João Fernandes (CERN)
15:00 - 15:30: Expected outcomes of the Pilot Phase - Buyers Group representatives (CERN,
DESY, EMBL-EBI, PIC)
15:30 - 15:40: Interactive brainstorming
15:40 - 15:50: Break
Award ceremony
15:50 - 16:20: Presentation from consortium 1
16:20 - 16:50: Presentation from consortium 2
16:50 - 17:00: Closing remarks - Sergey Yakubov (DESY)
Event Outline
4
HOUSE KEEPING
This event is being recorded in its entirety. A link to the full recordings
will be shared with participants afterwards
Two Q&A sessions are foreseen before the break and towards the end:
keep your questions and use the chat!
Please don’t activate your microphone and videos unless the host gives
you permission
5
Welcome!
Sergey Yakubov – DESY
Deutsches Elektronen-Synchrotron - DESY
Ein Forschungszentrum der Helmholtz-Gemeinschaft
7
8
9
10
11
12
Project overview / update
Pilot Phase Kick-Off
João Fernandes (CERN)
Contact: joao.fernandes@cern.ch
Focus: Archiving and Data Preservation Services using cloud services available via the European Open Science Cloud (EOSC)
Procurement R&D budget: 3.4M euro; Total Budget: 4.8M
Starting Date: 1st
of January 2019
Duration: 42 Months
Coordinator: CERN (Lead Procurer)
ARCHIVER Project
Buyers Group (BG) - Public organisations committing funds to
contribute to a joint-R&D-procurement, research data use cases and
R&D testing effort
Experts - Partner organisations bringing expertise in requirement
assessment and promotion activities
Buyers
Experts
Consortium
ARCHIVER has received funding from the European Union’s H2020 Research & Innovation programme under Grant Agreement No 824516.
14
Progress Beyond the State of the Art
15
Data integrity/security; cloud/hybrid deployment
Data volume in the PB range; high, sustained ingest data
rates in Gb/s. ISO certification: 27000, 27040, 19086 and
related standards. Archives connected to the GEANT
network
OAIS conformant services: data readability formats,
normalization, obsolesce monitoring, files fixity,
authenticity checks, etc. ISO 14721/16363, 26324 and
related standards
User services: search, discover, share, indexing, data
removal, etc. Access under Federated IAM
Layer 1
Storage/Basic Archiving/Secure
backup
Layer 2
Preservation
Layer 3
Baseline user services
Layer 4
Advanced
services
High level services: visual representation of data (domain
specific), reproducibility of scientific analyses, etc.
EMBL
1
–
FIRE
PIC
2
–
Mix
File
Storage
DESY
1
–
Individual
Scientist
CERN
1
–
CERN
Open
Data
CERN
2
–
CERN
Digital
Memory
PIC
3
–
Data
Distribution
EMBL
2
–
Cloud
Caching
PIC
1
–
Large
File
Storage
Scientific use cases deployments: https://www.archiver-project.eu/deployment-scenarios
DESY
2
–
Petra
III
Experiment
DESY
3
–
EUXFEL
Experiment
R&D Scope
ARCHIVER “current state of the art” report in the context of the EOSC: https://doi.org/10.5281/zenodo.3618215
16
17
ARCHIVER Timeline
Prototype Phase Highlights
• Functional objectives of the Prototype phase were successfully met.
• R&D challenge and scientific use cases requirements were globally understood.
• Systematic and structured feedback across the Buyers and Contractors to allow full understanding of the
R&D challenge by the Contractors and its service capabilities by the Buyer organizations.
• Prototype Phase organized into 13 bi-weekly sprints in an Agile-like process with a regular rhythm of
feedback and adjustment between Buyers and Contractors.
• CERN, EMBL-EBI, DESY & PIC allocated significant effort assessing and testing the prototypes, ingesting
several hundreds of TBs of data.
18
From Prototypes to Pilot
• R&D produced by the Contractors
• More than 200 tests executed during the Prototype phase across three platforms
• Technical assessment in two steps:
• R&D produced during the Prototype completed & of sufficient quality
• Mini-competition (Call-off) for Pilot phase services R&D
• Main Criteria considered in the R&D review process:
• Performance to scale correctly in the PB data region
• R&D validation progressing from functional to “Go-To-Market” ready
• Expertise in supporting Data Stewards achieving certification of scientific repositories
• Clear commercialisation plans for the resulting services after the project end
19
Pilot Phase Challenges
• Deployment model: on-premise vs cloud, portability, migration between different public
cloud infrastructure, strategies to avoid vendor lock-ins, disaster recovery
• Advanced AAI & Access management; Network performance: 100TB/day
• Scalability and Elasticity: High Volumes for data ingestion and data recall
• Security: simulated cyberattacks
• Layer-4 reproducibility of scientific analyses independent of infrastructure
• FAIRness measurements: F-UJI tool: https://www.f-uji.net/
• Total Cost of Ownership (TCO): SaaS on cloud vs SaaS on-premise
20
Early Adopters
• Participants:
• Demand side public sector organisations
• Key advantages
• Assess if resulting services address their archiving and preservation needs
• Contribute to shaping the R&D carried out in the project, identify additional use cases
• Have the option to purchase pilot-scale services by the end of the project
https://archiver-project.eu/early-adopters-programme
21
A Minimum Viable EOSC
• EOSC Association established to govern the European
Open Science Cloud
• Ambition to develop “Web of FAIR Data and Services”
for science in Europe for multiple domains
• Provide access to services for storage, computation,
analysis, preservation, etc. based on standards,
combining data and services
“FAIR Forever Report: A study to remember” with an
explicit mention to the ARCHIVER activities :
→https://www.dpconline.org/blog/fair-forever-a-fair-st
udy-to-remember
EOSC Long-term Data Preservation Task Force charter
→https://www.eosc.eu/sites/default/files/tfcharters/e
osca_tflongtermdatapreservation_draftcharter_202106
14.pdf
22
ARCHIVER & EOSC
● Broad pan-European requirement analysis of the research sector
○ Analysis considered in the competitive R&D tender
○ Technical and organisational measures aligned with European legislation in the services being
developed (by default & by design)
● Early Adopters Programme established
○ Additional use cases expanding further the set of supported scientific domains
○ Publicly funded research actors external to the ARCHIVER consortium
● Model for facilitate procurement of sustainable pilot services
○ Consortium members and Early Adopter organisations
○ Beyond the lifetime of the project
ARCHIVER is the only EOSC related H2020 project focusing on Archiving & LTDP services for PB scale datasets
across multiple research domains and countries.
23
Summary
● The R&D challenge of digital archiving goes beyond data storage: keep
intellectual control of data and associated products for decades, make
research outputs reusable
● Extending FAIR to research associated products: software, workflows, services
and even infrastructures
● Acting as a template to commoditise archiving and preservation at scale in
research domains with expert European SMEs
● ARCHIVER is promoting a sustainable model with services that will exist
beyond the project lifetime in the context of the EOSC
● ARCHIVER pilot phase: validating services with end-users and early adopters
organisations to determine suitability to their needs.
24
14:30 - 14:40: Welcome from Sergey Yakubov (DESY)
14:40 - 15:00: Project overview / update - João Fernandes (CERN)
15:00 - 15:30: Expected outcomes of the Pilot Phase - Buyers Group representatives (CERN,
DESY, EMBL-EBI, PIC)
15:30 - 15:40: Interactive brainstorming
15:40 - 15:50: Break
Award ceremony
15:50 - 16:20: Presentation from consortium 1
16:20 - 16:50: Presentation from consortium 2
16:50 - 17:00: Closing remarks - João Fernandes (CERN)
Event Outline
25
Thank you!
info@archiver-project.eu
https://www.archiver-project.eu/
https://twitter.com/ArchiverProject
https://www.linkedin.com/company/archiver-project/
https://www.youtube.com/channel/UCCBIyLpUt-hWmQatqdIhIzw
Expected outcomes of the Pilot
Phase
Buyers Group representatives
CERN, DESY, EMBL-EBI, PIC
CERN Requirements and Expectations
Tibor Simko, Jean-Yves Le Meur, Antonio Vivace, Ignacio Peluaga
28
CERN Open Data: From data preservation to data reuse
29
https://opendata.cern.ch https://www.reana.io
CERN Open Data portal
● more than 2.5 petabytes of particle
physics data
● education use cases
● independent research use cases
REANA reproducible analysis platform
● run containerised computational
workflows on remote clouds
● CWL, Snakemake, Yadage
● HTCondor, Kubernetes, Slurm
Nature 533 (2016), 452–454
https://doi.org/10.1038/533452a
CERN Open Data: From data preservation to data reuse
CMS Higgs-to-four-lepton example analysis CMS Higgs-to-tautau example analysis
● looking at “Compute” possibilities for content preserved in “Storage”
● reproducing several open data analysis examples
● reprocessing published simulated datasets
● challenges:
○ use containerised workflows (Docker, Kubernetes)
○ serve database content (CVMFS)
rule analyze_data:
input:
config["data"],
config["code"],
"results/scramdone.txt"
output:
"results/DoubleMuParked2012C_10000_Higgs.root"
container:
"docker://cmsopendata/cmssw_5_3_32"
shell:
"source /opt/cms/cmsset_default.sh "
"&& cd CMSSW_5_3_32/src "
"&& eval `scramv1 runtime -sh` "
"&& cd HiggsExample20112012/HiggsDemoAnalyzer "
"&& cd ../Level3 "
"&& cmsRun demoanalyzer_cfg_level3data.py"
Running containerised workflows
30
CERN Digital Memory: the OAIS Archive
31
- Status:
- Tool to harvest main CERN Information Systems and
create SIP (following BagIt specs).
- Includes publications, preprints, presentations,
photos, videos, and more
- Control Interface to run ‘’on-demand’’ archiving for
people leaving CERN
- Challenges:
- Duplication
- Authorships
- Integrity
- Versioning
- Running preservation services/ file conversions to
create AIPs
CERN Digital Memory: feeding Archiver.eu
32
Archiver.eu solutions to
ingest CERN SIPs and store
artifacts (AIPs, DIPs...)
33
DESY Requirements and Expectations
Sergey Yakubov, Martin Gasthuber
Main sources of data to be archived and preserved - 2021
>30 PB annual
3-6 PB annual
● two sites
○ Hamburg
○ Zeuthen (near Berlin)
● science fields
○ particle physics (LHC, Belle 2, …)
○ photon science (EuXFEL, Petra III, FLASH)
○ accelerator research (wakefield, Petra IV, …)
○ astrophysics
● all areas “data intensive science”
34
use cases - selection and relations
● photon science focussed - pushed by 3 local independently running
accelerator facilities
○ ground zero of the data deluge
● three, partly nested, scaled use cases
○ (1) individual scientist/small working group
○ (2) large working group/experiment
○ (3) facility level (with data policy)
● existing entities/actors with data ownership
and stewardship responsibilities
35
1
2
3
referencing
More tangible / general expectations
● core functionalities completed in last phase
● focus on scaling and stability validations
● Tape integration - scaling requires this ‘low cost’ storage technology
● hybrid deployments - i.e. MD and initial data copy on-site, secondary data and MD
backup (cold & encrypted) off-site - targeting at:
○ off-site data co-located with public cloud computing for ‘open data’ and related analysis tasks
○ primary off-site data attractive to communities/groups not having an ‘IT-Home’
36
● decent volume & bandwidth requirements
○ few 10 TB, 100MB/s, 10K objects - per archive
○ ~0.2-0.5PB annual
● access mainly via a web browser (GUI)
○ pre-defined MD schema & policy and templates by admins
● scientist (in person) - owner and archivist
● challenges
○ authentication - federated AAI
○ metadata scheme - hierarchy of schemes, mandatory fields, controlled extension
○ DOI / publication
○ local & hybrid deployments
○ open-data management & access
Case 1 - individual scientist / small working group
37
Case 2 - mid size - Petra III experiment
● Case 1 plus…
● increased volume and bandwidth requirements
○ ~100 TB, 1-2GB/s, >150K objects per archive
○ ~3-6 PB annual
● mainly API access (non interactive)
○ agent - automated life cycle process integration
● challenges
○ templating - i.e. MD, lifetimes, access rights, etc.
■ compliant to data policy
○ data storage costs starts to be important
■ volume and bandwidth amplification inside your solution - rising importance
38
Case 3 - facility level - EuXFEL
● Case 2 plus...
● largest scale
○ few 100 TB up to 1-2 PB, 2-10GB/s, >30K objects - per experiment
■ 1.3PB per day raw data production seen recently - and growing
○ >30 PB annual
● non interactive - full API access
○ full templating for creation and filling process of the archive
● challenges
○ scale
○ data storage costs becomes the most important criteria
■ volume and bandwidth amplification inside your solution becomes dominating criteria
39
EMBL-EBI Requirements and Expectations
Justin Clark-Casey
40
About EMBL-EBI
● “The home for big data in biology”
● Datasets, datasets, datasets: genomics, proteomics, metagenomics,
transcriptomics, imageomics …
○ You (researchers) submit them
○ We curate them
○ We provide them
○ You download them/(cloud)-access them/analyze them
○ We archive them
● Part of the European Molecular Biology Laboratory (27 member states,
1800 ppl, 80 independent research groups)
○ An intergovernmental organization like CERN.
41
Data growth at EMBL-EBI
42
EMBL-EBI Archiver Pilot Phase Plan
● Pilot preservation of selected large datasets from across EMBL-EBI as part of an integrated
data management system
43
EMBL-EBI Archiver Pilot Phase Plan
● Deploy a pilot in a real storage environment
● Taking forward the EMBL on FIRE use case
● No new feature requests
○ But may use features developed for other buyers (e.g. flexible metadata schemas)
● Assess in pilot conditions
○ Ease of integration
○ Scalability
○ Security
○ Robustness
○ Cost
44
45
PIC Requirements and Expectations
Jordi Casals, Manuel Delfino, Jordi Delgado
Port d’Informació Científica
Use cases will be based on MAGIC Telescope data
Observatorio del Roque de los Muchachos
(La Palma, Canary Islands)
• Collecting data 365 days a year
• Except when a volcano starts activity
Hint: NOW
• 300TB per year for ranges of 5-6 years
• Random recalls during the period
ALBA Synchrotron
• More than 10 Beamlines (and growing for next years)
• Datasets ranging from 200TB up to 4PB
• Internal and external scientific users
In collaboration with
46
Pilot Requirements
● Petabyte level Storage → functional, reliable, good performance, reasonable cost
○ From 1PB in 2021 to 15PB in 2025
○ Scalability testing → upload and downloads with production environments, sizes and bandwidth
● Integrate and automate processes using services CLI or API based scripts with production systems
● Create custom metadata schemas automatically on upload from data sources
○ Run production processes based on metadata searches
● Give access to users inside the collaboration with different permissions based on roles in experiment
● Test in-archive data processing using co-located Cloud
○ Enables automatic processing of uploaded data and prevents moving data around two times
● Test future reprocessing for reusability, based on custom containers, python notebooks or any
possible system that may help the data to not expire
● Analyze production costs to evaluate if it would be a usable option
Everything will be tested by both PIC and ALBA and put in common to
better evaluate the final results and possibilities of the resulting services
47
Any questions?
48
49
Go to www.menti.com and use the code
3228 5110
50
BREAK
51
PILOT PHASE
AWARD
CEREMONY
And the winners
are….
53
Arkivum - Google Cloud
Bringing archived data to life
SCALABLE AND SUSTAINABLE LONG TERM
DIGITAL PRESERVATION OF SCIENTIFIC DATASETS
Matthew Addis
Arkivum
ARCHIVER Pilot Kick-Off Event 29 Nov 2021
Bringing archived data to life
Contents
•Sustainability
• Scalability, Performance and Cost Effectiveness
• Portability, Open Standards, Data Sovereignty
• FAIR Forever, Digital Preservation, Long-term Access and Reuse
• Environmental Sustainability
•Summary
Bringing archived data to life
Scalability, Performance and Cost
Effectiveness
Bringing archived data to life
ARCHIVER: requirements
https://zenodo.org/record/4574234#.YW2NyqDTWJQ
Bringing archived data to life
Demand side requirements
Bringing archived data to life
Long-term digital preservation (LTDP) factories to support sustainable FAIR data
Bringing archived data to life
LTDP factories that scale
• Automation
• Minimal Effort Ingest / Minimal Viable
Preservation
• Scalable and efficient use of IT resources
• Serverless computing
• Microservices
• Scale-out parallel processing
• Robust workflows
• Failures will happen
• Record and audit everything
Eugen Stoll, CC BY-NC 2.0, https://flic.kr/p/5c8cz
Bringing archived data to life
Microservices, serverless computing, cloud infrastructure
Bringing archived data to life
Ingest and archiving of a 50TB dataset
50,000 HDF5 files (50TB) ingested and stored at a rate of 150TB/day
CPU
cores
Ingest
Starts
Time
All done
Bringing archived data to life
Costs and Benchmarking
• Execute real world scenarios on GCP
• Record parameters
• execution time
• data volumes, number of files
• type of activity (ingest, export, preservation)
• Extract baseline costs from GCP
• Extract overhead of running scenario
• Normalise to create a ‘cost per TB’ metric
• Calculate short-term and long-term storage
• Upload/export buckets, caching, archive
buckets
Bringing archived data to life
Costs Projections
Bringing archived data to life
Portability, Open Standards, Data Sovereignty
Bringing archived data to life
Open Standards, Open Specifications, Open Source: Deployment
Bringing archived data to life
Open Standards, Open Specifications, Open Source: Interaction
Bagit & BDbags
Bringing archived data to life
Open Standards, Open Specifications, Open Source: Content
Bringing archived data to life
FAIR Forever, Digital
Preservation, Long-term Access
and Reuse
Bringing archived data to life
Bringing archived data to life
Standards, External Assessment and Certification
Self Assessment
Peer Review
Formal Certification
Bringing archived data to life
Environmental Sustainability
Bringing archived data to life
Making LTDP more environmentally sustainable
• Keep less
• e.g. appraisal, don’t digitize everything
• Do less
• e.g. minimum effort ingest, don’t normalise
• Make smarter use of storage
• e.g. deep archive, small footprint access copies
• Make more efficient use of IT resources
• e.g. don’t leave idle servers running
• Use environmentally friendly infrastructures
• e.g. cloud with renewable energy
https://www.dpconline.org/blog/is-digital-preservation-bad-for-the-environment
https://dash.harvard.edu/handle/1/40741399
Bringing archived data to life
Cloud is greener than you might think
https://www.google.com/about/datacenters/gallery/#hamina-exterior-landscape
https://cloud.withgoogle.com/region-picker/ https://www.google.com/about/datacenters/gallery/#st-ghislain-solar-panels
Bringing archived data to life
Cloud providers can achieve very high energy efficiency
https://datacenters.lbl.gov/sites/default/files/Masanet_et_al_Science_2020.full_.pdf
https://www.stickermule.com/marketplace/3442-there-is-no-cloud
Bringing archived data to life
Strategy for energy efficiency and
low carbon footprint
• Users can choose what processing to do on their
content
• Deep archive storage for infrequently accessed
data
• Serverless computing: only consume resources
when needed
• Deployment into energy efficient data centres
• Allow applications to run near to the data
Bringing archived data to life
Summary
Bringing archived data to life
Summary
• The scale of scientific datasets mean that new approaches to LTDP are necessary
• Automation, microservices, serverless computing and cloud IaaS are powerful combination
• Efficiency and costs can be measured, managed and optimized (€ and carbon)
• Reduced consumption and choice over cloud location/provider helps environmental sustainability
• Open standards, open specifications and open source are key to portability and interoperability
https://www.archiver-project.eu/
Bringing archived data to life
QUESTIONS?
hello@arkivum.com
www.arkivum.com
matthew.addis@arkivum.com orcid.org/0000-0002-3837-2526 www.arkivum.com
Libnova - CSIC - University of Barcelona -
Giaretta Associates - AWS - Voxility - Bidaidea
LIBNOVA USA 14 NE First Ave (2nd
floor) -Miami, Florida 33132, USA -Tel: +1 844-894-6532
LIBNOVA EU Paseo de la Castellana 153 -28046 Madrid, Spain -Tel: +34 91 449 08 94
contact@libnova.com
PILOT PHASE KICK-OFF EVENT
2021-11-29
Big thanks to the Archiver team EU Representatives and the LIBNOVA consortium team.
85
Market
Time to market:
Shorter development
lifecycle leading to a faster
time to market (5 to 2
years)
Market visibility:
Several EU/USA
Universities and a large
European pharmaceutical
signing contracts.
Product
Accelerated innovation:
Being able to work with
the 4 buyers maximizes
success chances in first
iteration.
Standards:
We are bringing several –
previously unknown-
standards onboard.
Team
EUnthusiasm:
LIBNOVA consortium’s
team feels enthusiastic
about being part of
something larger, relevant
to the EU. The EU is leading
the R&D in this field.
Happiness:
Everyone have been
working really hard, but
really happy about it!
How are you and the ARCHIVER project benefiting LIBNOVA?
Making the practice more efficient
- Direct: Optimized storage costs, low operational costs.
- Indirect: Less time/resources for producers/researchers to use it.
Making it available to a broader audience
- Demystifies preservation. Easy to understand and to use.
- Opens the practice to large volume datasets and to more advanced organizations.
Making it easier to apply best
practices
- Fully conformant to ISO 14721 and ISO16363 (and several other standards)
- Full support for FAIR/TRUST data models, workflows, etc.
How are we benefiting the community?
How are we benefiting the community?
Reducing the environmental impact of the community preservation activities
- Reuse what you have: on-premise deployments using existing infrastructure are possible.
- Consume only when needed: Resources are consumed when there is something to do. Scaling from
36 Kubernetes pods to ~5000 in 32 minutes. Process the workload and then back to 36 pods.
- “Environment impact” topic is included in every architecture specification/analysis. The team is
proud of many small and smart optimizations. For example, the hashing algorithm.
- Using carbon-neutral providers and data centres.
Contributing to the European digital sovereignty and digital development and
leadership
- Cloud provider independence
- Decouples content from providers
- Open to the emergence of EU-based cloud providers.
LIBNOVA USA 14 NE First Ave (2nd
floor) -Miami, Florida 33132, USA -Tel: +1 844-894-6532
LIBNOVA EU Paseo de la Castellana 153 -28046 Madrid, Spain -Tel: +34 91 449 08 94
contact@libnova.com
Thanks again!
Any questions?
90
Closing remarks
João Fernandes (CERN)
Closing Remarks
Upcoming Milestones for Buyers / Early Adopter organisations
● Today: Pilot kick-off meeting
● January 2022: Full capacity pilot platforms available
● April - June 2022: Project Review meetings with demos of the resulting services
○ Webinars for training administrators and end users in the Pilot platforms
Upcoming Public Events (February - June 2022)
● Industry related webinars about the advantages of PCP projects - February 2022
○ Procurement of research and development of new innovative solutions before they are commercially available
● Thematic webinars for potential Early Adopters of ARCHIVER - March-May 2022
Dates and Registration to be announced at https://archiver-project.eu/
92
www.archiver-project.eu
@ArchiverProject
company/archiver-project
Join our community!
Thank you!

Contenu connexe

Tendances

RNP Cloud Infrastructure model, services and challenges
RNP Cloud Infrastructure model, services and challengesRNP Cloud Infrastructure model, services and challenges
RNP Cloud Infrastructure model, services and challengesEUBrasilCloudFORUM .
 
To Choose, to Check, to Verify: The Quest for Open AV Standards
To Choose, to Check, to Verify: The Quest for Open AV StandardsTo Choose, to Check, to Verify: The Quest for Open AV Standards
To Choose, to Check, to Verify: The Quest for Open AV StandardsErwin Verbruggen
 
Joining Forces in Digitisation, Storage and Access. An interim assessment aft...
Joining Forces in Digitisation, Storage and Access. An interim assessment aft...Joining Forces in Digitisation, Storage and Access. An interim assessment aft...
Joining Forces in Digitisation, Storage and Access. An interim assessment aft...Brecht Declercq
 
Progress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP ProjectProgress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP ProjectHelix Nebula The Science Cloud
 
Europeana Newspapers LFT Infoday Genereux
Europeana Newspapers LFT Infoday GenereuxEuropeana Newspapers LFT Infoday Genereux
Europeana Newspapers LFT Infoday GenereuxEuropeana Newspapers
 
Phidias: Steps forward in detection and identification of anomalous atmospher...
Phidias: Steps forward in detection and identification of anomalous atmospher...Phidias: Steps forward in detection and identification of anomalous atmospher...
Phidias: Steps forward in detection and identification of anomalous atmospher...Phidias
 
Open Access Testbed Facilities Available to Software Companies in Europe
Open Access Testbed Facilities Available to Software Companies in EuropeOpen Access Testbed Facilities Available to Software Companies in Europe
Open Access Testbed Facilities Available to Software Companies in EuropeTEST Huddle
 
Refinement of Digitised Newspapers
Refinement of Digitised NewspapersRefinement of Digitised Newspapers
Refinement of Digitised Newspaperscneudecker
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013MediaMixerCommunity
 
Experience in managing service portfolio by Pasquale Pagano
Experience in managing service portfolio by Pasquale PaganoExperience in managing service portfolio by Pasquale Pagano
Experience in managing service portfolio by Pasquale PaganoBlue BRIDGE
 
Latif ladid gen6 overview
Latif ladid gen6 overviewLatif ladid gen6 overview
Latif ladid gen6 overviewGlobalForum
 
Presentation of Hans-Jörg Lieder, BnF Information Day
Presentation of Hans-Jörg Lieder, BnF Information DayPresentation of Hans-Jörg Lieder, BnF Information Day
Presentation of Hans-Jörg Lieder, BnF Information DayEuropeana Newspapers
 
ICN in the IRTF and IETF
ICN in the IRTF and IETFICN in the IRTF and IETF
ICN in the IRTF and IETFDirk Kutscher
 
20171021 outsourced digitisation guideline
20171021 outsourced digitisation guideline20171021 outsourced digitisation guideline
20171021 outsourced digitisation guidelineBrecht Declercq
 
Presentation of Clemens Neudecker, BnF Information Day
Presentation of Clemens Neudecker, BnF Information DayPresentation of Clemens Neudecker, BnF Information Day
Presentation of Clemens Neudecker, BnF Information DayEuropeana Newspapers
 
Quality and capacity expansion of thematic services in EOSC-SYNERGY
Quality and capacity expansion of thematic services in EOSC-SYNERGYQuality and capacity expansion of thematic services in EOSC-SYNERGY
Quality and capacity expansion of thematic services in EOSC-SYNERGYJisc
 

Tendances (20)

RNP Cloud Infrastructure model, services and challenges
RNP Cloud Infrastructure model, services and challengesRNP Cloud Infrastructure model, services and challenges
RNP Cloud Infrastructure model, services and challenges
 
To Choose, to Check, to Verify: The Quest for Open AV Standards
To Choose, to Check, to Verify: The Quest for Open AV StandardsTo Choose, to Check, to Verify: The Quest for Open AV Standards
To Choose, to Check, to Verify: The Quest for Open AV Standards
 
Joining Forces in Digitisation, Storage and Access. An interim assessment aft...
Joining Forces in Digitisation, Storage and Access. An interim assessment aft...Joining Forces in Digitisation, Storage and Access. An interim assessment aft...
Joining Forces in Digitisation, Storage and Access. An interim assessment aft...
 
Progress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP ProjectProgress of the Helix Nebula Science Cloud PCP Project
Progress of the Helix Nebula Science Cloud PCP Project
 
The OCRE project
The OCRE projectThe OCRE project
The OCRE project
 
Europeana Newspapers LFT Infoday Genereux
Europeana Newspapers LFT Infoday GenereuxEuropeana Newspapers LFT Infoday Genereux
Europeana Newspapers LFT Infoday Genereux
 
EOSC-hub Week - OCRE: Stimulating the Uptake of Commercial Digital Services i...
EOSC-hub Week - OCRE: Stimulating the Uptake of Commercial Digital Services i...EOSC-hub Week - OCRE: Stimulating the Uptake of Commercial Digital Services i...
EOSC-hub Week - OCRE: Stimulating the Uptake of Commercial Digital Services i...
 
Phidias: Steps forward in detection and identification of anomalous atmospher...
Phidias: Steps forward in detection and identification of anomalous atmospher...Phidias: Steps forward in detection and identification of anomalous atmospher...
Phidias: Steps forward in detection and identification of anomalous atmospher...
 
Open Access Testbed Facilities Available to Software Companies in Europe
Open Access Testbed Facilities Available to Software Companies in EuropeOpen Access Testbed Facilities Available to Software Companies in Europe
Open Access Testbed Facilities Available to Software Companies in Europe
 
Refinement of Digitised Newspapers
Refinement of Digitised NewspapersRefinement of Digitised Newspapers
Refinement of Digitised Newspapers
 
Pre-Commercial Procurement - HNSciCloud
Pre-Commercial Procurement -  HNSciCloudPre-Commercial Procurement -  HNSciCloud
Pre-Commercial Procurement - HNSciCloud
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013
 
Experience in managing service portfolio by Pasquale Pagano
Experience in managing service portfolio by Pasquale PaganoExperience in managing service portfolio by Pasquale Pagano
Experience in managing service portfolio by Pasquale Pagano
 
Bne impact co_c
Bne impact co_cBne impact co_c
Bne impact co_c
 
Latif ladid gen6 overview
Latif ladid gen6 overviewLatif ladid gen6 overview
Latif ladid gen6 overview
 
Presentation of Hans-Jörg Lieder, BnF Information Day
Presentation of Hans-Jörg Lieder, BnF Information DayPresentation of Hans-Jörg Lieder, BnF Information Day
Presentation of Hans-Jörg Lieder, BnF Information Day
 
ICN in the IRTF and IETF
ICN in the IRTF and IETFICN in the IRTF and IETF
ICN in the IRTF and IETF
 
20171021 outsourced digitisation guideline
20171021 outsourced digitisation guideline20171021 outsourced digitisation guideline
20171021 outsourced digitisation guideline
 
Presentation of Clemens Neudecker, BnF Information Day
Presentation of Clemens Neudecker, BnF Information DayPresentation of Clemens Neudecker, BnF Information Day
Presentation of Clemens Neudecker, BnF Information Day
 
Quality and capacity expansion of thematic services in EOSC-SYNERGY
Quality and capacity expansion of thematic services in EOSC-SYNERGYQuality and capacity expansion of thematic services in EOSC-SYNERGY
Quality and capacity expansion of thematic services in EOSC-SYNERGY
 

Similaire à Archiver pilot phase kick off Award Ceremony

Project update - João Fernandes
Project update - João FernandesProject update - João Fernandes
Project update - João FernandesArchiver
 
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case studyTutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case studyBlue BRIDGE
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair" OpenAIRE
 
Science Demonstrator Session: Physics and Astrophysics
Science Demonstrator Session: Physics and AstrophysicsScience Demonstrator Session: Physics and Astrophysics
Science Demonstrator Session: Physics and AstrophysicsEOSCpilot .eu
 
EOSC-hub Early Adopter Programme
EOSC-hub Early Adopter ProgrammeEOSC-hub Early Adopter Programme
EOSC-hub Early Adopter ProgrammeEOSC-hub project
 
Apec workshop 2 presentation 7 2 apec workshop mexico 7 eu ccs demo network
Apec workshop 2 presentation 7 2  apec workshop mexico  7  eu ccs demo networkApec workshop 2 presentation 7 2  apec workshop mexico  7  eu ccs demo network
Apec workshop 2 presentation 7 2 apec workshop mexico 7 eu ccs demo networkGlobal CCS Institute
 
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...Archiver
 
1 archiver omc project_overview
1 archiver omc project_overview1 archiver omc project_overview
1 archiver omc project_overviewArchiver
 
Archiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing ServicesArchiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing ServicesArchiver
 
Helix Nebula Science Cloud - Pre Commercial Procurement Pilot
Helix Nebula Science Cloud - Pre Commercial Procurement PilotHelix Nebula Science Cloud - Pre Commercial Procurement Pilot
Helix Nebula Science Cloud - Pre Commercial Procurement PilotHelix Nebula The Science Cloud
 
A Service Perspective: Unlocking metadata to enhance discoverability and conn...
A Service Perspective: Unlocking metadata to enhance discoverability and conn...A Service Perspective: Unlocking metadata to enhance discoverability and conn...
A Service Perspective: Unlocking metadata to enhance discoverability and conn...EDINA, University of Edinburgh
 
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubCloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubBjörn Backeberg
 
European Research Projects as EOSC Service Providers
European Research Projects as EOSC Service ProvidersEuropean Research Projects as EOSC Service Providers
European Research Projects as EOSC Service ProvidersPedro Príncipe
 
COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016COBWEB Project
 
Infrastructure Innovations in the Rail Industry #COMIT2017
Infrastructure Innovations in the Rail Industry #COMIT2017Infrastructure Innovations in the Rail Industry #COMIT2017
Infrastructure Innovations in the Rail Industry #COMIT2017Comit Projects Ltd
 

Similaire à Archiver pilot phase kick off Award Ceremony (20)

Project update - João Fernandes
Project update - João FernandesProject update - João Fernandes
Project update - João Fernandes
 
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case studyTutorial on Hybrid Data Infrastructures: D4Science as a case study
Tutorial on Hybrid Data Infrastructures: D4Science as a case study
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair"
 
Ocre EARSC Webinar, 22 May 2019, Antonio Romeo
Ocre EARSC Webinar, 22 May 2019, Antonio RomeoOcre EARSC Webinar, 22 May 2019, Antonio Romeo
Ocre EARSC Webinar, 22 May 2019, Antonio Romeo
 
Science Demonstrator Session: Physics and Astrophysics
Science Demonstrator Session: Physics and AstrophysicsScience Demonstrator Session: Physics and Astrophysics
Science Demonstrator Session: Physics and Astrophysics
 
EOSC-hub Early Adopter Programme
EOSC-hub Early Adopter ProgrammeEOSC-hub Early Adopter Programme
EOSC-hub Early Adopter Programme
 
Apec workshop 2 presentation 7 2 apec workshop mexico 7 eu ccs demo network
Apec workshop 2 presentation 7 2  apec workshop mexico  7  eu ccs demo networkApec workshop 2 presentation 7 2  apec workshop mexico  7  eu ccs demo network
Apec workshop 2 presentation 7 2 apec workshop mexico 7 eu ccs demo network
 
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
 
1 archiver omc project_overview
1 archiver omc project_overview1 archiver omc project_overview
1 archiver omc project_overview
 
Archiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing ServicesArchiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing Services
 
Helix Nebula Science Cloud - Pre Commercial Procurement Pilot
Helix Nebula Science Cloud - Pre Commercial Procurement PilotHelix Nebula Science Cloud - Pre Commercial Procurement Pilot
Helix Nebula Science Cloud - Pre Commercial Procurement Pilot
 
A Service Perspective: Unlocking metadata to enhance discoverability and conn...
A Service Perspective: Unlocking metadata to enhance discoverability and conn...A Service Perspective: Unlocking metadata to enhance discoverability and conn...
A Service Perspective: Unlocking metadata to enhance discoverability and conn...
 
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hubCloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
Cloud Computing Needs for Earth Observation Data Analysis: EGI and EOSC-hub
 
European Research Projects as EOSC Service Providers
European Research Projects as EOSC Service ProvidersEuropean Research Projects as EOSC Service Providers
European Research Projects as EOSC Service Providers
 
COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016COBWEB Summit at the OGC TC Dublin, 2016
COBWEB Summit at the OGC TC Dublin, 2016
 
Infrastructure Innovations in the Rail Industry #COMIT2017
Infrastructure Innovations in the Rail Industry #COMIT2017Infrastructure Innovations in the Rail Industry #COMIT2017
Infrastructure Innovations in the Rail Industry #COMIT2017
 
H2020 space wp 2018 2020 call 2019
H2020 space wp 2018 2020 call 2019H2020 space wp 2018 2020 call 2019
H2020 space wp 2018 2020 call 2019
 
OCRE webinar - April 14 - Dave Heyns.pdf
OCRE webinar - April 14 - Dave Heyns.pdfOCRE webinar - April 14 - Dave Heyns.pdf
OCRE webinar - April 14 - Dave Heyns.pdf
 
E Infrastructure for OA
E Infrastructure for OAE Infrastructure for OA
E Infrastructure for OA
 
EOSC-hub in EOSC context
EOSC-hub in EOSC contextEOSC-hub in EOSC context
EOSC-hub in EOSC context
 

Plus de Archiver

Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶Archiver
 
Overview of the EOSC¶
Overview of the EOSC¶Overview of the EOSC¶
Overview of the EOSC¶Archiver
 
ARCHIVER Tender Requirements
ARCHIVER Tender RequirementsARCHIVER Tender Requirements
ARCHIVER Tender RequirementsArchiver
 
Summary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional RequirementsSummary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional RequirementsArchiver
 
DESY / XFEL Deployment Scenarios
DESY / XFEL Deployment Scenarios  DESY / XFEL Deployment Scenarios
DESY / XFEL Deployment Scenarios Archiver
 
Embl ebi use-cases_-_t.wildish
Embl ebi use-cases_-_t.wildishEmbl ebi use-cases_-_t.wildish
Embl ebi use-cases_-_t.wildishArchiver
 
Wrapping up and_next_steps_stansted
Wrapping up and_next_steps_stanstedWrapping up and_next_steps_stansted
Wrapping up and_next_steps_stanstedArchiver
 
20190523 archiver fim
20190523 archiver fim20190523 archiver fim
20190523 archiver fimArchiver
 
Geant cloud peering-v2
Geant cloud peering-v2Geant cloud peering-v2
Geant cloud peering-v2Archiver
 
Stansted slides-desy
Stansted slides-desyStansted slides-desy
Stansted slides-desyArchiver
 
Pic archiver stansted
Pic archiver stanstedPic archiver stansted
Pic archiver stanstedArchiver
 
Archiver omc cern_deployment_scenarios_technical_details
Archiver omc cern_deployment_scenarios_technical_detailsArchiver omc cern_deployment_scenarios_technical_details
Archiver omc cern_deployment_scenarios_technical_detailsArchiver
 
Archiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_finalArchiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_finalArchiver
 
Wrapping up_and_next_steps
Wrapping up_and_next_stepsWrapping up_and_next_steps
Wrapping up_and_next_stepsArchiver
 
Introduction to_planning_poker_addestino
Introduction to_planning_poker_addestinoIntroduction to_planning_poker_addestino
Introduction to_planning_poker_addestinoArchiver
 
Archiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project OverviewArchiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project OverviewArchiver
 
Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio Archiver
 
6 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v26 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v2Archiver
 
5 introduction to geant
5 introduction to geant5 introduction to geant
5 introduction to geantArchiver
 
4 archiver omc session 1
4 archiver omc session 1 4 archiver omc session 1
4 archiver omc session 1 Archiver
 

Plus de Archiver (20)

Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶Wrapping Up and Next Steps¶
Wrapping Up and Next Steps¶
 
Overview of the EOSC¶
Overview of the EOSC¶Overview of the EOSC¶
Overview of the EOSC¶
 
ARCHIVER Tender Requirements
ARCHIVER Tender RequirementsARCHIVER Tender Requirements
ARCHIVER Tender Requirements
 
Summary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional RequirementsSummary of the Deployment Scenarios and Functional Requirements
Summary of the Deployment Scenarios and Functional Requirements
 
DESY / XFEL Deployment Scenarios
DESY / XFEL Deployment Scenarios  DESY / XFEL Deployment Scenarios
DESY / XFEL Deployment Scenarios
 
Embl ebi use-cases_-_t.wildish
Embl ebi use-cases_-_t.wildishEmbl ebi use-cases_-_t.wildish
Embl ebi use-cases_-_t.wildish
 
Wrapping up and_next_steps_stansted
Wrapping up and_next_steps_stanstedWrapping up and_next_steps_stansted
Wrapping up and_next_steps_stansted
 
20190523 archiver fim
20190523 archiver fim20190523 archiver fim
20190523 archiver fim
 
Geant cloud peering-v2
Geant cloud peering-v2Geant cloud peering-v2
Geant cloud peering-v2
 
Stansted slides-desy
Stansted slides-desyStansted slides-desy
Stansted slides-desy
 
Pic archiver stansted
Pic archiver stanstedPic archiver stansted
Pic archiver stansted
 
Archiver omc cern_deployment_scenarios_technical_details
Archiver omc cern_deployment_scenarios_technical_detailsArchiver omc cern_deployment_scenarios_technical_details
Archiver omc cern_deployment_scenarios_technical_details
 
Archiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_finalArchiver omc stansted_tendering_procedure_and_requirements_final
Archiver omc stansted_tendering_procedure_and_requirements_final
 
Wrapping up_and_next_steps
Wrapping up_and_next_stepsWrapping up_and_next_steps
Wrapping up_and_next_steps
 
Introduction to_planning_poker_addestino
Introduction to_planning_poker_addestinoIntroduction to_planning_poker_addestino
Introduction to_planning_poker_addestino
 
Archiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project OverviewArchiver 2nd_OMC event_Barcelona_Project Overview
Archiver 2nd_OMC event_Barcelona_Project Overview
 
Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio Archiver OMC event_Barcelona_ Welcome to_accio
Archiver OMC event_Barcelona_ Welcome to_accio
 
6 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v26 presentation wrapping up and next steps v2
6 presentation wrapping up and next steps v2
 
5 introduction to geant
5 introduction to geant5 introduction to geant
5 introduction to geant
 
4 archiver omc session 1
4 archiver omc session 1 4 archiver omc session 1
4 archiver omc session 1
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Dernier (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Archiver pilot phase kick off Award Ceremony

  • 1. PILOT PHASE KICK-OFF EVENT AND AWARD CEREMONY 29 November 2021 Contact: info@archiver-project.eu Project website: www.archiver-project.eu
  • 2. Welcome to the Pilot phase! Started in January 2019, ARCHIVER is a unique initiative currently running in the EOSC framework that is competitively procuring R&D services for archiving and digital preservation for data generated in the petabyte range. Between December 2020 and August 2021 three consortia worked on innovative, prototype solutions for Long-term data preservation, in close collaboration with CERN, EMBL-EBI, DESY and PIC. 2
  • 3. The selection process for proceeding to the next phase is now over and the consortia selected to continue with the pilot phase will be officially announced on the public ceremony today! 3
  • 4. 14:30 - 14:40: Welcome from Sergey Yakubov (DESY) 14:40 - 15:00: Project overview / update - João Fernandes (CERN) 15:00 - 15:30: Expected outcomes of the Pilot Phase - Buyers Group representatives (CERN, DESY, EMBL-EBI, PIC) 15:30 - 15:40: Interactive brainstorming 15:40 - 15:50: Break Award ceremony 15:50 - 16:20: Presentation from consortium 1 16:20 - 16:50: Presentation from consortium 2 16:50 - 17:00: Closing remarks - Sergey Yakubov (DESY) Event Outline 4
  • 5. HOUSE KEEPING This event is being recorded in its entirety. A link to the full recordings will be shared with participants afterwards Two Q&A sessions are foreseen before the break and towards the end: keep your questions and use the chat! Please don’t activate your microphone and videos unless the host gives you permission 5
  • 6. Welcome! Sergey Yakubov – DESY Deutsches Elektronen-Synchrotron - DESY Ein Forschungszentrum der Helmholtz-Gemeinschaft
  • 7. 7
  • 8. 8
  • 9. 9
  • 10. 10
  • 11. 11
  • 12. 12
  • 13. Project overview / update Pilot Phase Kick-Off João Fernandes (CERN) Contact: joao.fernandes@cern.ch
  • 14. Focus: Archiving and Data Preservation Services using cloud services available via the European Open Science Cloud (EOSC) Procurement R&D budget: 3.4M euro; Total Budget: 4.8M Starting Date: 1st of January 2019 Duration: 42 Months Coordinator: CERN (Lead Procurer) ARCHIVER Project Buyers Group (BG) - Public organisations committing funds to contribute to a joint-R&D-procurement, research data use cases and R&D testing effort Experts - Partner organisations bringing expertise in requirement assessment and promotion activities Buyers Experts Consortium ARCHIVER has received funding from the European Union’s H2020 Research & Innovation programme under Grant Agreement No 824516. 14
  • 15. Progress Beyond the State of the Art 15
  • 16. Data integrity/security; cloud/hybrid deployment Data volume in the PB range; high, sustained ingest data rates in Gb/s. ISO certification: 27000, 27040, 19086 and related standards. Archives connected to the GEANT network OAIS conformant services: data readability formats, normalization, obsolesce monitoring, files fixity, authenticity checks, etc. ISO 14721/16363, 26324 and related standards User services: search, discover, share, indexing, data removal, etc. Access under Federated IAM Layer 1 Storage/Basic Archiving/Secure backup Layer 2 Preservation Layer 3 Baseline user services Layer 4 Advanced services High level services: visual representation of data (domain specific), reproducibility of scientific analyses, etc. EMBL 1 – FIRE PIC 2 – Mix File Storage DESY 1 – Individual Scientist CERN 1 – CERN Open Data CERN 2 – CERN Digital Memory PIC 3 – Data Distribution EMBL 2 – Cloud Caching PIC 1 – Large File Storage Scientific use cases deployments: https://www.archiver-project.eu/deployment-scenarios DESY 2 – Petra III Experiment DESY 3 – EUXFEL Experiment R&D Scope ARCHIVER “current state of the art” report in the context of the EOSC: https://doi.org/10.5281/zenodo.3618215 16
  • 18. Prototype Phase Highlights • Functional objectives of the Prototype phase were successfully met. • R&D challenge and scientific use cases requirements were globally understood. • Systematic and structured feedback across the Buyers and Contractors to allow full understanding of the R&D challenge by the Contractors and its service capabilities by the Buyer organizations. • Prototype Phase organized into 13 bi-weekly sprints in an Agile-like process with a regular rhythm of feedback and adjustment between Buyers and Contractors. • CERN, EMBL-EBI, DESY & PIC allocated significant effort assessing and testing the prototypes, ingesting several hundreds of TBs of data. 18
  • 19. From Prototypes to Pilot • R&D produced by the Contractors • More than 200 tests executed during the Prototype phase across three platforms • Technical assessment in two steps: • R&D produced during the Prototype completed & of sufficient quality • Mini-competition (Call-off) for Pilot phase services R&D • Main Criteria considered in the R&D review process: • Performance to scale correctly in the PB data region • R&D validation progressing from functional to “Go-To-Market” ready • Expertise in supporting Data Stewards achieving certification of scientific repositories • Clear commercialisation plans for the resulting services after the project end 19
  • 20. Pilot Phase Challenges • Deployment model: on-premise vs cloud, portability, migration between different public cloud infrastructure, strategies to avoid vendor lock-ins, disaster recovery • Advanced AAI & Access management; Network performance: 100TB/day • Scalability and Elasticity: High Volumes for data ingestion and data recall • Security: simulated cyberattacks • Layer-4 reproducibility of scientific analyses independent of infrastructure • FAIRness measurements: F-UJI tool: https://www.f-uji.net/ • Total Cost of Ownership (TCO): SaaS on cloud vs SaaS on-premise 20
  • 21. Early Adopters • Participants: • Demand side public sector organisations • Key advantages • Assess if resulting services address their archiving and preservation needs • Contribute to shaping the R&D carried out in the project, identify additional use cases • Have the option to purchase pilot-scale services by the end of the project https://archiver-project.eu/early-adopters-programme 21
  • 22. A Minimum Viable EOSC • EOSC Association established to govern the European Open Science Cloud • Ambition to develop “Web of FAIR Data and Services” for science in Europe for multiple domains • Provide access to services for storage, computation, analysis, preservation, etc. based on standards, combining data and services “FAIR Forever Report: A study to remember” with an explicit mention to the ARCHIVER activities : →https://www.dpconline.org/blog/fair-forever-a-fair-st udy-to-remember EOSC Long-term Data Preservation Task Force charter →https://www.eosc.eu/sites/default/files/tfcharters/e osca_tflongtermdatapreservation_draftcharter_202106 14.pdf 22
  • 23. ARCHIVER & EOSC ● Broad pan-European requirement analysis of the research sector ○ Analysis considered in the competitive R&D tender ○ Technical and organisational measures aligned with European legislation in the services being developed (by default & by design) ● Early Adopters Programme established ○ Additional use cases expanding further the set of supported scientific domains ○ Publicly funded research actors external to the ARCHIVER consortium ● Model for facilitate procurement of sustainable pilot services ○ Consortium members and Early Adopter organisations ○ Beyond the lifetime of the project ARCHIVER is the only EOSC related H2020 project focusing on Archiving & LTDP services for PB scale datasets across multiple research domains and countries. 23
  • 24. Summary ● The R&D challenge of digital archiving goes beyond data storage: keep intellectual control of data and associated products for decades, make research outputs reusable ● Extending FAIR to research associated products: software, workflows, services and even infrastructures ● Acting as a template to commoditise archiving and preservation at scale in research domains with expert European SMEs ● ARCHIVER is promoting a sustainable model with services that will exist beyond the project lifetime in the context of the EOSC ● ARCHIVER pilot phase: validating services with end-users and early adopters organisations to determine suitability to their needs. 24
  • 25. 14:30 - 14:40: Welcome from Sergey Yakubov (DESY) 14:40 - 15:00: Project overview / update - João Fernandes (CERN) 15:00 - 15:30: Expected outcomes of the Pilot Phase - Buyers Group representatives (CERN, DESY, EMBL-EBI, PIC) 15:30 - 15:40: Interactive brainstorming 15:40 - 15:50: Break Award ceremony 15:50 - 16:20: Presentation from consortium 1 16:20 - 16:50: Presentation from consortium 2 16:50 - 17:00: Closing remarks - João Fernandes (CERN) Event Outline 25
  • 27. Expected outcomes of the Pilot Phase Buyers Group representatives CERN, DESY, EMBL-EBI, PIC
  • 28. CERN Requirements and Expectations Tibor Simko, Jean-Yves Le Meur, Antonio Vivace, Ignacio Peluaga 28
  • 29. CERN Open Data: From data preservation to data reuse 29 https://opendata.cern.ch https://www.reana.io CERN Open Data portal ● more than 2.5 petabytes of particle physics data ● education use cases ● independent research use cases REANA reproducible analysis platform ● run containerised computational workflows on remote clouds ● CWL, Snakemake, Yadage ● HTCondor, Kubernetes, Slurm Nature 533 (2016), 452–454 https://doi.org/10.1038/533452a
  • 30. CERN Open Data: From data preservation to data reuse CMS Higgs-to-four-lepton example analysis CMS Higgs-to-tautau example analysis ● looking at “Compute” possibilities for content preserved in “Storage” ● reproducing several open data analysis examples ● reprocessing published simulated datasets ● challenges: ○ use containerised workflows (Docker, Kubernetes) ○ serve database content (CVMFS) rule analyze_data: input: config["data"], config["code"], "results/scramdone.txt" output: "results/DoubleMuParked2012C_10000_Higgs.root" container: "docker://cmsopendata/cmssw_5_3_32" shell: "source /opt/cms/cmsset_default.sh " "&& cd CMSSW_5_3_32/src " "&& eval `scramv1 runtime -sh` " "&& cd HiggsExample20112012/HiggsDemoAnalyzer " "&& cd ../Level3 " "&& cmsRun demoanalyzer_cfg_level3data.py" Running containerised workflows 30
  • 31. CERN Digital Memory: the OAIS Archive 31 - Status: - Tool to harvest main CERN Information Systems and create SIP (following BagIt specs). - Includes publications, preprints, presentations, photos, videos, and more - Control Interface to run ‘’on-demand’’ archiving for people leaving CERN - Challenges: - Duplication - Authorships - Integrity - Versioning - Running preservation services/ file conversions to create AIPs
  • 32. CERN Digital Memory: feeding Archiver.eu 32 Archiver.eu solutions to ingest CERN SIPs and store artifacts (AIPs, DIPs...)
  • 33. 33 DESY Requirements and Expectations Sergey Yakubov, Martin Gasthuber
  • 34. Main sources of data to be archived and preserved - 2021 >30 PB annual 3-6 PB annual ● two sites ○ Hamburg ○ Zeuthen (near Berlin) ● science fields ○ particle physics (LHC, Belle 2, …) ○ photon science (EuXFEL, Petra III, FLASH) ○ accelerator research (wakefield, Petra IV, …) ○ astrophysics ● all areas “data intensive science” 34
  • 35. use cases - selection and relations ● photon science focussed - pushed by 3 local independently running accelerator facilities ○ ground zero of the data deluge ● three, partly nested, scaled use cases ○ (1) individual scientist/small working group ○ (2) large working group/experiment ○ (3) facility level (with data policy) ● existing entities/actors with data ownership and stewardship responsibilities 35 1 2 3 referencing
  • 36. More tangible / general expectations ● core functionalities completed in last phase ● focus on scaling and stability validations ● Tape integration - scaling requires this ‘low cost’ storage technology ● hybrid deployments - i.e. MD and initial data copy on-site, secondary data and MD backup (cold & encrypted) off-site - targeting at: ○ off-site data co-located with public cloud computing for ‘open data’ and related analysis tasks ○ primary off-site data attractive to communities/groups not having an ‘IT-Home’ 36
  • 37. ● decent volume & bandwidth requirements ○ few 10 TB, 100MB/s, 10K objects - per archive ○ ~0.2-0.5PB annual ● access mainly via a web browser (GUI) ○ pre-defined MD schema & policy and templates by admins ● scientist (in person) - owner and archivist ● challenges ○ authentication - federated AAI ○ metadata scheme - hierarchy of schemes, mandatory fields, controlled extension ○ DOI / publication ○ local & hybrid deployments ○ open-data management & access Case 1 - individual scientist / small working group 37
  • 38. Case 2 - mid size - Petra III experiment ● Case 1 plus… ● increased volume and bandwidth requirements ○ ~100 TB, 1-2GB/s, >150K objects per archive ○ ~3-6 PB annual ● mainly API access (non interactive) ○ agent - automated life cycle process integration ● challenges ○ templating - i.e. MD, lifetimes, access rights, etc. ■ compliant to data policy ○ data storage costs starts to be important ■ volume and bandwidth amplification inside your solution - rising importance 38
  • 39. Case 3 - facility level - EuXFEL ● Case 2 plus... ● largest scale ○ few 100 TB up to 1-2 PB, 2-10GB/s, >30K objects - per experiment ■ 1.3PB per day raw data production seen recently - and growing ○ >30 PB annual ● non interactive - full API access ○ full templating for creation and filling process of the archive ● challenges ○ scale ○ data storage costs becomes the most important criteria ■ volume and bandwidth amplification inside your solution becomes dominating criteria 39
  • 40. EMBL-EBI Requirements and Expectations Justin Clark-Casey 40
  • 41. About EMBL-EBI ● “The home for big data in biology” ● Datasets, datasets, datasets: genomics, proteomics, metagenomics, transcriptomics, imageomics … ○ You (researchers) submit them ○ We curate them ○ We provide them ○ You download them/(cloud)-access them/analyze them ○ We archive them ● Part of the European Molecular Biology Laboratory (27 member states, 1800 ppl, 80 independent research groups) ○ An intergovernmental organization like CERN. 41
  • 42. Data growth at EMBL-EBI 42
  • 43. EMBL-EBI Archiver Pilot Phase Plan ● Pilot preservation of selected large datasets from across EMBL-EBI as part of an integrated data management system 43
  • 44. EMBL-EBI Archiver Pilot Phase Plan ● Deploy a pilot in a real storage environment ● Taking forward the EMBL on FIRE use case ● No new feature requests ○ But may use features developed for other buyers (e.g. flexible metadata schemas) ● Assess in pilot conditions ○ Ease of integration ○ Scalability ○ Security ○ Robustness ○ Cost 44
  • 45. 45 PIC Requirements and Expectations Jordi Casals, Manuel Delfino, Jordi Delgado
  • 46. Port d’Informació Científica Use cases will be based on MAGIC Telescope data Observatorio del Roque de los Muchachos (La Palma, Canary Islands) • Collecting data 365 days a year • Except when a volcano starts activity Hint: NOW • 300TB per year for ranges of 5-6 years • Random recalls during the period ALBA Synchrotron • More than 10 Beamlines (and growing for next years) • Datasets ranging from 200TB up to 4PB • Internal and external scientific users In collaboration with 46
  • 47. Pilot Requirements ● Petabyte level Storage → functional, reliable, good performance, reasonable cost ○ From 1PB in 2021 to 15PB in 2025 ○ Scalability testing → upload and downloads with production environments, sizes and bandwidth ● Integrate and automate processes using services CLI or API based scripts with production systems ● Create custom metadata schemas automatically on upload from data sources ○ Run production processes based on metadata searches ● Give access to users inside the collaboration with different permissions based on roles in experiment ● Test in-archive data processing using co-located Cloud ○ Enables automatic processing of uploaded data and prevents moving data around two times ● Test future reprocessing for reusability, based on custom containers, python notebooks or any possible system that may help the data to not expire ● Analyze production costs to evaluate if it would be a usable option Everything will be tested by both PIC and ALBA and put in common to better evaluate the final results and possibilities of the resulting services 47
  • 49. 49
  • 50. Go to www.menti.com and use the code 3228 5110 50
  • 53. 53
  • 55.
  • 56. Bringing archived data to life SCALABLE AND SUSTAINABLE LONG TERM DIGITAL PRESERVATION OF SCIENTIFIC DATASETS Matthew Addis Arkivum ARCHIVER Pilot Kick-Off Event 29 Nov 2021
  • 57. Bringing archived data to life Contents •Sustainability • Scalability, Performance and Cost Effectiveness • Portability, Open Standards, Data Sovereignty • FAIR Forever, Digital Preservation, Long-term Access and Reuse • Environmental Sustainability •Summary
  • 58. Bringing archived data to life Scalability, Performance and Cost Effectiveness
  • 59. Bringing archived data to life ARCHIVER: requirements https://zenodo.org/record/4574234#.YW2NyqDTWJQ
  • 60. Bringing archived data to life Demand side requirements
  • 61. Bringing archived data to life Long-term digital preservation (LTDP) factories to support sustainable FAIR data
  • 62. Bringing archived data to life LTDP factories that scale • Automation • Minimal Effort Ingest / Minimal Viable Preservation • Scalable and efficient use of IT resources • Serverless computing • Microservices • Scale-out parallel processing • Robust workflows • Failures will happen • Record and audit everything Eugen Stoll, CC BY-NC 2.0, https://flic.kr/p/5c8cz
  • 63. Bringing archived data to life Microservices, serverless computing, cloud infrastructure
  • 64. Bringing archived data to life Ingest and archiving of a 50TB dataset 50,000 HDF5 files (50TB) ingested and stored at a rate of 150TB/day CPU cores Ingest Starts Time All done
  • 65. Bringing archived data to life Costs and Benchmarking • Execute real world scenarios on GCP • Record parameters • execution time • data volumes, number of files • type of activity (ingest, export, preservation) • Extract baseline costs from GCP • Extract overhead of running scenario • Normalise to create a ‘cost per TB’ metric • Calculate short-term and long-term storage • Upload/export buckets, caching, archive buckets
  • 66. Bringing archived data to life Costs Projections
  • 67. Bringing archived data to life Portability, Open Standards, Data Sovereignty
  • 68. Bringing archived data to life Open Standards, Open Specifications, Open Source: Deployment
  • 69. Bringing archived data to life Open Standards, Open Specifications, Open Source: Interaction Bagit & BDbags
  • 70. Bringing archived data to life Open Standards, Open Specifications, Open Source: Content
  • 71. Bringing archived data to life FAIR Forever, Digital Preservation, Long-term Access and Reuse
  • 73. Bringing archived data to life Standards, External Assessment and Certification Self Assessment Peer Review Formal Certification
  • 74. Bringing archived data to life Environmental Sustainability
  • 75. Bringing archived data to life Making LTDP more environmentally sustainable • Keep less • e.g. appraisal, don’t digitize everything • Do less • e.g. minimum effort ingest, don’t normalise • Make smarter use of storage • e.g. deep archive, small footprint access copies • Make more efficient use of IT resources • e.g. don’t leave idle servers running • Use environmentally friendly infrastructures • e.g. cloud with renewable energy https://www.dpconline.org/blog/is-digital-preservation-bad-for-the-environment https://dash.harvard.edu/handle/1/40741399
  • 76. Bringing archived data to life Cloud is greener than you might think https://www.google.com/about/datacenters/gallery/#hamina-exterior-landscape https://cloud.withgoogle.com/region-picker/ https://www.google.com/about/datacenters/gallery/#st-ghislain-solar-panels
  • 77. Bringing archived data to life Cloud providers can achieve very high energy efficiency https://datacenters.lbl.gov/sites/default/files/Masanet_et_al_Science_2020.full_.pdf https://www.stickermule.com/marketplace/3442-there-is-no-cloud
  • 78. Bringing archived data to life Strategy for energy efficiency and low carbon footprint • Users can choose what processing to do on their content • Deep archive storage for infrequently accessed data • Serverless computing: only consume resources when needed • Deployment into energy efficient data centres • Allow applications to run near to the data
  • 79. Bringing archived data to life Summary
  • 80. Bringing archived data to life Summary • The scale of scientific datasets mean that new approaches to LTDP are necessary • Automation, microservices, serverless computing and cloud IaaS are powerful combination • Efficiency and costs can be measured, managed and optimized (€ and carbon) • Reduced consumption and choice over cloud location/provider helps environmental sustainability • Open standards, open specifications and open source are key to portability and interoperability https://www.archiver-project.eu/
  • 81. Bringing archived data to life QUESTIONS? hello@arkivum.com www.arkivum.com matthew.addis@arkivum.com orcid.org/0000-0002-3837-2526 www.arkivum.com
  • 82. Libnova - CSIC - University of Barcelona - Giaretta Associates - AWS - Voxility - Bidaidea
  • 83. LIBNOVA USA 14 NE First Ave (2nd floor) -Miami, Florida 33132, USA -Tel: +1 844-894-6532 LIBNOVA EU Paseo de la Castellana 153 -28046 Madrid, Spain -Tel: +34 91 449 08 94 contact@libnova.com PILOT PHASE KICK-OFF EVENT 2021-11-29
  • 84. Big thanks to the Archiver team EU Representatives and the LIBNOVA consortium team.
  • 85. 85
  • 86. Market Time to market: Shorter development lifecycle leading to a faster time to market (5 to 2 years) Market visibility: Several EU/USA Universities and a large European pharmaceutical signing contracts. Product Accelerated innovation: Being able to work with the 4 buyers maximizes success chances in first iteration. Standards: We are bringing several – previously unknown- standards onboard. Team EUnthusiasm: LIBNOVA consortium’s team feels enthusiastic about being part of something larger, relevant to the EU. The EU is leading the R&D in this field. Happiness: Everyone have been working really hard, but really happy about it! How are you and the ARCHIVER project benefiting LIBNOVA?
  • 87. Making the practice more efficient - Direct: Optimized storage costs, low operational costs. - Indirect: Less time/resources for producers/researchers to use it. Making it available to a broader audience - Demystifies preservation. Easy to understand and to use. - Opens the practice to large volume datasets and to more advanced organizations. Making it easier to apply best practices - Fully conformant to ISO 14721 and ISO16363 (and several other standards) - Full support for FAIR/TRUST data models, workflows, etc. How are we benefiting the community?
  • 88. How are we benefiting the community? Reducing the environmental impact of the community preservation activities - Reuse what you have: on-premise deployments using existing infrastructure are possible. - Consume only when needed: Resources are consumed when there is something to do. Scaling from 36 Kubernetes pods to ~5000 in 32 minutes. Process the workload and then back to 36 pods. - “Environment impact” topic is included in every architecture specification/analysis. The team is proud of many small and smart optimizations. For example, the hashing algorithm. - Using carbon-neutral providers and data centres. Contributing to the European digital sovereignty and digital development and leadership - Cloud provider independence - Decouples content from providers - Open to the emergence of EU-based cloud providers.
  • 89. LIBNOVA USA 14 NE First Ave (2nd floor) -Miami, Florida 33132, USA -Tel: +1 844-894-6532 LIBNOVA EU Paseo de la Castellana 153 -28046 Madrid, Spain -Tel: +34 91 449 08 94 contact@libnova.com Thanks again!
  • 92. Closing Remarks Upcoming Milestones for Buyers / Early Adopter organisations ● Today: Pilot kick-off meeting ● January 2022: Full capacity pilot platforms available ● April - June 2022: Project Review meetings with demos of the resulting services ○ Webinars for training administrators and end users in the Pilot platforms Upcoming Public Events (February - June 2022) ● Industry related webinars about the advantages of PCP projects - February 2022 ○ Procurement of research and development of new innovative solutions before they are commercially available ● Thematic webinars for potential Early Adopters of ARCHIVER - March-May 2022 Dates and Registration to be announced at https://archiver-project.eu/ 92