The main goal of the EGI-EUDAT collaboration is to harmonise the two eInfrastructures, including technical interoperability, authentication, authorisation and identity management, policy and operations. As main objective, this work is to provide end-users with a seamless access to an integrated infrastructure offering both EGI and EUDAT services and then, pairing data and high-throughput computing resources together. Selected user communities are able to bring requirements and help assign the right priorities to each of them. In this way, the integration activity has been driven by the end users from the start. The use case permits a user of either e-infrastructure to instantiate a VM on the EGI Cloud Federation for the execution of a computational job consuming data preserved onto EUDAT resources. The results of such analysis can be staged back to EUDAT storages, and if needed, allocated with Persistent identifiers (PIDs) for future use. To implement all the steps of this use case the following integration activities between the two infrastructures has to be fulfilled: (1) harmonisation between the authentication and authorisation model, (2) definition and implementation of the interfaces between the involved EGI and EUDAT services.
Visit: https://www.eudat.eu/eudat-summer-school
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
Linking EUDAT services to the EGI Fed-Cloud - EUDAT Summer School (Hans van Piggelen, SURFsara, Michaela Barth, KTH)
1. www.eudat.eu
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065
Linking EUDAT services with
EGI
Three use cases
Michaela Barth caela@kth.se
Hans van Piggelen hans.vanpiggelen@surfsara.nl
This work is licensed under the Creative
Commons CC-BY 4.0 licence
2. 2
Overall EUDAT WP7 objectives
Ensure the interoperability of EUDAT with other
public and private e-Infrastructures
Provide European researchers and industries with
seamless access to data and computing resources
Pave the way towards the interoperability of e-
Infrastructure tools and services beyond H2020
4. Lowered barriers, easier access for users to combine
computation and data
Reduced operational and administrative efforts for the
e-infrastructures
Additionally pooling effect: augmentation of services
from existing communities using EGI/EUDAT
4
Computation on EGI
Federated Cloud
and HTC
EUDAT services for
transfer, syncing,
sharing, staging and
preservation of data
Expected Benefits
5. WP7 Task 7.2: Joint Access to Data, HTC
and Cloud Computing Resources
EGI-EUDAT collaboration started in March 2016 and
at least continues until March 2018
Aiming at a production cross-infrastructure service
Starting with concrete community pilots:
EPOS
ICOS
IS-ENES (very recently)
5
6. 6
Harmonization on all levels
Technical
Interoperability (e.g. workflow execution, data
discoverability and provenance)
Authentication, Authorization and Identity (AAI)
management
Combination of respective service catalogues
Policies
Access policy
Long-term perspective
Operational policies
Operational
Operational tools, technologies, best practices
Security
SLAs
7. 7
EGI/EUDAT AAI Interoperability
See the EGI and EUDAT services as offered by a
unique infrastructure once authenticated
Breaking this down into smaller steps:
Allowing users to access EGI and EUDAT web
services with the same credentials
Allowing users to access EGI and EUDAT non-web
services with the same credentials
Attributes harmonisation
Enabling EGI services to delegate user’s
credential to EUDAT services and vice versa
Data privacy issues and
policy harmonisation
8. 8
EGI/EUDAT AAI Interoperability
Work done so far:
Get an understanding of each other’s AAI layers
Breaking the task down into smaller steps
Started draft documentation
Enabling accounts for some feasibility tests
Next step:
Complete roadmap
9. 9
User community pilots
EGI and EUDAT selected a set of relevant user
communities
Started with process of getting requirements from
user communities and their indication of
prioritization of those requirements:
Definition of universal use case
Integration activity has been driven by the end
users from the start!
Identified user communities that are prominent
European Research infrastructure in the field of
Earth Science (EPOS and ICOS), Bioinformatics
(BBMRI and ELIXIR) and Space Physics (EISCAT-3D)
10. Definition of the universal use case
Demo at EGI Community Forum 2015
10
11. 11
User community pilots
EPOS:
Fostering worldwide interoperability in Earth Sciences
and provide services to a broad community of users
ICOS:
Creating web-based service (“Footprint tool”) at
ICOS Carbon Portal, providing on-demand
computing facilities
ENES (just recently):
Performing on-demand climate data analytics for
climate research and climate change impact
communities
12. 12
European Plate Observing System
(EPOS)
EPOS is the integrated solid Earth Sciences research
infrastructure
Aims to be an effective coordinated European-
scale monitoring facility for solid Earth dynamics
Aims to establish a long-term plan to facilitate the
integrated use of data, models and facilities from
existing, and new distributed research infrastructures
(RIs), for solid Earth science
14. EPOS use case status
Use case first stage was:
Defining the best strategy for access to
Federated Cloud partners and identify secure
and efficient data transfer protocols towards the
iRODS system
Now in second stage:
Integration with third-party data storage services
(EUDAT) and cloud computing resources (EGI)
19. ICOS Carbon Portal use case status
Virtual machines with attached block storage instantiated in the EGI
Federated Cloud.
Docker container for computations with local VM storage
Data transfer between VM and B2SAFE using B2STAGE instance at
PDC/KTH Stockholm
Storing of ICOS data tested on the B2SAFE system at KTH
Robot certificates installed to allow for further automation of the
workflow
Next steps:
ICOS data replication in B2SAFE and access via B2STAGE
Access to common storage for several VMs (via the EGI DataHub)
Load balancing to distribute computations/users requests to several
VMs
Improve documentation, with a clear user perspective!
19
Slide thanks to Margareta Hellström
20. European Network for Earth Science
Modelling (IS-ENES)
20
Spawned from work on EGI-EUDAT interoperability in
WP7/WP8 and the ICOS Carbon Portal use case
developed therein
Goal: enabling computation on CMIP5/CMIP6 data
stored in the Earth System Grid Federation (ESGF)
infrastructure
Calculations will be performed using the EUDAT
General Execution Framework Workflow API (GEF)
combined with EUDAT B2 services and EGI
FedCloud
Results will be sent to climate4impact.eu platform
21. IS-ENES use case overview
Adaption of current ESGF
environment at CINES
consisting of one ESGF data
node and one B2SAFE node,
no GEF yet
21
22. hfp://climate4impact.eu Slide thanks to Christian Pagé and Xavier Pivan
Motivations
Societal
Provide climate projections data to climate change impact
researchers, facilitators, practitioners
Ease access with better intuitive interfaces
Provide more common data formats
Generate tailored products from data processing workflows
23. Climate Research Community
Slide thanks to Christian Pagé and Xavier Pivan
IS-ENES: Current situaFon
Data available for scientific analysis: a very large trend
Limitations in data access means limitations in data
analytics and scientific results
Download locally then analyze: a workflow that cannot be
sustained
Climate researchers
Impact researchers
24. PracFcal Example: Climate Community
FederaFon
Service
• Temperature at 850 hPa field (Aggregated files 30 levels)
• 10 climate models
• 1960-1990 & 2040-2070 = 60 years = 21 915 days
• Daily fields = 1 field per day
• Global spatial scale 100 km resolution
TOTAL: 6 754 500 fields to download
~100 Kb per 2D field = 626 Gb
After the analysis post-processing
• Anomaly of the average of the two periods over a specific
country for each climate model
• Result: 10 times 2D fields over a small domain
• Estimated data size after post-processing: 1 Mb
Data reduction...
Slide thanks to Christian Pagé and Xavier Pivan
Current situation
25. IS-ENES use case plan
Steps:
1. Researcher finds data in B2SHARE using B2FIND, or provides PIDs/
URLs
2. Researcher performs Data Analytics of selected data using GEF
backend deployed on EGI FedCloud. Output is stored into EGI
Volume.
3. Results are sent back to B2SHARE/B2DROP/B2SAFE for researcher
to download, or execute another GEF for further calculations or
to generate a figure
So far: Using EGI FedCloud IaaS infrastructure with VMs with CPU,
RAM and storage, Docker engine; testing dockerized jOCCI API for
automatic instantiation of VMs
Next steps:
Data transfers via Globus GridFTP, getting test access on
B2STAGE instances (KTH-PDC, CINECA?, STFC?)
Follow ICOS example in testing interoperability B2SAFE-B2STAGE/
EGI
25
27. Challenges encountered so far
When implementing prototypes of use-cases of EPOS
and ICOS:
Scaling up
Managing co-existing support systems and channels
User-friendly documentation often missing or lacking
Steep learning curve for the user communities
3rd party dependencies, e.g. GridFTP
Large amount of small files not suitable for input yet
On the plus side: personal contacts highly appreciated
27
28. Future continuation
Continued implementation of use cases
with result evaluation
Description of work and dataflow for the
use cases
Improved documentation
Final report (aim: end of Jan. 2018)
including recommendations for service
development and access policy
harmonization
28
29. www.eudat.eu
This work is licensed under the Creative Commons CC-BY 4.0 licence
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures.
Contract No. 654065
Authors Contributors
Michaela Barth, KTH
Hans van Piggelen, SURFsara
Christian Pagé, CERFACS
Xavier Pivan, CERFACS
Margareta Hellström, LU
Ute Karstens, LU
Thank you!