4. www.egi.euEGI-InSPIRE RI-261323
The Federated Cloud
A federation of Cloud resources from the public, academic and private sectors,
offering Cloud Services to all research communities
A ‘single’ cloud system to;
• scale
• integrate multiple providers irrespective of technology
• target the research community
Standards based federation of IaaS cloud:
• Exposing a set of independent cloud services through a common
standards profile
• Allowing deployment of services across multiple providers and capacity
bursting
• Building on world class EGI core services already proven
5. www.egi.euEGI-InSPIRE RI-261323
Usage Model
• Total control over
deployed applications
• Elastic resource
consumption based on
real needs
• Workloads processed on-
demand
• Endorsed and accredited
applications available
from multiple different
communities shared
• Single sign-on at multiple,
independent providers
• Centralised access to
service information across
multiple providers
VM
Operator
Resource
Provider
6. www.egi.euEGI-InSPIRE RI-261323
EGI Federated Cloud
6
EGI Core PlatformCloud Infrastructure
Platform
Collaboration Platform
Monitoring and
control of utilisation
Technical Consultancy
and Support
Uniform interfaces to
Cloud Compute and
Storage
Secure endorsed
Application and Service
Deployment
User Community
Consumer VM-Operator
Community
Management All
7. www.egi.euEGI-InSPIRE RI-261323
EGI Cloud Infrastructure
7
EGI Core Platform
Federated
AAI
Service
Registry
Monitoring Accounting
EGI Cloud Infrastructure Platform
Instance
Mgmt
Information
Discovery
Storage
Management
Help and
Support
Security Co-
ordination
Training and
Outreach
EGICollaborationTools
EGIApplication
DB
Image
Repository
EGICloudServiceMarketplace
Sustainable
Business
Models
User Community
Monitoring and control of utilisation
Technical Consultancy and Support
Uniform interfaces to Cloud
Compute and Storage
Cloud Management Stacks
(OpenStack, OpenNebula, Synnefo, …)
SecureendorsedApplication
andServiceDeployment
GSIGLUE2
Cloudinit CDMI
SAM UR
OVF
OCCI
9. www.egi.euEGI-InSPIRE RI-261323
Partnership
Resources
– 21 certified resource providers from 13
Countries
– 9 resources in certification process
– Worldwide interest & integration
• Australia* (NeCTAR)
• South Africa* (SAGrid)
• South Korea* (KISTI)
• United States* (NIST, NSF A.C. Centres)
– Technology
• 12 x Openstack
• 7 x Open Nebula
• 1 x Syneffo
• 1 x Emotive
* Not shown on map
9
11. www.egi.euEGI-InSPIRE RI-261323
DRIHM
11
• Scientific Discipline: Natural Science, Earth sciences, Hydrology
• Status: Test & Integration (drihm.eu VO)
DRIHM in the EGI FedCloud:
• Running various hydrological models in the
EGI Federated Cloud
• 1 VM: 1 cores, 4/8 GB of RAM
• few GB of storage
• Windows OS
• Contextualisation for Windows OS VM image
• Licence issue
DRIHM:
• project funded by EC aiming at providing an open, fully
integrated workflow platform for predicting, managing and
mitigating the risks related to extreme weather
phenomena.
12. www.egi.euEGI-InSPIRE RI-261323
Chipster
12
Chipster in the EGI FedCloud:
• ‘light’ VM (datasets removed)
• Chipster VM configured through
contextualisation
• shared block storage exported as
NFS for tools (500 GB)
• block storage for output (500 GB)
• Scientific Discipline: Natural Science, Biological Sciences, Bioinformatics
• Status: Production
ELIXIR Pilot Action Proposal:
Using virtual machines and clouds in bioinformatics training
User-friendly analysis software for
high-throughput data:
• NGS
• Microarray
• Proteomics
• sequence data
13. www.egi.euEGI-InSPIRE RI-261323
Use Case Discipline Classification
13
Usage since launch
>600k VMs
>40M CPU hours
Usecases
- 12 @ Launch
- 60 to date
- 11 production
23. Thoughts on Hub
• People are increasingly transient
– Stop loosing the unknown - knowns
• Living data is often the forgotten component in data management
• All data will be born digital
• Data management requirements mean responsibility for storage of raw data is
increasingly important
• Laboratory equipment can directly record into Hub to ensure data
management from birth to death
• Connecting all of the experiment to ensure institutional knowledge capture,
– neat and rough notes, raw data, analysis applications and output
25. Bio-Linux: A scalable solution
• Comprehensive, free bioinformatics workstation based on
Ubuntu Linux
• 10 years & 8 major releases
• 200+ bioinf packages including big integrative tools :- QIIME,
Galaxy Server, PredictProtein, EMBOSS, ...Incorporates all
software
• >7000 users in >1600 locations
Dual BootLinux Live Local Servers Cloud
26. Why Cloud?
• Tools such as Bio-Linux are community enablers
• Data sets can be too big or restricted to easily move
– move the compute to the data
– Researcher work patterns are maintained
• Need more efficient use of shared resources
• Central maintenance of infrastructure
• Lower barrier to entry (Compared to traditional
HPC and Grid)
27. EOSCloud
• A NERC Big Data capital project
• A tenancy in the STFC JASMIN Unmanaged Cloud
• Each registered user receives two VMs
• Bio-Linux
• Ubuntu Docker hosting environment
– With total responsibility for instantiated system
– Accessible though standard remote desktop tools
• But,
– utilising single scale of resources would be a waste
– Can we scale the users virtual services to take into account
demand?
28. Boosting Resource Capabilities
• Users VMs operate in native state ‘Standard’
– Enough capability to access stored data
– Configure applications and workflows
– Free
• User may boost his running VM to increased
capability
– Enough to run analysis applications on useful
timescale
– Credit consumption only for Boosted instances
• Reference datasets available to users
through shared storage
Name # Core Memory (GB) Cost(Credit/hou
r)
Standard 1 16 0
Standard+ 2 40 1
Big 8 140 4
Max 16 500 8
29.
30. Desktop as a Service for research
• Giving researchers an environment they are confident
in by changing the infrastructure around them
• Location independent persistence of research
environments
• Launch for pilot user communities 31st Mar 2015
– Moving beyond pilot user communities (e.g. Ocean
Sampling Day)
• Investigating other key usage models such as teaching
or online learning
31. Conclusions
• Cloud is (obviously) an enabler for research
– Allowing flexibility in infrastructure hitherto not possible
– User control rather than provide control
• Its not just about infrastructure and not just about single cloud providers
• Cloud is a way of allowing higher level services to be made more easily and made
accessible
• Open standards
– allow a marketplace of services to develop
– allows diverse resource providers to participate
– Moves the value add from availability of service to quality of service
Stability of production resources, increasing number drawn to federated cloud with largish backlog of integrating sites
Tools on PCs for converting files
Blog3 or similar with additional functionality on the group systems
myExperiment for supporting interaction within the community