Over the past five years, cloud computing has gone from a curiosity to
core scientific technology. The cloud's relative simplicity, instant
availability, and reasonable cost have made it attractive to
scientists, especially in domains relatively new to large scale data
analysis. This trend will continue into the foreseeable future,
challenging resource providers to adapt their services, to provide
easy federation with other providers, and to accommodate many
different scientific disciplines. For developers of cloud services,
there are also many challenges. Efficient access to, and the curation
of large data sets remain largely unsolved problems. Image
management also raises new issues, especially if these images are to
be shared and trusted. This presentation reviews the current status
of cloud computing and presents some ideas on how the upcoming
challenges might be met.
Presented at CNAF in Bologna, Italy by Charles Loomis in May 2013.
2. 2
Cloud Marketing
“Cloud” is currently very trendy, used everywhere
Many definitions that are often incompatible
Used (often) to market pre-existing (non-cloud) software
CommodityComputing(Sun)
UtilityComputing(IBM,HP,…)
AmazonEC2
AmazonEBS
Mature Virtualization
Simple APIs
Excess Capacity
3. 3
In two pages NIST defines:
Essential characteristics
Deployment models
Service models
What is a Cloud?
http://csrc.nist.gov/publications/
nistpubs/800-145/SP800-145.pdf
4. 4
On-demand self-service
No human intervention
Broad network access
Fast, reliable remote access
Rapid elasticity
Scale based on app. needs
Resource pooling
Multi-tenant sharing
Measured service
Direct or indirect economic
model with measured use
Essential Characteristics
http://csrc.nist.gov/publications/
nistpubs/800-145/SP800-145.pdf
5. 5
Private
Single administrative domain,
limited number of users
Community
Different administrative domains
with common interests & proc.
Public
People outside of institute’s
administrative domain
Hybrid
Federation via combination of
other deployment models
Deployment Models
http://csrc.nist.gov/publications/
nistpubs/800-145/SP800-145.pdf
6. 6
Software as a Service (SaaS)
Direct (scalable) hosting of end
user applications
Platform as a Service (PaaS)
Framework and infrastructure
for creating web applications
Infrastructure as a Service (IaaS)
Access to remote virtual
machines with root access
Service Models
http://csrc.nist.gov/publications/
nistpubs/800-145/SP800-145.pdf
7. 7
Advantages
No software installation
Universally accessible
Disadvantages
Questions about data access,
ownership, reliability, etc.
Integration of services & novel
uses of data are (often) difficult
Trends
Social scientific computing
Service APIs to allow
integration PaaS
Software as a Service (SaaS)
8. 8
Advantages
Programmers take advantage of
integrated load balancing,
automatic failover, etc.
Disadvantages
Restricted number of languages
Applications strongly locked to a
particular provider
Trends
Dearth of “pure” PaaS offers
Encroachment from both SaaS
and IaaS sides
Platform as a Service (PaaS)
9. 9
Advantages
Customized environment with
“root” access
Easy access to scalable
resources
Disadvantages
Variety of APIs and interfaces
VM image creation is difficult
and time-consuming
Trends
Lots of specialized cloud
providers appearing
Orchestration pushing into
PaaS space
Infrastructure as a Service (IaaS)
11. 11
State of the Art
Commercial Provider: Amazon Web Services (AWS)
Leading and largest IaaS service provider
Improving and adding new services at a phenomenal rate
Almost all IaaS providers use AWS-like service semantics, but
differentiate based on price, SLAs, location, etc.
Commercial Cloud Distribution: VM-ware
VM-ware: extremely good and complete, but very expensive
Provide ESXi virtual machine host for free
Open Source Cloud Distributions
Essentially none in 2004; now easily a dozen different distributions
StratusLab, WNoDeS, …, OpenStack, OpenNebula, CloudStack
Very different levels of maturity, stability, scalability, etc.
12. 12
Why are cloud technologies useful?
For (scientific) users
Custom environment: no rewriting or porting applications to fit into a
resource provider’s environment
Simple access: most providers use a REST or RPC API allowing
simple access from all programming languages
Reasonable cost: only pay for what resources are used, especially
attractive for individuals/groups that do not have large, existing
hardware investment
13. 13
Why are cloud technologies useful?
Separation of responsibilities
Hardware / Services / Platforms / Users
People at each layer can focus on their responsibilities with minimal
interactions with people in the other layers.
Resource Providers
Better utilization of shared resource because wider range of
applications (and disciplines) can use the cloud
With hybrid cloud infrastructures, providers can outsource excess
demand to other providers
14. 14
Trend for Scientific Computing
Will we all just be users of the Amazon cloud?
Pendulum swinging towards large data centers with “fat” machines
These can offer elastic cloud services at a reasonable price
With scientific clouds there is low barrier to entry and users can
maintain administrative control of services and data
Providing shared resource between scientific disciplines much easier
because of virtualization
Migration will be gradual…
15. 15
Overcoming inertia…
Users
How to use virtual machines to get my work done?
How to structure, store, access, and protect data?
Realize shared infrastructures with customized env. are possible
Application Developers
How to use cloud techniques to improve my applications?
… and my development workflows?
Applications can be services (with assoc. pluses and minuses)
Data Centers
Reuse existing (commodity) hardware investments
Take advantage of (and train) existing system administrators
How to manage/use a (private, community, public) cloud?
Significant benefits from cloud even without large scale elasticity!
17. 17
Elasticity
Can we have infinite elasticity with limited resources?
“Local” solutions
Economic models to avoid hitting infrastructure limits?
Spot instances and/or different service classes?
Aside: IPv6 addressing is necessary for large (scientific) clouds
Federated solutions
Hybrid (scale-out) infrastructures?
Higher-level brokers or orchestrators?
Cannot have elasticity without some kind of accounting!
18. 18
Data Management (Legal)
Transfer and treatment of data across borders
Differing legal protections in different jurisdictions
Legal constraints for data locality (banking, medical data)
Unclear responsibilities for data: guardian, custodian, owner, etc.
Europe working hard to come to a consistent legal framework
Protection of data in the cloud
Consistent access controls for all data locations
Guarantees about data protection from cloud provider personnel
Reliability of the provider’s storage
Knowledge about provider’s policies for data protection
19. 19
Data Management (Technical)
Efficient exploitation of large datasets
Need significant computing next to storage, AND/OR
High bandwidth remote access
Locality matters
Sometimes it is inconvenient to transfer raw data away from instrument
Clouds can be used to reduce data locally before transfer
“Open Data” requirements
Cloud may help meet such requirements
Does not remove need for well defined dataset metadata and format
Need long-term funding for the curation of those datasets
20. 20
Security
How to maintain the security of a cloud infrastructure?
Shifting some responsibility to users:
Users have root access and must secure services within their VMs
Users have less security experience need for education & help
Dynamic network configurations can help improve security
Changing expectations from administrators:
Leave firewall policies to users running VMs
Should not expect to run security software inside of VMs
Need to enhance monitoring to discover abnormal behavior
21. 21
Image Management
Image metadata
What does an image contain (OS, services, configuration, etc.)?
What versions of the kernel, software, etc. are included?
Who is responsible and/or supports a given image?
How do I identify a given image?
Creating machine images
How can an image be created for multiple clouds?
What do I have to do to create a secure machine image?
Sharing images
How can I make my images available to others?
Can I parameterize my images to make them useful to more people?
Can the images be transported and used efficiently?
22. 22
Vendor Lock-in vs. Federation
Common API
Fully interoperable API avoids duplication in cloud control software
Has very limited impact on applications and services in the cloud
Current (quasi-) standards: EC2, OCCI, CIMI, CDMI, …
Common Semantics
Semantics determine how apps and services operate in the cloud
For IaaS, cloud providers have a broadly similar semantics, but…
File-based and block storage is one difference.
Contextualization
Can a user run the same validated image on all clouds?
Can a user share the same parameterized image with others?
Neglected issue in standardization; CloudInit becoming de facto std.
24. 24
StratusLab History
Informal collaboration to investigate
running grid services on Amazon
EC2 (2007)
StratusLab Project (6/2010 to
5/2012) co-funded by EC with
6 partners from 5 countries
Open collaboration
to continue the
development and support of
the StratusLab software
Website: http://stratuslab.eu
Twitter: @StratusLab
Support: support@stratuslab.eu
Source: http://github.com/StratusLab
Identified need for open
source cloud distribution.
Production dist. with academic
& commercial deployments.
25. 25
StratusLab
Complete Infrastructure as a Service cloud distribution
Developed within EU project, software maintained by partners
Focus: Simple to install and simple to use
Services
Compute: Virtual machine management (currently uses OpenNebula)
Storage: Volume-based storage service
Network: Simple configuration for public, local, and private VM access
Image mgt.: Complete system for trusted sharing of VM images
Tools (python CLI) and APIs (Libcloud) to facilitate use of cloud
Tools to facilitate installation of services
26. 26
SlipStream
Cloud orchestrator and deployment engine
Facilitates testing, deployment, and maintenance of complex systems
Transparent access
to multiple cloud
infrastructures
Allows automated
deployment of systems
in one or more clouds
28. 28
Image Management
Image metadata
What does an image contain (OS, services, configuration, etc.)?
What versions of the kernel, software, etc. are included?
Who is responsible and/or supports a given image?
How do I identify a given image?
Creating machine images
How can an image be created for multiple clouds?
What do I have to do to create a secure machine image?
Sharing images
How can I make my images available to others?
Can I parameterize my images to make them useful to more people?
Can the images be transported and used efficiently?
30. 30
Marketplace
Priorities
Mechanism for sharing and trusting images
Possible to distribute fixed, read-only data sets as well
Split the storage of image metadata and image contents
Define roles for creator, user, administrator, and validator
Implementation
Marketplace API: Proprietary REST API for create, read, search
Marketplace acts as image registry and handles only metadata
Image contents can be located on any public (web) server
‘Private’ images can also be held in cloud storage
31. 31
Marketplace: Key service in larger ecosystem
Trust
Marketplace metadata plays key role in providing information about the
image contents and provenance
Factories
stratus-create-image facilitates customization of images
VirtualBox and other local virtual tools
Bitnami and many similar services for build of standard OS services
Quattor and other fabric management can be used
Transport (& Storage)
StratusLab uses simple HTTP(S) for image transport
Could imagine using ftp, gridftp, bittorrent, etc.
vm-caster/catcher in EGI federated cloud task force
34. 34
StratusLab Federated Cloud Infrastructure
Features
Two sites operating (LAL and GRNET) for ~3 years
Common user authentication
Ability to use the same images across resources
StratusLab client allows easy switching between sites
StratusLab Libcloud binding allows common view of both sites
Need to go further…
To sites running different cloud software
Helix Nebula and EGI Fed. Cloud Task Force active in this area
35. 35
Federation Models
Transparent Federation
Site operators “outsource” to other providers
Completely transparent to end users
Difficult to achieve in practice because of data protection concerns and
network access/performance
Brokered Federation
Variety of different cloud infrastructures are visible to users
Users choose to place virtual machines in particular locations
Simple clients can handle federation if differences are small
Orchestrators are needed for larger differences between clouds
Both Helix Nebula and EGI take the brokered approach
36. 36
Vendor Lock-in vs. Federation
Common API — minor issue
Fully interoperable API avoids duplication in cloud control software
Has very limited impact on applications and services in the cloud
Current (quasi-) standards: EC2, OCCI, CIMI, CDMI, …
Common Semantics — important issue
Semantics determine how apps and services operate in the cloud
For IaaS, cloud providers have a broadly similar semantics, but…
File-based and block storage is one difference.
Contextualization — critical issue
Can a user run the same validated image on all clouds?
Can a user share the same parameterized image with others?
Neglected in standardization but CloudInit becoming de facto std.
37. 37
“Our” Contributions
StratusLab
Adopt CIMI as the standard interface to services
Provide Libcloud (python) driver for StratusLab
Provide EC2, OCCI, etc. as adaptors to CIMI interface
Move to CloudInit for contextualization framework
Flexible authentication system to support different methods
SlipStream
Chosen as main interface to supported clouds in Helix Nebula
Already supports multi-cloud deployments
Support for all clouds within Helix Nebula
39. 39
Cloud Experience at LAL
Private cloud for laboratory services
Works well, plan to migrate all services including grid worker nodes
and experiment-specific servers
Services switched to VMs without users being aware of change
Very different way of working, need to change administrator habits
Have seen some stability issues related to SL6 kernel/virtualization
Public cloud open to university
Very positive reaction to cloud; LAL resources nearly 100% used
Variety of disciplines: biology, software eng., statistics, astrophysics,
bioinformatics, …
After initial introduction, users require only low level of support
Other labs offering StratusLab training without our direct involvement
41. 41
Summary
Cloud for scientific computing
Already a common tool for many scientific disciplines
Cloud technologies will become more pervasive with time
Associated swing back to larger, centralized data centers
Many challenges for cloud
Elasticity with limited resources
Data management (legal & technical)
Security
Image management — unique holistic approach from StratusLab
Federation — brokered federation with StratusLab & SlipStream
LAL cloud experience
Very positive feedback from both administrators and users