Presentation of the Ph. D. dissertation SLA-Driven Cloud Computing Domain Representation and Management. This presentation explains a new methodology for the representation and management of Cloud services using SLA fragments. Cloud resources are described as independent SLA fragments, which are composed on the fly to create complete Cloud services.
An architecture for the management of Cloud services is also presented.
Cloudcompaas, an open source SLA-driven framework is introduced. Cloudcompaas implements the methodology and architecture presented earlier and enables the management of the complete lifecycle of Cloud services.
Finally a set of experiments to validate the utility and performance of the contributions is presented.
3. INTRODUCTION
• Cloud computing is becoming widely adopted in the last
years.
• Many concerns arise regarding Cloud services.
Specifically we focus on the following.
– Representation of Cloud services (e.g. an application that
depends on a software stack deployed on aVM).
– Delivery of QoS on multitier Clouds.
• These concerns motivate the Ph.D. dissertation.
– Definition and implementation of a methodology for the
representation of Cloud services and QoS rules.
• Service Level Agreements (SLAs) are proposed as a vehicle for the
management of QoS on Cloud.
• SLAs can be extended to define Cloud Services.
3
4. CONCEPTS
• Cloud resource.
– A resource (e.g. network, server, storage, etc.) served by a
Cloud system.
• Cloud service.
– A capability provided by a Cloud system. Can range from a
single resource (VM) to a complete system formed by
multiple resources.
• QoS.
– Measure of the performance of a system according to a set of
indicators.
• SLA.
– Contract between a provider and a user defining the
delivered service, as well as conditions and guarantees in the
QoS.
4
5. OBJECTIVES
• Define a generic and extensible methodology for the
description of Cloud services.
– Model of Cloud resources.
– SLAs as an unified representation of Cloud services.
• Design a SLA-driven architecture for the management of
Cloud services.
– Performs provision, scheduling and allocation of resources (passive
features).
– Performs assessment of QoS and elasticity operations (active
features).
• Implement the Cloudcompaas framework, an open source
implementation of the methodology and architecture.
• Evaluate the performance and benefits of the developments
with a set of experiments.
5
6. THREE ACTOR ROLES
• Service developer
– Person who describes and
registers a service in
Cloudcompaas (e.g. application
software).
• Service provider.
– Person who creates an instance
of the Cloud service, and pays
for its deployment and
management.
• Service user.
– Persons that directly makes use
of the Cloud service capabilities.
• These roles may be filled by
the same or different persons.
6
8. HIERARCHICAL MODEL OF RESOURCES
AT DIFFERENT LEVELS
• A simple and extensible model of resources has
been defined to support the Cloudcompaas
methodology.
• It defines resources according to the NIST’s three
levels of Cloud Computing.
• It uses a hierarchical organization of resources.
Some resources are aggregation of others.
• It also includes metadata. Metadata defines
information about the Cloud service or resources
(e.g. number of replicas).
8
9. HIERARCHICAL MODEL OF RESOURCES
AT DIFFERENT LEVELS: IAAS
• AVM is composed of Physical resources.
• Default resources: Cores, Memory, Network and
Architecture.
9
10. HIERARCHICAL MODEL OF RESOURCES
AT DIFFERENT LEVELS: PAAS
• AVirtual Container is
composed by a
hierarchy of software
components.
• The same software
component cannot
appear twice on the
same composition.
10
11. METHODOLOGY FORTHE
DESCRIPTION OF CLOUD SERVICES
• WS-Agreement used as SLA language.
• WS-Agreement describes each service as a SLA
document.
– Defines SLA template, offer and instance documents.
– Defines a schema with the different sections of a SLA.
• Our methodology maps each section of a SLA to a
part of a Cloud service.
– ServiceTerms describe passive features (e.g. resources).
– GuaranteeTerms describe active features (e.g. QoS
rules).
– Creation Constraints represent relationships and
dependences between elements.
11
12. VM DESCRIPTION
<ServiceDescriptionTerm>
<VirtualMachine Name="large">
<PhysicalResource Name="Memory">
1024
</PhysicalResource>
<PhysicalResource Name="Cores">
2
</PhysicalResource>
<PhysicalResource Name="Network">
public
</PhysicalResource>
<PhysicalResource
Name="Architecture">
x86_64
</PhysicalResource>
</VirtualMachine>
</ServiceDescriptionTerm>
• Defined by the
Cloducompaas.
• Represents a largeVM
with 1024MB of RAM, 2
cores and a public
network.
12
13. METHODOLOGY FORTHE
DESCRIPTION OF CLOUD SERVICES
• WS-Agreement used as SLA language.
• WS-Agreement describes each service as a SLA
document.
– Defines a schema with the different sections of a SLA.
• Our methodology maps each section of a SLA to a
part of a Cloud service.
– ServiceTerms describe passive features (e.g. resources).
– GuaranteeTerms describe active features (e.g. QoS
rules).
– Creation Constraints represent relationships and
dependences between elements.
13
14. QOS RULE DESCRIPTION
<GuaranteeTerm Name="SCALE_OUT">
<QualifyingCondition>
MAX_REPLICAS gt ACT_REPLICAS
</QualifyingCondition>
<ServiceLevelObjective>
<KPITarget>
<CustomServiceLevel>
list.avg(CPUPERC) le 90
</CustomServiceLevel>
</KPITarget>
</ServiceLevelObjective>
</GuaranteeTerm>
• Defined by
Cloudcompaas.
• Represents an elasticity
rule.
• If the average CPU
usage of all replicas is
higher than 90%, deploy
a new replica.
14
15. METHODOLOGY FORTHE
DESCRIPTION OF CLOUD SERVICES
• WS-Agreement used as SLA language.
• WS-Agreement describes each service as a SLA
document.
– Defines a schema with the different sections of a SLA.
• Our methodology maps each section of a SLA to a
part of a Cloud service.
– ServiceTerms describe passive features (e.g. resources).
– GuaranteeTerms describe active features (e.g. QoS
rules).
– Creation Constraints represent relationships and
dependences between elements.
15
16. REQUIREMENTS DESCRIPTION
<CreationConstraints>
<Item Name=“hardware">
<Location>
/VirtualMachine/
[@Name='‘large'']
</Location>
</Item>
<Item Name="java">
<Location>
/VirtualContainer/
VirtualRuntime/[@Name]
</Location>
<ItemConstraint>
<ExactlyOne>
<enumeration value=“openjdk-7-jre"/>
<enumeration value=“openjdk-6-jre"/>
<ExactlyOne>
</ItemConstraint>
</Item>
</CreationConstraints>
• Defined by the Service
developer.
• Describes requirements of
a Cloud resource.This
resource requires a large
VM and a Java
runtime, either version 7
or 6.
• The location tag points to
the element that is being
restricted. Item constraint
define the possible values.
16
17. METHODOLOGY FORTHE
DESCRIPTION OF CLOUD SERVICES
• SLA languages represent services as complete documents, predefined
by the provider.
– Services must be manually predefined by the provider.
– Produces a combinatorial explosion of services.
• Our methodology introduces the novel concept of SLA fragment.
– A SLA fragment is a section of the SLA defined as a stand-alone
document.
– SLA fragments define individual resources, not complete services.
– Can be combined together to describe services.
• Our methodology composes SLA fragments in response to a Service
provider query for Cloud resources, in order to generate a Cloud
service. It has the following advantages.
– Reduces operational and maintenance expenses.
– Each element is self-contained.
– Improves flexibility.
17
19. METHODOLOGY FORTHE
DESCRIPTION OF CLOUD SERVICES
• SLA languages represent services as complete documents, predefined
by the provider.
– Services must be manually predefined by the provider.
– Produces a combinatorial explosion of services.
• Our methodology introduces the novel concept of SLA fragment.
– A SLA fragment is a section of the SLA defined as a stand-alone
document.
– SLA fragments define individual resources, not complete services.
– Can be combined together to describe services.
• Our methodology composes SLA fragments in response to a Service
provider query for Cloud resources, in order to generate a Cloud
service. It has the following advantages.
– Reduces operational and maintenance expenses.
– Each element is self-contained.
– Improves flexibility.
19
22. METHODOLOGY FORTHE
DESCRIPTION OF CLOUD SERVICES
• SLA languages represent services as complete documents, predefined
by the provider.
– Services must be manually predefined by the provider.
– Produces a combinatorial explosion of services.
• Our methodology introduces the novel concept of SLA fragment.
– A SLA fragment is a section of the SLA defined as a stand-alone
document.
– SLA fragments define individual resources, not complete services.
– Can be combined together to describe services.
• Our methodology composes SLA fragments in response to a Service
provider query for Cloud resources, in order to generate a Cloud
service. It has the following advantages.
– Reduces operational and maintenance expenses.
– Each element is self-contained.
– Improves flexibility.
22
23. SLA FRAGMENT COMPOSITION
• Service providers query the system requesting Cloud resources.
• SLA fragments are composed according to a set of constraints.
– Semantic constraints introduced by the data model.
– The query parameters introduced by the provider.
• This problem is an instance of a decision problem.These
problems have exponential execution time.
23
Yes
Cores: 1
No
Cores: 2
No
Runtime
Java
Yes
Runtime
Python
Yes
VM:
small
No
VM:
medium
Yes
RAM:
256mb
No
RAM:
512mb
SLA fragments representing Cloud resources
Should this fragment be added to the solution?
…
…
24. SLA COMPOSITION ALGORITHM
• We have designed an algorithm that explores the SLA
fragments as a search tree.
– The algorithm is recursive. Certain SLA fragments are an
aggregation of other fragments, and therefore spawn
composition subproblems.
– Non-terminal elements are fragments which are aggregation
of others.Terminal elements are not.
24
26. OPTIMIZATIONS TOTHE ALGORITHM
• The complexity can be reduced using heuristics and
focusing on particular instances.
– Dynamic programming: Prevents the recursive combinatorial
problems from repeating themselves. Reuses the solutions
from previous searches.
– Branch and bound:The number of fragments is used as an
estimator to guide the search. Stops as soon as a local
minimum is found.
• Ad-hoc optimizations.
– Using semantic restrictions and data structure.
– Using provider restrictions.
• These optimizations reduce the experimental running
time to polynomial instead of exponential.
26
28. ARCHITECTURE
• Architecture composed by
distributed, loosely-coupled
components where each one
fulfills an specific role.
• SLA-driven means that all
the interactions in the
system are performed by
means of SLAs.
• As a framework, it relies on
third party providers to
deploy resources.
• It provides a SLA-driven
layer on top of existing
tools.
28
Orchestrator Catalog
Platform
Connector
SLA Manager
Service
Connector
Infrastructure
Connector
Monitor
ONE Virtual
Container
User-defined
services
Service provider
29. ARCHITECTURE
• Components are implemented
as JavaWeb Services running in
ApacheTomcat.
• They provide RESTful
interfaces using Apache Wink.
• The SLA Manager and Monitor
components use theWSAG4J
framework to implement WS-
Agreement.
• The Infrastructure Connector
interfaces with ONE using its
API.
• The Catalog implements the
database using HSQLDB.
29
Orchestrator Catalog
Platform
Connector
SLA Manager
Service
Connector
Infrastructure
Connector
Monitor
ONE Virtual
Container
User-defined
services
Service provider
30. ARCHITECTURE
• SLA Manager
– Search: retrieves a new
SLA.
– Create: checks an SLA
offer, request service
deployment, register
SLA.
– Query: retrieves the state
of a running SLA.
– Delete: deallocates a
service, delete the
instance.
30
Orchestrator Catalog
Platform
Connector
SLA Manager
Service
Connector
Infrastructure
Connector
Monitor
ONE Virtual
Container
User-defined
services
1
Service provider
31. ARCHITECTURE
• SLA Manager
– Search: retrieves a new
SLA.
– Create: checks an SLA
offer, request service
deployment, register
SLA.
– Query: retrieves the state
of a running SLA.
– Delete: deallocates a
service, delete the
instance.
31
Orchestrator Catalog
Platform
Connector
SLA Manager
Service
Connector
Infrastructure
Connector
Monitor
ONE Virtual
Container
User-defined
services
1
2
3
4
Service provider
32. ARCHITECTURE
• SLA Manager
– Search: retrieves a new
SLA.
– Create: checks an SLA
offer, request service
deployment, register
SLA.
– Query: retrieves the state
of a running SLA.
– Delete: deallocates a
service, delete the
instance.
32
Orchestrator Catalog
Platform
Connector
SLA Manager
Service
Connector
Infrastructure
Connector
Monitor
ONE Virtual
Container
User-defined
services
1
Service provider
33. ARCHITECTURE
• SLA Manager
– Search: retrieves a new
SLA.
– Create: checks an SLA
offer, request service
deployment, register
SLA.
– Query: retrieves the state
of a running SLA.
– Delete: deallocates a
service, delete the
instance.
33
Orchestrator Catalog
Platform
Connector
SLA Manager
Service
Connector
Infrastructure
Connector
Monitor
ONE Virtual
Container
User-defined
services
1
3
4
2
Service provider
34. ARCHITECTURE
• Monitor
– Registers SLAs for
assessing active features.
– Assesses the state of the
SLAs periodically, in
monitoring intervals.
34
Orchestrator Catalog
Platform
Connector
SLA Manager
Service
Connector
Infrastructure
Connector
Monitor
ONE Virtual
Container
User-defined
services
Service provider
35. DYNAMIC SERVICE MANAGEMENT
• The monitor performs three operations
periodically while an SLA is active.
– Update the SLA state.
Retrieves the monitoring information (e.g. CPU and
memory usage) and updates the state of the service.
– Evaluate the QoS rules.
Use the monitoring information to evaluate the QoS rules.
– Performs self-management operations.
If a QoS is violated, executes corrective actions. Accounts
the usage of resources and bills the user.
35
36. ARCHITECTURE
• Orchestrator
– Assesses the passive
features of Cloud services.
– Global coordinator of
resources and services.
– View of all Cloud providers.
– Scheduling of Cloud
services.
– Delegates on different
connectors the deployment
of services on specific
provider.
– If no resources are
available, the SLA is
rejected.
36
Orchestrator Catalog
Platform
Connector
SLA Manager
Service
Connector
Infrastructure
Connector
Monitor
ONE Virtual
Container
User-defined
services
Service provider
37. ARCHITECTURE
• Connectors
– Façade to underlying Cloud
providers.
– Provides an uniform
interface.
– Uses plug-ins to support
different providers.
– Checks for
compatibility/availability of
resources.
– Translates from the SLA
representation to the
provider specific
representation.
– Configures the resources.
Relies on third party tools
to perform these actions.
37
Orchestrator Catalog
Platform
Connector
SLA Manager
Service
Connector
Infrastructure
Connector
Monitor
ONE Virtual
Container
User-defined
services
Service provider
38. ARCHITECTURE
• Catalog
– Stores relevant
information regarding
every element on the
framework
(SLAs, monitoring
information, etc.).
– Globally accessible.
– Provides a RESTful API.
Third party monitoring
systems store information.
38
Orchestrator Catalog
Platform
Connector
SLA Manager
Service
Connector
Infrastructure
Connector
Monitor
ONE Virtual
Container
User-defined
services
Service provider
40. USE CASE
• Resolution of a complete use case by
Cloudcompaas.
– Description, deployment and management of a
Cloud service.
• Validation of the Cloudcompaas methodology.
– Shows the qualitative benefits of this approach.
• Measure of the performance of the QoS
assessment capabilities of Cloudcompaas.
– Calculates the benefit of providing elasticity to a
Cloud service.
40
41. SERVICE DEVELOPER
• A developer registers an
application in
Cloudcompaas.
– jLinpack, a Java
implementation of
Linpack.
• The developer registers
the application bundle
and SLA fragment in
Cloudcompaas.
<Template>
<Service Name="jLinpack">
<ServiceDescription>
A Java implementation of the Linpack benchmark.
</ServiceDescription>
<CreationConstraints>
<Item Name="JavaVR">
<Location>
/VirtualContainer/VirtualRuntime[@Name=‘’openjdk-6-jre'']
</Location>
</Item>
</CreationConstraints>
</Service>
<GuaranteeTerm Name="JLINPACK-PRICE">
<ServiceLevelObjective>
<KPITarget>
<KPIName>STATE</KPIName>
<CustomServiceLevel>
JLINPACK_STATE eq 'Ready‘
</CustomServiceLevel>
</KPITarget>
</ServiceLevelObjective>
<BusinessValueList>
<Reward>
<AssessmentInterval>
<TimeInterval>PT1M</TimeInterval>
</AssessmentInterval>
<ValueExpression>0.001*ACT_REPLICAS</ValueExpression>
</Reward>
</BusinessValueList>
</GuaranteeTerm>
</Template>
41
42. SERVICE PROVIDER
• A provider deploys jLinpack instances to serve
users.
• He queries the system in order to retrieve SLAs
that describe a jLinpack Cloud service.
• The provider issues a query using the SLA
Manager REST interface.
GET slamanager/agreement/template?
include=Service+jLinpack
• Cloudcompaas returns three SLAs, each one
with a differentVM configuration.
42
43. ELASTICITY RULES
• The provider wants to add
elasticity capabilities to
jLinpack.
• He chooses the QoS rules that
control the elasticity for the
service from an ontology.
• These rules are predefined by
Cloudcompaas.
• The rules determine when new
replicas should be deployed
based on monitoring
information.
– E.g. if the average CPU load is
higher tan 90%, deploy a new
replica.
<GuaranteeTerm Name="SCALE_OUT">
<QualifyingCondition>
MAX_REPLICAS gt ACT_REPLICAS
</QualifyingCondition>
<ServiceLevelObjective>
<KPITarget>
<CustomServiceLevel>
list.avg(CPUPERC) le 90
</CustomServiceLevel>
</KPITarget>
</ServiceLevelObjective>
</GuaranteeTerm>
43
44. SERVICE USERS
• A user sends a request to jLinpack service.
• Several users can be served concurrently by a
replica.The higher the number of users, the
higher the response time.
• If a request takes more than 10 seconds to
complete, the request times out and it counts as
a failure.
• Replicas are balanced by an ad-hoc load-
balancing service.
44
45. QOS ASSESSMENT
• The jLinpackCloud service has been deployed in a local Cloudcompaas
deployment.The serviceVMs are deployed in a ONE 3.2 on-premise Cloud.
• The number of user requests per unit of time has been modelled after the user
load profile of different EGI scenarios (Chemistry, Fusion).These load profiles
have been scaled to fit the experiment size.
• Two experiments for different user loads.
• Two configurations, fixed and elastic.
– Fixed: 7 replicas for the complete experiment.
– Elastic:Variable number of replicas (1-7) managed by Cloudcompaas elasticity
rules.
• Metrics:
– Price of the service.
– Number of failed user requests.
– Average revenue per user ARPU: Metric used in telecommunications to measure
the revenue produced by a single user.
– Break-even point Be: Profit per user that yields the same profit for both
configurations.
45
48. METHODOLOGY
• Simulates an on-premise cloud of 20 machines that allocatesVM
for users.
• Users request a certain quantity of CPU and Memory and the
system provides them with aVM that most closely matches their
request.
• Two scenarios:
• Static templates:The system provides user with 7 predefinedVM
templates.
• Composed templates:The system compose templates for each user
request, using 64 fragments for CPU and Memory.
• Metrics.
• Average number of active nodes.
• Rejected users.
48
49. EXPERIMENTAL RESULTS
METHODOLOGY UTILITY
• Parameters of the simulation to produce different configurations.
– Node capacity (memory and cores).
– Arrival rate λ.
– VM time to live (TTL).
• The values for number of active nodes and rejection rate are
consistently lower for the composed scenario.
49
50. EXPERIMENT CONCLUSIONS
• The use case highlights the qualitative benefits of Cloudcompaas.
– The Service developer can describe his service independently of other resources.
– The Service provider only needs to specify his requirements to search for Cloud
services.
– Cloudcompaas doesn’t need to explicitly predefine Cloud services for each
resource.Avoids combinatorial explosion.
• The elastic configuration yields a lower price and a higher number
of failures. For services that expect a small profit per user the
tradeoff is positive.
• The performance of the elastic configuration highly depends on the
load profile.
– The same configuration produce different results depending on the load.
– Works best for highly variable or unpredictable loads.
• The Cloudcompaas methodology is able to improve the utilization
of resources in a Cloud deployment by better adjusting the resource
assignment to users.
50
52. CONCLUDING REMARKS
• Generic and extensible methodology for the
representation of Cloud services using SLAs.
• SLA-driven architecture for Cloud Services management.
• Cloudcompaas, an open source framework
implementation.
• Experiments validating the benefits of the methodology
and framework.
• Future work:
– Restriction representation system.
– Negotiation protocol.
– Decision making system.
52
53. CONTRIBUTIONS
• Journal papers
– Andrés García and Ignacio Blanquer, "Cloud domain
representation using SLA composition", Journal of Grid
Computing,Accepted, 2014, Impact factor 1.603, Q1
– Miguel Caballer et al., “CodeCloud: A Platform to Enable
Execution of Programming Models on the Clouds”
Journal of Systems and Software, DOI 10.1016/j.jss.2014.02.005,
2014, Impact factor 1.135, Q2
– Andrés García, Ignacio Blanquer andVicente Hernández, "SLA-
driven dynamic cloud resource management". Future Generation
Computer Systems (2013), 10.1016/j.future.2013.10.005, Impact
factor 1.864, Q1
– Andrés García et al., “Performance enhancement of a GIS-based
facility location problem using desktop grid infrastructure”. Earth
Science Informatics pp. 1-9 (2013). DOI 10.1007/s12145-013-0119-1,
Impact factor 0.404, Q4
53
54. CONTRIBUTIONS
• Conference papers
– Toni Mastelic, Ivona Brandic and Andrés García, “Towards Uniform Management of
Cloud Services by applying Model-Driven Development”,COMPSAC, Under Review
– MiguelCaballer, Andrés García, Germán Moltó and Carlos de Alfonso, “Towards SLA-
driven Management of Cloud Infrastructures to Elastically Execute Scientic
Applications”, Ibergrid 2012.
– AndrésGarcía, Carlos de Alfonso, andVicente Hernández, “Overview of current
commercial PaaS platforms”, “IWCCTA 2011 - InternationalWorkshop on Cloud
Computing,Technology and Applications”, inside the framework of the conference
“ICSOFT 2011 – 6º International Conference on Software and DataTechnologies”, July
2011
– AndrésGarcía et al., “Biomass@UPV: Computacional Resources Optimization of GIS-
based Applications using a BOINC Infraestructure”, 3rd Iberian Grid Infrastructure
Conference Proceedings, May 2009
– AndrésGarcía, Carlos de Alfonso andVicente Hernández, “Design of a Platform of
Virtual Service Containers for Service Oriented Cloud Computing”,CGW 2009
Proceedings. March 2010
54
55. CONTRIBUTIONS
• Research projects
– (2011-2013) Servicio avanzados para el despliegue y contextualización de aplicaciones
virtualizadas para dar soporte a modelos de programación en entornos Cloud. Ministerio de
Educación y Ciencia, Gobierno de España. Ref. TIN2010-17804
– (2006-2008) Supporting and structuring Healthgrid activities & research in Europe:
Developing a roadmap. European Commission. Ref. 027694
• Research visit
– 01/02/2013~30/04/2013. Distributed Systems Group,Technische Universität Wien. Integration
of the M4Cloud tool with the Cloudcompaas framework. Supervisor: Ivona Brandić.
• Marie Curie ITN postdoc position at IBM Haifa Labs. 18 months from May 2014 to
October 2015.
• Cloudcompaas framework.
– http://www.grycap.upv.es/compaas/
55
56. CONTRIBUTIONS
56
• GitHub repository of
source code.
• All code available under
a BSD 3-clause license.
• Redmine used to track
bugs and features.