The presentation from a MSc seminar course at the University of Cyprus, on cloud computing, elasticity and the research interests of the CELAR project (http://www.celarcloud.eu).
CELAR Components Featured
=======================
- Cloud Application Management Framework (CAMF)
- JCatascopia Cloud Monitoring Framework
- ADVISE Cloud Elasticity Evalaution Framework
2. Demetris Trihinas
Back in the old days…
10 March 2015, University of Cyprus
you had an idea…
but no money… difficult to start an
online business…
“In the early days (20 years ago), most new e-commerce sites, for example, cost a million
dollars to set up. Now the price is closer to $100” [M. Zwilling, NY Times, Jul 2014]
Why was it so difficult in the past?
3. Demetris Trihinas
Motivation
• he is an ambitious CS student
• he wants to develop a “youtube”
alternative
• John has no money, he must use his
knowledge and open-source tools
to develop his system
10 March 2015, University of Cyprus
Meet John!
4. Demetris Trihinas
Online Video Streaming Service #1
• From CS courses John learns about web service development
• Client-Sever model
10 March 2015, University of Cyprus
upload/download video
response
Clients (John’s family) Server (John’s computer)
• Processing done on client side (thick clients)
• Updating application logic code is not easy
Hosting database
(e.g. MySQL)
Desktop client (e.g.
Java) to connect
with server
5. Demetris Trihinas
Online Video Streaming Service #2
• Video streaming service is attracting more users (family & friends)
• John’s software development skills are getting better
• 3-tier web application
10 March 2015, University of Cyprus
upload/download video
Clients
(John’s friends and family)
application
server
store/extract video
database
Presentation layer:
CMS website (e.g.
Joomla), HTML/CSS
Application logic layer:
RESTful API, Apache
Tomcat
Data storage backend:
e.g. MySQL
6. Demetris Trihinas
Online Video Streaming Service #2
• John’s family and friends like the video service. They are telling
their friends about it.
• Scalability
• John’s system cannot sustain the increasing number of users
• Replace application server, database server with “stronger” ones
• Buying new servers is expensive
• Maintenance
• Software/hardware updates/upgrades
• Cooling, Security, Backups, etc.
10 March 2015, University of Cyprus
7. Demetris Trihinas
Back in the old days…
10 March 2015, University of Cyprus
you had an idea…
but no money… difficult to start an
online business…
“In the early days (20 years ago), most new e-commerce sites, for example, cost a million
dollars to set up. Now the price is closer to $100” [M. Zwilling, NY Times, Jul 2014]
• Infrastructure
• Hardware
• Software licences
• Maintenance
• Software updates
• Hardware upgrades
• Cooling
• Security
• Backups
Why was it so difficult in the past?
10. Demetris Trihinas
Cloud Computing
A model for enabling ubiquitous, convenient, on-demand
network access to a shared pool of configurable computing
resources (e.g., networks, servers, storage, applications, and
services) that can be rapidly provisioned and released with
minimal management effort or service provider interaction.
NIST definition, 2011
10 March 2015, University of Cyprus
source: http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf
11. Demetris Trihinas 10 March 2015, University of Cyprus
http://cloudtweaks.com/2012/09/true-facts-to-help-you-talk-about-cloud-computing-in-the-social-scene/
14. Demetris Trihinas
Online Video Streaming Service #3
• John moves video service to the Cloud!
• He has learnt about Cloud application development
• Video service is now scalable
10 March 2015, University of Cyprus
store/extract video
.
.
.
.
.
.
upload/download
video
Distribute
client requests
clients
Application Server Tier Database Backend
Load Balancer
Cloud
Provider
15. Demetris Trihinas
Elasticity in Cloud Computing
• Ability of a system to expand or contract its dedicated resources
to meet the current demand
10 March 2015, University of Cyprus
Workload(req/s)
Time (s)
De-allocate unused
resources to reduce cost
Allocate resources to
increase throughput
Provision only the
required resources
Stakeholders state that elasticity
(54%) and cost reduction (48%)
are driving cloud adoption
[FOC Survey 2013]
16. Demetris Trihinas
Elasticity Control
• MAPE-K control loop (Monitoring, Analysing, Planning, Executing
using Knowledge)
10 March 2015, University of Cyprus
Resource
Utilization
Application
Behaviour
“…automatic resource provisioning is challenging due to the fact that monitoring and
managing elastic cloud services is not a trivial task…”
Monitor
Analyse Plan
Execute
Knowledge
Elastic Cloud Service
"Managing and Monitoring Elastic Cloud Applications", D. Trihinas and C. Sofokleous and N. Loulloudes and A. Foudoulis
and G. Pallis and M. D. Dikaiakos, 14th International Conference on Web Engineering (ICWE 2014), Toulouse, France 2014
17. Demetris Trihinas
Online Video Streaming Service #5
• John decides to use an elasticity controller to scale his application
10 March 2015, University of Cyprus
store/extract video
.
.
.
.
.
.
Distribute
client requests
Application Server Tier Database Backend
Load Balancer
add/remove resourcesElasticity
Controller
if (metricA > X) then add VM
else if (metricA < X) then remove VM
else if (metricB > Y) then increase RAM
…
upload/download
video
clients
Elasticity constraints are to complex for users and based on low-level metrics
Cloud
Provider
18. Demetris Trihinas
Current Elasticity Controllers
10 March 2015, University of Cyprus
• Manual or semi-automated
elasticity control
• Vendor-specific
AutoScaling
• Elasticity modelled as a one-dimensional property
• No control over cost, performance and quality
• Only fine-grained elasticity control
• e.g. add/remove virtual instances
19. Demetris Trihinas 10 March 2015, University of Cyprus
Fully Automated
Intelligent Decision
Making Algorithms
Application
Management
Vendor
Neutrality
Multi-layer Scalable
Monitoring
Multi-Dimensional
Control
Open-Source
Multi-Grain
Elasticity Control
www.celarcloud.eu
22. Demetris Trihinas
CAMF
• A Cloud Application Management Framework providing
developers a complete set of graphical tools for:
Describing cloud applications topologies
Defining elasticity requirements and scaling actions
Deploying cloud application description(s) on any cloud
platform
By adopting the open OASIS TOSCA standard
Managing complete lifecycle of a cloud application
Open-source (on top of Eclipse Rich Client Platform)
10 March 2015, University of Cyprus
23. Demetris Trihinas
Cloud Application Management
• Emerging technology
• CSC acquired ServiceMesh for $350M
10 March 2015, University of Cyprus
CloudFormation
• Current frameworks lack in:
• Application portability – “describe once, deploy anywhere”
• “vendor neutrality (interoperability) is one of the main challenges in cloud
application management” [Gartner, CMP Landscape 2012]
24. Demetris Trihinas
CAMF
10 March 2015, University of Cyprus
“c-Eclipse: An Open-Source Management Framework for Cloud Applications", C. Sofokleous, N. Loulloudes, D.Trihinas, G.Pallis and
M. D. Dikaiakos, 20th International Conference on Parallel Processing (Euro-Par 2014), Porto, Portugal 2014
CSARCSAR
27. Demetris Trihinas
Elasticity Policy Definition
• Multi-grain elasticity policy definition
• Application’s constraints related to cost, performance and quality metrics
• Express specific strategies to be enforced when constraints are violated
• Based on powerful and flexible SYBL definition language
10 March 2015, University of Cyprus
Elasticity Policy View -> No knowledge of SYBL is required!
28. Demetris Trihinas
Cloud Provider Selection
• Users can:
– Select a Cloud provider to deploy their application(s)
– Add a new provider to the list by providing their CELAR endpoint and authentication
credentials
10 March 2015, University of Cyprus
30. Demetris Trihinas 10 March 2015, University of Cyprus
The status of the deployments is shown in the Application Deployments View
John’s Video Service Description via CAMF
32. Demetris Trihinas
Cloud Monitoring Challenges
• Monitor heterogeneous types of information and resources
• Extract metrics from multiple levels of the cloud
• Low-level metrics (i.e. CPU usage, network traffic)
• High-level metrics (i.e. application throughput, latency, availability)
• Metrics collected at different time granularities
• Non-intrusiveness
10 March 2015, University of Cyprus
33. Demetris Trihinas
Cloud Monitoring Challenges
• Cloud Platform Independence
• If a cloud service is portable then it can be moved to another
platform due to better pricing schemes, availability, QoS, etc.
• Monitoring System?
• Portable
• Easily configurable on new platform
10 March 2015, University of Cyprus
Cloud Service
Monitoring
Cloud Service
Monitoring
Provider A Provider B
Vendor lock-in concerns
have dropped 45%
[GIGAOM 2014]
34. Demetris Trihinas
Cloud Monitoring Challenges
• Interoperability
• Distribute a cloud service across multiple providers due to
better resource locality, availability or security concerns
• Monitoring System?
• Operate and collect metrics seamlessly across multiple providers
10 March 2015, University of Cyprus
Cloud Service
Monitoring Monitoring
Provider A Provider CProvider B
Cloud Service
Monitoring
42% are interested in adopting
hybrid cloud. Estimated to rise to
55% by 2016 [GIGAOM 2014]
35. Demetris Trihinas
Cloud Monitoring Challenges
• Elasticity Support
• Detect configuration changes in a cloud service
• Monitoring System?
• Detect configuration changes automatically without restarting
monitoring process or part of it and without any human intervention
10 March 2015, University of Cyprus
Cloud Service
VMVM VM VM VM. . .
Cloud Service
VM VM VM. . .VM
Application topology changes
(e.g. new VM added)
Allocated resource changes
(e.g. new disk attached to VM)
36. Demetris Trihinas
Cloud Specific Monitoring Tools
• Public and Private cloud providers offer
monitoring capabilities
• Fully documented
• Well integrated with underlying platform
10 March 2015, University of Cyprus
• REST APIs and graphical web interfaces
• Automated notification and alerting mechanisms
• Commercial and proprietary -> limited portability and
interoperability
37. Demetris Trihinas
JCatascopia Monitoring System
Open-source
Multi-Layer Cloud Monitoring
• Customizable and Extensible by Users
• Metric Subscription Rule Language and Mechanism
Platform Independent
• Operate on any cloud platform since neither metric collecting,
distribution or storage is depend to underlying infrastructure
10 March 2015, University of Cyprus
38. Demetris Trihinas
JCatascopia Monitoring System
Interoperable
• Support for application distributed across multiple cloud platforms
Capable of Supporting Elastic Cloud Services
• JCatascopia Pub/Sub Message Communication Protocol
Scalable
10 March 2015, University of Cyprus
"JCatascopia: Monitoring Elastically Adaptive Applications in the Cloud", D. Trihinas and G. Pallis and M. D.
Dikaiakos, 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014), 2014
39. Demetris Trihinas
JCatascopia Pub/Sub Message
Communication Protocol
• Elasticity support
• Automatic monitoring instance discovery and removal
• Dynamic resource configuration (e.g. new disk is attached at runtime)
• Dynamic network interface change at runtime (e.g. elastic ip)
10 March 2015, University of Cyprus
40. Demetris Trihinas
Multi-Tier Monitoring
10 March 2015, University of Cyprus
avgActiveConnections = AVG(busyThreads)
MEMBERS = [id1, ... ,idN]
ACTION = NOTIFY(<70, >=140)
avgCPUUsage = AVG(1-cpuIdle)
MEMBERS = [id1, ... ,idN]
ACTION = NOTIFY(<30, >=85)
JCatascopia Metric Rule Language
and Mechanism
41. Demetris Trihinas
XDB In-Memory
Data Analytics
JCatascopia: Portability and Interoperability
SCAN Genome Pipeline
Multi-Graph Clustering in the Cloud
Online Gaming Multi-Tier Video Streaming
10 March 2015, University of Cyprus
42. Demetris Trihinas
JCatascopia: Advance over State-of-the-Art
Monitoring Agent Runtime Footprint
for a 3-tier Video Streaming Service
HAProxy Load Balancer Cassandra DB Node
Tomcat Application ServerOnline Directory Node
As metric count increases, Ganglia doubles its runtime footprint since custom application-specific metrics are
external processes in contrast to JCatascopia where Probes are loaded as lightweight Java threads
10 March 2015, University of Cyprus
43. Demetris Trihinas
JCatascopia: Advance over State-of-the-Art
When in need of application-level
monitoring, for a small runtime overhead,
JCatascopia can reduce monitoring
network traffic and consequently
monitoring cost
Network Utilization
for 3-tier Video Streaming Service
10 March 2015, University of Cyprus
48. Demetris Trihinas
JCatascopia: Scalability Evaluation
When archiving time is high, we can direct monitoring metric traffic through
multiple Monitoring Servers, allowing the monitoring system to scale
Node #1a
Node #M
Node #1b
Node #K
.
.
.
A
MS
Web
Service
Node #K+1
A
A
A
A
A
add node to the cluster
Monitoring Agent
Monitoring
Server
Monitoring
Server
.
.
.
Metrics
Monitoring
Server
Elastically Control
JCatascopia
10 March 2015, University of Cyprus
49. Demetris Trihinas
JCatascopia: Release and Exploitation
• Open-source under Apache 2.0 Licence
• JCatascopia Website (docs, examples, videos, publications, etc.)
• Packaging (JARs, tarballs, RPMs and Chef recipes) available in CELAR
repo
• JCatascopia Probe Library and Java Probe API
• System-level monitoring probes (for both Linux and Windows)
• Application-specific probes (Tomcat, Cassandra DB, HAProxy, Postgres DB,
RabbitMQ)
• Supporting 2 Different Database Backends (MySQL, Cassandra DB)
https://github.com/CELAR/cloud-ms
http://linc.ucy.ac.cy/CELAR/jcatascopia
https://github.com/dtrihinas/JCatascopia-Probe-Library
10 March 2015, University of Cyprus
50. Demetris Trihinas
So is simple elasticity control based
on user defined directives enough?
10 March 2015, University of Cyprus
51. Demetris Trihinas
Elasticity Control Estimation and Evaluation
10 March 2015, University of Cyprus
• How should we interpret a sudden drop in request throughput
at the business tier of a 3-tier cloud service?
• There are less clients which makes the business
tier inefficiently utilized
• Right Decision: Remove an Application Server
• Video storage backend is under-provisioned,
requests are getting queued at business tier
• Right Decision: Add another Database Node
Elasticity Controller with simple IF-THEN-ELSE policies based on metric
violations cannot determine the right ECP to improve QoS or cost
52. Demetris Trihinas
ADVISE Framework
Input
• Cloud Service topology description
(CAMF)
• Multi-layer monitoring metric
evolution (JCatascopia)
• Elasticity Control Processes (rSYBL)
• Cloud specific info (Info Service)
10 March 2015, University of Cyprus
Processing
• Project metric evolution on n-
dimensional space
• Cluster metrics and discover (or better
learn) metric correlations
• Create execution plan based on historic
info to improve resource utilization,
QoS and reduce cost
Knowledge Base
• Metric evolution
• Metric correlations
• ECPs and possible
plans
-> Collect more metrics
-> Refine clusters and discover new correlations
-> Increase our knowledge base
53. Demetris Trihinas
Elasticity Control Estimation and Evaluation
with ADVISE
10 March 2015, University of Cyprus
"ADVISE: a Framework for Evaluating Cloud Service Elasticity Behavior [Best Paper]", G. Copil, D. Trihinas, H.L Truong, D. Moldovan, G.
Pallis, S. Dustdar, M. D. Dikaiakos, 12th International Conference on Service Oriented Computing (ICSOC 2014), Paris, France 2014.
54. Demetris Trihinas
ADVISE-based Multi-Dimensional Control
A single peek causes a
“ping-pong” effect which is
billing users for resources
they aren’t really consuming
10 March 2015, University of Cyprus
ADVISE-based Control
AWS uses a hourly
charge rate
“Evaluating Cloud Service Elasticity Behavior", G. Copil, D. Trihinas, H.L Truong, D. Moldovan, G. Pallis, S. Dustdar, M. D. Dikaiakos,
International Journal of Cooperative Information Systems (IJCIS), 2015.
55. Demetris Trihinas
So is CELAR applicative anywhere
else other than video streaming?
10 March 2015, University of Cyprus
56. Demetris Trihinas
Use Case: Cancer Genome Detection
• process large amount of genomic and proteomic data
10 March 2015, University of Cyprus
CPU and disk I/O
intensive
Memory
intensive
Disk I/O and
memory intensive
Disk I/O, CPU and
network intensive
• Old approach
• Provision HPC cluster with max capacity
57. Demetris Trihinas
Acknowledgements
10 March 2015, University of Cyprus
www.celarcloud.eu
co-funded by the
European Commission
source code: https://github.com/CELAR/
website: http://linc.ucy.ac.cy/CELAR/