The RECAP project aims to develop an architecture for reliable capacity provisioning and enhanced remediation for distributed cloud/edge/fog applications. The architecture includes a feedback loop consisting of a collector, application modeler, workload modeler, optimizer, and simulator. The collector gathers metrics across the infrastructure. The modelers develop models of application behavior and workloads. The optimizer performs optimization tasks like application placement based on these models. The simulator assists the optimizer by simulating different deployments. The project will apply this architecture to use cases from partners in telecommunications, analytics, IoT, and infrastructure management.
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
The RECAP Project: Large Scale Simulation Framework
1. Reliable Capacity Provisioning and Enhanced
Remediation for Distributed Cloud Applications
http://recap-project.eu recap2020
THIS PROJECT HAS RECEIVED FUNDING FROM THE EUROPEAN UNION’S HORIZON 2020
RESEARCH AND INNOVATION PROGRAMME UNDER GRANT AGREEMENT NUMBER 732667
REliable CApacity Provisioning for
Distributed Cloud/Edge/Fog
Computing Applications
(RECAP)
Sergej Svorobej
Dublin City University
3. What is RECAP?
• Reliable Capacity Provisioning and Enhanced Remediation for
Distributed Cloud/Edge/Fog Applications
• Founded under European Horizon 2020 framework
• Jan 2017 – Dec 2019
• Consortium comprises many academic and industrial partners
⁃ 9 partners from 5 countries (Germany, Ireland, Spain, Sweden, UK)
• http://recap-project.eu
3
5. Project Motivation
• Large-scale systems are typically built as
distributed systems
• Tradeoffs in the placement of application
components:
⁃ Data center High latency, high power
⁃ Fog/Edge Low latency, low power
• Cloud computing capacity is provisioned
using best-effort models and coarsed-
grained QoS mechanisms
⁃ Not a sustainable way as the number of
connected ”things” increases
5
6. Realisation Approach (1/2)
• An architecture for cloud-edge computing capacity provisioning and
remediation
⁃ Fine-grained and accurate models of application behaviour and deployment
⁃ Model of QoS requirements at application component-level
⁃ Model of workloads
• To understand and predict the behaviour of applications (users)
• To enhance the proactive remediation of systems
6
9. The RECAP Collector
• Gathers, synthesizes and analyzes metrics to be monitored across
the infrastructure
• Acquires, characterizes, and analyzes data
⁃ Workload patterns and their relationship
⁃ Status of the infrastructure
⁃ …
• Visualizes, annotates,archives and manages the collected data
9
Knowledge
Discovery
10. The RECAP Application Modeler
• With the knowledge provided by the Collector, the Modeler
discovers and defines
⁃ The internal structure of cloud applications
⁃ The QoS requirements for each applications
• To support intelligent decision making
⁃ Application/Component placement and autoscaling
10
Application
Modeling
11. The RECAP Workload Modeler
• Decomposes, classifies, and predicts the workloads in the network,
and the load propagation in applications
⁃ CPU, memory, network traffic, …
• Models the workload distribution and
load propagation patterns
• To improve planning decision and ensure QoS
• To support the construction of an artificial workload generation tool
⁃ Validating and training the learning models
11
Workload
Modeling
12. The RECAP Optimizer
• With the models provided by the Modelers, the Optimizer performs
optimization tasks
⁃ Application/Component placement: scaling vs. migration
⁃ Infrastructure management decisions: energy, utilization rate,
load balancing
• To improve the efficiency in resources utilzation
while maintaining QoS
12
Optimization
13. The RECAP Simulator
• Simulation is needed due to the size and complexity of the
intended target systems
⁃ Simulateinteractions of distributed cloud application behaviors
⁃ Emulate data center and network systems
• To assist the Optimizer with the evaluation of different
deployments of applications and infrastructures
• To feed back to the Data Collector with simulation results
13
Testing &
Improvement
14. Use Cases from Industry
• BT
⁃ NFV, QoS Management and
Remediation
• Linknovate
⁃ Complex Big Data Analytics
Engine
• Satec
⁃ Fog Computing and Large
Scale IoT Scenario for
supporting Smart Cities
• Tieto
⁃ Infrastructure and Network
Management
14
15. Reliable Capacity Provisioning and Enhanced
Remediation for Distributed Cloud Applications
http://recap-project.eu recap2020
THIS PROJECT HAS RECEIVED FUNDING FROM THE EUROPEAN UNION’S HORIZON 2020
RESEARCH AND INNOVATION PROGRAMME UNDER GRANT AGREEMENT NUMBER 732667
WP7 – Large Scale Simulation Framework
Sergej Svorobej
Dublin City University
17. Role of simulation
• Reason on implementation effectiveness
• Show optimisation effects
⁃ Placement
⁃ Consolidation
⁃ Elasticity
⁃ Infrastructure and Application configuration
⁃ Cost
• Provide value to the use case owners
17
18. 18
API
API
APPLICATION
COMPONENT
1
A 300
B 500
C 1000
LB
COMPONENT
1’
A 300
B 500
C 1000
LB
RequestDevice
API
API
COMPONENT
2
A 500
B 450
C 100
COMPONENT
2’
A 500
B 450
C 100
High level model
DeviceDevice/
User
Request
Request
Network
Node Node Node
Node
Node
Node
Node
Node
Node
Network Network
Tier 1 Metro Core
19. Simulation challenges
• Experiment model size
⁃ Network topology (Graph)
⁃ Infrastructure (Physical and Virtual)
⁃ Workload
⁃ Application
• Speed and resource demand
• Accuracy and granularity
19
20. Research direction
• Discrete Event Simulation (DES)
⁃ Event Queue simulation engine
⁃ Java 8
⁃ CloudSimPlus based (http://cloudsimplus.org/)
• “Time step” simulation
⁃ Calculation loop
⁃ C/C++, MPI based
⁃ CLSim (https://cloudlightning.eu/)
20
21. THANK YOU
http://recap-project.eu recap2020
RECAP Project ■ H2020 ■ Grant Agreement #732667
Call: H2020-ICT-2016-2017 ■Topic: ICT-06-2016
THIS PROJECT HAS RECEIVED FUNDING FROM THE EUROPEAN UNION’S HORIZON 2020
RESEARCH AND INNOVATION PROGRAMME UNDER GRANT AGREEMENT NUMBER 732667
23. BT
23
• NFV infrastructure and virtual CDNs
⁃ Softwarization of network appliances
⁃ Mulitiple distributed applications per
CDN operator
• RECAP:
⁃ Automated decision making
• Optimization of placement and
scaling vCDN systems
• Monitoring and Remediation
⁃ Improves resource utilization while
guaranteeing SLAs
24. Linknovate
• Big data analytics engine
⁃ Data acquisition, data aggregation,
data processing, data visualization
• RECAP:
⁃ Characterizes workload and
models workload distribution
⁃ Automatically and dynamically
allocates computing resources
⁃ To reduce costs and improve
performance
24
25. Satec
• IoT, smart city
⁃ Data management system: to collect sensory data, and to provide data (in
a distributed manner)
• RECAP:
⁃ Optimizes the placement of IoT
resources such as computation,
storage for cost and latency
⁃ Automated reallocation of resources
⁃ Automated deployment of IoT
applications
25