With increasing availability of `edge' resources, understanding how computation should be split across edge devices and data centre based systems remains an important challenge. Latency sensitive applications, e.g. in stream processing, video analytics, etc, are often constrained by the network connectivity between the edge resource (involved in data collection) and the analytics platform (often reachable over a multi-hop network). We describe how edge resources can be more effectively used in combination with data centre-based resources to support such types of applications. This work builds on a recently produced survey paper covering various applications (generally in scientific computing), which could benefit from more effective edge resource/data centre integration -- survey paper is available at: https://arxiv.org/abs/1609.03647
Approximate Analysis via In-transit and Edge Resources
1. Omer F. Rana
School of Computer Science & Informatics
Cardiff University, UK
ranaof@cardiff.ac.uk Twitter:@omerfrana
In collaboration with:
Rajiv Ranjan (Newcastle Univ., UK), Massimo Villari (U. Messina, Italy)
Manish Parashar, Javier Diaz-Montes, Ali Zamani, Mengsong Zhou (Rutgers
University, USA),
Ioan Petri, Tom Beach, Yacine Rezgui (Cardiff University),
Rafael Tolosana-Calasanz (Univ. of Zaragoza, Spain)
Luiz Bittencourt (UNICAMP, Brazil)
Approximate Analysis via In-transit and
Edge Resources
Systems Research Challenges in
the Internet of Things Workshop
16th – 17th January 2017
3. The Rise of “Edge Computing”
• Edge devices can have varying capability, data rates and
programmability
• Capability to also undertake some processing on these devices
– Increasing availability of programming support – “software defined environments”
As data volume and velocity increases – we need to
rethink our Cloud/Data Centre architecture
Move away from centralised solutions – to more Peer-
2-Peer Edge Clouds (Distributed Clouds)
How do we support service provisioning and
orchestration at the Network Edge
From: Manish Parashar (Rutgers Univ.)
Summary
4. CONTEXT
Petri, I.et al. 2016. Coordinating data analysis and management in multi-layered
clouds. EAI International Conference on Cloud, Networking for IoT
systems, Rome, Italy, 26-27 October 2015.Proceedings of EAI International
Conference on Cloud, Networking for IoT systems, Springer. Available at:
http://orca.cf.ac.uk/78658/1/cn4iot15.pdf
5. • “Cloud of Things” (CoT) & “Fog Computing” (Cloudlets)
– Extending computing to the edges of the network;
– Overcoming latency constraints
• Real world/pervasive systems benefiting from Cloud
infrastructure (keep this much more open, not Telco Centric)
– Mobile & task off-loading (balancing energy usage with computation capability)
– Interest in reverse of Mobile Offloading (cf. Cloud offloading)
• Edge clouds via:
– “Cloudlets” (Fog Computing) + Mobile offloading (device clone in
Cloud/annotated program call tree (local/Cloud) – e.g. MAUI,
CloneCloud, ThinkAir, Moitree, EMCO, etc)
– Limitations of the “last mile” network (common in Content Distribution
Networks) – Akamai (web traffic @ 25Terabits/s, 2 trillion daily internet
interactions) – “edge server” ensemble (175k servers, 85% one net
hop)
Defining “Edge Services” …
14. Osmotic Computing
M. Villari, M. Fazio, S. Dustdar, O. Rana & R. Ranjan, “Osmotic Computing: A New
Paradigm for Edge/ Cloud Integration”, IEEE Cloud Computing Magazine, December 2016
https://www.computer.org/csdl/mags/cd/2016/06/mcd2016060076-abs.html
• Migration of micro-
services from Edge
to Data Centers
• Services hosted on
light weight
containers (e.g.
Docker)
• Migration triggered
by monitored events
(e.g. latency)
15. Osmotic Computing
M. Villari, M. Fazio, S. Dustdar, O. Rana & R. Ranjan, “Osmotic Computing: A New
Paradigm for Edge/ Cloud Integration”, IEEE Cloud Computing Magazine, December 2016
• Microservices can be both: management services and user owned and
managed services
• Aligns with work on EPCaaS/5G + user-owned network management
17. • Real time optimisation of building energy use
– sensors provide readings within an interval of 15-30
minutes,
– Optimisation run over this interval
• The efficiency of the optimisation process
depends on the capacity of the computing
infrastructure
– deploying multiple EnergyPlus simulations
• Closed loop optimisation
– Set control set points
– Monitor/acquire sensor data + perform analysis with
EnergyPlus
– Update HVAC and actuators in physical infrastructure
17
EnergyPlus is a whole building energy simulation program that engineers, architects, and
researchers use to model energy and water use in buildings. Modelling the performance of a
building with EnergyPlus enables building professionals to optimize building design to reduce
energy usage – http://apps1.eere.energy.gov/buildings/energyplus/
18. Instrumented Facility
CENTRO SPORTIVO FIDIA ROMA (http://www.asfidia.it/)
Pool (indoor) – size: 25m x 16m, depth: 1,60m to 2,10m, Capacity: 760 m³
Learning Pool (indoor) – size: 16m x 4 m, depth: 1m, Capacity: 64 m³
1 Gym (indoor) provided of electric equipment (electric bicycles, etc…)
1 Fitness room (indoor) size: 18m x 9m x 3m, Volume: 486m³
1 Volleyball court (indoor) – size: 40m x 28m x 8m, Volume: 8960 m³
2 Tennis/Five-a-side courts (outdoor, with changing rooms) – size: 30m x 20m
19. Federated Clouds in Building Optimisation
I. Petri, O. Rana, J. Diaz-Montes, M. Zou, M. Parashar, T. Beach, Y. Rezqui, and H. Li, "In-transit Data Analysis and
Distribution in a Multi-Cloud Environment using CometCloud," The International Workshop on Energy Management for
Sustainable Internet-of-Things and Cloud Computing. Co-located with International Conference on Future Internet of Things
and Cloud (FiCloud 2014), Barcelona, Spain, August 2014.
21. 21
Ioan Petri, Omer Rana, Yacine Rezgui, Haijiang Li, Tom Beach, Mengsong Zou, Javier Diaz Montes, Manish
Parashar: “Cloud Supported Building Data Analytics”. DPMSS workshop alongside CCGRID 2014: pp 641-
650, Chicago, USA. IEEE Computer Society Press.
22. • In the context of single cloud federation (3 workers) only 37 out of 72 tasks
are completed within the deadline of 1 hour. Extend deadline to 1 h 30 min
• Exchanging 15 tuples between the two federation sites, with increased cost
for execution and storage.
23. 23
M. Zou, A. Zamani, J. Diaz-Montes, I. Petri, O. Rana, M. Parashar, “Leveraging in-transit computational
capabilities in federated ecosystems”. IEEE Symposium on Service-Oriented System Engineering (SOSE),
Oxford, UK, March 29 -April 2 2016.
In-transit
node
In-transit
node
Edge Devices
24. • Can we characterise behaviour of in-transit nodes?
– Network Data Centers vs. Edge Data Centers
– Goes beyond the use of simple programmable
network characterisation
• Consider job (J) consisting of (k) tasks
– Deadline(J); Budget(J); CRatio(J) – with k’ <= k
• Consider that there is some waiting time W(J) before a
job J can be executed at resource provider.
– Job is idle (queued) and it is using storage space at
the destination resource.
• Identify & configure a data path that leverages in-transit
computation to take advantage of W(J) for a job.
24
Characterising “In-Transit” Nodes
25. Characteristing the
problem:
To leverage in-transit computation and minimize the amount of time a
job is idle at destination, the objective of our problem becomes
maximizing the amount of tasks completed in-transit
25
An Optimisation Problem
26. To leverage in-transit computation and minimize the amount of time a
job is idle at destination, the objective of our problem becomes
maximizing the amount of tasks completed in-transit
subject to being ready to compute at destination resource d at the
scheduled time (2), performing computation within the given deadline
(3), keeping costs within the given budget (4), and making sure that the
completion ratio is satisfied (5):
26
An Optimisation Problem … constraints
ratio between completed
tasks and total number of
tasks composing job J.
27. Cost(J) is the overall cost of computing job J,
27
Cost Analysis
M. Zhou, A. Zamani, J. Diaz-Montes, I. Petri, O. Rana, M. Parashar & A. Anjum, “Deadline Constrained
Video Analysis via In-Transit Computational Environments”, IEEE Transactions on Services Computing,
2017 (to appear)
28. 28
Leveraging In-Transit Computational Capabilities in Federated Ecosystems.
IEEE SOSE 2016
Sites implemented as VMs on Amazon
SDN capability emulated via Mininet. Each
VM had one Mininet host and one Mininet
switch
• Routing tables managed via POX
SDN controller
Switches were connected to each other using
Generic Routing Encapsulation (GRE)
tunnelling, B/W allocation via a token bucket
filter
(i) Base: in-transit resources
and sites have the same
computational power;
(ii) Higher: in-transit
resources are less powerful
than those at the resource
providers’ sites; and
(iii) Highest: in-transit
resources are much less
powerful than site resources.
29. 29
Job Properties & Resource Types
SLA: at least 60% of the tasks, within a job, must be
completed before the deadline
30. Considered Scenarios
• Traditional
– Request resources from a cloud provider – no awareness of in-transit
resources
– Traditionally, all computation at the data center
• Traditional + In-transit:
– An intermediate controller allocates resources within the network (without
client involvement or awareness)
• In-transit aware (In-transit 2)
– Client aware of available in-transit nodes
– Willing to accept a wider range of offers from Cloud providers
– Requests controller to perform in-transit optimisation
34. Edge-based Approximation … 1
• “Performing exact computation or operating at peak-level
service demand require a high amount of resources, allowing
selective approximation or occasional violation of the
specification can provide disproportionate gains in efficiency.”
• Techniques used:
Precision Scaling Loop Perforation
Load value approximation Task dropping/skipping
Memory access skipping Data sampling
Program versions of different
accuracy
Using inexact hardware (SRAM,
eDRAM, DRAM, GPU, etc)
Voltage scaling Refresh rate reduction
Inexact reads/writes Lossy compression
Neural networks Compiler-based strategies
S. Mittal, “A Survey of Techniques for Approximate Computing”,
ACM Computing Surveys, Vol. 48, Issue 4, May 2016
35. Edge-based Approximation … 2
• Combine capability in Data Centre with “approximate”
algorithms in transit or at the edge
• EnergyPlus (as at present) + a trained neural network
(as a function approximator for EnergyPlus behaviour)
• But why?
– EnergyPlus ~ Execution time(Minutes)
– Neural Network Training ~ Execution time (Minutes)
– Trained (FF) Neural Network ~ Execution time (Seconds)
• Combine more accurate model execution with
approximate model via a learned neural network
• Trigger re-training when input parameters change
significantly
– Each EnergyPlus execution provides potential training data for
the neural network
36. Three-phased execution
• Phase 1: EnergyPlus simulations – 30
simulations to acquire initial data
• Phase 2: Co-schedule EnergyPlus with ANN
training – wait for ANN training threshold to be
reached
• Phase 3: Deploy trained ANN on edge and in-
transit nodes
– Change in input data parameter range would trigger Phase 1 again
• Can in-transit and edge resource be used more
effectively?:
– Increase job acceptance and completion
– Increase potential revenue earned by edge and in-transit resources
39. Conclusion …
• Emergence of data-driven + data intensive
applications
• Use of Cloud/data centres and edge nodes
collectively
• Pipeline-based enactment a common theme
– Various characteristics – buffer management and
data coordination
– Model development that can be integrated into a
workflow environment
• Automating application adaptation
– … as infrastructure changes
– … as application characteristics change
Notes de l'éditeur
Modularization to build a flexible network architecture natively supporting Network Slicing
In experiments:
1. We evaluate how the price of the SDN affects the decision of where to execute the workload.
2. we study factors that influence whether jobs should be computed locally or remotely
In experiments:
1. We evaluate how the price of the SDN affects the decision of where to execute the workload.
2. we study factors that influence whether jobs should be computed locally or remotely
In experiments:
1. We evaluate how the price of the SDN affects the decision of where to execute the workload.
2. we study factors that influence whether jobs should be computed locally or remotely
In experiments:
1. We evaluate how the price of the SDN affects the decision of where to execute the workload.
2. we study factors that influence whether jobs should be computed locally or remotely
In experiments:
1. We evaluate how the price of the SDN affects the decision of where to execute the workload.
2. we study factors that influence whether jobs should be computed locally or remotely
In experiments:
1. We evaluate how the price of the SDN affects the decision of where to execute the workload.
2. we study factors that influence whether jobs should be computed locally or remotely
In experiments:
1. We evaluate how the price of the SDN affects the decision of where to execute the workload.
2. we study factors that influence whether jobs should be computed locally or remotely
In experiments:
1. We evaluate how the price of the SDN affects the decision of where to execute the workload.
2. we study factors that influence whether jobs should be computed locally or remotely