Approximate Analysis via In-transit and Edge Resources

Omer F. Rana
School of Computer Science & Informatics
Cardiff University, UK
ranaof@cardiff.ac.uk Twitter:@omerfrana
In collaboration with:
Rajiv Ranjan (Newcastle Univ., UK), Massimo Villari (U. Messina, Italy)
Manish Parashar, Javier Diaz-Montes, Ali Zamani, Mengsong Zhou (Rutgers
University, USA),
Ioan Petri, Tom Beach, Yacine Rezgui (Cardiff University),
Rafael Tolosana-Calasanz (Univ. of Zaragoza, Spain)
Luiz Bittencourt (UNICAMP, Brazil)
Approximate Analysis via In-transit and
Edge Resources
Systems Research Challenges in
the Internet of Things Workshop
16th – 17th January 2017

The Rise of “Edge Computing”
• Edge devices can have varying capability, data rates and
programmability
• Capability to also undertake some processing on these devices
– Increasing availability of programming support – “software defined environments”
As data volume and velocity increases – we need to
rethink our Cloud/Data Centre architecture
Move away from centralised solutions – to more Peer-
2-Peer Edge Clouds (Distributed Clouds)
How do we support service provisioning and
orchestration at the Network Edge
From: Manish Parashar (Rutgers Univ.)
Summary

CONTEXT
Petri, I.et al. 2016. Coordinating data analysis and management in multi-layered
clouds. EAI International Conference on Cloud, Networking for IoT
systems, Rome, Italy, 26-27 October 2015.Proceedings of EAI International
Conference on Cloud, Networking for IoT systems, Springer. Available at:
http://orca.cf.ac.uk/78658/1/cn4iot15.pdf

• “Cloud of Things” (CoT) & “Fog Computing” (Cloudlets)
– Extending computing to the edges of the network;
– Overcoming latency constraints
• Real world/pervasive systems benefiting from Cloud
infrastructure (keep this much more open, not Telco Centric)
– Mobile & task off-loading (balancing energy usage with computation capability)
– Interest in reverse of Mobile Offloading (cf. Cloud offloading)
• Edge clouds via:
– “Cloudlets” (Fog Computing) + Mobile offloading (device clone in
Cloud/annotated program call tree (local/Cloud) – e.g. MAUI,
CloneCloud, ThinkAir, Moitree, EMCO, etc)
– Limitations of the “last mile” network (common in Content Distribution
Networks) – Akamai (web traffic @ 25Terabits/s, 2 trillion daily internet
interactions) – “edge server” ensemble (175k servers, 85% one net
hop)
Defining “Edge Services” …

Aggregator
Sensor
Cluster
Communication
channel
s6
s9
s10 s7
c2
s11
s15 s12
s13
c1
s14
s1
s4 s3
s2
s5
c3
Time
s8
eUtility
eU1
If g(x,y) > 100 then
Buy Z shares of
Stock S
x
y
Decision trigger
eU3
eU2
g(x,y)
From Jeff Voas, Bob Marcus (NIST) – Terminology for IoT/Clouds
Where should this
be hosted?

Amazon Lambda – 100 millisecond billing
• User creates a “lambda” function (e.g. Python code) that is triggered
based on events (custom code, dependencies, uploaded to AWS) – as a
“handler” method
– Event source (e.g. AWS S3, DynamoDB stream associated with a table,
HTTP/Amazon Gateway API, Amazon Cognito Auth./Mobile Services etc) publishes
object-created events
– Amazon Kinesis Stream Events (poll stream, generate events when new record
detected)
– User generated events
• Resource allocation
– Identify memory requirements => CPU requirements
– Identify execution timeout (to prevent indefinite execution)
– Can be sync./async.
– Monitored metrics: Invocation rate, Errors, Durations (Latency), Throttles (via
CloudWatch)
• Environment
– Container-based enactment/execution (AWS Lambda automatically does this)
– Container maintained (“frozen”) after lambda function finishes (no init. needed) + /tmp
space kept (transient cache for multiple executions)
– Background processes/callbacks maintained when container resumes
AWS Lambda -- compute
nodes charged by 100ms --
not the hour. First 1M
node.js exec/month for free
-- a monitoring challenge
(http://aws.amazon.com/lambda/)

ETSI – Mobile Edge Computing (MEC) & NFV
http://www.etsi.org/technologies-clusters/technologies/mobile-edge-computing
VM-based (QoS/QoE
rules – resource need,
latency, etc)
Discover, advertise
Services – DNS proxy
Traffic rules & state

ETSI – Mobile Edge Computing (MEC) & NFV
http://www.etsi.org/technologies-clusters/technologies/mobile-edge-computing
Edge orchestrator
(available resources,
hosts, topology, etc)
Application management
Operations Support
System (OSS)
EPC/EPCaaS
E-UTRAN access network:
LTE, LTE-A + legacy
Cloud-RAN

Modularization to build a flexible network
architecture
Enabling Concept: Architecture Modularization
Kashif Mehmood, Telenor Research, Norway (Dec 2016)

Modularization natively supports Network slicing
IoT Network slice with no mobility and relaxed
security requirements
5G
control
Flow
Management
Access
Function
Security and
AAA management
Mobility
Management
Connectivity
Management
Context Aware
Engine
Kashif Mehmood, Telenor Research, Norway (Dec 2016)

Edge Boundary
Network
processor
Ad hoc/mesh network
What’s on the “Edge”?

Osmotic Computing
M. Villari, M. Fazio, S. Dustdar, O. Rana & R. Ranjan, “Osmotic Computing: A New
Paradigm for Edge/ Cloud Integration”, IEEE Cloud Computing Magazine, December 2016
https://www.computer.org/csdl/mags/cd/2016/06/mcd2016060076-abs.html
• Migration of micro-
services from Edge
to Data Centers
• Services hosted on
light weight
containers (e.g.
Docker)
• Migration triggered
by monitored events
(e.g. latency)

Osmotic Computing
M. Villari, M. Fazio, S. Dustdar, O. Rana & R. Ranjan, “Osmotic Computing: A New
Paradigm for Edge/ Cloud Integration”, IEEE Cloud Computing Magazine, December 2016
• Microservices can be both: management services and user owned and
managed services
• Aligns with work on EPCaaS/5G + user-owned network management

• Real time optimisation of building energy use
– sensors provide readings within an interval of 15-30
minutes,
– Optimisation run over this interval
• The efficiency of the optimisation process
depends on the capacity of the computing
infrastructure
– deploying multiple EnergyPlus simulations
• Closed loop optimisation
– Set control set points
– Monitor/acquire sensor data + perform analysis with
EnergyPlus
– Update HVAC and actuators in physical infrastructure
17
EnergyPlus is a whole building energy simulation program that engineers, architects, and
researchers use to model energy and water use in buildings. Modelling the performance of a
building with EnergyPlus enables building professionals to optimize building design to reduce
energy usage – http://apps1.eere.energy.gov/buildings/energyplus/

Instrumented Facility
CENTRO SPORTIVO FIDIA ROMA (http://www.asfidia.it/)
Pool (indoor) – size: 25m x 16m, depth: 1,60m to 2,10m, Capacity: 760 m³
Learning Pool (indoor) – size: 16m x 4 m, depth: 1m, Capacity: 64 m³
1 Gym (indoor) provided of electric equipment (electric bicycles, etc…)
1 Fitness room (indoor) size: 18m x 9m x 3m, Volume: 486m³
1 Volleyball court (indoor) – size: 40m x 28m x 8m, Volume: 8960 m³
2 Tennis/Five-a-side courts (outdoor, with changing rooms) – size: 30m x 20m

Federated Clouds in Building Optimisation
I. Petri, O. Rana, J. Diaz-Montes, M. Zou, M. Parashar, T. Beach, Y. Rezqui, and H. Li, "In-transit Data Analysis and
Distribution in a Multi-Cloud Environment using CometCloud," The International Workshop on Energy Management for
Sustainable Internet-of-Things and Cloud Computing. Co-located with International Conference on Future Internet of Things
and Cloud (FiCloud 2014), Barcelona, Spain, August 2014.

21
Ioan Petri, Omer Rana, Yacine Rezgui, Haijiang Li, Tom Beach, Mengsong Zou, Javier Diaz Montes, Manish
Parashar: “Cloud Supported Building Data Analytics”. DPMSS workshop alongside CCGRID 2014: pp 641-
650, Chicago, USA. IEEE Computer Society Press.

• In the context of single cloud federation (3 workers) only 37 out of 72 tasks
are completed within the deadline of 1 hour. Extend deadline to 1 h 30 min
• Exchanging 15 tuples between the two federation sites, with increased cost
for execution and storage.

23
M. Zou, A. Zamani, J. Diaz-Montes, I. Petri, O. Rana, M. Parashar, “Leveraging in-transit computational
capabilities in federated ecosystems”. IEEE Symposium on Service-Oriented System Engineering (SOSE),
Oxford, UK, March 29 -April 2 2016.
In-transit
node
In-transit
node
Edge Devices

• Can we characterise behaviour of in-transit nodes?
– Network Data Centers vs. Edge Data Centers
– Goes beyond the use of simple programmable
network characterisation
• Consider job (J) consisting of (k) tasks
– Deadline(J); Budget(J); CRatio(J) – with k’ <= k
• Consider that there is some waiting time W(J) before a
job J can be executed at resource provider.
– Job is idle (queued) and it is using storage space at
the destination resource.
• Identify & configure a data path that leverages in-transit
computation to take advantage of W(J) for a job.
24
Characterising “In-Transit” Nodes

Characteristing the
problem:
To leverage in-transit computation and minimize the amount of time a
job is idle at destination, the objective of our problem becomes
maximizing the amount of tasks completed in-transit
25
An Optimisation Problem

To leverage in-transit computation and minimize the amount of time a
job is idle at destination, the objective of our problem becomes
maximizing the amount of tasks completed in-transit
subject to being ready to compute at destination resource d at the
scheduled time (2), performing computation within the given deadline
(3), keeping costs within the given budget (4), and making sure that the
completion ratio is satisfied (5):
26
An Optimisation Problem … constraints
ratio between completed
tasks and total number of
tasks composing job J.

Cost(J) is the overall cost of computing job J,
27
Cost Analysis
M. Zhou, A. Zamani, J. Diaz-Montes, I. Petri, O. Rana, M. Parashar & A. Anjum, “Deadline Constrained
Video Analysis via In-Transit Computational Environments”, IEEE Transactions on Services Computing,
2017 (to appear)

28
Leveraging In-Transit Computational Capabilities in Federated Ecosystems.
IEEE SOSE 2016
Sites implemented as VMs on Amazon
SDN capability emulated via Mininet. Each
VM had one Mininet host and one Mininet
switch
• Routing tables managed via POX
SDN controller
Switches were connected to each other using
Generic Routing Encapsulation (GRE)
tunnelling, B/W allocation via a token bucket
filter
(i) Base: in-transit resources
and sites have the same
computational power;
(ii) Higher: in-transit
resources are less powerful
than those at the resource
providers’ sites; and
(iii) Highest: in-transit
resources are much less
powerful than site resources.

29
Job Properties & Resource Types
SLA: at least 60% of the tasks, within a job, must be
completed before the deadline

Considered Scenarios
• Traditional
– Request resources from a cloud provider – no awareness of in-transit
resources
– Traditionally, all computation at the data center
• Traditional + In-transit:
– An intermediate controller allocates resources within the network (without
client involvement or awareness)
• In-transit aware (In-transit 2)
– Client aware of available in-transit nodes
– Willing to accept a wider range of offers from Cloud providers
– Requests controller to perform in-transit optimisation

Approximate Analysis via In-transit and Edge Resources

Edge-based Approximation … 1
• “Performing exact computation or operating at peak-level
service demand require a high amount of resources, allowing
selective approximation or occasional violation of the
specification can provide disproportionate gains in efficiency.”
• Techniques used:
Precision Scaling Loop Perforation
Load value approximation Task dropping/skipping
Memory access skipping Data sampling
Program versions of different
accuracy
Using inexact hardware (SRAM,
eDRAM, DRAM, GPU, etc)
Voltage scaling Refresh rate reduction
Inexact reads/writes Lossy compression
Neural networks Compiler-based strategies
S. Mittal, “A Survey of Techniques for Approximate Computing”,
ACM Computing Surveys, Vol. 48, Issue 4, May 2016

Edge-based Approximation … 2
• Combine capability in Data Centre with “approximate”
algorithms in transit or at the edge
• EnergyPlus (as at present) + a trained neural network
(as a function approximator for EnergyPlus behaviour)
• But why?
– EnergyPlus ~ Execution time(Minutes)
– Neural Network Training ~ Execution time (Minutes)
– Trained (FF) Neural Network ~ Execution time (Seconds)
• Combine more accurate model execution with
approximate model via a learned neural network
• Trigger re-training when input parameters change
significantly
– Each EnergyPlus execution provides potential training data for
the neural network

Three-phased execution
• Phase 1: EnergyPlus simulations – 30
simulations to acquire initial data
• Phase 2: Co-schedule EnergyPlus with ANN
training – wait for ANN training threshold to be
reached
• Phase 3: Deploy trained ANN on edge and in-
transit nodes
– Change in input data parameter range would trigger Phase 1 again
• Can in-transit and edge resource be used more
effectively?:
– Increase job acceptance and completion
– Increase potential revenue earned by edge and in-transit resources

Edge-based Approximation … overhead

Conclusion …
• Emergence of data-driven + data intensive
applications
• Use of Cloud/data centres and edge nodes
collectively
• Pipeline-based enactment a common theme
– Various characteristics – buffer management and
data coordination
– Model development that can be integrated into a
workflow environment
• Automating application adaptation
– … as infrastructure changes
– … as application characteristics change

Approximate Analysis via In-transit and Edge Resources

Recommandé

Recommandé

Contenu connexe

Dernier

Dernier (20)

En vedette

En vedette (20)

Approximate Analysis via In-transit and Edge Resources

Notes de l'éditeur