SlideShare une entreprise Scribd logo
1  sur  167
Télécharger pour lire hors ligne
Software-Defined Systems for
Network-Aware Service Composition and
Workflow Placement
Pradeeban Kathiravelu
Supervisors: Prof. Luís Veiga
Prof. Peter Van Roy
Louvain-la-Neuve, Belgium.
August 23rd
, 2019
2/37
Promising Trends of the Internet
●
Growth of the Internet bandwidth.
– ↟  demand and $$$.↡ $$$.
●
Innovation with cloud ecosystems.
– Service providers and tenants.
– Dedicated connectivity* of the cloud providers.
●
Increasing geographical presence.
●
Well-provisioned network → Low latency links.
* James Hamilton, VP, AWS (AWS re:invent 2016).
3/37
●
Disparities
– Pricing.
●
E.g. IP Transit price per Mbps, 2014
– USA: 0.94 $
– Kazakhstan: 15 $
– Uzbekistan: 347 $
– Latency.
●
Multi-domain Workflows?
– Interoperability
– Control
Challenges in Practice
4/37
A Case for Cloud-Assisted Networks
●
A large-scale overlay network built over cloud VMs
●
Can a network overlay built over cloud instances be
a better connectivity provider?
– High-performance
– Cost effectiveness
– Network Services on the fly!
5/37
Low Latency with Cloud Routes
• Cloud-based data transfer A → Z via the path A → B → Z.
– Cloud region B is closer to the origin server A.
– B and Z are cloud VMs connected by a cloud overlay.
6/37
Network Services Dilemma
●
Network Services: On-Premise vs. Centralized Cloud? Edge!
●
Network Service Chaining (NSC)
●
Service chain placement abiding by tenant Service Level Objectives (SLOs).
7/37
Can Network Softwarization help?
●
Network Control, Reusability, and Interoperability.
●
Typically focuses on a single provider.
●
Network-awareness for multi-domain workflows.
8/37
Enablers of Network Softwarization
●
Software-Defined
Networking (SDN)
●
Network Functions Virtualization (NFV)
– Network middleboxes → Virtual Network Functions (VNFs)
●
Software-Defined Systems (SDS)
– Storage, Security, Data center, ..
– Improved configurability
9/37
Motivation
●
Better control for tenants composing service workflows.
– Optimal placement abiding by their policies & SLOs.
●
Challenges: technical, economic, and policy.
10/37
Thesis Goals
Network-Aware
Service Composition and
Workflow Placement
Scale
Intra-Domain
Multi-Domain
Edge
The Internet
11/37
Q1: Execution Migration Across
Development Stages
Can we
seamlessly
scale and migrate
network applications
through
network softwarization
across development
and deployment stages?
Scale:
Data center
(CoopIS’16, SDS’15, and IC2E’16)
12/37
Q2: Economic & Performance Benefits
Can
network softwarization
offer
economic and
performance
benefits
to the end users?
Scale:
Data center →
Inter-cloud
(Networking’18 and IM’17)
13/37
Q3: Service Chain Placement
Can we efficiently
chain services
from several
edge and cloud providers
to compose tenant workflows,
by federating SDN deployments
of the providers, using
SOA?
Scale:
Multi-domain →
Edge
(ETT’18, ICWS’16, and SDS’16)
14/37
Q4: Interoperability
Can we enhance the
interoperability of
diverse
network
applications,
by leveraging
network softwarization
and SOA?
Scale:
Data center →
Multi-domain
and Edge
(CLUSTER’18, DAPD’19, SDS’17, and CoopIS’15)
15/37
Q5: Application to Big Data
Can we improve the
performance,
modularity, and
reusability
of big data applications,
by leveraging
network softwarization
and SOA?
Scale:
Data center →
the Internet
(CCPE’19 and SDS’18)
16/37
Thesis Contributions
17/37
[1] SENDIM: Unified Network
Modeling and Deployment
18/37
[2] SMART: Application-level tenant
policies to network with middleboxes
19/37
[3] NetUber: Cloud-Assisted
Networks as a Connectivity Provider
• A third-party virtual connectivity provider with no fixed
infrastructure.
– Better network paths compared to public Internet paths.
20/37
NetUber Application Scenarios
• Cheaper data transfers between two endpoints.
• Higher throughput and lower latency.
• Network services.
• Alternative to Software-as-a-Service replication.
21/37
NetUber Inter-Cloud Architecture
• Deploy SaaS applications in one or a few regions.
– Fast access from more regions with NetUber.
Ohio London Belgium
AWS
GCP
22/37
Monetary Costs to Operate NetUber
A.Cost of Cloud VMs (per second)
– Spot instances: volatile, but up to 90% savings.
B.Cost of Bandwidth (per transferred data volume).
C.Cost to connect to the cloud provider (per port-hour).
23/37
Evaluation
• NetUber prototype with AWS r4.8xlarge spot instances.
• Cheaper point-to-point connectivity.
●
Better throughput and reduced latency & jitter.
– Origin: RIPE Atlas Probes and our distributed servers.
– Destination: VMs of multiple AWS regions.
●
Network Services: Compression
24/37
1) Cost: NetUber vs. connectivity providers
• 10 Gbps point-to-point connectivity: from EU & USA.
– Cheaper for data transfers <50 TB/month.
25/37
2) Latency (Ping times): ISP vs. NetUber
(via region, % Improvement)
• NetUber cuts Internet latencies by up to 30%.
• Direct Connect would make NetUber even faster.
26/37
3) Throughput: ISP, NetUber, and
Selectively Using NetUber
●
Better throughput with NetUber via near cloud region.
– Selective use of overlay when no proximate region.
27/37
4) Jitter: ISP vs. NetUber
●
NetUber for latency-sensitive web applications.
28/37
Related Work
• Connectivity provider that does not own the infrastructure.
– Low latency cloud-assisted overlay network.
– Better data rate than ISPs.
• Previous research do not consider economic aspects.
– A cheaper alternative (< 50 TB/month).
• Similar industrial efforts.
– Voxility, an alternative to transit providers.
– Teridion, Internet fast lanes for SaaS providers.
29/37
[4] Évora: Service Chain
Orchestration
●
SDN with Message-Oriented Middleware (MOM).
– For multi-domain edge environments.
●
Graph-based algorithm
– To incrementally construct user workflows
●
as service chains at the edge.
– Place and migrate user service chains.
●
Adhering to the user policies.
30/37
Évora Orchestration:
Deployment Architecture
31/37
1) Initialize Orchestrator in
each User Device
●
Construct a service graph in the user device.
― As a snapshot of the service instances at the edge.
32/37
2) Identify Potential Workflow
Placements
●
Construct potential chains incrementally.
– Subgraphs from service graph to match user chain.
– Noting individual service properties.
●
A complete match?
– Save as a potential service chain placement.
33/37
3) Service Chain Placement
●
Calculate a penalty value for potential placements.
– Normalized values: Cost, Latency, and Throughput.
– α,β,γ ← User-specified weights.
●
Place NSC on composition with minimal penalty value.
– Mixed Integer Linear Problem.
– Extensible with powers and more properties.
34/37
Evaluation
●
Model sample edge environment.
– Service nodes and a user device.
– User policies for the service workflow.
●
Microbenchmark Évora workflow placement.
– Effectiveness in satisfying user policies.
– Efficacy in closeness to optimal results
●
↡ $$$.Penalty value ➡  ↟ Quality of Experience
35/37
User Policies with Two Properties
●
Equal weights to 2 properties among T, C, and L.
●
Darker circles – compositions with minimal penalty.
– The ones that Évora chooses (circled).
T↑ & C↓ T↑ & L↓ C↓ & L↓
36/37
User Policies with Three Properties
T C L
●
T↑, C ↓, and L ↓ with weighted properties:
– Prominence to one (w=10) than the other two (w=3).
●
Radius – Monthly Cost
●
Effectively satisfying the user policies.
37/37
Conclusion
●
Seamless migration across development and deployments.
●
A case for Cloud-Assisted Networks as a connectivity provider.
●
Composing & placing workflows in multi-domain networks.
●
Increased interoperability with network softwarization & SOA.
●
Applicability of our contributions in the context of Big Data.
Future Work
●
NetUber as an enterprise connectivity provider.
●
Adaptive network service chains on hybrid networks.
Thank you! Questions?
38/37
Additional Slides
39/37
(0)
*Overview*
40/37
Publications
41/37
Multitenancy and the Tenant Users
of a Cloud Environment
42/37
Contributions and Relationships
43/37
Why SOA for our SDS?
●
Beyond data center scale.
– Thanks to the standardization of services.
●
SOA and RESTful reference architectures.
– Multiple implementation approaches such as Message-
Oriented Middleware (MOM).
●
Publish/subscribe to a message broker over the Internet.
●
Service endpoints to handover messages to the broker.
●
Flexibility, modularity, loose-coupling, and adaptability.
44/37
OpenDaylight
●
Incremental development of OSGi bundles
– Checkpointing and versioning of the modules.
●
State of executions and transactions
– Stored in the controller distributed data tree.
45/37
What MOM got to do with the controller?
●
Expose the internals from controller (e.g. OpenDaylight)
– Through a message-based northbound API
●
e.g. AMQP (Advanced Message Queuing Protocol).
– Publish/Subscribe with a broker (e.g. ActiveMQ).
●
What can be exposed
– Data tree (internal data structures of the controller)
– Remote procedure calls (RPCs)
– Notifications.
●
Thanks to Model-Driven Service Abstraction Layer (MD-SAL) of OpenDaylight.
– Compatible internal representation of data plane.
– Messaging4Transport Project.
46/37
State-aware Adaptive Scaling
●
Adaptive scaling through shared state.
– Horizontal scalability through In-Memory Data
Grids (IMDGs).
– State of the executions for scaling decisions.
●
Pause-and-resume executions.
– Parallel multi-tenant executions.
47/37
(1) *SENDIM*
●
Simulation, Emulation, aNd Deployment Integration
Middleware
●
CoopIS’16, SDS’15, and IC2E’16
48/37
Introduction
●
Networks simulated or emulated at early stages .
●
Programmable networks → continuous development.
– Native integration of emulators into SDN.
– Network simulators supporting SDN and emulation.
– Cloud simulators extended for clouds with SDN.
●
Lack of “Software-Defined” network simulators.
– Policy/algorithms locked in simulator-imperative code.
●
Demand for easy migration and programmability.
49/37
Motivation
●
An integrated network simulation and emulation.
●
Extend SDN controllers for cloud network simulations.
– Bring the benefits of SDN to its own simulations!
●
Reusability, Scalability, Easy migration, . . .
– Run control plane code in controller itself
(portability).
– Simulate the data plane (scalability, efficiency).
●
by programmatically invoking the southbound.
50/37
Integrated Modeling and Development
51/37
Our Proposal: SENDIM
●
Separation of the Application Logic From the
Execution Environment.
52/37
SENDIM
Execution
53/37
“Software-Defined Simulations”
●
Application Logic expressed in “descriptors”.
– Deployed into the SDN controller, with a Java API.
●
System simulated in the simulation sandbox.
54/37
55/37
56/37
Prototype Implementation
●
Oracle Java 1.8.0 - Development language.
●
Apache Maven 3.1.1 - Build the bundles and execute the
scripts.
●
Infinispan 7.2.0.Final - Distributed cluster.
●
Apache Karaf 3.0.3 - OSGi run time.
●
OpenDaylight Beryllium - Default controller.
●
Multiple deployment options:
– As a stand-alone simulator.
– Distributed execution with an SDN controller.
– As a bundle in an OSGi-based SDN controller.
57/37
Evaluation
●
A cluster of up to 6 identical computers.
– Intel Core TM i7-4700MQ CPU @ 2.40GHz 8 CPU.
– 8 GB memory, Ubuntu 14.04 LTS 64 bit.
●
Simulating routing algorithms in fat-tree topology.
– Up to 100,000 nodes and changing degrees.
●
Simulation Performance: Benchmark
against CloudSimSDN/Cloud2Sim.
●
Evaluate the migration performance.
– Emulation (Mininet) → Simulation (SENDIM)
– Simulation (SENDIM) → Emulation (Mininet)
58/37
Automated Code Migration:
Simulation → Emulation
●
Time taken to programmatically convert a SENDIM
simulation script into a Mininet script.
59/37
Modeling Performance
●
Network Construction Efficiency and Adaptiveness.
– Simulate when resources are scarce for emulation.
60/37
Simulation Performance and Scalability
●
Higher performance for larger simulations.
●
Smart scale-out → Higher horizontal scalability
61/37
Performance with Incremental Updates
●
Smaller simulations: up to 1000 nodes.
●
SENDIM: controller and middleware execution
completion time.
62/37
Performance with Incremental Updates
●
Initial execution takes longer - Initializations.
63/37
Performance with Incremental Updates
●
Faster, once SENDIM & controller initialized.
64/37
Test-driven Development
●
Faster executions once the system is initialized.
65/37
Subsequent Incremental Updates
●
Even faster executions for subsequent simulations.
66/37
Deploy Changesets to the Controller●
No change in simulated environment
67/37
Revert Changesets from the Controller
●
No change in simulated environment
68/37
Scale/Migrate Simulated Environment
●
No change in controller.
69/37
70/37
Key Findings
●
SENDIM, Separation of execution from the infrastructure.
– Easy migration between simulations and emulations.
– Enabling an incremental modeling of cloud networks.
●
Performance and scalability.
– Reuse the same controller code to simulate larger deployments.
– Adaptive parallel and distributed simulations.
Future Work
●
Extension points for easy migrations.
– More emulator and controller integrations.
71/37
(2) *NetUber*
(Complementary Slides)
●
Networking’18
72/37
Cost of Cloud Spot VMs
●
10 Gbps R4 instance pairs offered only maximum of
1.2 Gbps of data transfer inter-region.
– 10 Gbps only inside
a placement group.
73/37
Price disparity is real!
Cost of Bandwidth
Regions 1 - 9 (US, Canada, and EU) much cheaper than the others.
74/37
Potential for Network Services
●
NetUber uses memory-optimized R4 spot instances.
– Each with 244 GB memory, 32 vCPU, and 10 GbE interface.
●
Deploy network services at the instances
– Value-added services for the customer.
●
Encryption, WAN-Optimizer, load balancer, ..
– Services for cost-efficiency.
●
Compression.
75/37
(3) *SMART*
●
SDN Middlebox Architecture for Reliable
Transfers.
●
EI2N’16 and IM’17
76/37
Introduction
●
Differentiated QoS in multi-tenant cloud networks.
– Different priorities among tenant processes.
– Application-level user preferences and system policies.
– Performance guarantees at the network-level.
●
Network is shared among the tenants.
– SLA guarantee despite congestion for critical flows.
77/37
Motivation
●
Cross-layer optimization of clouds with SDN.
– Centralized network-as-a-service control plane.
78/37
Our Proposal: SMART
●
Cross-layer architecture for differentiated QoS of flows.
●
FlowTags - Software middlebox to tag the network flows with
contextual information.
– Application-level preferences to the control plane as tags.
– Dynamic flow routing modifications based on the tags.
●
Timely delivery of priority flows by dynamically diverting
them or cloning them to a less congested path.
– Selective Redundancy
– Adaptive approach in cloning and diverting.
79/37
SMART Approach
●
Divert or clone subflows by setting breakpoints
in the priority flows, to avert congestion.
– Trade-off of redundancy to ensure the SLA.
– Adaptiveness with contextual information.
80/37
81/37
82/37
SMART Deployment
83/37
SMART Workflow
84/37
I: Tag Generation for Priority Flows
●
Tag generation query and response.
– between hosts and FlowTags controller.
●
A centralized controller for FlowTags.
●
Tag the flows at the origin.
●
FlowTagger software middlebox.
– A generator of the tags.
– Invoked by the host application layer.
– Similar to the FlowTags-capable
middleboxes for NATs
85/37
II: Regular Routing until Policy Violation
86/37
III: When a Threshold is Met
●
Controller is triggered through the OpenFlow API.
●
A series of control flows inside the control plane.
●
Modify flow entries in the relevant switches.
87/37
SMART Control Flows: Rules Manager
●
A software middlebox in the control plane.
●
Consumes the tags from the packet.
– Similar to FlowTags-capable firewalls.
88/37
Rules Manager Tags Consumption
●
Interprets the tags
– as input to the SMART Enhancer
89/37
SMART Enhancer●
Gets the input to the enhancement algorithms.
●
Decides the flow modifications.
– Breakpoint node and packet.
– Clone/divert decisions.
90/37
Prototype Implementation
●
Developed in Oracle Java 1.8.0.
●
OpenDaylight Beryllium as the core SDN controller.
●
Enhancer & Rules Manager middlebox: controller extensions.
– Deployed in OpenDaylight Karaf runtime as OSGi bundles.
●
FlowTags middlebox controller deployed with SDN controller.
– FlowTags, originally a POX extension.
●
Network nodes and flows emulated with Mininet.
– Larger scale cloud deployments simulated.
91/37
Evaluation Strategy
●
Data center network with 1024 nodes and leaf-spine topology.
– Path lengths of more than two-hops.
– Up to 100,000 of short flows.
●
Flow completion time < 1 s.
●
A few non-priority elephant flows.
– SLA → maximum permitted flow completion time for priority flows
– Uniformly randomized congestion.
●
hitting a few uplinks of nodes concurrently.
●
overwhelming amount of flows through the same nodes and links.
●
Benchmark: SMART enhancements over base routing algorithms.
– Performance (SLA-awareness), redundancy, and overhead.
92/37
SMART Adaptive Clone/Replicate
●
Replicate subsequent flows once a previous flow was
cloned.
– Shortest path and Equal-Cost Multi-Path (ECMP)
93/37
Related Work
●
Multipath TCP (MPTCP) uses the available multiple paths
between the nodes concurrently to route the flows.
– Performance, bandwidth utilization, & congestion control
– through a distributed load balancing.
●
ProgNET: WS-Agreement and SDN for SLA-aware cloud.
●
pFabric for deadline-constrained data flows with minimal
completion time.
●
QJump linux traffic control module for latency-sensitive
applications.
94/37
Key Findings
●
SMART leverages redundancy in the flows
– Improve the SLA of the priority flows.
●
Cross-layer optimizations through tagging the flows.
– For differentiated QoS.
Future Work
●
Implementation of SMART on a real data center
network.
●
Evaluate against the related work quantitatively.
95/37
(4) *Mayan*
●
Software-Defined Service Compositions
●
ICWS’16 and SDS’16
96/37
Introduction
●
eScience workflows
– Computation-intensive.
– Execute on highly distributed networks.
●
Complex service composition workflows
– To automate scientific and enterprise business
processes.
97/37
Motivation
●
Better orchestration of service workflow
compositions in wide area networks.
●
Software-Defined Service Composition
98/37
Our Proposal: Mayan
●
SDN-based approach for adaptively composing
multi-domain service workflows
– An efficient service instance selection.
– Loose coupling of service definitions and
implementations.
– Availability of a logically centralized control plane.
●
State of executions and transactions stored in the
controller distributed data tree.
– Clustered and federated deployments with MOM.
99/37
Alternative Representations
100/37
Mayan Services Registry:
Modeling Language
101/37
Service Composition Representation
●
<Service3,(<Service1, Input1>, <Service2, Input2>)>
102/37
Service Instances: Alternative
Implementations and Deployments
103/37
Solution Architecture
●
Mayan Controller Farm: Inter-Domain Compositions
104/37
105/37
106/37
107/37
Evaluation
●
Evaluation Environment:
– Smaller physical deployments in a cluster.
– Larger deployments as simulations and emulations (Mininet).
●
Evaluation Strategy:
– A workflow performing distributed data cleaning and
consolidation.
●
A distributed web service composition.
vs.
●
Mayan approach with the extended SDN architecture.
108/37
Speedup and Horizontal Scalability
●
No performance degradation for larger deployments.
109/37
Controller Throughput
●
No. of messages entirely processed by the controller.
– Publisher → Controller → Receiver.
●
5000 messages/s in a concurrency of 10 million msg.
110/37
Processing Time
●
Total time to process the complete set of messages
– Against a varying number of messages.
●
Linear scaling with the number of parallel messages.
– 10 million messages in 40 minutes.
111/37
Success Rate
●
Success rate of the controller vs. number of
messages processed in parallel.
– 99.5% for up to 10 million parallel messages.
112/37
Scalability of the Mayan Controller
●
Presented results for a single stand-alone controller.
●
Mayan is designed as a federated deployment.
– Scales horizontally to
●
manage a wider area with a more substantial number of
service nodes and improved latency.
●
handle more concurrent messages in each controller
domain.
113/37
Key Findings
●
SDN-based approach that enables efficient and
flexible large-scale service composition workflows .
– Multi-tenant and multi-domain executions.
– Service composition with web services and
distributed execution frameworks.
●
Related Works on SDN for distributed frameworks
and service workflows.
– Palantir: SDN for MapReduce performance with
the network proximity data.
114/37
(5) *Évora*
(Complementary Slides)
●
ETT’18
115/37
A User-Defined NSC Among the
Edge Nodes
116/37
Problem Scale: Representation of the
Service Graph from the Node Graph
●
The number of links in this service graph grows
– linearly with the number of edges or links between the edge nodes.
– exponentially with the average number of services per edge node.
117/37
Two given more
prominence
(weight = 10),
than the third
(weight = 3).
118/37
MILP and Graph Matching can be
Computation Intensive
●
But initialization is once per user chain with a given policy.
– This procedure does not repeat once initialized.
– unless updates received from the edge network.
●
New node with the service offering at the edge.
●
An existing node or a service offering fails to respond.
●
Services in each NSC is typically 5 – 10.
– Évora algorithm follows a greedy approach, rather than a
typical graph matching.
119/37
Performance and Scalability of Évora
Orchestrator Algorithms
120/37
121/37
122/37
123/37
(6) *SD-CPS*
●
Software-Defined Cyber-Physical Systems
●
CLUSTER’18, SDS’17, and M4IoT’15
124/37
Cyber-Physical System (CPS)
●
A system composed
of cyber and physical
elements.
●
Challenges in CPS.
– Modeling
– Large-scale heterogeneous execution environments.
– Decision making: communication and coordination.
– Management and orchestration of the intelligent agents.
125/37
Motivation
●
An SDS to address the challenges of CPS.
Desired Properties in a new CPS framework
●
Easy to adopt from current CPS approaches.
●
Should not introduce more/new challenges.
126/37
●
An SDS framework for CPS workflows at the edge.
– CPS workload as edge service workflows.
●
A dual (physical and virtual/cyber) execution environment for
CPS executions.
– Efficient CPS modelling and simulations.
– Mitigate the unpredictability of the physical execution
environment.
●
Resilience for critical flows with a differentiated QoS.
– End-to-end delivery guarantees.
Our Proposal: Software-Defined
Cyber-Physical Systems (SD-CPS)
127/37
SD-CPS Controller Architecture
128/37
Controller Farm and
Software-Defined Sensor Networks
129/37
Modeling and Simulating CPS
●
Cyberspace to model the smart devices as virtual intelligent agents.
●
Mapped interactions between the actors in physical & cyber spaces.
●
Incrementally model and load from the controller farm.
130/37
Evaluation Environment●
Edge nodes and service resource requirements
– with properties normalized.
●
Resource requirement
– Negative value – even the smallest node satisfies.
– High positive value – higher demand for resource.
131/37
Service Deployment Over the Nodes
●
How each service is deployed across nodes.
●
How each node hosts several services.
132/37
Parallel Execution of 1 million workflows
●
Minimal idling nodes.
●
High resource utilization.
133/37
Related Work
●
SDN for Heterogeneous Devices.
– Sensor OpenFlow: SD-Wireless Sensor Networks.
●
Scaling SDN: Clustering SDN controller with Akka.
●
OpenDaylight Federation
●
Conceptual Data Tree projects.
●
SDS for Smart Environments.
●
Albatross: Taming challenges of distributed systems
134/37
Key Findings
●
Increased resource efficiency using edge workflows.
●
An approach to mitigate the design and operations
challenges in CPS.
●
Benefits of SDN to CPS.
– Unified and centralized control.
– Improved QoS, management, and resilience.
– Reduced repeated effort in modeling.
135/37
(7) *Obidos*
●
OOn-demand BBig Data IIntegration, DDistribution, and
OOrchestration SSystem
●
DAPD’19, CoopIS’15, and DMAH’17
136/37
Introduction
●
Volume, variety, and distribution of big data are rising.
– Structured, semi-structured, unstructured, or ill-formed.
●
Integration of data is crucial for data science.
– Multiple types of data: Imaging, clinical, and genomic.
– Numerous data sources: No shared messaging protocol.
– Do we really need to integrate all the data?
●
Sharing of integrated data and results for reproducibility.
137/37
Human-in-the-loop
On-Demand
Data Integration●
Service-based data access through APIs.
– Thanks to specifications such as HL7 FHIR.
●
The researchers possess domain knowledge.
●
Integrate On-Demand.
– Avoid eager loading of binary data or its textual metadata.
– Use the researcher query as an input in loading data.
●
Scalable storage in-house.
– Load, integrate, index, and query unstructured data.
138/37
Data Sharing
Intra-Organization
●
Load data only once per organization.
– Bandwidth and storage efficiency.
139/37
Data Sharing
Inter-Organization
●
Do not duplicate data!
– We ``own`` our interest; not the data.
●
Point to the data in the data sources.
– Pointers to data like Dropbox Shared Links.
●
Avoids outdated duplicate data.
●
Easy to maintain.
●
APIs – Access the list of research data sets.
140/37
Problems
●
How to..
– Load data from several big data sources.
●
Avoid repeated loading and near duplicate data.
– Integrate disparate data and persist for future
accesses.
– Share pointers to data internally and externally.
141/37
Our Proposal: Óbidos
●
Define subsets of data that are of interest.
– using hierarchical structure of medical data.
●
Medical Images (DICOM), Clinical data, ..
●
User query → Narrow down the search space.
OOn-demand BBig Data IIntegration, DDistribution & OOrchestration SSystem
142/37
Óbidos Approach
●
Hybrid of virtual and materialized data integration approaches.
– Lazy load of metadata: Load the matching subset of metadata.
– Store integrated data and query results → scalable storage.
●
Track already loaded data.
– Near duplicate detection.
– Download only updates (changesets).
●
Efficient SQL queries on NoSQL storage.
●
Share pointers to the datasets rather than the dataset itself.
●
Generic design; implementation for medical research data.Generic design; implementation for medical research data.
143/37
Óbidos
Architecture
144/37
145/37
Data Sharing with Óbidos
146/37
147/37
Data Structures of the Replicaset Holder
148/37
Evaluation
●
Evaluation Data:
– Clinical data and TCIA DICOM imaging collections.
●
Benchmark Óbidos against eager and lazy ETL.
–
Performance of loading and querying data.
●
Óbidos (inter- and intra- org) against binary data sharing.
–
Space/bandwidth efficiency of data sharing.
149/37
Workload Characterization
Various Entries in Evaluated Collections
150/37
Data Load Time
Change in total data volume (Same query and same interest)
●
Load time for eager & lazy ETL with total volume↟ 
●
Load time for Óbidos remains constant.
151/37
Change in studies of interest
(Same query and constant total data volume)
Data Load Time
●
Load time for eager and lazy ETL remains constant.
●
Load time increases for Óbidos with the interest.
–
Converges to the load time of lazy ETL.
152/37
Load Time from the
Remote Data Sources
●
Eager and lazy ETL take much longer
– To load more data and metadata over the Internet.
153/37
Query Completion Time
for the Integrated Data Repository
●
Corresponding data already loaded in Óbidos.
●
Indexed scalable NoSQL architecture of Óbidos
→ Better performance.
154/37
Efficiency in Sharing
Medical Research Data
●
Replicaset – Pointers of marginal size, yet increases
with entries of same granularity.
155/37
Key Findings
●
Óbidos offers on-demand service-based big data integration.
– Fast and resource-efficient data analysis.
– SQL queries over NoSQL data store for the integrated data.
– Efficient data sharing without replicating the actual data.
Future Work
– Consume data from repositories beyond medical domain.
●
EUDAT
– Óbidos distributed virtual data warehouses.
●
Leverage the proximity in data integration and sharing.
156/37
(8) *Mayan-DS*
●
Software-Defined Data Services (SDDS)
●
CCPE’19 and SDS’18 (Best Paper Award)
●
Work-in-progress
Introduction
●
Data services: Service APIs to big data → Interoperability.
●
Related data and services distributed far from each other
→ Bad performance with scale.
●
Chaining of data services.
– Composing chains of numerous data services.
– Data access → Data cleaning → Data integration.
●
How to scale out efficiently?
– How to minimize communication overheads?
158/37
Motivation
●
Software-Defined Networking (SDN).
– A unified controller to the data plane devices.
– Brings network awareness to the applications.
●
Data services
– Make big data executions interoperable.
●
Can we bring SDN to the Data Services?
– Software-Defined Data Services (SDDS)
159/37
Our Proposal:
Software-Defined Data Services (SDDS)
●
SDDS as a generic approach for data services.
– Extending and leveraging SDN.
●
Mayan-DS, an SDDS framework.
– Efficient management of data services.
– Interoperability and scalability.
160/37
Solution Architecture
161/37
SDDS Approach
●
Define all the data operations as interoperable services.
●
SDN for distributing data and service executions
– Inside a data center (e.g. Software-Defined Data
Centers).
– Beyond data centers (extend SDN with MOM).
●
Optimal placement of data and service execution.
– Minimize communication overhead and data movements.
●
Execute data service on the best-fit server, until
interrupted.
162/37
Efficient Data and Execution Placement
{i, j} – related data objects
D – datasets of interest
n – execution node
ξ – spread of the related data objects
163/37
Prototype Implementation
164/37
Simulated Environment
(with Modeled Latency in ms)
165/37
Ping Times (ms) Between Two Nodes:
Regular Internet vs. Mayan-DS
166/37
Latency: Ping Times of Mayan-DS
●
Up to 33% reduction in latency
– with a fraction of the path through a direct link.
●
75% or more reduction with significant portion of direct link.
167/37
Key Findings
●
Software-Defined Data Services (SDDS) for
interoperability and scalability in big data executions.
●
Mayan-DS leverages SDN for big data workflows at
Internet-scale.
●
Limited focus of industrial offerings.
– Storage or one or a few specific services.
Future Work
●
Extend Mayan-DS for edge and IoT/CPS
environments.

Contenu connexe

Tendances

Pack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction systemPack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction systemPapitha Velumani
 
Envisioning the Network Cloud
Envisioning the Network CloudEnvisioning the Network Cloud
Envisioning the Network CloudAPNIC
 
F233842
F233842F233842
F233842irjes
 
JPJ1410 PACK: Prediction-Based Cloud Bandwidth and Cost Reduction System
JPJ1410  PACK: Prediction-Based Cloud Bandwidth and Cost Reduction SystemJPJ1410  PACK: Prediction-Based Cloud Bandwidth and Cost Reduction System
JPJ1410 PACK: Prediction-Based Cloud Bandwidth and Cost Reduction Systemchennaijp
 
Virtual machine consolidation for cloud data centers using parameter based ad...
Virtual machine consolidation for cloud data centers using parameter based ad...Virtual machine consolidation for cloud data centers using parameter based ad...
Virtual machine consolidation for cloud data centers using parameter based ad...Abdelkhalik Mosa
 
Distributed Database practicals
Distributed Database practicals Distributed Database practicals
Distributed Database practicals Vrushali Lanjewar
 
Pack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction systemPack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction systemIEEEFINALYEARPROJECTS
 
Fast Distribution of Replicated Content to Multi- Homed Clients
Fast Distribution of Replicated Content to Multi- Homed ClientsFast Distribution of Replicated Content to Multi- Homed Clients
Fast Distribution of Replicated Content to Multi- Homed ClientsIDES Editor
 
Global WAN Level Clustering
Global WAN Level ClusteringGlobal WAN Level Clustering
Global WAN Level ClusteringSunil Srivastava
 
Ieeepro techno solutions 2014 ieee java project - cloud bandwidth and cost ...
Ieeepro techno solutions   2014 ieee java project - cloud bandwidth and cost ...Ieeepro techno solutions   2014 ieee java project - cloud bandwidth and cost ...
Ieeepro techno solutions 2014 ieee java project - cloud bandwidth and cost ...hemanthbbc
 
Cable Metro Packet Optical Transport
Cable Metro  Packet Optical TransportCable Metro  Packet Optical Transport
Cable Metro Packet Optical TransportJuniper Networks
 
Business Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical NetworksBusiness Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical NetworksTal Lavian Ph.D.
 
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) ToolsA Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) ToolsVishal Sharma, Ph.D.
 
An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions Blesson Babu
 
Cooperation without synchronization practical cooperative relaying for wirele...
Cooperation without synchronization practical cooperative relaying for wirele...Cooperation without synchronization practical cooperative relaying for wirele...
Cooperation without synchronization practical cooperative relaying for wirele...ieeeprojectschennai
 
Data Center Network Trends - Lin Nease
Data Center Network Trends - Lin NeaseData Center Network Trends - Lin Nease
Data Center Network Trends - Lin NeaseHPDutchWorld
 
M phil-computer-science-networking-projects
M phil-computer-science-networking-projectsM phil-computer-science-networking-projects
M phil-computer-science-networking-projectsVijay Karan
 

Tendances (19)

Pack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction systemPack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction system
 
Link_NwkingforDevOps
Link_NwkingforDevOpsLink_NwkingforDevOps
Link_NwkingforDevOps
 
Envisioning the Network Cloud
Envisioning the Network CloudEnvisioning the Network Cloud
Envisioning the Network Cloud
 
F233842
F233842F233842
F233842
 
JPJ1410 PACK: Prediction-Based Cloud Bandwidth and Cost Reduction System
JPJ1410  PACK: Prediction-Based Cloud Bandwidth and Cost Reduction SystemJPJ1410  PACK: Prediction-Based Cloud Bandwidth and Cost Reduction System
JPJ1410 PACK: Prediction-Based Cloud Bandwidth and Cost Reduction System
 
Virtual machine consolidation for cloud data centers using parameter based ad...
Virtual machine consolidation for cloud data centers using parameter based ad...Virtual machine consolidation for cloud data centers using parameter based ad...
Virtual machine consolidation for cloud data centers using parameter based ad...
 
Distributed Database practicals
Distributed Database practicals Distributed Database practicals
Distributed Database practicals
 
Pack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction systemPack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction system
 
Fast Distribution of Replicated Content to Multi- Homed Clients
Fast Distribution of Replicated Content to Multi- Homed ClientsFast Distribution of Replicated Content to Multi- Homed Clients
Fast Distribution of Replicated Content to Multi- Homed Clients
 
Global WAN Level Clustering
Global WAN Level ClusteringGlobal WAN Level Clustering
Global WAN Level Clustering
 
N fv good
N fv goodN fv good
N fv good
 
Ieeepro techno solutions 2014 ieee java project - cloud bandwidth and cost ...
Ieeepro techno solutions   2014 ieee java project - cloud bandwidth and cost ...Ieeepro techno solutions   2014 ieee java project - cloud bandwidth and cost ...
Ieeepro techno solutions 2014 ieee java project - cloud bandwidth and cost ...
 
Cable Metro Packet Optical Transport
Cable Metro  Packet Optical TransportCable Metro  Packet Optical Transport
Cable Metro Packet Optical Transport
 
Business Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical NetworksBusiness Models for Dynamically Provisioned Optical Networks
Business Models for Dynamically Provisioned Optical Networks
 
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) ToolsA Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
A Survey of Recent Advances in Network Planning/Traffic Engineering (TE) Tools
 
An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions An Investigation into Convergence of Networking and Storage Solutions
An Investigation into Convergence of Networking and Storage Solutions
 
Cooperation without synchronization practical cooperative relaying for wirele...
Cooperation without synchronization practical cooperative relaying for wirele...Cooperation without synchronization practical cooperative relaying for wirele...
Cooperation without synchronization practical cooperative relaying for wirele...
 
Data Center Network Trends - Lin Nease
Data Center Network Trends - Lin NeaseData Center Network Trends - Lin Nease
Data Center Network Trends - Lin Nease
 
M phil-computer-science-networking-projects
M phil-computer-science-networking-projectsM phil-computer-science-networking-projects
M phil-computer-science-networking-projects
 

Similaire à The UCLouvain Public Defense of my EMJD-DC Double Doctorate Ph.D. degree

My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...Pradeeban Kathiravelu, Ph.D.
 
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos...
 My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos... My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos...
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos...Pradeeban Kathiravelu, Ph.D.
 
Data Replication In Cloud Computing
Data Replication In Cloud ComputingData Replication In Cloud Computing
Data Replication In Cloud ComputingRahul Garg
 
#ATAGTR2021 Presentation : "Performance Evaluation Strategy of multi-access e...
#ATAGTR2021 Presentation : "Performance Evaluation Strategy of multi-access e...#ATAGTR2021 Presentation : "Performance Evaluation Strategy of multi-access e...
#ATAGTR2021 Presentation : "Performance Evaluation Strategy of multi-access e...Agile Testing Alliance
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Pack: prediction based cloud bandwidth ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Pack: prediction based cloud bandwidth ...JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Pack: prediction based cloud bandwidth ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Pack: prediction based cloud bandwidth ...IEEEGLOBALSOFTTECHNOLOGIES
 
Control Plane for High Capacity Networks Public
Control Plane for High Capacity Networks PublicControl Plane for High Capacity Networks Public
Control Plane for High Capacity Networks PublicCPqD
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud ComputingEd Byrne
 
On network throughput variability in microsoft azure cloud
On network throughput variability in microsoft azure cloudOn network throughput variability in microsoft azure cloud
On network throughput variability in microsoft azure cloudssuser79fc19
 
Pack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction systemPack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction systemJPINFOTECH JAYAPRAKASH
 
IEEE Final Year Projects 2011-2012 :: Elysium Technologies Pvt Ltd::Parallel ...
IEEE Final Year Projects 2011-2012 :: Elysium Technologies Pvt Ltd::Parallel ...IEEE Final Year Projects 2011-2012 :: Elysium Technologies Pvt Ltd::Parallel ...
IEEE Final Year Projects 2011-2012 :: Elysium Technologies Pvt Ltd::Parallel ...sunda2011
 
Latency equalization as a new network service primitive.ppt
Latency equalization as a new network service primitive.pptLatency equalization as a new network service primitive.ppt
Latency equalization as a new network service primitive.pptShankar Murthy
 
High Scalability Network Monitoring for Communications Service Providers
High Scalability Network Monitoring for Communications Service ProvidersHigh Scalability Network Monitoring for Communications Service Providers
High Scalability Network Monitoring for Communications Service ProvidersCA Technologies
 
Final Year Project IEEE 2015
Final Year Project IEEE 2015Final Year Project IEEE 2015
Final Year Project IEEE 2015TTA_TNagar
 
Final Year IEEE Project Titles 2015
Final Year IEEE Project Titles 2015Final Year IEEE Project Titles 2015
Final Year IEEE Project Titles 2015TTA_TNagar
 
Unit 1.2 move to cloud computing
Unit 1.2   move to cloud computingUnit 1.2   move to cloud computing
Unit 1.2 move to cloud computingeShikshak
 
Practical active network services within content-aware gateways
Practical active network services within content-aware gatewaysPractical active network services within content-aware gateways
Practical active network services within content-aware gatewaysTal Lavian Ph.D.
 

Similaire à The UCLouvain Public Defense of my EMJD-DC Double Doctorate Ph.D. degree (20)

My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Composi...
 
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos...
 My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos... My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos...
My Ph.D. Defense - Software-Defined Systems for Network-Aware Service Compos...
 
Data Replication In Cloud Computing
Data Replication In Cloud ComputingData Replication In Cloud Computing
Data Replication In Cloud Computing
 
#ATAGTR2021 Presentation : "Performance Evaluation Strategy of multi-access e...
#ATAGTR2021 Presentation : "Performance Evaluation Strategy of multi-access e...#ATAGTR2021 Presentation : "Performance Evaluation Strategy of multi-access e...
#ATAGTR2021 Presentation : "Performance Evaluation Strategy of multi-access e...
 
Evolution of internet by Ali Kashif
Evolution of internet  by Ali KashifEvolution of internet  by Ali Kashif
Evolution of internet by Ali Kashif
 
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Pack: prediction based cloud bandwidth ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Pack: prediction based cloud bandwidth ...JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Pack: prediction based cloud bandwidth ...
JAVA 2013 IEEE CLOUDCOMPUTING PROJECT Pack: prediction based cloud bandwidth ...
 
Control Plane for High Capacity Networks Public
Control Plane for High Capacity Networks PublicControl Plane for High Capacity Networks Public
Control Plane for High Capacity Networks Public
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
On network throughput variability in microsoft azure cloud
On network throughput variability in microsoft azure cloudOn network throughput variability in microsoft azure cloud
On network throughput variability in microsoft azure cloud
 
Pack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction systemPack prediction based cloud bandwidth and cost reduction system
Pack prediction based cloud bandwidth and cost reduction system
 
IEEE Final Year Projects 2011-2012 :: Elysium Technologies Pvt Ltd::Parallel ...
IEEE Final Year Projects 2011-2012 :: Elysium Technologies Pvt Ltd::Parallel ...IEEE Final Year Projects 2011-2012 :: Elysium Technologies Pvt Ltd::Parallel ...
IEEE Final Year Projects 2011-2012 :: Elysium Technologies Pvt Ltd::Parallel ...
 
Cloud ppt
Cloud pptCloud ppt
Cloud ppt
 
Network cost services
Network cost servicesNetwork cost services
Network cost services
 
Latency equalization as a new network service primitive.ppt
Latency equalization as a new network service primitive.pptLatency equalization as a new network service primitive.ppt
Latency equalization as a new network service primitive.ppt
 
High Scalability Network Monitoring for Communications Service Providers
High Scalability Network Monitoring for Communications Service ProvidersHigh Scalability Network Monitoring for Communications Service Providers
High Scalability Network Monitoring for Communications Service Providers
 
Overlay Network Overview
Overlay Network OverviewOverlay Network Overview
Overlay Network Overview
 
Final Year Project IEEE 2015
Final Year Project IEEE 2015Final Year Project IEEE 2015
Final Year Project IEEE 2015
 
Final Year IEEE Project Titles 2015
Final Year IEEE Project Titles 2015Final Year IEEE Project Titles 2015
Final Year IEEE Project Titles 2015
 
Unit 1.2 move to cloud computing
Unit 1.2   move to cloud computingUnit 1.2   move to cloud computing
Unit 1.2 move to cloud computing
 
Practical active network services within content-aware gateways
Practical active network services within content-aware gatewaysPractical active network services within content-aware gateways
Practical active network services within content-aware gateways
 

Plus de Pradeeban Kathiravelu, Ph.D.

Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.
Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.
Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.Pradeeban Kathiravelu, Ph.D.
 
A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...
A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...
A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...Pradeeban Kathiravelu, Ph.D.
 
Data Services with Bindaas: RESTful Interfaces for Diverse Data Sources
Data Services with Bindaas: RESTful Interfaces for Diverse Data SourcesData Services with Bindaas: RESTful Interfaces for Diverse Data Sources
Data Services with Bindaas: RESTful Interfaces for Diverse Data SourcesPradeeban Kathiravelu, Ph.D.
 
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...Pradeeban Kathiravelu, Ph.D.
 
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...Pradeeban Kathiravelu, Ph.D.
 
Software-Defined Inter-Cloud Composition of Big Services
Software-Defined Inter-Cloud Composition of Big ServicesSoftware-Defined Inter-Cloud Composition of Big Services
Software-Defined Inter-Cloud Composition of Big ServicesPradeeban Kathiravelu, Ph.D.
 
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...Pradeeban Kathiravelu, Ph.D.
 
SD-CPS: Taming the Challenges of Cyber-Physical Systems with a Software-Defin...
SD-CPS: Taming the Challenges of Cyber-Physical Systems with a Software-Defin...SD-CPS: Taming the Challenges of Cyber-Physical Systems with a Software-Defin...
SD-CPS: Taming the Challenges of Cyber-Physical Systems with a Software-Defin...Pradeeban Kathiravelu, Ph.D.
 
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...Pradeeban Kathiravelu, Ph.D.
 
Software-Defined Simulations for Continuous Development of Cloud and Data Cen...
Software-Defined Simulations for Continuous Development of Cloud and Data Cen...Software-Defined Simulations for Continuous Development of Cloud and Data Cen...
Software-Defined Simulations for Continuous Development of Cloud and Data Cen...Pradeeban Kathiravelu, Ph.D.
 
Selective Redundancy in Network-as-a-Service: Differentiated QoS in Multi-Ten...
Selective Redundancy in Network-as-a-Service: Differentiated QoS in Multi-Ten...Selective Redundancy in Network-as-a-Service: Differentiated QoS in Multi-Ten...
Selective Redundancy in Network-as-a-Service: Differentiated QoS in Multi-Ten...Pradeeban Kathiravelu, Ph.D.
 
Building Blocks of Mayan: Componentizing the eScience Workflows Through Softw...
Building Blocks of Mayan: Componentizing the eScience Workflows Through Softw...Building Blocks of Mayan: Componentizing the eScience Workflows Through Softw...
Building Blocks of Mayan: Componentizing the eScience Workflows Through Softw...Pradeeban Kathiravelu, Ph.D.
 
Software-Defined Approach for QoS and Data Quality in Multi-Tenant Clouds
Software-Defined Approach for QoS and Data Quality in Multi-Tenant CloudsSoftware-Defined Approach for QoS and Data Quality in Multi-Tenant Clouds
Software-Defined Approach for QoS and Data Quality in Multi-Tenant CloudsPradeeban Kathiravelu, Ph.D.
 

Plus de Pradeeban Kathiravelu, Ph.D. (20)

Google Summer of Code_2023.pdf
Google Summer of Code_2023.pdfGoogle Summer of Code_2023.pdf
Google Summer of Code_2023.pdf
 
Google Summer of Code (GSoC) 2022
Google Summer of Code (GSoC) 2022Google Summer of Code (GSoC) 2022
Google Summer of Code (GSoC) 2022
 
Google Summer of Code (GSoC) 2022
Google Summer of Code (GSoC) 2022Google Summer of Code (GSoC) 2022
Google Summer of Code (GSoC) 2022
 
Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.
Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.
Niffler: A DICOM Framework for Machine Learning and Processing Pipelines.
 
Google summer of code (GSoC) 2021
Google summer of code (GSoC) 2021Google summer of code (GSoC) 2021
Google summer of code (GSoC) 2021
 
A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...
A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...
A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology ...
 
Google Summer of Code (GSoC) 2020 for mentors
Google Summer of Code (GSoC) 2020 for mentorsGoogle Summer of Code (GSoC) 2020 for mentors
Google Summer of Code (GSoC) 2020 for mentors
 
Google Summer of Code (GSoC) 2020
Google Summer of Code (GSoC) 2020Google Summer of Code (GSoC) 2020
Google Summer of Code (GSoC) 2020
 
Data Services with Bindaas: RESTful Interfaces for Diverse Data Sources
Data Services with Bindaas: RESTful Interfaces for Diverse Data SourcesData Services with Bindaas: RESTful Interfaces for Diverse Data Sources
Data Services with Bindaas: RESTful Interfaces for Diverse Data Sources
 
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...
On-Demand Service-Based Big Data Integration: Optimized for Research Collabor...
 
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
 
Software-Defined Inter-Cloud Composition of Big Services
Software-Defined Inter-Cloud Composition of Big ServicesSoftware-Defined Inter-Cloud Composition of Big Services
Software-Defined Inter-Cloud Composition of Big Services
 
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
Scalability and Resilience of Multi-Tenant Distributed Clouds in the Big Serv...
 
Componentizing Big Services in the Internet
Componentizing Big Services in the InternetComponentizing Big Services in the Internet
Componentizing Big Services in the Internet
 
SD-CPS: Taming the Challenges of Cyber-Physical Systems with a Software-Defin...
SD-CPS: Taming the Challenges of Cyber-Physical Systems with a Software-Defin...SD-CPS: Taming the Challenges of Cyber-Physical Systems with a Software-Defin...
SD-CPS: Taming the Challenges of Cyber-Physical Systems with a Software-Defin...
 
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
 
Software-Defined Simulations for Continuous Development of Cloud and Data Cen...
Software-Defined Simulations for Continuous Development of Cloud and Data Cen...Software-Defined Simulations for Continuous Development of Cloud and Data Cen...
Software-Defined Simulations for Continuous Development of Cloud and Data Cen...
 
Selective Redundancy in Network-as-a-Service: Differentiated QoS in Multi-Ten...
Selective Redundancy in Network-as-a-Service: Differentiated QoS in Multi-Ten...Selective Redundancy in Network-as-a-Service: Differentiated QoS in Multi-Ten...
Selective Redundancy in Network-as-a-Service: Differentiated QoS in Multi-Ten...
 
Building Blocks of Mayan: Componentizing the eScience Workflows Through Softw...
Building Blocks of Mayan: Componentizing the eScience Workflows Through Softw...Building Blocks of Mayan: Componentizing the eScience Workflows Through Softw...
Building Blocks of Mayan: Componentizing the eScience Workflows Through Softw...
 
Software-Defined Approach for QoS and Data Quality in Multi-Tenant Clouds
Software-Defined Approach for QoS and Data Quality in Multi-Tenant CloudsSoftware-Defined Approach for QoS and Data Quality in Multi-Tenant Clouds
Software-Defined Approach for QoS and Data Quality in Multi-Tenant Clouds
 

Dernier

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 

Dernier (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 

The UCLouvain Public Defense of my EMJD-DC Double Doctorate Ph.D. degree

  • 1. Software-Defined Systems for Network-Aware Service Composition and Workflow Placement Pradeeban Kathiravelu Supervisors: Prof. Luís Veiga Prof. Peter Van Roy Louvain-la-Neuve, Belgium. August 23rd , 2019
  • 2. 2/37 Promising Trends of the Internet ● Growth of the Internet bandwidth. – ↟ demand and $$$.↡ $$$. ● Innovation with cloud ecosystems. – Service providers and tenants. – Dedicated connectivity* of the cloud providers. ● Increasing geographical presence. ● Well-provisioned network → Low latency links. * James Hamilton, VP, AWS (AWS re:invent 2016).
  • 3. 3/37 ● Disparities – Pricing. ● E.g. IP Transit price per Mbps, 2014 – USA: 0.94 $ – Kazakhstan: 15 $ – Uzbekistan: 347 $ – Latency. ● Multi-domain Workflows? – Interoperability – Control Challenges in Practice
  • 4. 4/37 A Case for Cloud-Assisted Networks ● A large-scale overlay network built over cloud VMs ● Can a network overlay built over cloud instances be a better connectivity provider? – High-performance – Cost effectiveness – Network Services on the fly!
  • 5. 5/37 Low Latency with Cloud Routes • Cloud-based data transfer A → Z via the path A → B → Z. – Cloud region B is closer to the origin server A. – B and Z are cloud VMs connected by a cloud overlay.
  • 6. 6/37 Network Services Dilemma ● Network Services: On-Premise vs. Centralized Cloud? Edge! ● Network Service Chaining (NSC) ● Service chain placement abiding by tenant Service Level Objectives (SLOs).
  • 7. 7/37 Can Network Softwarization help? ● Network Control, Reusability, and Interoperability. ● Typically focuses on a single provider. ● Network-awareness for multi-domain workflows.
  • 8. 8/37 Enablers of Network Softwarization ● Software-Defined Networking (SDN) ● Network Functions Virtualization (NFV) – Network middleboxes → Virtual Network Functions (VNFs) ● Software-Defined Systems (SDS) – Storage, Security, Data center, .. – Improved configurability
  • 9. 9/37 Motivation ● Better control for tenants composing service workflows. – Optimal placement abiding by their policies & SLOs. ● Challenges: technical, economic, and policy.
  • 10. 10/37 Thesis Goals Network-Aware Service Composition and Workflow Placement Scale Intra-Domain Multi-Domain Edge The Internet
  • 11. 11/37 Q1: Execution Migration Across Development Stages Can we seamlessly scale and migrate network applications through network softwarization across development and deployment stages? Scale: Data center (CoopIS’16, SDS’15, and IC2E’16)
  • 12. 12/37 Q2: Economic & Performance Benefits Can network softwarization offer economic and performance benefits to the end users? Scale: Data center → Inter-cloud (Networking’18 and IM’17)
  • 13. 13/37 Q3: Service Chain Placement Can we efficiently chain services from several edge and cloud providers to compose tenant workflows, by federating SDN deployments of the providers, using SOA? Scale: Multi-domain → Edge (ETT’18, ICWS’16, and SDS’16)
  • 14. 14/37 Q4: Interoperability Can we enhance the interoperability of diverse network applications, by leveraging network softwarization and SOA? Scale: Data center → Multi-domain and Edge (CLUSTER’18, DAPD’19, SDS’17, and CoopIS’15)
  • 15. 15/37 Q5: Application to Big Data Can we improve the performance, modularity, and reusability of big data applications, by leveraging network softwarization and SOA? Scale: Data center → the Internet (CCPE’19 and SDS’18)
  • 17. 17/37 [1] SENDIM: Unified Network Modeling and Deployment
  • 18. 18/37 [2] SMART: Application-level tenant policies to network with middleboxes
  • 19. 19/37 [3] NetUber: Cloud-Assisted Networks as a Connectivity Provider • A third-party virtual connectivity provider with no fixed infrastructure. – Better network paths compared to public Internet paths.
  • 20. 20/37 NetUber Application Scenarios • Cheaper data transfers between two endpoints. • Higher throughput and lower latency. • Network services. • Alternative to Software-as-a-Service replication.
  • 21. 21/37 NetUber Inter-Cloud Architecture • Deploy SaaS applications in one or a few regions. – Fast access from more regions with NetUber. Ohio London Belgium AWS GCP
  • 22. 22/37 Monetary Costs to Operate NetUber A.Cost of Cloud VMs (per second) – Spot instances: volatile, but up to 90% savings. B.Cost of Bandwidth (per transferred data volume). C.Cost to connect to the cloud provider (per port-hour).
  • 23. 23/37 Evaluation • NetUber prototype with AWS r4.8xlarge spot instances. • Cheaper point-to-point connectivity. ● Better throughput and reduced latency & jitter. – Origin: RIPE Atlas Probes and our distributed servers. – Destination: VMs of multiple AWS regions. ● Network Services: Compression
  • 24. 24/37 1) Cost: NetUber vs. connectivity providers • 10 Gbps point-to-point connectivity: from EU & USA. – Cheaper for data transfers <50 TB/month.
  • 25. 25/37 2) Latency (Ping times): ISP vs. NetUber (via region, % Improvement) • NetUber cuts Internet latencies by up to 30%. • Direct Connect would make NetUber even faster.
  • 26. 26/37 3) Throughput: ISP, NetUber, and Selectively Using NetUber ● Better throughput with NetUber via near cloud region. – Selective use of overlay when no proximate region.
  • 27. 27/37 4) Jitter: ISP vs. NetUber ● NetUber for latency-sensitive web applications.
  • 28. 28/37 Related Work • Connectivity provider that does not own the infrastructure. – Low latency cloud-assisted overlay network. – Better data rate than ISPs. • Previous research do not consider economic aspects. – A cheaper alternative (< 50 TB/month). • Similar industrial efforts. – Voxility, an alternative to transit providers. – Teridion, Internet fast lanes for SaaS providers.
  • 29. 29/37 [4] Évora: Service Chain Orchestration ● SDN with Message-Oriented Middleware (MOM). – For multi-domain edge environments. ● Graph-based algorithm – To incrementally construct user workflows ● as service chains at the edge. – Place and migrate user service chains. ● Adhering to the user policies.
  • 31. 31/37 1) Initialize Orchestrator in each User Device ● Construct a service graph in the user device. ― As a snapshot of the service instances at the edge.
  • 32. 32/37 2) Identify Potential Workflow Placements ● Construct potential chains incrementally. – Subgraphs from service graph to match user chain. – Noting individual service properties. ● A complete match? – Save as a potential service chain placement.
  • 33. 33/37 3) Service Chain Placement ● Calculate a penalty value for potential placements. – Normalized values: Cost, Latency, and Throughput. – α,β,γ ← User-specified weights. ● Place NSC on composition with minimal penalty value. – Mixed Integer Linear Problem. – Extensible with powers and more properties.
  • 34. 34/37 Evaluation ● Model sample edge environment. – Service nodes and a user device. – User policies for the service workflow. ● Microbenchmark Évora workflow placement. – Effectiveness in satisfying user policies. – Efficacy in closeness to optimal results ● ↡ $$$.Penalty value ➡ ↟ Quality of Experience
  • 35. 35/37 User Policies with Two Properties ● Equal weights to 2 properties among T, C, and L. ● Darker circles – compositions with minimal penalty. – The ones that Évora chooses (circled). T↑ & C↓ T↑ & L↓ C↓ & L↓
  • 36. 36/37 User Policies with Three Properties T C L ● T↑, C ↓, and L ↓ with weighted properties: – Prominence to one (w=10) than the other two (w=3). ● Radius – Monthly Cost ● Effectively satisfying the user policies.
  • 37. 37/37 Conclusion ● Seamless migration across development and deployments. ● A case for Cloud-Assisted Networks as a connectivity provider. ● Composing & placing workflows in multi-domain networks. ● Increased interoperability with network softwarization & SOA. ● Applicability of our contributions in the context of Big Data. Future Work ● NetUber as an enterprise connectivity provider. ● Adaptive network service chains on hybrid networks. Thank you! Questions?
  • 41. 41/37 Multitenancy and the Tenant Users of a Cloud Environment
  • 43. 43/37 Why SOA for our SDS? ● Beyond data center scale. – Thanks to the standardization of services. ● SOA and RESTful reference architectures. – Multiple implementation approaches such as Message- Oriented Middleware (MOM). ● Publish/subscribe to a message broker over the Internet. ● Service endpoints to handover messages to the broker. ● Flexibility, modularity, loose-coupling, and adaptability.
  • 44. 44/37 OpenDaylight ● Incremental development of OSGi bundles – Checkpointing and versioning of the modules. ● State of executions and transactions – Stored in the controller distributed data tree.
  • 45. 45/37 What MOM got to do with the controller? ● Expose the internals from controller (e.g. OpenDaylight) – Through a message-based northbound API ● e.g. AMQP (Advanced Message Queuing Protocol). – Publish/Subscribe with a broker (e.g. ActiveMQ). ● What can be exposed – Data tree (internal data structures of the controller) – Remote procedure calls (RPCs) – Notifications. ● Thanks to Model-Driven Service Abstraction Layer (MD-SAL) of OpenDaylight. – Compatible internal representation of data plane. – Messaging4Transport Project.
  • 46. 46/37 State-aware Adaptive Scaling ● Adaptive scaling through shared state. – Horizontal scalability through In-Memory Data Grids (IMDGs). – State of the executions for scaling decisions. ● Pause-and-resume executions. – Parallel multi-tenant executions.
  • 47. 47/37 (1) *SENDIM* ● Simulation, Emulation, aNd Deployment Integration Middleware ● CoopIS’16, SDS’15, and IC2E’16
  • 48. 48/37 Introduction ● Networks simulated or emulated at early stages . ● Programmable networks → continuous development. – Native integration of emulators into SDN. – Network simulators supporting SDN and emulation. – Cloud simulators extended for clouds with SDN. ● Lack of “Software-Defined” network simulators. – Policy/algorithms locked in simulator-imperative code. ● Demand for easy migration and programmability.
  • 49. 49/37 Motivation ● An integrated network simulation and emulation. ● Extend SDN controllers for cloud network simulations. – Bring the benefits of SDN to its own simulations! ● Reusability, Scalability, Easy migration, . . . – Run control plane code in controller itself (portability). – Simulate the data plane (scalability, efficiency). ● by programmatically invoking the southbound.
  • 51. 51/37 Our Proposal: SENDIM ● Separation of the Application Logic From the Execution Environment.
  • 53. 53/37 “Software-Defined Simulations” ● Application Logic expressed in “descriptors”. – Deployed into the SDN controller, with a Java API. ● System simulated in the simulation sandbox.
  • 54. 54/37
  • 55. 55/37
  • 56. 56/37 Prototype Implementation ● Oracle Java 1.8.0 - Development language. ● Apache Maven 3.1.1 - Build the bundles and execute the scripts. ● Infinispan 7.2.0.Final - Distributed cluster. ● Apache Karaf 3.0.3 - OSGi run time. ● OpenDaylight Beryllium - Default controller. ● Multiple deployment options: – As a stand-alone simulator. – Distributed execution with an SDN controller. – As a bundle in an OSGi-based SDN controller.
  • 57. 57/37 Evaluation ● A cluster of up to 6 identical computers. – Intel Core TM i7-4700MQ CPU @ 2.40GHz 8 CPU. – 8 GB memory, Ubuntu 14.04 LTS 64 bit. ● Simulating routing algorithms in fat-tree topology. – Up to 100,000 nodes and changing degrees. ● Simulation Performance: Benchmark against CloudSimSDN/Cloud2Sim. ● Evaluate the migration performance. – Emulation (Mininet) → Simulation (SENDIM) – Simulation (SENDIM) → Emulation (Mininet)
  • 58. 58/37 Automated Code Migration: Simulation → Emulation ● Time taken to programmatically convert a SENDIM simulation script into a Mininet script.
  • 59. 59/37 Modeling Performance ● Network Construction Efficiency and Adaptiveness. – Simulate when resources are scarce for emulation.
  • 60. 60/37 Simulation Performance and Scalability ● Higher performance for larger simulations. ● Smart scale-out → Higher horizontal scalability
  • 61. 61/37 Performance with Incremental Updates ● Smaller simulations: up to 1000 nodes. ● SENDIM: controller and middleware execution completion time.
  • 62. 62/37 Performance with Incremental Updates ● Initial execution takes longer - Initializations.
  • 63. 63/37 Performance with Incremental Updates ● Faster, once SENDIM & controller initialized.
  • 64. 64/37 Test-driven Development ● Faster executions once the system is initialized.
  • 65. 65/37 Subsequent Incremental Updates ● Even faster executions for subsequent simulations.
  • 66. 66/37 Deploy Changesets to the Controller● No change in simulated environment
  • 67. 67/37 Revert Changesets from the Controller ● No change in simulated environment
  • 69. 69/37
  • 70. 70/37 Key Findings ● SENDIM, Separation of execution from the infrastructure. – Easy migration between simulations and emulations. – Enabling an incremental modeling of cloud networks. ● Performance and scalability. – Reuse the same controller code to simulate larger deployments. – Adaptive parallel and distributed simulations. Future Work ● Extension points for easy migrations. – More emulator and controller integrations.
  • 72. 72/37 Cost of Cloud Spot VMs ● 10 Gbps R4 instance pairs offered only maximum of 1.2 Gbps of data transfer inter-region. – 10 Gbps only inside a placement group.
  • 73. 73/37 Price disparity is real! Cost of Bandwidth Regions 1 - 9 (US, Canada, and EU) much cheaper than the others.
  • 74. 74/37 Potential for Network Services ● NetUber uses memory-optimized R4 spot instances. – Each with 244 GB memory, 32 vCPU, and 10 GbE interface. ● Deploy network services at the instances – Value-added services for the customer. ● Encryption, WAN-Optimizer, load balancer, .. – Services for cost-efficiency. ● Compression.
  • 75. 75/37 (3) *SMART* ● SDN Middlebox Architecture for Reliable Transfers. ● EI2N’16 and IM’17
  • 76. 76/37 Introduction ● Differentiated QoS in multi-tenant cloud networks. – Different priorities among tenant processes. – Application-level user preferences and system policies. – Performance guarantees at the network-level. ● Network is shared among the tenants. – SLA guarantee despite congestion for critical flows.
  • 77. 77/37 Motivation ● Cross-layer optimization of clouds with SDN. – Centralized network-as-a-service control plane.
  • 78. 78/37 Our Proposal: SMART ● Cross-layer architecture for differentiated QoS of flows. ● FlowTags - Software middlebox to tag the network flows with contextual information. – Application-level preferences to the control plane as tags. – Dynamic flow routing modifications based on the tags. ● Timely delivery of priority flows by dynamically diverting them or cloning them to a less congested path. – Selective Redundancy – Adaptive approach in cloning and diverting.
  • 79. 79/37 SMART Approach ● Divert or clone subflows by setting breakpoints in the priority flows, to avert congestion. – Trade-off of redundancy to ensure the SLA. – Adaptiveness with contextual information.
  • 80. 80/37
  • 81. 81/37
  • 84. 84/37 I: Tag Generation for Priority Flows ● Tag generation query and response. – between hosts and FlowTags controller. ● A centralized controller for FlowTags. ● Tag the flows at the origin. ● FlowTagger software middlebox. – A generator of the tags. – Invoked by the host application layer. – Similar to the FlowTags-capable middleboxes for NATs
  • 85. 85/37 II: Regular Routing until Policy Violation
  • 86. 86/37 III: When a Threshold is Met ● Controller is triggered through the OpenFlow API. ● A series of control flows inside the control plane. ● Modify flow entries in the relevant switches.
  • 87. 87/37 SMART Control Flows: Rules Manager ● A software middlebox in the control plane. ● Consumes the tags from the packet. – Similar to FlowTags-capable firewalls.
  • 88. 88/37 Rules Manager Tags Consumption ● Interprets the tags – as input to the SMART Enhancer
  • 89. 89/37 SMART Enhancer● Gets the input to the enhancement algorithms. ● Decides the flow modifications. – Breakpoint node and packet. – Clone/divert decisions.
  • 90. 90/37 Prototype Implementation ● Developed in Oracle Java 1.8.0. ● OpenDaylight Beryllium as the core SDN controller. ● Enhancer & Rules Manager middlebox: controller extensions. – Deployed in OpenDaylight Karaf runtime as OSGi bundles. ● FlowTags middlebox controller deployed with SDN controller. – FlowTags, originally a POX extension. ● Network nodes and flows emulated with Mininet. – Larger scale cloud deployments simulated.
  • 91. 91/37 Evaluation Strategy ● Data center network with 1024 nodes and leaf-spine topology. – Path lengths of more than two-hops. – Up to 100,000 of short flows. ● Flow completion time < 1 s. ● A few non-priority elephant flows. – SLA → maximum permitted flow completion time for priority flows – Uniformly randomized congestion. ● hitting a few uplinks of nodes concurrently. ● overwhelming amount of flows through the same nodes and links. ● Benchmark: SMART enhancements over base routing algorithms. – Performance (SLA-awareness), redundancy, and overhead.
  • 92. 92/37 SMART Adaptive Clone/Replicate ● Replicate subsequent flows once a previous flow was cloned. – Shortest path and Equal-Cost Multi-Path (ECMP)
  • 93. 93/37 Related Work ● Multipath TCP (MPTCP) uses the available multiple paths between the nodes concurrently to route the flows. – Performance, bandwidth utilization, & congestion control – through a distributed load balancing. ● ProgNET: WS-Agreement and SDN for SLA-aware cloud. ● pFabric for deadline-constrained data flows with minimal completion time. ● QJump linux traffic control module for latency-sensitive applications.
  • 94. 94/37 Key Findings ● SMART leverages redundancy in the flows – Improve the SLA of the priority flows. ● Cross-layer optimizations through tagging the flows. – For differentiated QoS. Future Work ● Implementation of SMART on a real data center network. ● Evaluate against the related work quantitatively.
  • 95. 95/37 (4) *Mayan* ● Software-Defined Service Compositions ● ICWS’16 and SDS’16
  • 96. 96/37 Introduction ● eScience workflows – Computation-intensive. – Execute on highly distributed networks. ● Complex service composition workflows – To automate scientific and enterprise business processes.
  • 97. 97/37 Motivation ● Better orchestration of service workflow compositions in wide area networks. ● Software-Defined Service Composition
  • 98. 98/37 Our Proposal: Mayan ● SDN-based approach for adaptively composing multi-domain service workflows – An efficient service instance selection. – Loose coupling of service definitions and implementations. – Availability of a logically centralized control plane. ● State of executions and transactions stored in the controller distributed data tree. – Clustered and federated deployments with MOM.
  • 103. 103/37 Solution Architecture ● Mayan Controller Farm: Inter-Domain Compositions
  • 104. 104/37
  • 105. 105/37
  • 106. 106/37
  • 107. 107/37 Evaluation ● Evaluation Environment: – Smaller physical deployments in a cluster. – Larger deployments as simulations and emulations (Mininet). ● Evaluation Strategy: – A workflow performing distributed data cleaning and consolidation. ● A distributed web service composition. vs. ● Mayan approach with the extended SDN architecture.
  • 108. 108/37 Speedup and Horizontal Scalability ● No performance degradation for larger deployments.
  • 109. 109/37 Controller Throughput ● No. of messages entirely processed by the controller. – Publisher → Controller → Receiver. ● 5000 messages/s in a concurrency of 10 million msg.
  • 110. 110/37 Processing Time ● Total time to process the complete set of messages – Against a varying number of messages. ● Linear scaling with the number of parallel messages. – 10 million messages in 40 minutes.
  • 111. 111/37 Success Rate ● Success rate of the controller vs. number of messages processed in parallel. – 99.5% for up to 10 million parallel messages.
  • 112. 112/37 Scalability of the Mayan Controller ● Presented results for a single stand-alone controller. ● Mayan is designed as a federated deployment. – Scales horizontally to ● manage a wider area with a more substantial number of service nodes and improved latency. ● handle more concurrent messages in each controller domain.
  • 113. 113/37 Key Findings ● SDN-based approach that enables efficient and flexible large-scale service composition workflows . – Multi-tenant and multi-domain executions. – Service composition with web services and distributed execution frameworks. ● Related Works on SDN for distributed frameworks and service workflows. – Palantir: SDN for MapReduce performance with the network proximity data.
  • 115. 115/37 A User-Defined NSC Among the Edge Nodes
  • 116. 116/37 Problem Scale: Representation of the Service Graph from the Node Graph ● The number of links in this service graph grows – linearly with the number of edges or links between the edge nodes. – exponentially with the average number of services per edge node.
  • 117. 117/37 Two given more prominence (weight = 10), than the third (weight = 3).
  • 118. 118/37 MILP and Graph Matching can be Computation Intensive ● But initialization is once per user chain with a given policy. – This procedure does not repeat once initialized. – unless updates received from the edge network. ● New node with the service offering at the edge. ● An existing node or a service offering fails to respond. ● Services in each NSC is typically 5 – 10. – Évora algorithm follows a greedy approach, rather than a typical graph matching.
  • 119. 119/37 Performance and Scalability of Évora Orchestrator Algorithms
  • 120. 120/37
  • 121. 121/37
  • 122. 122/37
  • 123. 123/37 (6) *SD-CPS* ● Software-Defined Cyber-Physical Systems ● CLUSTER’18, SDS’17, and M4IoT’15
  • 124. 124/37 Cyber-Physical System (CPS) ● A system composed of cyber and physical elements. ● Challenges in CPS. – Modeling – Large-scale heterogeneous execution environments. – Decision making: communication and coordination. – Management and orchestration of the intelligent agents.
  • 125. 125/37 Motivation ● An SDS to address the challenges of CPS. Desired Properties in a new CPS framework ● Easy to adopt from current CPS approaches. ● Should not introduce more/new challenges.
  • 126. 126/37 ● An SDS framework for CPS workflows at the edge. – CPS workload as edge service workflows. ● A dual (physical and virtual/cyber) execution environment for CPS executions. – Efficient CPS modelling and simulations. – Mitigate the unpredictability of the physical execution environment. ● Resilience for critical flows with a differentiated QoS. – End-to-end delivery guarantees. Our Proposal: Software-Defined Cyber-Physical Systems (SD-CPS)
  • 129. 129/37 Modeling and Simulating CPS ● Cyberspace to model the smart devices as virtual intelligent agents. ● Mapped interactions between the actors in physical & cyber spaces. ● Incrementally model and load from the controller farm.
  • 130. 130/37 Evaluation Environment● Edge nodes and service resource requirements – with properties normalized. ● Resource requirement – Negative value – even the smallest node satisfies. – High positive value – higher demand for resource.
  • 131. 131/37 Service Deployment Over the Nodes ● How each service is deployed across nodes. ● How each node hosts several services.
  • 132. 132/37 Parallel Execution of 1 million workflows ● Minimal idling nodes. ● High resource utilization.
  • 133. 133/37 Related Work ● SDN for Heterogeneous Devices. – Sensor OpenFlow: SD-Wireless Sensor Networks. ● Scaling SDN: Clustering SDN controller with Akka. ● OpenDaylight Federation ● Conceptual Data Tree projects. ● SDS for Smart Environments. ● Albatross: Taming challenges of distributed systems
  • 134. 134/37 Key Findings ● Increased resource efficiency using edge workflows. ● An approach to mitigate the design and operations challenges in CPS. ● Benefits of SDN to CPS. – Unified and centralized control. – Improved QoS, management, and resilience. – Reduced repeated effort in modeling.
  • 135. 135/37 (7) *Obidos* ● OOn-demand BBig Data IIntegration, DDistribution, and OOrchestration SSystem ● DAPD’19, CoopIS’15, and DMAH’17
  • 136. 136/37 Introduction ● Volume, variety, and distribution of big data are rising. – Structured, semi-structured, unstructured, or ill-formed. ● Integration of data is crucial for data science. – Multiple types of data: Imaging, clinical, and genomic. – Numerous data sources: No shared messaging protocol. – Do we really need to integrate all the data? ● Sharing of integrated data and results for reproducibility.
  • 137. 137/37 Human-in-the-loop On-Demand Data Integration● Service-based data access through APIs. – Thanks to specifications such as HL7 FHIR. ● The researchers possess domain knowledge. ● Integrate On-Demand. – Avoid eager loading of binary data or its textual metadata. – Use the researcher query as an input in loading data. ● Scalable storage in-house. – Load, integrate, index, and query unstructured data.
  • 138. 138/37 Data Sharing Intra-Organization ● Load data only once per organization. – Bandwidth and storage efficiency.
  • 139. 139/37 Data Sharing Inter-Organization ● Do not duplicate data! – We ``own`` our interest; not the data. ● Point to the data in the data sources. – Pointers to data like Dropbox Shared Links. ● Avoids outdated duplicate data. ● Easy to maintain. ● APIs – Access the list of research data sets.
  • 140. 140/37 Problems ● How to.. – Load data from several big data sources. ● Avoid repeated loading and near duplicate data. – Integrate disparate data and persist for future accesses. – Share pointers to data internally and externally.
  • 141. 141/37 Our Proposal: Óbidos ● Define subsets of data that are of interest. – using hierarchical structure of medical data. ● Medical Images (DICOM), Clinical data, .. ● User query → Narrow down the search space. OOn-demand BBig Data IIntegration, DDistribution & OOrchestration SSystem
  • 142. 142/37 Óbidos Approach ● Hybrid of virtual and materialized data integration approaches. – Lazy load of metadata: Load the matching subset of metadata. – Store integrated data and query results → scalable storage. ● Track already loaded data. – Near duplicate detection. – Download only updates (changesets). ● Efficient SQL queries on NoSQL storage. ● Share pointers to the datasets rather than the dataset itself. ● Generic design; implementation for medical research data.Generic design; implementation for medical research data.
  • 144. 144/37
  • 146. 146/37
  • 147. 147/37 Data Structures of the Replicaset Holder
  • 148. 148/37 Evaluation ● Evaluation Data: – Clinical data and TCIA DICOM imaging collections. ● Benchmark Óbidos against eager and lazy ETL. – Performance of loading and querying data. ● Óbidos (inter- and intra- org) against binary data sharing. – Space/bandwidth efficiency of data sharing.
  • 150. 150/37 Data Load Time Change in total data volume (Same query and same interest) ● Load time for eager & lazy ETL with total volume↟ ● Load time for Óbidos remains constant.
  • 151. 151/37 Change in studies of interest (Same query and constant total data volume) Data Load Time ● Load time for eager and lazy ETL remains constant. ● Load time increases for Óbidos with the interest. – Converges to the load time of lazy ETL.
  • 152. 152/37 Load Time from the Remote Data Sources ● Eager and lazy ETL take much longer – To load more data and metadata over the Internet.
  • 153. 153/37 Query Completion Time for the Integrated Data Repository ● Corresponding data already loaded in Óbidos. ● Indexed scalable NoSQL architecture of Óbidos → Better performance.
  • 154. 154/37 Efficiency in Sharing Medical Research Data ● Replicaset – Pointers of marginal size, yet increases with entries of same granularity.
  • 155. 155/37 Key Findings ● Óbidos offers on-demand service-based big data integration. – Fast and resource-efficient data analysis. – SQL queries over NoSQL data store for the integrated data. – Efficient data sharing without replicating the actual data. Future Work – Consume data from repositories beyond medical domain. ● EUDAT – Óbidos distributed virtual data warehouses. ● Leverage the proximity in data integration and sharing.
  • 156. 156/37 (8) *Mayan-DS* ● Software-Defined Data Services (SDDS) ● CCPE’19 and SDS’18 (Best Paper Award) ● Work-in-progress
  • 157. Introduction ● Data services: Service APIs to big data → Interoperability. ● Related data and services distributed far from each other → Bad performance with scale. ● Chaining of data services. – Composing chains of numerous data services. – Data access → Data cleaning → Data integration. ● How to scale out efficiently? – How to minimize communication overheads?
  • 158. 158/37 Motivation ● Software-Defined Networking (SDN). – A unified controller to the data plane devices. – Brings network awareness to the applications. ● Data services – Make big data executions interoperable. ● Can we bring SDN to the Data Services? – Software-Defined Data Services (SDDS)
  • 159. 159/37 Our Proposal: Software-Defined Data Services (SDDS) ● SDDS as a generic approach for data services. – Extending and leveraging SDN. ● Mayan-DS, an SDDS framework. – Efficient management of data services. – Interoperability and scalability.
  • 161. 161/37 SDDS Approach ● Define all the data operations as interoperable services. ● SDN for distributing data and service executions – Inside a data center (e.g. Software-Defined Data Centers). – Beyond data centers (extend SDN with MOM). ● Optimal placement of data and service execution. – Minimize communication overhead and data movements. ● Execute data service on the best-fit server, until interrupted.
  • 162. 162/37 Efficient Data and Execution Placement {i, j} – related data objects D – datasets of interest n – execution node ξ – spread of the related data objects
  • 165. 165/37 Ping Times (ms) Between Two Nodes: Regular Internet vs. Mayan-DS
  • 166. 166/37 Latency: Ping Times of Mayan-DS ● Up to 33% reduction in latency – with a fraction of the path through a direct link. ● 75% or more reduction with significant portion of direct link.
  • 167. 167/37 Key Findings ● Software-Defined Data Services (SDDS) for interoperability and scalability in big data executions. ● Mayan-DS leverages SDN for big data workflows at Internet-scale. ● Limited focus of industrial offerings. – Storage or one or a few specific services. Future Work ● Extend Mayan-DS for edge and IoT/CPS environments.