(R)evolution of the computing continuum - A few challenges

Frederic Desprez
Frederic DesprezDeputy Scientific Director at INRIA à Inria
(R)evolution of the computing
continuum
A few challenges…
International Symposium on Stabilization, Safety, and Security of Distributed Systems
F. Desprez (INRIA), A. Lebre (IMT Atlantique)
Agenda
• Introduction, context, and research issues
• Some recent challenges/scientific issues addressed by the Stack team
1. How to operate a geo-distributed infrastructure
2. Services placement
3. Decentralized indexation
• Experimental infrastructures
• Conclusions
Why do we need a computing continuum ? Mahadev Satyanarayanan
Introduction and context
• Huge increase of data generated (2.5 exabytes of new data generated each
day)
• More than 50 billions of connected devices around the world
• Moving the data from IoT devices to the cloud is an issue
• New applications (time-sensitive, location aware) with ultra-low latencies
requirements
• Privacy issues
• Solution: A computing paradigm closer to the data is generated and used
Impossible !
Edge computing applications
• Autonomous driving
• Security apps
• IoT applications
• Location services
• Network functions
• Industry 4.0
• Edge intelligence
Several ‘flavors’ of distributed computing
• Cloud computing
• Ubiquitous, on-demand access to shared computing resources. Virtualization. Elasticity. IaaS, PaaS, SaaS.
• Fog computing
• « Horizontal system-level architecture that distributes computing, storage, control, and networking closer to the
users along a cloud-to-thing continuum » (OpenFog consortium).
• Mobile computing
• Mobile devices, resource constrained devices, connected though Bluetooth, Wifi, ZigBee, …
• Mobile cloud computing (MCC)
• An infrastructure where both the data storage and data processing occur outside of the mobile device, bringing
mobile computing applications to not just smartphone users but a much broader range of mobile subscribers.
• Mobile and ad hoc cloud computing
• Mobile devices in an ad hoc mobile network form a highly dynamic network topology; the network formed by the
mobile devices is highly dynamic and must accommodate for devices that continuously join or leave the
network.
All one needs to know about fog computing and related edge computing paradigms: A complete survey, A. Yousefpour et al., Journal of Syst. Arch., Vol 98, Sep. 2019
Several ‘flavors’ of distributed computing, contd
• Edge computing
• « Computation done at the edge of the network through small data centers that are close to users »
(OpenEdge Computing).
• Multi-access Edge Computing (MEC)
• « A platform that provides IT and cloud-computing capabilities within the Radio Access Network (RAN) in
4G and 5G, in close proximity to mobile subscribers » (ETSI).
• Cloudlet computing
• Trusted resource-rich computer or a cluster of computers with strong connection to the Internet that is
utilized by nearby mobile devices (Carnegie Mellon University)
• Mist computing
• Dispersed computing at the extreme edge (the IoT devices themselves).
All one needs to know about fog computing and related edge computing paradigms: A complete survey, A. Yousefpour et al., Journal of Syst. Arch., Vol 98, Sep. 2019
Some common characteristics
• Low Latency
• Nodes are closer to the end users and can offer a faster analysis and response to the data generated and
requested by the users
• Geographic Distribution
• Geo-distributed deployment and management,
• Heterogeneity
• Collection and processing of information obtained from different sources and collected by several means of
network communication,
• Interoperability and Federation
• Resources must be able to interoperate with each other and services and applications must be federated
across domains,
• Real-Time Interactions
• Services and applications involve real-time interaction, not just batch processing,
• Scalability
• Fast detection of variation in workload’s response time and of changes in network and device conditions,
supporting elasticity of resources.
Orchestration in Fog Computing: A Comprehensive Survey, B. Costa et al., ACM Computing Surveys, Vol. 55, No. 2, Jan. 2022.
Some research issues
Application lifecycle management (initial deployment, configuration, reconfiguration,
maintenance)
• Abstracting the description of the whole application structure, globally optimize the resources used with respect to multi-criteria
objectives (price, deadline, performance, energy, etc.), models and associated languages to describe applications, their
objective functions, placement and scheduling algorithms supporting system and application-level criteria, ...
Infrastructure management
• Virtualization (hyper-converged 2.0 architecture, complexity, heterogeneity, dynamicity, scaling and locality), storage
(compromise between moving computation vs. data, files, BLOB, key/value systems, geo-distributed graph database, …), and
administration (intelligent orchestrator, geo-distributed scale, automatically adaption to users' needs, ...)
Hardware
• Trusted hardware solutions, architectural support for high level features, energy reduction solutions, new accelerators, …
Security
• Vulnerabilities in VMs, hypervisors and orchestrators, virtual network technologies (SDN, NFV), programming or access
interfaces, adapting security policies to a more complex environment, ...
Energy
• End-to-end energy analysis and management of large-scale hierarchical Cloud/Edge/Fog infrastructures on processing, network
and storage aspects, trade-offs between energy efficiency and other performance metrics in virtualized infrastructures, Eco-
design of digital applications and services, ...
• …
CLOUDLET/FoG/Edge/CLOUD-To-IOT/CONTINUUM Computing
Inter Micro DCs latency
[50ms-100ms]
Edge
Frontier
Edge
Frontier
Extreme Edge
Frontier
Domestic network
Enterprise network
Wired link
Wireless link
Cloud Latency
> 100ms
Cloud Computing
Micro/Nano DC
Intra DC latency
< 10ms
Hybrid network
CHALLENGE 1: HOW TO GEO-DISTRIBUTE
CLOUD APPLICATIONS TO THE EDGE
Defacto open source standard to administrate/virtualize/use
resources of one DC
Scalability?
Latency/throughput impact?
Network partitioning issues?
…
From LAN to WAN? ⇒
Bring Cloud applications to the Edge
INITIATING THE DEBATE WITH OPENSTACK (2016-2021)
Inter Micro DCs latency
[50ms-100ms]
Edge
Frontier
Edge
Frontier
Extreme Edge
Frontier
Domestic network
Enterprise network
Wired link
Wireless link
Cloud Latency
> 100ms
Cloud Computing
Micro/Nano DC
Intra DC latency
< 10ms
Hybrid network
WANWIDE
Collaborative?
Bring Cloud applications to the Edge
INITIATING THE DEBATE WITH OPENSTACK (2016-2021)
13 Millions of LOCs,186 subservices
Designed for a single location
OPENSTACK (THE DEVIL IN
DETAILS)
NOVA
GLANCE
foo
GET
foo
NOVA
GLANCE
NOVA
GLANCE
VM a = openstack server create —image foo
Bring Cloud applications to the Edge
COLLABORATION: ADDITIONAL PIECES OF CODE IS REQUIRED
Collaboration code is
required in every Service
A broker per service
must be implemented
DB values might be
location dependant
Bring Cloud applications to the Edge
COLLABORATION: ADDITIONAL PIECES OF CODE IS REQUIRED
Geo-distributed principles
Collaborations kinds
Bring Cloud applications to the Edge
A SERVICE DEDICATED TO ON DEMAND COLLABORATIONS
The SCOPE lang: Andy defines the scope of the request into the CLI.
The scope specifies where the request applies.
Bring Cloud applications to the Edge
A SERVICE DEDICATED TO ON DEMAND COLLABORATIONS
openstack server create my-vm ——flavor m1.tiny --image cirros.uec
—-scope {compute: Nantes, image: Paris}
OpenStack Summit Berlin - Nov 2018
Hacking the Edge hosted by Open Telekom Cloud
• A complete model in order to enhance the scope description with sites compositions (e.g., AND, OR)
• List VMs on Nantes and Paris
openstack server list --scope {compute:Nantes&Paris}
Bring Cloud applications to the Edge
A SERVICE DEDICATED TO ON DEMAND COLLABORATIONS
https://gitlab.inria.fr/discovery/cheops (Work in Progress)
Bring Cloud applications to the Edge
A CHEOPS AS A BUILDING BLOCK TO DEAL WITH GEO-DISTRIBUTION
• Expose consistency policies at to the user level (extend the scope syntax)
• Manage the dependencies between resources
• Notion of replication set: manage a fixed pool of resources with an automatic control loop
(implemented in a geo-distributed way at the Cheops level).
Replication overview/challenge
Bring Cloud applications to the Edge
A CHEOPS AS A BUILDING BLOCK TO DEAL WITH GEO-DISTRIBUTION
Manage partition issues using appropriate replication/aggregation policies
Cross overview/challenge
Bring Cloud applications to the Edge
A CHEOPS AS A BUILDING BLOCK TO DEAL WITH GEO-DISTRIBUTION
A bit more complicated than it looks like…
Delavergne, Marie; Antony, Geo Johns; Lebre, Adrien
Cheops, a service to bloud away Cloud applications to the Edge, To appear in ICSOC 2022
Bring Cloud applications to the Edge
TOWARD A GENERALISATION OF THE SERVICE (OpenStack/Kubernetes/…)
CHALLENGE 2: SERVICE PLACEMENT
Service placement problems
How to assign the IoT applications to computing nodes (Fog nodes) which are distributed in a Fog environment ?
• Different kinds of applications
• Monolithic service, data pipeline, set of inter-dependent components, Directed Acyclic Graphs (DAGs)
• Several constraints
• Computing and networking resources are heterogeneous and dynamic, Computing and network resources are not always available,
Service cannot be processed everywhere
• Different approaches
• Centralized or distributed approaches
• Online or offline placement
• Static or dynamic
• Mobility support
• Different performance criterions
• Execution time, quality of service, latency, energy consumption
• Problem formulations
• Linear programming: Integer Linear Programming (ILP), Integer Nonlinear Programming (INLP), Mixed Integer Linear Programming
(MILP), Mixed-integer non-linear programming (MINLP), Mixed Integer Quadratic Programming (MIQP)
• Constraint programming, Markov decision process, stochastic optimization, potential games, …
An overview of service placement problems in Fog and Edge Computing. F. Ait-Salaht, F. Desprez, and. A. Lebre. ACM Computing Surveys, Vol. 53, Issue 3, May 2021
Service Placement Problem using Constraint programming
and Choco solver
• Goals
• Elaborate a generic and easy to upgrade model
• Define a new formulation of the placement problem considering a general definition of service and
infrastructure network through graphs using constraint programming
Service Placement in Fog Computing Using Constraint Programming. F. Ait-Salaht, F. Desprez, A. Lebre, C. Prud’homme and M. Abderrahim. IEEE
System model and problem formulation
• A directed graph G = <V,E> represents the Network
• V: set of vertices or nodes (server)
• E: set of edges or arcs (connections)
• Each node defines CPU and RAM capacities
• Each arc defines a latency and a bandwidth
capacity
• Infrastructure
• An application is an ordered set of components
• A component requires CPU/RAM to work
• A component can send data (bandwidth, latency)
• Some components are fixed (f-ex., cameras)
• Application
• CPU capacity of each node is
respected
• Same goes with RAM capacity
• Bandwidth capacity is respected on
arcs too
• Latencies are satisfied
Placement (mapping)
Assign services (each component and each edge) to network infrastructure
(node and link) such that:
System model and problem formulation
Constraint Programming model (CP)
What is CP ?
• CP stands for Constraint Programming
• CP is a general purpose implementation of Mathematical Programming
• MP theoretically studies optimization problems and resolution techniques
• It aims at describing real combinatorial problems in the form of Constraint
Satisfaction Problems and solving them with Constraint Programming techniques
• The problem is solved by alternating constraint filtering algorithms with a search
mechanism
• Modeling steps (3)
• Declare variables and their domain
• Find relation between them
• Declare a objective function, if any
Example
Example
Example
34
Constraint Programming model (CP)
Variables and domains
Constraint Programming model (CP)
Variables and domains
Constraints
Variables and domains
Constraints
Constraints on nodes
Constraints
Constraints on Arcs
Constraints
Constraints between nodes and arcs
Experiment 1
Infrastructure Smart bell application
91 fog
nodes
86
sensors
• Requirements
‣ Resources: CPU, RAM, DISK
‣ Networking: Latency and Bandwidth
‣ Locality
• Objective
‣ Minimize average latency
Implementation of the model on the Choco solver (Free Open-Source Java library dedicated to Constraint Program
Infrastructure Smart bell application
91 fog
nodes
86
sensors
Experiment 1
Infrastructure
Applications
(a) Storage Application, (b) Smart Bell application, and (c) A face recognition application
Greek Forthnet topology
60
PoPs
59 links
Experiment 2
(a) For G with 120 nodes (b) For G with 300 nodes
(c) For G with 600 nodes (d) For G with 1200 nodes
Experiment 2
CHALLENGE 3: THE INDEXING PROBLEM
Where is the content I’m looking for?
Locating the closest replica of a specific content requires indexing every live replica along with
its location
Existing solutions
• Remote services (centralized index, DHT)
In contradiction with the objectives of Edge infrastructures:
The indexing information might be stored in a node that is far away
(or even unreachable) while the replica could be in the vicinity
• Broadcast
• Maintaining such an index at every node would prove overly costly in terms of memory
and traffic (it does not confine the traffic)
• Epidemic propagation
Epidemic Propagation and Dynamic logical partitioning
Challenges
How to maintain such a logical partitioning in a dynamic environment
where…
• Nodes can ADD or DELETE content any time (no synchronization)
• Nodes can join or leave the system at any time (without any warning)
…while limiting the scope of transferred information as much as possible
Challenges
Propagating messages naively is not sufficient to guarantee consistent partit
Lock Down the Traffic of Decentralized Content Indexing at the Edge, B. Nedelec et al., ICA3PP 2022
A preliminary step toward a complete solution
• Definitions of the properties that guarantee
decentralized consistent partitioning in dynamic
infrastructures.
• Demonstration that concurrent creation and
removal of partitions may impair the propagation
of control information
• Proposal of a first algorithm solving this dynamic
partitioning problem (and its evaluation by
simulations)
RESEARCH INFRASTRUCTURES
Experimental infrastructures
SILECS/SLICES: Super Infrastructure for Large-Scale Experimental Computer Science
• The Discipline of Computing: An Experimental Science
• Studied objects are more and more complex (Hardware, Systems, Networks, Programs, Protocols, Data,
Algorithms, …)
• A good experiment should fulfill the following properties
• Reproducibility: must give the same result with the same input
• Extensibility: must target possible comparisons with other works and extensions (more/other processors,
larger data sets, different architectures)
• Applicability: must define realistic parameters and must allow for an easy calibration
• “Revisability”: when an implementation does not perform as expected, must help to identify the reasons
• ACM Artifact Review and Badging
SILECS/Grid’5000
• Testbed for research on distributed systems
• Born in 2003 from the observation that we need a better and larger testbed
• HPC, Grids, P2P, and now Cloud computing, and BigData systems
• A complete access to the nodes’ hardware in an exclusive mode (from one node to
the whole infrastructure)
• Dedicated network (RENATER)
• Reconfigurable: nodes with Kadeploy and network with KaVLAN
• Current status
• 8 sites, 36 clusters, 838 nodes, 15116 cores
• Memory: ~100 TiB RAM + 6.0 TiB PMEM, Storage: 1.42 PB (1515 SSDs and 953
HDDs on nodes), 617.0 TFLOPS (excluding GPUs)
• Diverse technologies/resources (Intel, AMD, Myrinet, Infiniband, two GPU clusters,
energy probes)
• Some Experiments examples
• In Situ analytics, Big Data Management,
• HPC Programming approaches, Batch scheduler optimization
• Network modeling and simulation
• Energy consumption evaluation
• Large virtual machines deployments
SILECS/FIT
Providing Internet players access to a
variety of fixed and mobile
technologies and services, thus
accelerating the design of advanced
technologies for the Future Internet
Experiments
• Discovering resources from their description
• Reconfiguring the testbed to meet experimental
needs
• Monitoring experiments, extracting and
analyzing data
• Controlling experiments: API
Target infrastructure
Distributed Storage for a Fog/Edge infrastructure
based on a P2P and a Scale-Out NAS
FogIoT Orchestrator: an Orchestration System for IoT
Applications in Fog Environment
European dimension
ESFRI project/infrastructure since 2021
CONCLUSIONS
Conclusion
The disconnection is the norm
• High latency, unreliable connections,
• Logical partitioning (Edge areas/zones)
A (r)evolution of distributed systems and networks?
• Algorithms, (distributed) system building blocks should be revised to satisfy geo-
distributed constraints
• Decentralized vs collaborative (e.g. DHT, network ASes)
Questions / THANKS
Post-scriptum
• We are looking for students, Phd candidates,
postdocs, engineers, researchers, associate-
professors (AI/infrastructure experts, this is trendy ;-)),
use-cases, fundings, collaborations…
• We propose … a lot of fun and work!
http://stack.inria.fr
1 sur 60

Recommandé

Basic distributed systems principles par
Basic distributed systems principlesBasic distributed systems principles
Basic distributed systems principlesRuben Tan
1.8K vues157 diapositives
Cloud Architecture - Multi Cloud, Edge, On-Premise par
Cloud Architecture - Multi Cloud, Edge, On-PremiseCloud Architecture - Multi Cloud, Edge, On-Premise
Cloud Architecture - Multi Cloud, Edge, On-PremiseAraf Karsh Hamid
393 vues72 diapositives
Service Mesh - Observability par
Service Mesh - ObservabilityService Mesh - Observability
Service Mesh - ObservabilityAraf Karsh Hamid
376 vues59 diapositives
Application modernization patterns with apache kafka, debezium, and kubernete... par
Application modernization patterns with apache kafka, debezium, and kubernete...Application modernization patterns with apache kafka, debezium, and kubernete...
Application modernization patterns with apache kafka, debezium, and kubernete...Bilgin Ibryam
748 vues29 diapositives
Deep Dive into Building a Secure & Multi-tenant SaaS Solution with NATS par
Deep Dive into Building a Secure & Multi-tenant SaaS Solution with NATSDeep Dive into Building a Secure & Multi-tenant SaaS Solution with NATS
Deep Dive into Building a Secure & Multi-tenant SaaS Solution with NATSNATS
465 vues31 diapositives
Kubernetes Concepts And Architecture Powerpoint Presentation Slides par
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesKubernetes Concepts And Architecture Powerpoint Presentation Slides
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesSlideTeam
4K vues48 diapositives

Contenu connexe

Tendances

Moving Your Data Center: Keys to planning a successful data center migration par
Moving Your Data Center: Keys to planning a successful data center migrationMoving Your Data Center: Keys to planning a successful data center migration
Moving Your Data Center: Keys to planning a successful data center migrationData Cave
9K vues17 diapositives
Introduction To Microservices par
Introduction To MicroservicesIntroduction To Microservices
Introduction To MicroservicesLalit Kale
1.1K vues40 diapositives
GitOps with Amazon EKS Anywhere by Dan Budris par
GitOps with Amazon EKS Anywhere by Dan BudrisGitOps with Amazon EKS Anywhere by Dan Budris
GitOps with Amazon EKS Anywhere by Dan BudrisWeaveworks
272 vues11 diapositives
Cloud Native Architectures for Devops par
Cloud Native Architectures for DevopsCloud Native Architectures for Devops
Cloud Native Architectures for Devopscornelia davis
2.2K vues64 diapositives
Cloud migration slides par
Cloud migration slidesCloud migration slides
Cloud migration slidesErika Barron
951 vues14 diapositives
Event Sourcing & CQRS, Kafka, Rabbit MQ par
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQAraf Karsh Hamid
1.3K vues104 diapositives

Tendances(20)

Moving Your Data Center: Keys to planning a successful data center migration par Data Cave
Moving Your Data Center: Keys to planning a successful data center migrationMoving Your Data Center: Keys to planning a successful data center migration
Moving Your Data Center: Keys to planning a successful data center migration
Data Cave9K vues
Introduction To Microservices par Lalit Kale
Introduction To MicroservicesIntroduction To Microservices
Introduction To Microservices
Lalit Kale1.1K vues
GitOps with Amazon EKS Anywhere by Dan Budris par Weaveworks
GitOps with Amazon EKS Anywhere by Dan BudrisGitOps with Amazon EKS Anywhere by Dan Budris
GitOps with Amazon EKS Anywhere by Dan Budris
Weaveworks272 vues
Cloud Native Architectures for Devops par cornelia davis
Cloud Native Architectures for DevopsCloud Native Architectures for Devops
Cloud Native Architectures for Devops
cornelia davis2.2K vues
Event Sourcing & CQRS, Kafka, Rabbit MQ par Araf Karsh Hamid
Event Sourcing & CQRS, Kafka, Rabbit MQEvent Sourcing & CQRS, Kafka, Rabbit MQ
Event Sourcing & CQRS, Kafka, Rabbit MQ
Araf Karsh Hamid1.3K vues
Microservices Docker Kubernetes Istio Kanban DevOps SRE par Araf Karsh Hamid
Microservices Docker Kubernetes Istio Kanban DevOps SREMicroservices Docker Kubernetes Istio Kanban DevOps SRE
Microservices Docker Kubernetes Istio Kanban DevOps SRE
Araf Karsh Hamid4.2K vues
Google Anthos - Azure Stack - AWS Outposts :Comparison par Krishna-Kumar
Google Anthos - Azure Stack - AWS Outposts :ComparisonGoogle Anthos - Azure Stack - AWS Outposts :Comparison
Google Anthos - Azure Stack - AWS Outposts :Comparison
Krishna-Kumar 2.3K vues
Leveraging Azure DevOps across the Enterprise par Andrew Kelleher
Leveraging Azure DevOps across the EnterpriseLeveraging Azure DevOps across the Enterprise
Leveraging Azure DevOps across the Enterprise
Andrew Kelleher529 vues
Application Portfolio Assessment and the 6Rs in Cloud Migrations par Amazon Web Services
Application Portfolio Assessment and the 6Rs in Cloud MigrationsApplication Portfolio Assessment and the 6Rs in Cloud Migrations
Application Portfolio Assessment and the 6Rs in Cloud Migrations
Event Driven Microservices architecture par NikhilBarthwal4
Event Driven Microservices architectureEvent Driven Microservices architecture
Event Driven Microservices architecture
NikhilBarthwal4156 vues
Cloud migration strategies par SogetiLabs
Cloud migration strategiesCloud migration strategies
Cloud migration strategies
SogetiLabs3.7K vues
App Modernization Pitch Deck.pptx par MONISH407209
App Modernization Pitch Deck.pptxApp Modernization Pitch Deck.pptx
App Modernization Pitch Deck.pptx
MONISH407209247 vues
Ceph with CloudStack par ShapeBlue
Ceph with CloudStackCeph with CloudStack
Ceph with CloudStack
ShapeBlue730 vues
The Cloud Native Journey par VMware Tanzu
The Cloud Native JourneyThe Cloud Native Journey
The Cloud Native Journey
VMware Tanzu11.4K vues
NATS for Modern Messaging and Microservices par Apcera
NATS for Modern Messaging and MicroservicesNATS for Modern Messaging and Microservices
NATS for Modern Messaging and Microservices
Apcera3.2K vues
Free GitOps Workshop + Intro to Kubernetes & GitOps par Weaveworks
Free GitOps Workshop + Intro to Kubernetes & GitOpsFree GitOps Workshop + Intro to Kubernetes & GitOps
Free GitOps Workshop + Intro to Kubernetes & GitOps
Weaveworks180 vues

Similaire à (R)evolution of the computing continuum - A few challenges

What is Your Edge From the Cloud to the Edge, Extending Your Reach par
What is Your Edge From the Cloud to the Edge, Extending Your ReachWhat is Your Edge From the Cloud to the Edge, Extending Your Reach
What is Your Edge From the Cloud to the Edge, Extending Your ReachSUSE
125 vues31 diapositives
System Support for Internet of Things par
System Support for Internet of ThingsSystem Support for Internet of Things
System Support for Internet of ThingsHarshitParkar6677
43 vues47 diapositives
Walking through the fog (computing) - Keynote talk at Italian Networking Work... par
Walking through the fog (computing) - Keynote talk at Italian Networking Work...Walking through the fog (computing) - Keynote talk at Italian Networking Work...
Walking through the fog (computing) - Keynote talk at Italian Networking Work...FBK CREATE-NET
373 vues35 diapositives
Gridcomputingppt par
GridcomputingpptGridcomputingppt
Gridcomputingpptnavjasser
5.9K vues30 diapositives
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx par
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptxCS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptxMALATHYANANDAN
18 vues38 diapositives
Research portfolio par
Research portfolio Research portfolio
Research portfolio Mehdi Bennis
2K vues31 diapositives

Similaire à (R)evolution of the computing continuum - A few challenges(20)

What is Your Edge From the Cloud to the Edge, Extending Your Reach par SUSE
What is Your Edge From the Cloud to the Edge, Extending Your ReachWhat is Your Edge From the Cloud to the Edge, Extending Your Reach
What is Your Edge From the Cloud to the Edge, Extending Your Reach
SUSE125 vues
Walking through the fog (computing) - Keynote talk at Italian Networking Work... par FBK CREATE-NET
Walking through the fog (computing) - Keynote talk at Italian Networking Work...Walking through the fog (computing) - Keynote talk at Italian Networking Work...
Walking through the fog (computing) - Keynote talk at Italian Networking Work...
FBK CREATE-NET373 vues
Gridcomputingppt par navjasser
GridcomputingpptGridcomputingppt
Gridcomputingppt
navjasser5.9K vues
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx par MALATHYANANDAN
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptxCS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
MALATHYANANDAN18 vues
云计算及其应用 par lantianlcdx
云计算及其应用云计算及其应用
云计算及其应用
lantianlcdx780 vues
Cloud and Grid Computing par Leen Blom
Cloud and Grid ComputingCloud and Grid Computing
Cloud and Grid Computing
Leen Blom2.5K vues
Cloud and grid computing by Leen Blom, Centric par Centric
Cloud and grid computing by Leen Blom, CentricCloud and grid computing by Leen Blom, Centric
Cloud and grid computing by Leen Blom, Centric
Centric1.1K vues
Scientific Cloud Computing: Present & Future par stratuslab
Scientific Cloud Computing: Present & FutureScientific Cloud Computing: Present & Future
Scientific Cloud Computing: Present & Future
stratuslab499 vues
GridComputing-an introduction.ppt par NileshkuGiri
GridComputing-an introduction.pptGridComputing-an introduction.ppt
GridComputing-an introduction.ppt
NileshkuGiri7 vues

Plus de Frederic Desprez

SILECS/SLICES - Super Infrastructure for Large-Scale Experimental Computer Sc... par
SILECS/SLICES - Super Infrastructure for Large-Scale Experimental Computer Sc...SILECS/SLICES - Super Infrastructure for Large-Scale Experimental Computer Sc...
SILECS/SLICES - Super Infrastructure for Large-Scale Experimental Computer Sc...Frederic Desprez
239 vues27 diapositives
SILECS/SLICES par
SILECS/SLICESSILECS/SLICES
SILECS/SLICESFrederic Desprez
156 vues26 diapositives
SILECS: Super Infrastructure for Large-scale Experimental Computer Science par
SILECS: Super Infrastructure for Large-scale Experimental Computer ScienceSILECS: Super Infrastructure for Large-scale Experimental Computer Science
SILECS: Super Infrastructure for Large-scale Experimental Computer ScienceFrederic Desprez
756 vues26 diapositives
From IoT Devices to Cloud par
From IoT Devices to CloudFrom IoT Devices to Cloud
From IoT Devices to CloudFrederic Desprez
865 vues42 diapositives
Challenges and Issues of Next Cloud Computing Platforms par
Challenges and Issues of Next Cloud Computing PlatformsChallenges and Issues of Next Cloud Computing Platforms
Challenges and Issues of Next Cloud Computing PlatformsFrederic Desprez
1.2K vues63 diapositives
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ... par
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Frederic Desprez
1.3K vues79 diapositives

Plus de Frederic Desprez(15)

SILECS/SLICES - Super Infrastructure for Large-Scale Experimental Computer Sc... par Frederic Desprez
SILECS/SLICES - Super Infrastructure for Large-Scale Experimental Computer Sc...SILECS/SLICES - Super Infrastructure for Large-Scale Experimental Computer Sc...
SILECS/SLICES - Super Infrastructure for Large-Scale Experimental Computer Sc...
Frederic Desprez239 vues
SILECS: Super Infrastructure for Large-scale Experimental Computer Science par Frederic Desprez
SILECS: Super Infrastructure for Large-scale Experimental Computer ScienceSILECS: Super Infrastructure for Large-scale Experimental Computer Science
SILECS: Super Infrastructure for Large-scale Experimental Computer Science
Frederic Desprez756 vues
Challenges and Issues of Next Cloud Computing Platforms par Frederic Desprez
Challenges and Issues of Next Cloud Computing PlatformsChallenges and Issues of Next Cloud Computing Platforms
Challenges and Issues of Next Cloud Computing Platforms
Frederic Desprez1.2K vues
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ... par Frederic Desprez
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Frederic Desprez1.3K vues
Experimental Computer Science - Approaches and Instruments par Frederic Desprez
Experimental Computer Science - Approaches and InstrumentsExperimental Computer Science - Approaches and Instruments
Experimental Computer Science - Approaches and Instruments
Frederic Desprez1.5K vues
Cloud Computing: De la recherche dans les nuages ? par Frederic Desprez
Cloud Computing: De la recherche dans les nuages ?Cloud Computing: De la recherche dans les nuages ?
Cloud Computing: De la recherche dans les nuages ?
Frederic Desprez4.6K vues
Workflow Allocations and Scheduling on IaaS Platforms, from Theory to Practice par Frederic Desprez
Workflow Allocations and Scheduling on IaaS Platforms, from Theory to PracticeWorkflow Allocations and Scheduling on IaaS Platforms, from Theory to Practice
Workflow Allocations and Scheduling on IaaS Platforms, from Theory to Practice
Frederic Desprez691 vues
Multiple Services Throughput Optimization in a Hierarchical Middleware par Frederic Desprez
Multiple Services Throughput Optimization in a Hierarchical MiddlewareMultiple Services Throughput Optimization in a Hierarchical Middleware
Multiple Services Throughput Optimization in a Hierarchical Middleware
Frederic Desprez798 vues
Les Clouds: Buzzword ou révolution technologique par Frederic Desprez
Les Clouds: Buzzword ou révolution technologiqueLes Clouds: Buzzword ou révolution technologique
Les Clouds: Buzzword ou révolution technologique
Frederic Desprez1.8K vues

Dernier

Network Source of Truth and Infrastructure as Code revisited par
Network Source of Truth and Infrastructure as Code revisitedNetwork Source of Truth and Infrastructure as Code revisited
Network Source of Truth and Infrastructure as Code revisitedNetwork Automation Forum
52 vues45 diapositives
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... par
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...ShapeBlue
132 vues15 diapositives
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... par
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...James Anderson
156 vues32 diapositives
20231123_Camunda Meetup Vienna.pdf par
20231123_Camunda Meetup Vienna.pdf20231123_Camunda Meetup Vienna.pdf
20231123_Camunda Meetup Vienna.pdfPhactum Softwareentwicklung GmbH
50 vues73 diapositives
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online par
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineShapeBlue
181 vues19 diapositives
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue par
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueShapeBlue
176 vues20 diapositives

Dernier(20)

Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... par ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue132 vues
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N... par James Anderson
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
GDG Cloud Southlake 28 Brad Taylor and Shawn Augenstein Old Problems in the N...
James Anderson156 vues
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online par ShapeBlue
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
ShapeBlue181 vues
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue par ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
ShapeBlue176 vues
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ... par ShapeBlue
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
ShapeBlue79 vues
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O... par ShapeBlue
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
ShapeBlue88 vues
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue par ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue93 vues
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... par ShapeBlue
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
ShapeBlue85 vues
Confidence in CloudStack - Aron Wagner, Nathan Gleason - Americ par ShapeBlue
Confidence in CloudStack - Aron Wagner, Nathan Gleason - AmericConfidence in CloudStack - Aron Wagner, Nathan Gleason - Americ
Confidence in CloudStack - Aron Wagner, Nathan Gleason - Americ
ShapeBlue88 vues
NTGapps NTG LowCode Platform par Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu365 vues
The Role of Patterns in the Era of Large Language Models par Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li80 vues
Data Integrity for Banking and Financial Services par Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely78 vues
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... par ShapeBlue
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
ShapeBlue154 vues
Igniting Next Level Productivity with AI-Infused Data Integration Workflows par Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software385 vues
Business Analyst Series 2023 - Week 4 Session 7 par DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray10126 vues

(R)evolution of the computing continuum - A few challenges

  • 1. (R)evolution of the computing continuum A few challenges… International Symposium on Stabilization, Safety, and Security of Distributed Systems F. Desprez (INRIA), A. Lebre (IMT Atlantique)
  • 2. Agenda • Introduction, context, and research issues • Some recent challenges/scientific issues addressed by the Stack team 1. How to operate a geo-distributed infrastructure 2. Services placement 3. Decentralized indexation • Experimental infrastructures • Conclusions
  • 3. Why do we need a computing continuum ? Mahadev Satyanarayanan
  • 4. Introduction and context • Huge increase of data generated (2.5 exabytes of new data generated each day) • More than 50 billions of connected devices around the world • Moving the data from IoT devices to the cloud is an issue • New applications (time-sensitive, location aware) with ultra-low latencies requirements • Privacy issues • Solution: A computing paradigm closer to the data is generated and used Impossible !
  • 5. Edge computing applications • Autonomous driving • Security apps • IoT applications • Location services • Network functions • Industry 4.0 • Edge intelligence
  • 6. Several ‘flavors’ of distributed computing • Cloud computing • Ubiquitous, on-demand access to shared computing resources. Virtualization. Elasticity. IaaS, PaaS, SaaS. • Fog computing • « Horizontal system-level architecture that distributes computing, storage, control, and networking closer to the users along a cloud-to-thing continuum » (OpenFog consortium). • Mobile computing • Mobile devices, resource constrained devices, connected though Bluetooth, Wifi, ZigBee, … • Mobile cloud computing (MCC) • An infrastructure where both the data storage and data processing occur outside of the mobile device, bringing mobile computing applications to not just smartphone users but a much broader range of mobile subscribers. • Mobile and ad hoc cloud computing • Mobile devices in an ad hoc mobile network form a highly dynamic network topology; the network formed by the mobile devices is highly dynamic and must accommodate for devices that continuously join or leave the network. All one needs to know about fog computing and related edge computing paradigms: A complete survey, A. Yousefpour et al., Journal of Syst. Arch., Vol 98, Sep. 2019
  • 7. Several ‘flavors’ of distributed computing, contd • Edge computing • « Computation done at the edge of the network through small data centers that are close to users » (OpenEdge Computing). • Multi-access Edge Computing (MEC) • « A platform that provides IT and cloud-computing capabilities within the Radio Access Network (RAN) in 4G and 5G, in close proximity to mobile subscribers » (ETSI). • Cloudlet computing • Trusted resource-rich computer or a cluster of computers with strong connection to the Internet that is utilized by nearby mobile devices (Carnegie Mellon University) • Mist computing • Dispersed computing at the extreme edge (the IoT devices themselves). All one needs to know about fog computing and related edge computing paradigms: A complete survey, A. Yousefpour et al., Journal of Syst. Arch., Vol 98, Sep. 2019
  • 8. Some common characteristics • Low Latency • Nodes are closer to the end users and can offer a faster analysis and response to the data generated and requested by the users • Geographic Distribution • Geo-distributed deployment and management, • Heterogeneity • Collection and processing of information obtained from different sources and collected by several means of network communication, • Interoperability and Federation • Resources must be able to interoperate with each other and services and applications must be federated across domains, • Real-Time Interactions • Services and applications involve real-time interaction, not just batch processing, • Scalability • Fast detection of variation in workload’s response time and of changes in network and device conditions, supporting elasticity of resources. Orchestration in Fog Computing: A Comprehensive Survey, B. Costa et al., ACM Computing Surveys, Vol. 55, No. 2, Jan. 2022.
  • 9. Some research issues Application lifecycle management (initial deployment, configuration, reconfiguration, maintenance) • Abstracting the description of the whole application structure, globally optimize the resources used with respect to multi-criteria objectives (price, deadline, performance, energy, etc.), models and associated languages to describe applications, their objective functions, placement and scheduling algorithms supporting system and application-level criteria, ... Infrastructure management • Virtualization (hyper-converged 2.0 architecture, complexity, heterogeneity, dynamicity, scaling and locality), storage (compromise between moving computation vs. data, files, BLOB, key/value systems, geo-distributed graph database, …), and administration (intelligent orchestrator, geo-distributed scale, automatically adaption to users' needs, ...) Hardware • Trusted hardware solutions, architectural support for high level features, energy reduction solutions, new accelerators, … Security • Vulnerabilities in VMs, hypervisors and orchestrators, virtual network technologies (SDN, NFV), programming or access interfaces, adapting security policies to a more complex environment, ... Energy • End-to-end energy analysis and management of large-scale hierarchical Cloud/Edge/Fog infrastructures on processing, network and storage aspects, trade-offs between energy efficiency and other performance metrics in virtualized infrastructures, Eco- design of digital applications and services, ... • …
  • 10. CLOUDLET/FoG/Edge/CLOUD-To-IOT/CONTINUUM Computing Inter Micro DCs latency [50ms-100ms] Edge Frontier Edge Frontier Extreme Edge Frontier Domestic network Enterprise network Wired link Wireless link Cloud Latency > 100ms Cloud Computing Micro/Nano DC Intra DC latency < 10ms Hybrid network
  • 11. CHALLENGE 1: HOW TO GEO-DISTRIBUTE CLOUD APPLICATIONS TO THE EDGE
  • 12. Defacto open source standard to administrate/virtualize/use resources of one DC Scalability? Latency/throughput impact? Network partitioning issues? … From LAN to WAN? ⇒ Bring Cloud applications to the Edge INITIATING THE DEBATE WITH OPENSTACK (2016-2021)
  • 13. Inter Micro DCs latency [50ms-100ms] Edge Frontier Edge Frontier Extreme Edge Frontier Domestic network Enterprise network Wired link Wireless link Cloud Latency > 100ms Cloud Computing Micro/Nano DC Intra DC latency < 10ms Hybrid network WANWIDE Collaborative? Bring Cloud applications to the Edge INITIATING THE DEBATE WITH OPENSTACK (2016-2021)
  • 14. 13 Millions of LOCs,186 subservices Designed for a single location OPENSTACK (THE DEVIL IN DETAILS)
  • 15. NOVA GLANCE foo GET foo NOVA GLANCE NOVA GLANCE VM a = openstack server create —image foo Bring Cloud applications to the Edge COLLABORATION: ADDITIONAL PIECES OF CODE IS REQUIRED
  • 16. Collaboration code is required in every Service A broker per service must be implemented DB values might be location dependant Bring Cloud applications to the Edge COLLABORATION: ADDITIONAL PIECES OF CODE IS REQUIRED
  • 17. Geo-distributed principles Collaborations kinds Bring Cloud applications to the Edge A SERVICE DEDICATED TO ON DEMAND COLLABORATIONS
  • 18. The SCOPE lang: Andy defines the scope of the request into the CLI. The scope specifies where the request applies. Bring Cloud applications to the Edge A SERVICE DEDICATED TO ON DEMAND COLLABORATIONS
  • 19. openstack server create my-vm ——flavor m1.tiny --image cirros.uec —-scope {compute: Nantes, image: Paris} OpenStack Summit Berlin - Nov 2018 Hacking the Edge hosted by Open Telekom Cloud • A complete model in order to enhance the scope description with sites compositions (e.g., AND, OR) • List VMs on Nantes and Paris openstack server list --scope {compute:Nantes&Paris} Bring Cloud applications to the Edge A SERVICE DEDICATED TO ON DEMAND COLLABORATIONS
  • 20. https://gitlab.inria.fr/discovery/cheops (Work in Progress) Bring Cloud applications to the Edge A CHEOPS AS A BUILDING BLOCK TO DEAL WITH GEO-DISTRIBUTION
  • 21. • Expose consistency policies at to the user level (extend the scope syntax) • Manage the dependencies between resources • Notion of replication set: manage a fixed pool of resources with an automatic control loop (implemented in a geo-distributed way at the Cheops level). Replication overview/challenge Bring Cloud applications to the Edge A CHEOPS AS A BUILDING BLOCK TO DEAL WITH GEO-DISTRIBUTION
  • 22. Manage partition issues using appropriate replication/aggregation policies Cross overview/challenge Bring Cloud applications to the Edge A CHEOPS AS A BUILDING BLOCK TO DEAL WITH GEO-DISTRIBUTION
  • 23. A bit more complicated than it looks like… Delavergne, Marie; Antony, Geo Johns; Lebre, Adrien Cheops, a service to bloud away Cloud applications to the Edge, To appear in ICSOC 2022 Bring Cloud applications to the Edge TOWARD A GENERALISATION OF THE SERVICE (OpenStack/Kubernetes/…)
  • 24. CHALLENGE 2: SERVICE PLACEMENT
  • 25. Service placement problems How to assign the IoT applications to computing nodes (Fog nodes) which are distributed in a Fog environment ? • Different kinds of applications • Monolithic service, data pipeline, set of inter-dependent components, Directed Acyclic Graphs (DAGs) • Several constraints • Computing and networking resources are heterogeneous and dynamic, Computing and network resources are not always available, Service cannot be processed everywhere • Different approaches • Centralized or distributed approaches • Online or offline placement • Static or dynamic • Mobility support • Different performance criterions • Execution time, quality of service, latency, energy consumption • Problem formulations • Linear programming: Integer Linear Programming (ILP), Integer Nonlinear Programming (INLP), Mixed Integer Linear Programming (MILP), Mixed-integer non-linear programming (MINLP), Mixed Integer Quadratic Programming (MIQP) • Constraint programming, Markov decision process, stochastic optimization, potential games, … An overview of service placement problems in Fog and Edge Computing. F. Ait-Salaht, F. Desprez, and. A. Lebre. ACM Computing Surveys, Vol. 53, Issue 3, May 2021
  • 26. Service Placement Problem using Constraint programming and Choco solver • Goals • Elaborate a generic and easy to upgrade model • Define a new formulation of the placement problem considering a general definition of service and infrastructure network through graphs using constraint programming Service Placement in Fog Computing Using Constraint Programming. F. Ait-Salaht, F. Desprez, A. Lebre, C. Prud’homme and M. Abderrahim. IEEE
  • 27. System model and problem formulation • A directed graph G = <V,E> represents the Network • V: set of vertices or nodes (server) • E: set of edges or arcs (connections) • Each node defines CPU and RAM capacities • Each arc defines a latency and a bandwidth capacity • Infrastructure • An application is an ordered set of components • A component requires CPU/RAM to work • A component can send data (bandwidth, latency) • Some components are fixed (f-ex., cameras) • Application
  • 28. • CPU capacity of each node is respected • Same goes with RAM capacity • Bandwidth capacity is respected on arcs too • Latencies are satisfied Placement (mapping) Assign services (each component and each edge) to network infrastructure (node and link) such that: System model and problem formulation
  • 29. Constraint Programming model (CP) What is CP ? • CP stands for Constraint Programming • CP is a general purpose implementation of Mathematical Programming • MP theoretically studies optimization problems and resolution techniques • It aims at describing real combinatorial problems in the form of Constraint Satisfaction Problems and solving them with Constraint Programming techniques • The problem is solved by alternating constraint filtering algorithms with a search mechanism • Modeling steps (3) • Declare variables and their domain • Find relation between them • Declare a objective function, if any
  • 33. 34 Constraint Programming model (CP) Variables and domains
  • 34. Constraint Programming model (CP) Variables and domains
  • 39. Experiment 1 Infrastructure Smart bell application 91 fog nodes 86 sensors • Requirements ‣ Resources: CPU, RAM, DISK ‣ Networking: Latency and Bandwidth ‣ Locality • Objective ‣ Minimize average latency Implementation of the model on the Choco solver (Free Open-Source Java library dedicated to Constraint Program
  • 40. Infrastructure Smart bell application 91 fog nodes 86 sensors Experiment 1
  • 41. Infrastructure Applications (a) Storage Application, (b) Smart Bell application, and (c) A face recognition application Greek Forthnet topology 60 PoPs 59 links Experiment 2
  • 42. (a) For G with 120 nodes (b) For G with 300 nodes (c) For G with 600 nodes (d) For G with 1200 nodes Experiment 2
  • 43. CHALLENGE 3: THE INDEXING PROBLEM
  • 44. Where is the content I’m looking for? Locating the closest replica of a specific content requires indexing every live replica along with its location Existing solutions • Remote services (centralized index, DHT) In contradiction with the objectives of Edge infrastructures: The indexing information might be stored in a node that is far away (or even unreachable) while the replica could be in the vicinity • Broadcast • Maintaining such an index at every node would prove overly costly in terms of memory and traffic (it does not confine the traffic) • Epidemic propagation
  • 45. Epidemic Propagation and Dynamic logical partitioning
  • 46. Challenges How to maintain such a logical partitioning in a dynamic environment where… • Nodes can ADD or DELETE content any time (no synchronization) • Nodes can join or leave the system at any time (without any warning) …while limiting the scope of transferred information as much as possible
  • 47. Challenges Propagating messages naively is not sufficient to guarantee consistent partit
  • 48. Lock Down the Traffic of Decentralized Content Indexing at the Edge, B. Nedelec et al., ICA3PP 2022 A preliminary step toward a complete solution • Definitions of the properties that guarantee decentralized consistent partitioning in dynamic infrastructures. • Demonstration that concurrent creation and removal of partitions may impair the propagation of control information • Proposal of a first algorithm solving this dynamic partitioning problem (and its evaluation by simulations)
  • 50. Experimental infrastructures SILECS/SLICES: Super Infrastructure for Large-Scale Experimental Computer Science • The Discipline of Computing: An Experimental Science • Studied objects are more and more complex (Hardware, Systems, Networks, Programs, Protocols, Data, Algorithms, …) • A good experiment should fulfill the following properties • Reproducibility: must give the same result with the same input • Extensibility: must target possible comparisons with other works and extensions (more/other processors, larger data sets, different architectures) • Applicability: must define realistic parameters and must allow for an easy calibration • “Revisability”: when an implementation does not perform as expected, must help to identify the reasons • ACM Artifact Review and Badging
  • 51. SILECS/Grid’5000 • Testbed for research on distributed systems • Born in 2003 from the observation that we need a better and larger testbed • HPC, Grids, P2P, and now Cloud computing, and BigData systems • A complete access to the nodes’ hardware in an exclusive mode (from one node to the whole infrastructure) • Dedicated network (RENATER) • Reconfigurable: nodes with Kadeploy and network with KaVLAN • Current status • 8 sites, 36 clusters, 838 nodes, 15116 cores • Memory: ~100 TiB RAM + 6.0 TiB PMEM, Storage: 1.42 PB (1515 SSDs and 953 HDDs on nodes), 617.0 TFLOPS (excluding GPUs) • Diverse technologies/resources (Intel, AMD, Myrinet, Infiniband, two GPU clusters, energy probes) • Some Experiments examples • In Situ analytics, Big Data Management, • HPC Programming approaches, Batch scheduler optimization • Network modeling and simulation • Energy consumption evaluation • Large virtual machines deployments
  • 52. SILECS/FIT Providing Internet players access to a variety of fixed and mobile technologies and services, thus accelerating the design of advanced technologies for the Future Internet
  • 53. Experiments • Discovering resources from their description • Reconfiguring the testbed to meet experimental needs • Monitoring experiments, extracting and analyzing data • Controlling experiments: API
  • 55. Distributed Storage for a Fog/Edge infrastructure based on a P2P and a Scale-Out NAS
  • 56. FogIoT Orchestrator: an Orchestration System for IoT Applications in Fog Environment
  • 59. Conclusion The disconnection is the norm • High latency, unreliable connections, • Logical partitioning (Edge areas/zones) A (r)evolution of distributed systems and networks? • Algorithms, (distributed) system building blocks should be revised to satisfy geo- distributed constraints • Decentralized vs collaborative (e.g. DHT, network ASes)
  • 60. Questions / THANKS Post-scriptum • We are looking for students, Phd candidates, postdocs, engineers, researchers, associate- professors (AI/infrastructure experts, this is trendy ;-)), use-cases, fundings, collaborations… • We propose … a lot of fun and work! http://stack.inria.fr

Notes de l'éditeur

  1. CDF = cumulative Distribution Function of responses times (3 runs) E2E = Eye to Eye VR < 20 ms
  2. Todo: ordres de grandeurs comparatifs
  3. Une version simplifiée du edge mais pouvons nous déjà opérer une telle infrastructure?
  4. First application: OpenStack Can we operate an edge infrastructure with a single instance (aka. a single controlplane) of OpenStack?
  5. Une version simplifiée du edge mais pouvons nous déjà opérer une telle infrastructure? Good results and Openstack, impossible to provision VMs when having network disconnection One version of OpenStack on every site
  6. Etudier un système biologique (qui évolue tout les 6 mois de maniere drastique)
  7. Nova is the OpenStack project that provides a way to provision compute instances (aka virtual servers). Glance: OpenStack Image service Last scenario when Andy wants to launch a VM instance which is only available on site 2
  8. Having a solution without changing the code
  9. Etudier un système biologique (qui évolue tout les 6 mois de maniere drastique)
  10. Cheops: new dedicated service acting as a proxy
  11. Etudier un système biologique (qui évolue tout les 6 mois de maniere drastique)
  12. Etudier un système biologique (qui évolue tout les 6 mois de maniere drastique)
  13. K8S = Kubernete
  14. ILP = CPLEX, First fit with backtrack, Genetic alg., Xia et al., Choco
  15. If we look to the state of the art, when a client wants to access a specific content, it has to request a remote node to provide at least one node identity to retrieve this content from. After retrieving the content, the client can create another replica to improve the performance of future accesses, but it must then recontact the indexing service to notify of the creation of this new replica. This approach has two drawback. - First, accessing a remote node to request content location(s) raises hot spots and availability issues. But most importantly, it results in additional delays [3,12] that occur even before the actual download started. Second, the client gets a list of content locations at the discretion of content indexing services. Without information about these locations, it often ends up downloading from multiple hosts, yet only keeping the fastest answer. In turn, clients either waste network resources, or face slower response time. A naive approach would be that every node indexes and ranks every live replica along with its location information. When creating or destroying a replica, a node would notify all other nodes by broadcasting its operation This flooding approach is counter performant as a node may acknowledge the existence of replicas at the other side of the network while there already exists a replica next to it. A promising approach would be to use epidemic propagation by limiting the propagation only to a subset of relevant nodes. To better understand this idea, let’s discuss a concrete example.
  16. In this example, we consider a Node R that creates a new content and that efficiently advertises its content by epidemic propagation. At the end of the epidemic phase, every node can request R to get the content if needed Let’s consider that Node G gets the content and creates a second replica splitting the red set in two (now we have a set of nodes in red that should request R and a set in Green that should requests G in order to get the closest replica host. In this example, we consider the geographical distance but the notion of distance can be defined in a more advanced way considering latency, throughput, robustness etc.). Now, let’s consider that Node B creates another replica. Node B needs to notify only a small subset of nodes (resulting a 3 sets, red, green and blue) Finally, let’s consider G destroys its replica. Nodes that belonged to its partition must find the closest partition they are in, resulting at the end in two sets (red and blue). While it makes sense for Node G to broadcast its removal, Node B and Node R cannot afford to continuously advertise their replica to fill the gap left open by Node G. A better approach would consist in triggering notifications at bordering nodes of red and blue partitions once again. In other words, the indexing problem can be seen as a distributed and dynamic partitioning problem.
  17. This dynamic partitioning raises additional challenges related to concurrent operations where removed partitions could block the propagation of other partitions. So the problem that should be tackled is: how can we maintain such a logical partitioning in a dynamic environment where nodes can add…. Node can join… While limiting the scope of the messages between nodes as much as possible (network confinement)
  18. Just to give you an idea of the consistency issue, let’s consider the following example. In the first part (a)): nodes a and c create a new replica of a the same content. The colors illustrate in which partition node belong to (black no partition). So node a belongs to the blue partition and node c to the green one. Both nodes send a creation message to their neighbour (here b). b) Let’s consider node b receives the notification of node a, so it joins the blue partition and forwards the notification towards C (alpha a3,3 since distance equals AB+BC: 2+1). Meanwhile, node a and node c delete the replica and so send a new notification related to the removal to b (respectively deltaA and deltaB). Once again nodes evolve independently from the broadcasted messages. c) The creation notification from c to b (alpha c1) is finally received on b and so b will join the green partition since the distance is better and forward this message to A (alpha c3). The removal message from node a is received on b. Since node b belongs to the green partition, it does not consider the notification related to the removal of the replica sent by node a (deltaA is discarded). Remind that the goal is to mitigate the network traffic as much as possible. Meanwhile, node c receives the initial creation of node a. Since it does not have the replica anymore, it joins the blue partition (here we have a first inconsistency, since b believes its closest replica is on node C while node C believes it is on node a going through node b, which is obviously not possible). Anyway let’s continue the scenario. d) node A receives the initial notification of node c, since it does not have the replica anymore, it joins the green partition (although the content has been already deleted at C but there is no way for node A to be aware of that). Node b receives the removal notification from node c and so leaves the green partition and forwards the notification to node A. e) node A receives the removal notification and leaves in its turn the green partition. f) at the end node C belongs to a partition that does not exist anymore and of C have children, they would stay in the wrong partition too.
  19. Without diving into details, nor presenting the algorithm, the idea is to make echos to creation and removals notifications. If a node receives a notification that it has already proceed and that it knows it is deprecated it will make an echo of the previous message (in the previous case, the removal notification that has been discarded on node B will be triggered once again to node C). For further details please refer to the article.