SlideShare une entreprise Scribd logo
1  sur  6
Télécharger pour lire hors ligne
The Distributed Cloud
A Foundation for Planetary-Scale
Computing
The emergence of computing clouds has put a renewed emphasis
on the issue of scale in computing. The enormous size of the Web,
together with ever-more demanding requirements such as
freshness (results in seconds, not weeks) means that massive
resources are required to handle enormous datasets in a timely
fashion. Datacenters are now considered to be the new units of
computer power, e.g. Google's Warehouse-Scale Computer. The
number of organizations able to deploy such resources is ever
shrinking. Wowd aims to demonstrate that there is an even bigger
scale of computing than that yet imagined -- specifically --
planetary-sized distributed clouds. Such clouds can be deployed
by motivated collections of users, instead of a handful of gigantic
organizations.




                                              Mark	
  Drummond	
  
                                                                     	
  	
  
                                                [Pick	
  the	
  date]	
  
Background

Clouds have emerged as a major trend in computing, as an answer to the ever-
increasing scale of resources required to handle Web-sized tasks. The definition
of cloud is still not firmly established, so let us start with ours. We consider a cloud
to be a collection of computing resources, where it is possible to allocate and
provision additional resources in an incremental and seamless way, with no
disruption to the deployed applications.

In this key respect, a cloud is not simply a group of servers co-located at some
data center since with such a collection it is not simple, nor very clear, how to
deploy additional machines for many tasks. Consider, for example, the task of a
server supporting a Relational Database Management System. A large increase in
the number of records in the database cannot be simply handled only by adding
additional machines since the underlying database needs to be partitioned such
that all underlying operations and queries perform in a satisfactory fashion across
all of the machines. The solution in this situation requires significant re-
engineering of the database application.

Clouds are considered to be collections of machines where it is possible to
dynamically scale and provision additional resources for underlying application(s)
with no change nor disruption to the operation. Some, such as Google, consider
datacenters which are basis for clouds, to be a new form of "warehouse-scale
computer" (source: "The Datacenter as a Computer, Google Inc. 2009)

Clearly, the number of organizations capable of deploying such resources is small,
and getting smaller, due to prohibitive cost.

Consider, as an example, P2P networks. For the longest time, indeed, since the
very inception of P2P, these networks have been asssociated with a rather narrow
scope of activities – principally, sharing of media content. The scale of computing
occuring in such networks every moment is truly staggering. However, there is a
common (mis)perception that such massive distributed systems are good only for
a very limited set of activities, specifically, the sharing of (often illicit) content.

Our goal is to demonstrate that distributed networks can be a basis for
tremendously powerful distributed clouds, quite literally of planetary-scale. At that
scale, the power provided by such a cloud actually dwarfs the power of even the
biggest prioprietary clouds.

Distributed vs. Proprietary Clouds

Planetary-scale distributed clouds have different properties than proprietary
clouds. First, note that proprietary clouds appear to be much more homogeneous
and (very) tightly coupled compared to distributed ones.

In their white paper on datacenter-scale computing, Luiz Andre Barroso and Urs
Hoelzle consider each datacenter to be a monolithic warehouse-scale computer.



                                           2
The key to any computer is coupling and communication bandwidth and latency
among its components. In a datacenter, one might consider individual servers to
be very tightly coupled through a very reliable network. Yet such a network has
severe design limitations, e.g. all machines must communicate with a 1 or 10 Gbps
fixed bandwidth. The network imposes very significant constraints on the scaling
of additional machines in the cloud and places a limit on how far the scaling can
go.

An important point about scaling of proprietary clouds is a very clear distinction
between computing within a single datacenter as opposed to computing across
multiple datacenters. A very good real-world example of this distinction is Google
search: a query is always answered from a single datacenter, never across
multiple ones. In the above-referenced white paper, the authors do not consider
multiple datacenters, viewing it as a set of networked computers, with a view that
in the future they might need to re-examine how they draw boundaries.

We do not enforce such a distinction, and indeed we view the connectivity limits
across datacenters, or individual machines across the planet, as the constraints
among components of a planetary-size computer. As such, we simply have to live
with wide variance in the performance capabilities of the parts of the cloud, much
like the huge speed differential between RAM and disk within an individual
computer.

Proper performance of an individual computer is predicated on good design
balance among its components: CPU, RAM, disk, peripherals and connectivity
constraints among them. The constraints within a datcenter, or a warehouse-
scale computer, are very similar in spirit in that it is essential to achieve a balance
in designing the components and communication channels among them to achieve
optimal performance.

What we are saying is that it is only natural to extend such design thinking and
consideration much broader than a datacenter, on a planetary scale. For example,
the connectivity constraints of individual nodes may be much stricter compared to
a datacenter, yet the aggregate bandwidth and path redundancy of the
communication medium of a planetary-scale computer (the Internet) are vastly
larger.




                                           3
VS




Left: Centralized cloud at small scale; Right: Distributed cloud, at small scale.




                                          VS




Left: Centralized cloud at large scale; Right: Distributed cloud, at large scale.



Aggregate Resources

The resources contained in a distributed cloud are truly staggering. Consider a
collection of 1M users. Such a group, while significant, would not be among the
largest distributed networks in existence today. Consider the case of users
running an application that uses 200MB of RAM on their machines. The aggregate
amount of the RAM available in the system would be 200TB. Assuming a
contribution of 1Mbps of bandwidth per user, the aggregate bandwidth of the
system would be 1000 Gbps. Assuming disk space of 10GB / user, then the
aggregate available disk space would be 10 PB.

Of course, one need not stop at 1M users! Indeed, the largest existing systems,
such as Skype, now surpass 10M simultaneous on-line machines. With the
worldwide Internet population having already passed 1B people, it’s easy to
envision a system with tens of millions or perhaps even hundreds millions of



                                                   4
participants. The computing power of such a system would be truly planetary-
class!

At first, it might seem that such a system would be necessarily unreliable and
inconsistent, due to the fact that any given participant can choose to join the
network or to leave it at any time. But this impression is wrong, as any user of
BitTorrent can attest. In fact, the the aggregate reliability of such a system is
unsurpassed, because of the masive redundancy employed. The key to achieving
this is designing the interaction of system components in a way that leverages
strengths such as aggregate resources and deals effectively with constraints,
such as unreliability and the bandwith limitations of individual nodes and
communication latencies.

New Alternatives

The scaling requirements of more traditional architectures are driving the
development of new approaches. For example, Relational Database Management
Systems have been the backbone of data access for decades. However, their
scaling limitations have resulted in the development of approaches such as key-
value stores. Another example is Random Access Memory (RAM) – the
development of in-memory data grids has arisen from the need to effectively
leverage the aggregate RAM of large collections of machines.

Our Solution

Wowd is building a distributed cloud with the goal of achieving a planetary scale,
with (tens, or hundreds of) millions of participants. We are also developing a set of
key applications to work on that cloud, including search, discovery and
recommendation. We want to demonstrate that a planetary-scale distributed
cloud is the perfect platform for the development of applications able to process
data from the entire web, in real time.

Some of these applications may be surprising, for instance, search. The overall
latency in computing an answer to a search query is a key parameter. It might be
(very) surprising to expect that an answer to a query can be computed in under,
say 1 sec, on a planetary-scale cloud. It turns out with the right design, that this is
entirely possible.

We mentioned in the paragraphs above some of the common and perceived issues
with distributed clouds. We have developed methods to deal with these limitations
very effectively, specifically:

   •   Unreliability of individual nodes is handled by large redundancy.

   •   Individual bandwidth limitations are dealt with by partitioning data into a
       large number of small pieces.

   •   Communication latencies are handled by limiting the number of hops on



                                           5
critical paths.

Conclusion

In summary, our goal is to demonstrate that clouds are not the province of the
chosen and powerful few, but that massive clouds – capable of supporting even
web search – can be created from much more diverse individual machines on a
much wider scale. The aggregate power of such distributed clouds will dwarf the
size and power of proprietary clouds. We strongly believe that the advent of
distributed clouds will usher in a new era of cloud computing. This new era will be
characterized by decentralized and democratic access to computational
resources.




                                         6

Contenu connexe

Tendances

Clouds, Grids and Data
Clouds, Grids and DataClouds, Grids and Data
Clouds, Grids and DataGuy Coates
 
Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Dataijccsa
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESneirew J
 
Cloud computing tarea
Cloud computing tareaCloud computing tarea
Cloud computing tareasaullopes24
 
Demystifying cloud
Demystifying cloudDemystifying cloud
Demystifying cloudsriramr
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
Distributed Computing - Cloud Computing and Other Buzzwords: Implications for...
Distributed Computing - Cloud Computing and Other Buzzwords: Implications for...Distributed Computing - Cloud Computing and Other Buzzwords: Implications for...
Distributed Computing - Cloud Computing and Other Buzzwords: Implications for...Mark Conrad
 
Distributed Large Dataset Deployment with Improved Load Balancing and Perform...
Distributed Large Dataset Deployment with Improved Load Balancing and Perform...Distributed Large Dataset Deployment with Improved Load Balancing and Perform...
Distributed Large Dataset Deployment with Improved Load Balancing and Perform...IJERA Editor
 
A request skew aware heterogeneous distributed
A request skew aware heterogeneous distributedA request skew aware heterogeneous distributed
A request skew aware heterogeneous distributedJoão Gabriel Lima
 
Talk at West Coast Association of Shared Resource Directors
Talk at West Coast Association of Shared Resource DirectorsTalk at West Coast Association of Shared Resource Directors
Talk at West Coast Association of Shared Resource DirectorsDeepak Singh
 
Exploiting dynamic resource allocation for
Exploiting dynamic resource allocation forExploiting dynamic resource allocation for
Exploiting dynamic resource allocation foringenioustech
 
Improving Utilization of Infrastructure Cloud
Improving Utilization of Infrastructure CloudImproving Utilization of Infrastructure Cloud
Improving Utilization of Infrastructure CloudIJASCSE
 
Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...IJSRD
 
How Containers are Becoming The New Basic Currency For Pay as You Go Hybrid IT
How Containers are Becoming The New Basic Currency For Pay as You Go Hybrid ITHow Containers are Becoming The New Basic Currency For Pay as You Go Hybrid IT
How Containers are Becoming The New Basic Currency For Pay as You Go Hybrid ITDana Gardner
 

Tendances (18)

CLOUD BIOINFORMATICS Part1
 CLOUD BIOINFORMATICS Part1 CLOUD BIOINFORMATICS Part1
CLOUD BIOINFORMATICS Part1
 
Clouds, Grids and Data
Clouds, Grids and DataClouds, Grids and Data
Clouds, Grids and Data
 
Data Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big DataData Distribution Handling on Cloud for Deployment of Big Data
Data Distribution Handling on Cloud for Deployment of Big Data
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
 
A REVIEW ON RESOURCE ALLOCATION MECHANISM IN CLOUD ENVIORNMENT
A REVIEW ON RESOURCE ALLOCATION MECHANISM IN CLOUD ENVIORNMENTA REVIEW ON RESOURCE ALLOCATION MECHANISM IN CLOUD ENVIORNMENT
A REVIEW ON RESOURCE ALLOCATION MECHANISM IN CLOUD ENVIORNMENT
 
Cloud computing tarea
Cloud computing tareaCloud computing tarea
Cloud computing tarea
 
Demystifying cloud
Demystifying cloudDemystifying cloud
Demystifying cloud
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Distributed Computing - Cloud Computing and Other Buzzwords: Implications for...
Distributed Computing - Cloud Computing and Other Buzzwords: Implications for...Distributed Computing - Cloud Computing and Other Buzzwords: Implications for...
Distributed Computing - Cloud Computing and Other Buzzwords: Implications for...
 
Distributed Large Dataset Deployment with Improved Load Balancing and Perform...
Distributed Large Dataset Deployment with Improved Load Balancing and Perform...Distributed Large Dataset Deployment with Improved Load Balancing and Perform...
Distributed Large Dataset Deployment with Improved Load Balancing and Perform...
 
A request skew aware heterogeneous distributed
A request skew aware heterogeneous distributedA request skew aware heterogeneous distributed
A request skew aware heterogeneous distributed
 
Talk at West Coast Association of Shared Resource Directors
Talk at West Coast Association of Shared Resource DirectorsTalk at West Coast Association of Shared Resource Directors
Talk at West Coast Association of Shared Resource Directors
 
Exploiting dynamic resource allocation for
Exploiting dynamic resource allocation forExploiting dynamic resource allocation for
Exploiting dynamic resource allocation for
 
CLOUD COMPUTING
CLOUD COMPUTINGCLOUD COMPUTING
CLOUD COMPUTING
 
Improving Utilization of Infrastructure Cloud
Improving Utilization of Infrastructure CloudImproving Utilization of Infrastructure Cloud
Improving Utilization of Infrastructure Cloud
 
Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...Survey on Division and Replication of Data in Cloud for Optimal Performance a...
Survey on Division and Replication of Data in Cloud for Optimal Performance a...
 
WJCAT2-13707877
WJCAT2-13707877WJCAT2-13707877
WJCAT2-13707877
 
How Containers are Becoming The New Basic Currency For Pay as You Go Hybrid IT
How Containers are Becoming The New Basic Currency For Pay as You Go Hybrid ITHow Containers are Becoming The New Basic Currency For Pay as You Go Hybrid IT
How Containers are Becoming The New Basic Currency For Pay as You Go Hybrid IT
 

En vedette

Real Time Search
Real Time SearchReal Time Search
Real Time SearchWowd
 
CSS3 cores e transparencia
CSS3 cores e transparenciaCSS3 cores e transparencia
CSS3 cores e transparenciagabrielaspinola
 
Pa waalsprong presentation_20101007_v2
Pa waalsprong presentation_20101007_v2Pa waalsprong presentation_20101007_v2
Pa waalsprong presentation_20101007_v2Eugene Borshch
 
How To Present An Apg Paper Final
How To Present An Apg Paper FinalHow To Present An Apg Paper Final
How To Present An Apg Paper FinalLoz Horner
 
Mastering Mock Objects - Advanced Unit Testing for Java
Mastering Mock Objects - Advanced Unit Testing for JavaMastering Mock Objects - Advanced Unit Testing for Java
Mastering Mock Objects - Advanced Unit Testing for JavaDenilson Nastacio
 

En vedette (9)

Real Time Search
Real Time SearchReal Time Search
Real Time Search
 
CSS3 cores e transparencia
CSS3 cores e transparenciaCSS3 cores e transparencia
CSS3 cores e transparencia
 
Sosushi formazione
Sosushi formazioneSosushi formazione
Sosushi formazione
 
Pa waalsprong presentation_20101007_v2
Pa waalsprong presentation_20101007_v2Pa waalsprong presentation_20101007_v2
Pa waalsprong presentation_20101007_v2
 
Agile enterprise
Agile enterpriseAgile enterprise
Agile enterprise
 
How To Present An Apg Paper Final
How To Present An Apg Paper FinalHow To Present An Apg Paper Final
How To Present An Apg Paper Final
 
Agile dashboard
Agile dashboardAgile dashboard
Agile dashboard
 
Mastering Mock Objects - Advanced Unit Testing for Java
Mastering Mock Objects - Advanced Unit Testing for JavaMastering Mock Objects - Advanced Unit Testing for Java
Mastering Mock Objects - Advanced Unit Testing for Java
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 

Similaire à The Distributed Cloud

云计算及其应用
云计算及其应用云计算及其应用
云计算及其应用lantianlcdx
 
Comparative study of Data management for cloud computing deployment
Comparative study of Data management for cloud computing deploymentComparative study of Data management for cloud computing deployment
Comparative study of Data management for cloud computing deploymentAkanksha Chandel
 
Cidr11 paper32
Cidr11 paper32Cidr11 paper32
Cidr11 paper32jujukoko
 
Megastore providing scalable, highly available storage for interactive services
Megastore providing scalable, highly available storage for interactive servicesMegastore providing scalable, highly available storage for interactive services
Megastore providing scalable, highly available storage for interactive servicesJoão Gabriel Lima
 
Cloud computing overview
Cloud computing overviewCloud computing overview
Cloud computing overviewKHANSAFEE
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL TechnologiesAmit Singh
 
50 C o m m u n i C At i o n S o f t h E A C m A P.docx
50    C o m m u n i C At i o n S  o f  t h E  A C m       A P.docx50    C o m m u n i C At i o n S  o f  t h E  A C m       A P.docx
50 C o m m u n i C At i o n S o f t h E A C m A P.docxalinainglis
 
Cloud ready reference
Cloud ready referenceCloud ready reference
Cloud ready referenceHelly Patel
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data CentersGina Buck
 
A viewof cloud computing
A viewof cloud computingA viewof cloud computing
A viewof cloud computingpurplesea
 
AViewofCloudComputing.ppt
AViewofCloudComputing.pptAViewofCloudComputing.ppt
AViewofCloudComputing.pptMrGopirajanPV
 
A View of Cloud Computing.ppt
A View of Cloud Computing.pptA View of Cloud Computing.ppt
A View of Cloud Computing.pptAriaNasi
 
Cloud computing..
Cloud computing..Cloud computing..
Cloud computing..manoj kumar
 
CouchBase The Complete NoSql Solution for Big Data
CouchBase The Complete NoSql Solution for Big DataCouchBase The Complete NoSql Solution for Big Data
CouchBase The Complete NoSql Solution for Big DataDebajani Mohanty
 

Similaire à The Distributed Cloud (20)

云计算及其应用
云计算及其应用云计算及其应用
云计算及其应用
 
Scaling Up vs. Scaling-out
Scaling Up vs. Scaling-outScaling Up vs. Scaling-out
Scaling Up vs. Scaling-out
 
Comparative study of Data management for cloud computing deployment
Comparative study of Data management for cloud computing deploymentComparative study of Data management for cloud computing deployment
Comparative study of Data management for cloud computing deployment
 
Cidr11 paper32
Cidr11 paper32Cidr11 paper32
Cidr11 paper32
 
Megastore providing scalable, highly available storage for interactive services
Megastore providing scalable, highly available storage for interactive servicesMegastore providing scalable, highly available storage for interactive services
Megastore providing scalable, highly available storage for interactive services
 
Cloud computing overview
Cloud computing overviewCloud computing overview
Cloud computing overview
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL Technologies
 
Cooper1
Cooper1Cooper1
Cooper1
 
50 C o m m u n i C At i o n S o f t h E A C m A P.docx
50    C o m m u n i C At i o n S  o f  t h E  A C m       A P.docx50    C o m m u n i C At i o n S  o f  t h E  A C m       A P.docx
50 C o m m u n i C At i o n S o f t h E A C m A P.docx
 
Cloud ready reference
Cloud ready referenceCloud ready reference
Cloud ready reference
 
computing
computingcomputing
computing
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data Centers
 
A viewof cloud computing
A viewof cloud computingA viewof cloud computing
A viewof cloud computing
 
AViewofCloudComputing.ppt
AViewofCloudComputing.pptAViewofCloudComputing.ppt
AViewofCloudComputing.ppt
 
AViewofCloudComputing.ppt
AViewofCloudComputing.pptAViewofCloudComputing.ppt
AViewofCloudComputing.ppt
 
A View of Cloud Computing.ppt
A View of Cloud Computing.pptA View of Cloud Computing.ppt
A View of Cloud Computing.ppt
 
Cloud vs grid
Cloud vs gridCloud vs grid
Cloud vs grid
 
Cloud computing..
Cloud computing..Cloud computing..
Cloud computing..
 
CouchBase The Complete NoSql Solution for Big Data
CouchBase The Complete NoSql Solution for Big DataCouchBase The Complete NoSql Solution for Big Data
CouchBase The Complete NoSql Solution for Big Data
 

Dernier

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 

Dernier (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

The Distributed Cloud

  • 1. The Distributed Cloud A Foundation for Planetary-Scale Computing The emergence of computing clouds has put a renewed emphasis on the issue of scale in computing. The enormous size of the Web, together with ever-more demanding requirements such as freshness (results in seconds, not weeks) means that massive resources are required to handle enormous datasets in a timely fashion. Datacenters are now considered to be the new units of computer power, e.g. Google's Warehouse-Scale Computer. The number of organizations able to deploy such resources is ever shrinking. Wowd aims to demonstrate that there is an even bigger scale of computing than that yet imagined -- specifically -- planetary-sized distributed clouds. Such clouds can be deployed by motivated collections of users, instead of a handful of gigantic organizations. Mark  Drummond       [Pick  the  date]  
  • 2. Background Clouds have emerged as a major trend in computing, as an answer to the ever- increasing scale of resources required to handle Web-sized tasks. The definition of cloud is still not firmly established, so let us start with ours. We consider a cloud to be a collection of computing resources, where it is possible to allocate and provision additional resources in an incremental and seamless way, with no disruption to the deployed applications. In this key respect, a cloud is not simply a group of servers co-located at some data center since with such a collection it is not simple, nor very clear, how to deploy additional machines for many tasks. Consider, for example, the task of a server supporting a Relational Database Management System. A large increase in the number of records in the database cannot be simply handled only by adding additional machines since the underlying database needs to be partitioned such that all underlying operations and queries perform in a satisfactory fashion across all of the machines. The solution in this situation requires significant re- engineering of the database application. Clouds are considered to be collections of machines where it is possible to dynamically scale and provision additional resources for underlying application(s) with no change nor disruption to the operation. Some, such as Google, consider datacenters which are basis for clouds, to be a new form of "warehouse-scale computer" (source: "The Datacenter as a Computer, Google Inc. 2009) Clearly, the number of organizations capable of deploying such resources is small, and getting smaller, due to prohibitive cost. Consider, as an example, P2P networks. For the longest time, indeed, since the very inception of P2P, these networks have been asssociated with a rather narrow scope of activities – principally, sharing of media content. The scale of computing occuring in such networks every moment is truly staggering. However, there is a common (mis)perception that such massive distributed systems are good only for a very limited set of activities, specifically, the sharing of (often illicit) content. Our goal is to demonstrate that distributed networks can be a basis for tremendously powerful distributed clouds, quite literally of planetary-scale. At that scale, the power provided by such a cloud actually dwarfs the power of even the biggest prioprietary clouds. Distributed vs. Proprietary Clouds Planetary-scale distributed clouds have different properties than proprietary clouds. First, note that proprietary clouds appear to be much more homogeneous and (very) tightly coupled compared to distributed ones. In their white paper on datacenter-scale computing, Luiz Andre Barroso and Urs Hoelzle consider each datacenter to be a monolithic warehouse-scale computer. 2
  • 3. The key to any computer is coupling and communication bandwidth and latency among its components. In a datacenter, one might consider individual servers to be very tightly coupled through a very reliable network. Yet such a network has severe design limitations, e.g. all machines must communicate with a 1 or 10 Gbps fixed bandwidth. The network imposes very significant constraints on the scaling of additional machines in the cloud and places a limit on how far the scaling can go. An important point about scaling of proprietary clouds is a very clear distinction between computing within a single datacenter as opposed to computing across multiple datacenters. A very good real-world example of this distinction is Google search: a query is always answered from a single datacenter, never across multiple ones. In the above-referenced white paper, the authors do not consider multiple datacenters, viewing it as a set of networked computers, with a view that in the future they might need to re-examine how they draw boundaries. We do not enforce such a distinction, and indeed we view the connectivity limits across datacenters, or individual machines across the planet, as the constraints among components of a planetary-size computer. As such, we simply have to live with wide variance in the performance capabilities of the parts of the cloud, much like the huge speed differential between RAM and disk within an individual computer. Proper performance of an individual computer is predicated on good design balance among its components: CPU, RAM, disk, peripherals and connectivity constraints among them. The constraints within a datcenter, or a warehouse- scale computer, are very similar in spirit in that it is essential to achieve a balance in designing the components and communication channels among them to achieve optimal performance. What we are saying is that it is only natural to extend such design thinking and consideration much broader than a datacenter, on a planetary scale. For example, the connectivity constraints of individual nodes may be much stricter compared to a datacenter, yet the aggregate bandwidth and path redundancy of the communication medium of a planetary-scale computer (the Internet) are vastly larger. 3
  • 4. VS Left: Centralized cloud at small scale; Right: Distributed cloud, at small scale. VS Left: Centralized cloud at large scale; Right: Distributed cloud, at large scale. Aggregate Resources The resources contained in a distributed cloud are truly staggering. Consider a collection of 1M users. Such a group, while significant, would not be among the largest distributed networks in existence today. Consider the case of users running an application that uses 200MB of RAM on their machines. The aggregate amount of the RAM available in the system would be 200TB. Assuming a contribution of 1Mbps of bandwidth per user, the aggregate bandwidth of the system would be 1000 Gbps. Assuming disk space of 10GB / user, then the aggregate available disk space would be 10 PB. Of course, one need not stop at 1M users! Indeed, the largest existing systems, such as Skype, now surpass 10M simultaneous on-line machines. With the worldwide Internet population having already passed 1B people, it’s easy to envision a system with tens of millions or perhaps even hundreds millions of 4
  • 5. participants. The computing power of such a system would be truly planetary- class! At first, it might seem that such a system would be necessarily unreliable and inconsistent, due to the fact that any given participant can choose to join the network or to leave it at any time. But this impression is wrong, as any user of BitTorrent can attest. In fact, the the aggregate reliability of such a system is unsurpassed, because of the masive redundancy employed. The key to achieving this is designing the interaction of system components in a way that leverages strengths such as aggregate resources and deals effectively with constraints, such as unreliability and the bandwith limitations of individual nodes and communication latencies. New Alternatives The scaling requirements of more traditional architectures are driving the development of new approaches. For example, Relational Database Management Systems have been the backbone of data access for decades. However, their scaling limitations have resulted in the development of approaches such as key- value stores. Another example is Random Access Memory (RAM) – the development of in-memory data grids has arisen from the need to effectively leverage the aggregate RAM of large collections of machines. Our Solution Wowd is building a distributed cloud with the goal of achieving a planetary scale, with (tens, or hundreds of) millions of participants. We are also developing a set of key applications to work on that cloud, including search, discovery and recommendation. We want to demonstrate that a planetary-scale distributed cloud is the perfect platform for the development of applications able to process data from the entire web, in real time. Some of these applications may be surprising, for instance, search. The overall latency in computing an answer to a search query is a key parameter. It might be (very) surprising to expect that an answer to a query can be computed in under, say 1 sec, on a planetary-scale cloud. It turns out with the right design, that this is entirely possible. We mentioned in the paragraphs above some of the common and perceived issues with distributed clouds. We have developed methods to deal with these limitations very effectively, specifically: • Unreliability of individual nodes is handled by large redundancy. • Individual bandwidth limitations are dealt with by partitioning data into a large number of small pieces. • Communication latencies are handled by limiting the number of hops on 5
  • 6. critical paths. Conclusion In summary, our goal is to demonstrate that clouds are not the province of the chosen and powerful few, but that massive clouds – capable of supporting even web search – can be created from much more diverse individual machines on a much wider scale. The aggregate power of such distributed clouds will dwarf the size and power of proprietary clouds. We strongly believe that the advent of distributed clouds will usher in a new era of cloud computing. This new era will be characterized by decentralized and democratic access to computational resources. 6