SlideShare a Scribd company logo
1 of 15
Download to read offline
 




Resource Overbooking and Application Profiling in Shared Hosting Platforms
                                                                                                                     ¡




                                    Bhuvan Urgaonkar, Prashant Shenoy                      and Timothy Roscoe
                                                                                       ¡




                            Department of Computer Science                              Intel Research at Berkeley
                              University of Massachusetts                           2150 Shattuck Avenue Suite 1300
                             ¢
                                  Amherst MA 01003                                         Berkeley CA 94704
                             bhuvan,shenoy @cs.umass.edu
                                                     £                                 troscoe@intel-research.net


Abstract                                                                            search engine), or each individual processing element in the
                                                                                    cluster is dedicated to a single application (as in the “man-
In this paper, we present techniques for provisioning CPU and                       aged hosting” services provided by some data centers). In
network resources in shared hosting platforms running poten-                        contrast, shared hosting platforms run a large number of dif-
tially antagonistic third-party applications. The primary con-                      ferent third-party applications (web servers, streaming media
tribution of our work is to demonstrate the feasibility and ben-                    servers, multi-player game servers, e-commerce applications,
efits of overbooking resources in shared platforms. Since an                         etc.), and the number of applications typically exceeds the num-
accurate estimate of an application’s resource needs is nec-                        ber of nodes in the cluster. More specifically, each application
essary when overbooking resources, we present techniques to                         runs on a subset of the nodes and these subsets can overlap
profile applications on dedicated nodes, possibly while in ser-                      with one another. Whereas dedicated hosting platforms are used
vice, and use these profiles to guide the placement of appli-                        for many niche applications that warrant their additional cost,
cation components onto shared nodes. We then propose tech-                          the economic reasons of space, power, cooling and cost make
niques to overbook cluster resources in a controlled fashion                        shared hosting platforms an attractive choice for many applica-
such that the platform can provide performance guarantees to                        tion hosting environments.
applications even when overbooked. We show how our tech-                               Shared hosting platforms imply a business relationship be-
niques can be combined with commonly used QoS resource al-                          tween the platform provider and application providers: the lat-
location mechanisms to provide application isolation and per-                       ter pay the former for resources on the platform. In return,
formance guarantees at run-time. We implement our techniques                        the platform provider gives some kind of guarantee of resource
in a Linux cluster and evaluate them using common server ap-                        availability to applications[20].
plications. We find that the efficiency (and consequently rev-                           Perhaps the central challenge in building such a shared host-
enue) benefits from controlled overbooking of resources can be                       ing platform is resource management: the ability to reserve
dramatic. Specifically, we find that overbooking resources by                         resources for individual applications, the ability to isolate ap-
as little as 1% we can increase the utilization of the cluster by                   plications from other misbehaving or overloaded applications,
a factor of two, and a 5% overbooking yields a 300-500% im-                         and the ability to provide performance guarantees to applica-
provement, while still providing useful resource guarantees to                      tions. Arguably, the widespread deployment of shared hosting
applications.                                                                       platforms has been hampered by the lack of effective resource
                                                                                    management mechanisms that meet these requirements. Conse-
1       Introduction and Motivation                                                 quently, most hosting platforms in use today adopt one of two
                                                                                    approaches.
Server clusters built using commodity hardware and software                            The first avoids resource sharing altogether by employing a
are an increasingly attractive alternative to traditional large                     dedicated model. This delivers useful resources to application
multiprocessor servers for many applications, in part due to                        providers, but is expensive in machine resources. The second
rapid advances in computing technologies and falling hardware                       approach is to share resources in a best-effort manner among
prices.                                                                             applications, which consequently receive no resource guaran-
   This paper addresses challenges in the design of a particular                    tees. While this is cheap in resources, the value delivered to ap-
type of server cluster we call shared hosting platforms. These                      plication providers is limited. Consequently, both approaches
can be contrasted with dedicated hosting platforms, where ei-                       imply an economic disincentive to deploy viable hosting plat-
ther the entire cluster runs a single application (such as a web                    forms.
    ¤
                                                                                       Some recent research efforts have proposed resource man-
     Portions of this research were done when Timothy Roscoe was a researcher
at Sprint ATL and Bhuvan Urgaonkar was a summer intern at Sprint ATL. This
                                                                                    agement mechanisms for shared hosting platforms [2, 7, 29].
research was supported in part by NSF grants CCR-9984030, EIA-0080119 and           While these efforts take an initial step towards the design of ef-
a gift from Sprint Corporation.                                                     fective shared hosting platforms, many challenges remain to be

                                                                                1
addressed. This is the context of the present work.                       a high percentile of the application needs yields statistical mul-
                                                                          tiplexing gains that significantly increase the average utilization
                                                                          of the cluster at the expense of a small amount of overbooking,
1.1 Research Contributions
                                                                          and increases the number of applications that can be supported
The contribution of this paper is threefold. First, we show how           on a given hardware configuration.
the resource requirements of an application can be derived us-               A well-designed hosting platform should be able to provide
ing online profiling and modeling. Second, we demonstrate                  performance guarantees to applications even when overbooked,
the efficiency benefits to the platform provider of overbooking             with the proviso that this guarantee is now probabilistic instead
resources on the platform, and how this can be usefully done              of deterministic (for instance, an application might be provided
without adversely impacting the guarantees offered to applica-            a 99% guarantee (0.99 probability) that its resource needs will
tion providers. Thirdly, we show how untrusted and/or mutually            be met). Since different applications have different tolerance
antagonistic applications in the platform can be isolated from            to such overbooking (e.g., the latency requirements of a game
one another.                                                              server make it less tolerant to violations of performance guar-
                                                                          antees than a web server), an overbooking mechanism should
                                                                          take into account diverse application needs.
Automatic derivation of QoS requirements
                                                                             The primary contribution of this paper is to demonstrate the
Recent work on resource management mechanisms for clusters                feasibility and benefits of overbooking resources in shared host-
(e.g. [2, 29]) implicitly assumes that the resource requirements          ing platforms. We propose techniques to overbook (i.e. under-
of an application are either known in advance or can be de-               provision) resources in a controlled fashion based on applica-
rived, but does not specifically address the problem of how                tion resource needs. Although such overbooking can result in
to determine these requirements. However, the effectiveness               transient overloads where the aggregate resource demand tem-
of these techniques is crucially dependent on the ability to re-          porarily exceeds capacity, our techniques limit the chances of
serve appropriate amount of resources for each application—               transient overload of resources to predictably rare occasions,
overestimating an application’s resource needs can result in              and provide useful performance guarantees to applications in
idling of resources, while underestimating them can degrade               the presence of overbooking.
application performance.                                                     The techniques we describe are general enough to work with
    A shared hosting platform can significantly enhance its util-          many commonly used OS resource allocation mechanisms. Ex-
ity to users by automatically deriving the QoS requirements of            perimental results demonstrate that overbooking resources by
an application. Automatic derivation of QoS requirements in-              amounts as small as 1% yields a factor of two increase in the
volves (i) monitoring an application’s resource usage, and (ii)           number of applications supported by a given platform config-
using these statistics to derive QoS requirements that conform            uration, while a 5-10% overbooking yields a 300-500% in-
to the observed behavior.                                                 crease in effective platform capacity. In general, we find that
    In this paper, we employ kernel-based profiling mechanisms             the more bursty the application resource needs, the higher are
to empirically monitor an application’s resource usage and pro-           the benefits of resource overbooking. We also find that collo-
pose techniques to derive QoS requirements from this observed             cating CPU-bound and network-bound applications as well as
behavior. We then use these techniques to experimentally pro-             bursty and non-bursty applications yields additional multiplex-
file several server applications such as web, streaming, game,             ing gains when overbooking resources.
and database servers. Our results show that the bursty resource
usage of server applications makes it feasible to extract statisti-       Placement and isolation of antagonistic applications
cal multiplexing gains by overbooking resources on the hosting
platform.                                                                 In a shared hosting platform, it is assumed that third-party ap-
                                                                          plications may be antagonistic to each other and/or the platform
Revenue maximization through overbooking                                  itself, either through malice or bugs. A hosting platform should
                                                                          address these issues by isolating applications from one another
The goal of the owner of a hosting platform is to maximize rev-           and preventing malicious, misbehaving, or overloaded applica-
enue, which implies that the cluster should strive to maximize            tions from affecting the performance of other applications.
the number of applications that can be housed on a given hard-                A third contribution of our work is to demonstrate how un-
ware configuration. One approach is to overbook resources—a                trusted third-party applications can be isolated from one an-
technique routinely used to maximize yield in airline reserva-            other in shared hosting platforms. This isolation takes two
tion systems [24].                                                        forms. Local to a machine, each processing node in the plat-
   Provisioning cluster resources solely based on the worst-              form employs resource management techniques that “sandbox”
case needs of an application results in low average utilization,          applications by restricting the resources consumed by an ap-
since the average resource requirements of an application are             plication to its reserved share. Globally, the process of place-
typically smaller than its worst case (peak) requirements, and            ment, whereby components of an application are assigned to
resources tend to idle when the application does not utilize its          individual processing nodes, can be constrained by externally
peak reserved share. In contrast, provisioning a cluster based on         imposed policies—for instance, by risk assessments made by

                                                                      2
App C                                        on a single node, then the application will consist of a single
                      App A         App B                              capsule with all three components. On the other hand, if each
                                                cpu
                                                                       component needs to be placed on a different node, then the ap-
                                                NIC
                                                                       plication should be partitioned into three capsules. Depending
            Cluster Interconnect (gigabit ethernet)
                                                                       on the number of its capsules, each application runs on a subset
                                                                       of the platform nodes and these subsets can overlap with one
                                                                       another, resulting in resource sharing (see Figure 1).
                                                                           The rest of this paper is structured as follows. Section 2 dis-
              App E         App F       App G
                                                                       cusses techniques for empirically deriving an application’s re-
                                App H
                                                                       source needs, while Section 3 discusses our resource overbook-
                                                                       ing techniques and capsule placement strategies. We discuss
                                                                       implementation issues in Section 4 and present our experimen-
Figure 1: Architecture of a shared hosting platform. Each ap-
                                                                       tal results in Section 5. Section 6 discusses related work, and
plication runs on one or more nodes and shares resources with
                                                                       finally, Section 7 presents concluding remarks.
other applications.

                                                                       2       Automatic Derivation of Application QoS
the provider about each application, which may prevent an ap-                  Requirements
plication from being collocated with certain other applications.
Since a manual placement of applications onto nodes is infea-          The first step in hosting a new application is to derive its re-
sibly complex in large clusters, the design of automated place-        source requirements. While the problem of QoS-aware re-
ment techniques that allow a platform provider to exert suffi-          source management has been studied extensively in the litera-
cient control over the placement process is a key issue.               ture [4, 8, 14, 15], the problem of how much resource to allocate
                                                                       to each application has received relatively little attention. In this
1.2 System Model and Terminology                                       section, we address this issue by proposing techniques to auto-
                                                                       matically derive the QoS requirements of an application (the
The shared hosting platform assumed in our research consists           terms resource requirements and QoS requirements are used in-
of a cluster of    nodes, each of which consists of processor,         terchangeably in this paper.) Deriving the QoS requirements
memory, and storage resources as well as one or more network           is a two step process: (i) we first use profiling techniques to
interfaces. Platform nodes are allowed to be heterogeneous             monitor application behavior, and (ii) we then use our empiri-
with different amounts of these resources on each node. The            cal measurements to derive QoS requirements that conform to
nodes in the hosting platform are assumed to be interconnected         the observed behavior.
by a high-speed LAN such as gigabit ethernet (see Figure 1).
Each cluster node is assumed to run an operating system kernel
that supports some notion of quality of service such as reserva-       2.1 Application QoS Requirements: Definitions
tions or shares. Such mechanisms have been extensively stud-           The QoS requirements of an application are defined on a per-
ied over the past decade and many deployed commercial and              capsule basis. For each capsule, the QoS requirements specify
open-source operating systems such as Solaris [26], IRIX [22],         the intrinsic rate of resource usage, the variability in the re-
Linux [27], and FreeBSD [5] already support such features. In          source usage, the time period over which the capsule desires
this paper, we primarily focus on managing two resources—              resource guarantees, and the level of overbooking that the ap-
CPU and network interface bandwidth—in shared hosting plat-            plication (capsule) is willing to tolerate. As explained earlier,
forms. The challenges of managing other resources in host-             in this paper, we are concerned with two key resources, namely
ing environments, such as memory and storage, are beyond the           CPU and network interface bandwidth. For each of these re-
scope of this paper. Nevertheless, we believe the techniques           sources, we define the QoS requirements along the above di-
developed here are also applicable to these other resources.           mensions in an OS-independent manner. In Section 4.1, we
   We use the term application for a complete service running          show how to map these requirements to various OS-specific re-
on behalf of an application provider; since an application will        source management mechanisms that have been developed.
frequently consist of multiple distributed components, we use              More formally, we represent the QoS requirements of an ap-
the term capsule (borrowed from [13]) to refer to that compo-          plication capsule by a quintuple                :
                                                                                                             §¨§¥£¡
                                                                                                              ¤  ¤ © ¤ ¦ ¤ ¢
nent of an application running on a single node. Each applica-
tion has at least one capsule, possibly more if the application            
                                                                               Token Bucket Parameters     !¦ ¥¡
                                                                                                           ¤ ¢  : We capture the basic re-
is distributed. Capsules provide a useful abstraction for logi-                source requirements of a capsule by modeling resource us-
cally partitioning an application into sub-components and for                  age as a token bucket    §¥¡
                                                                                                        ¦ ¤ ¢ [28]. The parameter de-
                                                                                                                                     ¢
exerting control over the distribution of these components onto                notes the intrinsic rate of resource consumption, while    ¦
different nodes. To illustrate, consider an e-commerce applica-                denotes the variability in the resource consumption. More
tion consisting of a web server, a Java application server and a               specifically, denotes the rate at which the capsule con-
                                                                                            ¢
database server. If all three components need to be collocated                 sumes CPU cycles or network interface bandwidth, while

                                                                   3
¦ captures the maximum burst size. By definition, a token                                                            mechanisms ranging from OS-kernel-based profiling [1] to run-
      bucket bounds the resource usage of the capsule to
                                                                                                          ¤¢ ¡¢
                                                                                                         £        ¦       time profiling using specially linked libraries [23] have been
      over any interval .
                              ¢                                                                                           proposed.
  
      Period : The third parameter denotes the time pe-
                  ©                                         ©
                                                                                                                              We use kernel-based profiling mechanisms in the context of
      riod over which the capsule desires guarantees on resource
                                                                                                                          shared hosting platforms, for a number of reasons. Firstly, be-
      availability. Put another way, the system should strive to
                                                                                                                          ing kernel-based, these mechanisms work with any application
      meet the QoS requirements of the capsule over each inter-
                                                                                                                          and require no changes to the application at the source or bi-
      val of length . The smaller the value of , the more strin-
                      ©                                                  ©
                                                                                                                          nary levels. This is especially important in hosting environ-
      gent are the desired guarantees (since the capsule needs to
                                                                                                                          ments where the platform provider may have little or no ac-
      be guaranteed resources over a finer time scale). In par-
                                                                                                                          cess to third-party applications. Secondly, accurate estimation
      ticular, for the above token bucket parameters, the capsule
                                                                £ © ¥¢                                                    of an application’s resource needs requires detailed information
      requires that it be allocated at least      resources every            ¦
                                                                                                                          about when and how much resources are used by the application
      ©  time units.
                                                                                                                          at a fine time-scale. Whereas detailed resource allocation infor-
  
      Usage Distribution : While the token bucket parameters
                                                                                                                         mation is difficult to obtain using application-level techniques,
      succinctly capture the capsule’s resource requirements,                                                             kernel-based techniques can provide precise information about
      they are not sufficiently expressive by themselves to de-                                                            various kernel events such as CPU scheduling instances and
      note the QoS requirements in the presence of overbook-                                                              network packet transmissions times.
      ing. Consequently, we use two additional parameters—
          and —to specify resource requirements in the pres-
                                                                                                                            The profiling process involves running the application on a
      ence of overbooking. The first parameter denotes the                                                                set of isolated platform nodes (the number of nodes required
      probability distribution of resource usage. Note that is a                                                         for profiling depends on the number of capsules). By isolated,
      more detailed specification of resource usage than the to-                                                           we mean that each node runs only the minimum number of
      ken bucket parameters          , and indicates the probability
                                                   !¦ ¥¡
                                                   ¤ ¢                                                                   system services necessary for executing the application and no
      with which the capsule is likely to use a certain fraction of                                                       other applications are run on these nodes during the profiling
      the resource (i.e.,       is the probability that the capsule
                                         ¦
                                           §
                                            ¥ ¡                                                                           process—such isolation is necessary to minimize interference
      uses a fraction of the resource,
                          ¦                              ). A prob-
                                                                 ¨
                                                                 © ¦ ©                                                   from unrelated tasks when determining the application’s re-
      ability distribution of resource usage is necessary so that                                                         source usage. The application is then subjected to a realistic
      the hosting platform can provide (quantifiable) probabilis-                                                          workload, and the kernel profiling mechanism is used to track
      tic guarantees even in the presence of overbooking.                                                                 its resource usage. It is important to emphasize that the work-
                                                                                                                          load used during profiling should be both realistic and represen-
  
      Overbooking Tolerance : The parameter is the over-
                                                                                                
                                                                                                                          tative of real-world workloads. While techniques for generating
      booking tolerance of the capsule. It specifies the probabil-                                                         such realistic workloads are orthogonal to our current research,
      ity with which the capsule’s requirements may be violated                                                           we note that a number of different workload-generation tech-
      due to resource overbooking (by providing it with less re-                                                          niques exist, ranging from trace replay of actual workloads to
      sources than the required amount). Thus, the overbooking                                                            running the application in a “live” setting, and from the use of
      tolerance indicates the minimum level of service that is                                                            synthetic workload generators to the use of well-known bench-
      acceptable to the capsule. To illustrate, if            , the                       ¨ 
                                                                                             ¨                          marks. Any such technique suffices for our purpose as long as
      capsule’s resource requirements should be met 99% of the                                                            it realistically emulates real-world conditions, although we note
      time (or with a probability of 0.99 in each interval ).                                            ©
                                                                                                                          that, from a business perspective, running the application “for
   In general, we assume that parameters and are specified         ©                  
                                                                                                                          real” on an isolated machine to obtain a profile may be prefer-
by the application provider. This may be based on a contract be-                                                          able to other workload generations techniques.
tween the platform provider and the application provider (e.g.,
the more the application provider is willing to pay for resources,                                                           We use the Linux trace toolkit as our kernel profiling mech-
the stronger are the provided guarantees), or on the particular                                                           anism [16]. The toolkit provides flexible, low-overhead mech-
characteristics of the application (e.g., a streaming media server                                                        anisms to trace a variety of kernel events such as system call
requires more stringent guarantees and is less tolerant to viola-                                                         invocations, process, memory, file system and network opera-
tions of these guarantees). In the rest of this section, we show                                                          tions. The user can specify the specific kernel events of interest
how to derive the remaining three parameters , and using                         ¢           ¦                           as well as the processes that are being profiled to selectively
profiling, given values of and .           ©                                                                              log events. For our purposes, it is sufficient to monitor CPU
                                                                                                                          and network activity of capsule processes—we monitor CPU
                                                                                                                          scheduling instances (the time instants at which capsule pro-
2.2 Kernel-based Profiling of Resource Usage
                                                                                                                          cesses get scheduled and the corresponding quantum durations)
Our techniques for empirically deriving the QoS requirements                                                              as well as network transmission times and packet sizes. Given
of an application rely on profiling mechanisms that monitor ap-                                                            such a trace of CPU and network activity, we now discuss the
plication behavior. Recently, a number of application profiling                                                            derivation of the capsule’s QoS requirements.

                                                                                                                      4
Begin CPU quantum/Network transmission
                                                                                                                                    Our objective is to find a line        that bounds the cumulative
                                                                                                                                                                                                           £ ¢ ¥¢
                                                                                                                                                                                                                    ¦
                                                   End CPU quantum/Network transmission                                             resource usage; the slope of this line is the token bucket rate                                                   ¢
                                                                                                                                    and its Y-intercept is the burst size . As shown in Figure 3(b),                    ¦

                                                                                                 time                               there are in general many such curves, all of which are valid
                                 Idle/ Non capsule
                              related activity (OFF)        Busy period (ON)                                                        descriptions of the observed resource usage.
                                                                                                                                        Several algorithms that mechanically compute all valid
                                                                                                                                            pairs for a given On-Off trace have been proposed re-
                                                                                                                                         !¦ ¥¡
                                                                                                                                         ¤ ¢
                     Figure 2: An example of an On-Off trace.                                                                       cently. We use a variant of one such algorithm [28] in our
                                                                                                                                    research—for each On-Off trace, the algorithm produces a
                     Measurement interval I                    I                     I
                                                                                                                                    range of values (i.e.,         ¢        ) that constitute valid to-
                                                                                                                                                                                                     31%(¢ ¤ '%#¢ !
                                                                                                                                                                                                    2 0 )   $ 
                                                                                                                                    ken bucket rates for observed behavior. For each within this                                              ¢
                                                                                                                                    range, the algorithm also computes the corresponding burst size
                                                                                                   Time                             ¦ . Although any pair within this range conforms to the observed
                                                                                                              ¨¦ ¦
                                                                                                                                behavior, the choice of a particular         has important practi-                       !¦ ¥¡
                                                                                                                                                                                                                             ¤ ¢
                                                                                                                                    cal implications.
                                                                                                                                        Since the overbooking tolerance for the capsule is given,
                                                           resource usage




                     1
       Probability




                                                                                                                                                                                                                            
                                                           Cumulative




                                                                             
                                                                                                         ©¨¦¤¢ 
                                                                                                        ¡ § ¥ £¡                    we can use to choose a particular              pair. To illustrate,                         !¦ ¥¡
                                                                                                                                                                                                                                 ¤ ¢
                                                                               ©
                                                                                                                                    if           , the capsule needs must be met 95% of the time,
                                                                                                                                                         ¨ ¨        4 5

                         0                             1
                                                                                                                                    which can be achieved by reserving resources corresponding to
                         Fraction resource usage                                                   Time                             the       percentile of the usage distribution. Consequently, a
                                                                                                                                                    A976
                                                                                                                                                   @ 8 4
                         (a) Usage distribution                             (b) Token bucket parameters                             good policy for shared hosting platforms is to pick a that cor-                                               ¢
                                                                                                                                    responds to the                   percentile of the resource usage
                                                                                                                                                                                     B¨ ¡    7
                                                                                                                                                                                             C      ¨¨   @ A8
                                                                                                                                    distribution , and to pick the corresponding as computed by
                                                                                                                                                                                                                                         ¦
Figure 3: Derivation of the usage distribution and token bucket                                                                     the above algorithm. This ensures that we provision resources
parameters.                                                                                                                         based on a high percentile of the capsule’s needs and that this
                                                                                                                                    percentile is chosen based on the specified overbooking toler-
                                                                                                                                    ance .
2.3 Empirical Derivation of the QoS Require-
                                                                                                                                                     

    ments
                                                                                                                                    2.4 Profiling Server Applications: Experimental
We use the trace of kernel events obtained from the profiling
                                                                                                                                        Results
process to model CPU and network activity as a simple On-Off
process. This is achieved by examining the time at which each                                                                       In this section, we profile several commonly-used server appli-
event occurs and its duration and deriving a sequence of busy                                                                       cations to illustrate the process of deriving an application’s QoS
(On) and idle (Off) periods from this information (see Figure                                                                       requirements. Our experimentally derived profiles not only il-
2). This trace of busy and idle periods can then be used to de-                                                                     lustrate the inherent nature of various server application but also
rive both the resource usage distribution as well as the token                                                                     demonstrate the utility and benefits of resource overbooking in
bucket parameters           .             §¥£¡
                                          ¦ ¤ ¢                                                                                    shared hosting platforms.
    Determining the usage distribution : Recall that, the us-                                                                          The test bed for our profiling experiments consists of a clus-
age distribution denotes the probability with which the cap-
                                                                                                                                   ter of five Dell Poweredge 1550 servers, each with a 966 MHz
sule uses a certain fraction of the resource. To derive , we                                                                       Pentium III processor and 512 MB memory running Red Hat
simply partition the trace into measurement intervals of length                                                                     Linux 7.0. All servers runs the 2.2.17 version of the Linux ker-
   and measure the fraction of time for which the capsule was                                                                       nel patched with the Linux trace toolkit version 0.9.5, and are
busy in each such interval. This value, which represents the                                                                        connected by 100Mbps Ethernet links to a Dell PowerConnect
fractional resource usage in that interval, is histogrammed and                                                                     (model no. 5012) ethernet switch.
then each bucket is normalized with respect to the number of                                                                            To profile an application, we run it on one of our servers and
measurement intervals in the trace to obtain the probability                                                                        use the remaining servers to generate the workload for profil-
distribution . Figure 3(a) illustrates this process.
                                                                                                                                   ing. We assume that all machines are lightly loaded and that all
    Deriving token bucket parameters          : Recall that a token                   !¦ ¥¡
                                                                                      ¤ ¢                                          non-essential system services (e.g., mail services, X windows
bucket limits the resource usage of a capsule to          over any
                                                                                                             £ ¢ 
              ¢                                                                                          ¢          ¦               server) are turned off to prevent interference during profiling.
interval . A given On-Off trace can have, in general, many ( ,                                                              ¢       We profile the following server applications in our experiments:
¦ ) pairs that satisfy this bound. To intuitively understand why,
                                                                                                                                               
let us compute the cumulative resource usage for the capsule                                                                                        Apache web server: We use the SPECWeb99 benchmark
over time. The cumulative resource usage is simply the total re-                                                                                    [25] to generate the workload for the Apache web server
source consumption thus far and is computed by incrementing                                                                                         (version 1.3.24). The SPECWeb benchmark allows control
the cumulative usage after each ON period. Thus, the cumula-                                                                                        along two dimensions—the number of concurrent clients
tive resource usage is a step function as depicted in Figure 3(b).                                                                                  and the percentage of dynamic (cgi-bin) HTTP requests.

                                                                                                                                5
Apache Web Server, SPECWEB99 default                                                       Apache Web Server, SPECWEB99 default                                                                         Valid Token Bucket Pairs
                 0.5                                                                                           1                                                                                                  2.5
                                                        Probability                                                                   Cumulative Probability
                0.45
                 0.4                                                                                          0.8                                                                                                  2




                                                                                     Cumulative Probability




                                                                                                                                                                                              Burst, rho (msec)
                0.35
  Probability




                 0.3                                                                                          0.6                                                                                                 1.5
                0.25
                 0.2                                                                                          0.4                                                                                                  1
                0.15
                 0.1                                                                                          0.2                                                                                                 0.5
                0.05
                  0                                                                                            0                                                                                                   0
                       0      0.05    0.1   0.15 0.2 0.25      0.3    0.35   0.4                                    0   0.05   0.1   0.15 0.2 0.25       0.3    0.35   0.4                                              0   0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9               1
                                            Fraction of CPU                                                                          Fraction of CPU                                                                                 Rate, sigma (fraction)

                (a) Probability distribution (PDF)                                 (b) Cumulative distribution function (CDF)                                                                                           (c) Token bucket parameters


                                                  Figure 4: Profile of the Apache web server using the default SPECWeb99 configuration.


                We vary both parameters to study their impact on Apache’s                                                                   and 4% of capacity, respectively. These results indicate that
                resource needs.                                                                                                             CPU usage is bursty in nature and that the worst-case require-
                                                                                                                                            ments are significantly higher than a high percentile of the us-
    
                MPEG streaming media server: We use a home-grown                                                                            age. Consequently, under provisioning (i.e., overbooking) by a
                streaming server to stream MPEG-1 video files to multiple                                                                    mere 1% reduces the CPU requirements of Apache by a factor
                concurrent clients over UDP. Each client in our experiment                                                                  of 2.5, while overbooking by 5% yields a factor of 6.25 reduc-
                requests a 15 minute long variable bit rate MPEG-1 video                                                                    tion (implying that 2.5 and 6.25 times as many web servers can
                with a mean bit rate of 1.5 Mb/s. We vary the number                                                                        be supported when provisioning based on the           and                                                  A756
                                                                                                                                                                                                                                                      @ 8 6                A356
                                                                                                                                                                                                                                                                          @ 8 4
                of concurrent clients and study its impact on the resource                                                                  percentiles, respectively, instead of the       profile). Thus,                                 ¨¨ A5
                                                                                                                                                                                                                                              @ 8
                usage at the server.                                                                                                        even small amounts of overbooking can potentially yield sig-
                                                                                                                                           nificant increases in platform capacity. Figure 4(c) depicts the
                Quake game server: We use the publicly available Linux                                                                      possible valid        pairs for Apache’s CPU usage. Depend-
                                                                                                                                                                            §¥£¡
                                                                                                                                                                            ¦ ¤ ¢
                Quake server to understand the resource usage of a multi-                                                                   ing on the specified overbooking tolerance , we can set to                                                                    ¢
                player game server; our experiments use the standard ver-                                                                   an appropriate percentile of the usage distribution , and the                                                     
                sion of Quake I—a popular multi-player game on the In-                                                                      corresponding can then be chosen using this figure.
                                                                                                                                                                       ¦
                ternet. The client workload is generated using a bot—
                an autonomous software program that emulates a human
                player. We use the publicly available “terminator” bot to                                                                      Figures 5(a)-(d) depict the CPU or network bandwidth distri-
                emulate each player; we vary the number of concurrent                                                                       butions, as appropriate, for various server applications. Specif-
                players connected to the server and study its impact on the                                                                 ically, the figure shows the usage distribution for the Apache
                resource usage.                                                                                                             web server with 50% dynamic SPECWeb requests, the stream-
                                                                                                                                            ing media server with 20 concurrent clients, the Quake game
    
                PostgreSQL database server: We profile the postgreSQL                                                                        server with 4 clients and the postgreSQL server with 10 clients.
                database server (version 7.2.1) using the pgbench 1.2                                                                       Table 1 summarizes our results and also presents profiles for
                benchmark. This benchmark is part of the postgreSQL dis-                                                                    several additional scenarios (only a small subset of the three
                tribution and emulates the TPC-B transactional benchmark                                                                    dozen profiles obtained from our experiments are presented due
                [19]. The benchmark provides control over the number of                                                                     to space constraints). Table 1 also lists the worst-case resource
                concurrent clients as well as the number of transactions                                                                    needs as well as the      and the       percentile of the resource
                                                                                                                                                                                      A8 56
                                                                                                                                                                                     @ 6                                      A976
                                                                                                                                                                                                                             @ 8 4
                performed by each client. We vary both parameters and                                                                       usage.
                study their impact on the resource usage of the database
                server.
                                                                                                                                               Together, Figure 5 and Table 1 demonstrate that all server
We now present some results from our profiling study.                                                                                        applications exhibit burstiness in their resource usage, albeit
    Figure 4(a) depicts the CPU usage distribution of the                                                                                   to different degrees. This burstiness causes the worst-case re-
Apache web server obtained using the default settings of the                                                                                source needs to be significantly higher than a high percentile
SPECWeb99 benchmark (50 concurrent clients, 30% dynamic                                                                                     of the usage distribution. Consequently, we find that the                                                                       A756
                                                                                                                                                                                                                                                                          @ 8 6
cgi-bin requests). Figure 4(b) plots the corresponding cumu-                                                                                percentile is smaller by a factor of 1.1-2.5, while the      per-                                                      A576
                                                                                                                                                                                                                                                                  @ 8 4
lative distribution function (CDF) of the resource usage. As                                                                                centile yields a factor of 1.3-6.25 reduction when compared to
shown in the figure (and summarized in Table 1), the worst                                                                                   the    ¨¨ percentile. Together, these results illustrate the po-
                                                                                                                                                         @ A8
case CPU usage (          profile) is 25% of CPU capacity. Fur-
                                            ¨¨ A5
                                               @ 8                                                                                          tential gains that can be realized by overbooking resources in
ther, the       and the    @  percentiles of CPU usage are 10
                             A8 6 6                   A356
                                                     @ 8 4                                                                                  shared hosting platforms.

                                                                                                                                       6
¨¦¤¢ 
                                                                                                                                                                                                                                                                                                                                                            § ¥ £ ¡
                           Apache Web Server, 50% cgi-bin                                                  Streaming Media Server, 20 clients                            Application                                                       Resource                               Res. usage at percentile
                                                                                                                                                                                                                                                                                  ©
                                                                                                                                                                                                                                                                                                   
                                                                                                                                                                                                                                                                                                                            
                                                                                                                                                                                                                                                                                                                                                                      $ #
                                                                                                                                                                                                                                                                                                                                                                        © 
               0.3
                                                  Probability
                                                                                               0.3
                                                                                                                                    Probability
                                                                                                                                                                                                                                                                                                                                                    for     !
                                                                                                                                                                         WS,default                                                             CPU                               0.25                0.10                    0.04                   (0.10, 0.218)
              0.25                                                                            0.25
                                                                                                                                                                       WS, 50% dynamic                                                          CPU                               0.69                0.29                    0.12                   (0.29, 0.382)
                                                                                                                                                                          SMS,k=4                                                               Net                               0.19                0.16                    0.11                    (0.16, 1.89)
               0.2                                                                             0.2
                                                                                                                                                                         SMS,k=20                                                               Net                               0.63                0.49                    0.43                    (0.49, 6.27)
Probability




                                                                                Probability
              0.15                                                                            0.15                                                                         GS,k=2                                                               CPU                              0.011                0.010                  0.009                 (0.010, 0.00099)
                                                                                                                                                                           GS,k=4                                                               CPU                              0.018                0.016                  0.014                 (0.016, 0.00163)
               0.1                                                                             0.1                                                                      DBS,k=1 (def)                                                           CPU                               0.33                0.27                    0.20                   (0.27, 0.184)
                                                                                                                                                                          DBS,k=10                                                              CPU                               0.85                0.81                    0.79                   (0.81, 0.130)
              0.05                                                                            0.05

                0                                                                               0
                     0   0.1     0.2     0.3    0.4      0.5      0.6    0.7                         0   0.1     0.2   0.3    0.4    0.5   0.6     0.7   0.8
                                       Fraction of CPU                                                         Fraction of Network Bandwidth                       Table 1: Summary of profiles. Although we profiled both
              (a) Apache: dynamic requests                                                     (b) Streaming media server                                          CPU and network usage for each application, we only present
                               Quake Game Server, 4 clients                                               Postgres Database Server, 10 clients
                                                                                                                                                                   results for the more constraining resource due to space con-
               0.3
                                                  Probability
                                                                                               0.1
                                                                                                                                    Probability                    straints. Abbreviations: WS=Apache, SMS=streaming media
              0.25
                                                                                              0.08
                                                                                                                                                                   server, GS=Quake game server, DBS=database server, k=num.
               0.2
                                                                                                                                                                   clients.
                                                                                Probability
Probability




                                                                                              0.06
              0.15
                                                                                              0.04
               0.1                                                                                                                                                    Test 1: Resource requirements of the new and existing cap-
              0.05
                                                                                              0.02                                                                 sules can be met. To verify that a node can meet the require-
                                                                                                                                                                   ments of all capsules, we simply sum the requirements of indi-
                0                                                                               0
                     0    0.005        0.01     0.015          0.02     0.025                        0         0.2      0.4         0.6      0.8         1         vidual capsules and ensure that the aggregate requirements do
                                       Fraction of CPU                                                                 Fraction of CPU
                                                                                                                                                                   not exceed node capacity. For each capsule on the node, the                                                                                                       %
                         (c) Quake Server                                                            (d) PostgreSQL Server                                         QoS parameters ( , ) and require that the capsule be allo-          $ ¦ $ ¢                                    $ ©
                                                                                                                                                                   cated
                                                                                                                                                                                           £ $ £ 
                                                                                                                                                                                      resources in each interval of duration . Fur-
                         Figure 5: Profiles of Various Server Applications                                                                                                    $ £¡
                                                                                                                                                                               ¢               ©                             $ ¦                                                                                                                                 $ ©
                                                                                                                                                                   ther, since the capsule has an overbooking tolerance , in the                                                                                                                            $ 
                                                                                                                                                                   worst case, the node can allocate only                        re-                                                                            $ ¡
                                                                                                                                                                                                                                                                                                                  ¢
                                                                                                                                                                                                                                                                                                                            £ $  
                                                                                                                                                                                                                                                                                                                                ©        C 3 $ ¦      ¡    B            $

3                    Resource Overbooking and Capsule                                                                                                              sources and yet satisfy the capsule needs (thus, the overbooking
                     Placement in Hosting Platforms                                                                                                                tolerance represents the fraction by which the allocation may
                                                                                                                                                                   be reduced if the node saturates due to overbooking). Conse-
Having derived the QoS requirements of each capsule, the next                                                                                                      quently, even in the worst case scenario, the resource require-
step is to determine which platform node will run each capsule.                                                                                                    ments of all capsules can be met so long as the total resource
Several considerations arise when making such placement de-                                                                                                        requirements do not exceed the capacity:
cisions. First, since platform resources are being overbooked,                                                                                                                              ' $           ©

the platform should ensure that the QoS requirements of a cap-
                                                                                                                                                                                               (
                                                                                                                                                                                                                            (¡
                                                                                                                                                                                                                           $ ¢                  % ©
                                                                                                                                                                                                                                                 $
                                                                                                                                                                                                                                                                   £         £$ ¦       ! 
                                                                                                                                                                                                                                                                                          ¡       B                © ¥ $  ) 0                '% ©
                                                                                                                                                                                                                                                                                                                                               $                                (1)
sule will be met even in the presence of overbooking. Second,                                                                                                                               $          ©

since multiple nodes may have the resources necessary to house
each application capsule, the platform will need to pick a spe-                                                                                                    where denotes the CPU or network interface capacity on the
                                                                                                                                                                                   )

cific mapping from the set of feasible mappings. This choice                                                                                                        node, denotes the number of existing capsules on the node,
                                                                                                                                                                               1

will be constrained by issues such as trust among competing                                                                                                        1    £ is the new capsule, and              min                                                                             % ©
                                                                                                                                                                                                                                                                                                 $                                      © ¡   ©
                                                                                                                                                                                                                                                                                                                                                    #§¤
                                                                                                                                                                                                                                                                                                                                                   2 ©      ¤         ©
                                                                                                                                                                                                                                                                                                                                                                            ' 3
                                                                                                                                                                                                                                                                                                                                                                                 ©

applications. In this section, we present techniques for over-                                                                                                     is the period for the capsule that desires the most stringent
                                                                                                                                                                                                                   ©

booking platform resources in a controlled manner. The aim is                                                                                                      guarantees.
to ensure that: (i) the QoS requirements of the application are                                                                                                        Test 2: Overbooking tolerances of all capsules are met.
satisfied and (ii) overbooking tolerances as well as external pol-                                                                                                  The overbooking tolerance of a capsule is met only if the to-
icy constraints are taken into account while making placement                                                                                                      tal amount of overbooking is smaller than its specified toler-
decisions.                                                                                                                                                         ance. To compute the aggregate overbooking on a node, we
                                                                                                                                                                   must first compute the total resource usage on the node. Since
                                                                                                                                                                   the usage distributions of individual capsules are known, the                              $ 
3.1 Resource Overbooking Techniques                                                                                                                                total resource on a node is simply the sum of the individual us-                    ' $            ©
                                                                                                                                                                   ages. That is,                , where denotes the of aggregate
                                                                                                                                                                                                                       4          
                                                                                                                                                                                                                                  5 6              $           ©           $                           4
A platform node can accept a new application capsule so long as                                                                                                    resource usage distribution on the node. Assuming each         is                                                                                                                                      $ 
the resource requirements of existing capsules are not violated,                                                                                                   independent, the resulting distribution can be computed from                                                                                 4
and sufficient unused resources exist to meet the requirements                                                                                                      elementary probability theory.1 Given the total resource usage
of the new capsule. However, if the node resources are over-                                                                                                       distribution , the probability that the total demand exceeds the
                                                                                                                                                                                                   4
booked, another requirement is added: the overbooking toler-                                                                                                            1 This is done using the z-transform. The z-transform of a random variable
ances of individual capsules already placed on the node should                                                                                                     7
                                                                                                                                                                       is the polynomial
                                                                                                                                                                                                                                                TPR¤$P@GEC@7
                                                                                                                                                                                                                                               D S I
                                                                                                                                                                                                                              where the coefficient
                                                                                                                                                                                                                                      9 @8
                                                                                                                                                                                                                                                     H Q D I H F D B A
                                                                                                                                                                                                                                                                                                                        S
                                                                                                                                                                                                                                                                                                                             WWVH
                                                                                                                                                                                                                                                                                                                            U U U

not be exceeded as a result of accepting the new capsule. Veri-                                                                                                    of the     `X
                                                                                                                                                                             a Y term represents the probability that the random variable equals                                                                                                                                     X
                                                                                                                                                                         7             A                       7            7 Q
                                                                                                                                                                                                                              c                 3h7 fffec
                                                                                                                                                                                                                                               i g  c d d d                                     q EH
fying these conditions involves two tests:                                                                                                                         (i.e.,     X ). If
                                                                                                                                                                                b9                        are       independent random variables,
                                                                                                                                                                                                                                           S
                                                                                                                                                                                                                                                                            Q
                                                                                                                                                                                                                                                                                           p




                                                                                                                                                               7
Resource Overbooking and Application Profiling in Shared ...
Resource Overbooking and Application Profiling in Shared ...
Resource Overbooking and Application Profiling in Shared ...
Resource Overbooking and Application Profiling in Shared ...
Resource Overbooking and Application Profiling in Shared ...
Resource Overbooking and Application Profiling in Shared ...
Resource Overbooking and Application Profiling in Shared ...
Resource Overbooking and Application Profiling in Shared ...

More Related Content

What's hot

H03401049054
H03401049054H03401049054
H03401049054theijes
 
A REVIEW OF MEMORY UTILISATION AND MANAGEMENT RELATED ISSUES IN WIRELESS SENS...
A REVIEW OF MEMORY UTILISATION AND MANAGEMENT RELATED ISSUES IN WIRELESS SENS...A REVIEW OF MEMORY UTILISATION AND MANAGEMENT RELATED ISSUES IN WIRELESS SENS...
A REVIEW OF MEMORY UTILISATION AND MANAGEMENT RELATED ISSUES IN WIRELESS SENS...IAEME Publication
 
A DENIAL OF SERVICE STRATEGY TO ORCHESTRATE STEALTHY ATTACK PATTERNS IN CLOUD...
A DENIAL OF SERVICE STRATEGY TO ORCHESTRATE STEALTHY ATTACK PATTERNS IN CLOUD...A DENIAL OF SERVICE STRATEGY TO ORCHESTRATE STEALTHY ATTACK PATTERNS IN CLOUD...
A DENIAL OF SERVICE STRATEGY TO ORCHESTRATE STEALTHY ATTACK PATTERNS IN CLOUD...IAEME Publication
 
Adaptive middleware of context aware application in smart homes
Adaptive middleware of context aware application in smart homesAdaptive middleware of context aware application in smart homes
Adaptive middleware of context aware application in smart homesambitlick
 
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENTDYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENTIJCNCJournal
 
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...IJSRD
 
Quality of service parameter centric resource allocation for lte advanced
Quality of service parameter centric resource allocation for lte advancedQuality of service parameter centric resource allocation for lte advanced
Quality of service parameter centric resource allocation for lte advancedeSAT Publishing House
 
Quality of service parameter centric resource allocation for lte advanced
Quality of service parameter centric resource allocation for lte advancedQuality of service parameter centric resource allocation for lte advanced
Quality of service parameter centric resource allocation for lte advancedeSAT Journals
 
Enhancement of qos in lte downlink systems using
Enhancement of qos in lte downlink systems usingEnhancement of qos in lte downlink systems using
Enhancement of qos in lte downlink systems usingeSAT Publishing House
 
presentation on reducing Cost in Cloud Computing
 presentation on reducing Cost in Cloud Computing presentation on reducing Cost in Cloud Computing
presentation on reducing Cost in Cloud ComputingMuhammad Faheem ul Hassan
 
Survey on caching and replication algorithm for content distribution in peer ...
Survey on caching and replication algorithm for content distribution in peer ...Survey on caching and replication algorithm for content distribution in peer ...
Survey on caching and replication algorithm for content distribution in peer ...ijcseit
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...ijcseit
 
Transparent Caching of Virtual Stubs for Improved Performance in Ubiquitous E...
Transparent Caching of Virtual Stubs for Improved Performance in Ubiquitous E...Transparent Caching of Virtual Stubs for Improved Performance in Ubiquitous E...
Transparent Caching of Virtual Stubs for Improved Performance in Ubiquitous E...ijujournal
 
IRJET- HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...
IRJET-  	  HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...IRJET-  	  HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...
IRJET- HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...IRJET Journal
 
Flexible channel allocation using best Secondary user detection algorithm
Flexible channel allocation using best Secondary user detection algorithmFlexible channel allocation using best Secondary user detection algorithm
Flexible channel allocation using best Secondary user detection algorithmijsrd.com
 
New York City and Baltimore Semantic Web Meetups 20130221/20120226
New York City and Baltimore Semantic Web Meetups 20130221/20120226New York City and Baltimore Semantic Web Meetups 20130221/20120226
New York City and Baltimore Semantic Web Meetups 20130221/201202263 Round Stones
 
Rate adaptive resource allocation in ofdma using bees algorithm
Rate adaptive resource allocation in ofdma using bees algorithmRate adaptive resource allocation in ofdma using bees algorithm
Rate adaptive resource allocation in ofdma using bees algorithmeSAT Publishing House
 

What's hot (19)

H03401049054
H03401049054H03401049054
H03401049054
 
A REVIEW OF MEMORY UTILISATION AND MANAGEMENT RELATED ISSUES IN WIRELESS SENS...
A REVIEW OF MEMORY UTILISATION AND MANAGEMENT RELATED ISSUES IN WIRELESS SENS...A REVIEW OF MEMORY UTILISATION AND MANAGEMENT RELATED ISSUES IN WIRELESS SENS...
A REVIEW OF MEMORY UTILISATION AND MANAGEMENT RELATED ISSUES IN WIRELESS SENS...
 
A DENIAL OF SERVICE STRATEGY TO ORCHESTRATE STEALTHY ATTACK PATTERNS IN CLOUD...
A DENIAL OF SERVICE STRATEGY TO ORCHESTRATE STEALTHY ATTACK PATTERNS IN CLOUD...A DENIAL OF SERVICE STRATEGY TO ORCHESTRATE STEALTHY ATTACK PATTERNS IN CLOUD...
A DENIAL OF SERVICE STRATEGY TO ORCHESTRATE STEALTHY ATTACK PATTERNS IN CLOUD...
 
Adaptive middleware of context aware application in smart homes
Adaptive middleware of context aware application in smart homesAdaptive middleware of context aware application in smart homes
Adaptive middleware of context aware application in smart homes
 
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENTDYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
DYNAMIC TASK SCHEDULING BASED ON BURST TIME REQUIREMENT FOR CLOUD ENVIRONMENT
 
Hm2413291336
Hm2413291336Hm2413291336
Hm2413291336
 
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
Fault Tolerance in Big Data Processing Using Heartbeat Messages and Data Repl...
 
Quality of service parameter centric resource allocation for lte advanced
Quality of service parameter centric resource allocation for lte advancedQuality of service parameter centric resource allocation for lte advanced
Quality of service parameter centric resource allocation for lte advanced
 
Quality of service parameter centric resource allocation for lte advanced
Quality of service parameter centric resource allocation for lte advancedQuality of service parameter centric resource allocation for lte advanced
Quality of service parameter centric resource allocation for lte advanced
 
Enhancement of qos in lte downlink systems using
Enhancement of qos in lte downlink systems usingEnhancement of qos in lte downlink systems using
Enhancement of qos in lte downlink systems using
 
presentation on reducing Cost in Cloud Computing
 presentation on reducing Cost in Cloud Computing presentation on reducing Cost in Cloud Computing
presentation on reducing Cost in Cloud Computing
 
Survey on caching and replication algorithm for content distribution in peer ...
Survey on caching and replication algorithm for content distribution in peer ...Survey on caching and replication algorithm for content distribution in peer ...
Survey on caching and replication algorithm for content distribution in peer ...
 
17 51-1-pb
17 51-1-pb17 51-1-pb
17 51-1-pb
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...
 
Transparent Caching of Virtual Stubs for Improved Performance in Ubiquitous E...
Transparent Caching of Virtual Stubs for Improved Performance in Ubiquitous E...Transparent Caching of Virtual Stubs for Improved Performance in Ubiquitous E...
Transparent Caching of Virtual Stubs for Improved Performance in Ubiquitous E...
 
IRJET- HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...
IRJET-  	  HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...IRJET-  	  HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...
IRJET- HHH- A Hyped-up Handling of Hadoop based SAMR-MST for DDOS Attacks...
 
Flexible channel allocation using best Secondary user detection algorithm
Flexible channel allocation using best Secondary user detection algorithmFlexible channel allocation using best Secondary user detection algorithm
Flexible channel allocation using best Secondary user detection algorithm
 
New York City and Baltimore Semantic Web Meetups 20130221/20120226
New York City and Baltimore Semantic Web Meetups 20130221/20120226New York City and Baltimore Semantic Web Meetups 20130221/20120226
New York City and Baltimore Semantic Web Meetups 20130221/20120226
 
Rate adaptive resource allocation in ofdma using bees algorithm
Rate adaptive resource allocation in ofdma using bees algorithmRate adaptive resource allocation in ofdma using bees algorithm
Rate adaptive resource allocation in ofdma using bees algorithm
 

Similar to Resource Overbooking and Application Profiling in Shared ...

Hybrid Based Resource Provisioning in Cloud
Hybrid Based Resource Provisioning in CloudHybrid Based Resource Provisioning in Cloud
Hybrid Based Resource Provisioning in CloudEditor IJCATR
 
A survey on various resource allocation policies in cloud computing environment
A survey on various resource allocation policies in cloud computing environmentA survey on various resource allocation policies in cloud computing environment
A survey on various resource allocation policies in cloud computing environmenteSAT Journals
 
A survey on various resource allocation policies in cloud computing environment
A survey on various resource allocation policies in cloud computing environmentA survey on various resource allocation policies in cloud computing environment
A survey on various resource allocation policies in cloud computing environmenteSAT Publishing House
 
B02120307013
B02120307013B02120307013
B02120307013theijes
 
Allocation Strategies of Virtual Resources in Cloud-Computing Networks
Allocation Strategies of Virtual Resources in Cloud-Computing NetworksAllocation Strategies of Virtual Resources in Cloud-Computing Networks
Allocation Strategies of Virtual Resources in Cloud-Computing NetworksIJERA Editor
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...IRJET Journal
 
An Efficient Queuing Model for Resource Sharing in Cloud Computing
	An Efficient Queuing Model for Resource Sharing in Cloud Computing	An Efficient Queuing Model for Resource Sharing in Cloud Computing
An Efficient Queuing Model for Resource Sharing in Cloud Computingtheijes
 
IRJET- Dynamic Resource Allocation of Heterogeneous Workload in Cloud
IRJET- Dynamic Resource Allocation of Heterogeneous Workload in CloudIRJET- Dynamic Resource Allocation of Heterogeneous Workload in Cloud
IRJET- Dynamic Resource Allocation of Heterogeneous Workload in CloudIRJET Journal
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET Journal
 
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGGROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGAIRCC Publishing Corporation
 
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGGROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGijcsit
 
IRJET- Efficient Resource Allocation for Heterogeneous Workloads in Iaas Clouds
IRJET- Efficient Resource Allocation for Heterogeneous Workloads in Iaas CloudsIRJET- Efficient Resource Allocation for Heterogeneous Workloads in Iaas Clouds
IRJET- Efficient Resource Allocation for Heterogeneous Workloads in Iaas CloudsIRJET Journal
 
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENT
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENTA STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENT
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENTpharmaindexing
 
A Virtual Machine Resource Management Method with Millisecond Precision
A Virtual Machine Resource Management Method with Millisecond PrecisionA Virtual Machine Resource Management Method with Millisecond Precision
A Virtual Machine Resource Management Method with Millisecond PrecisionIRJET Journal
 
Survey on Dynamic Resource Allocation Strategy in Cloud Computing Environment
Survey on Dynamic Resource Allocation Strategy in Cloud Computing EnvironmentSurvey on Dynamic Resource Allocation Strategy in Cloud Computing Environment
Survey on Dynamic Resource Allocation Strategy in Cloud Computing EnvironmentEditor IJCATR
 

Similar to Resource Overbooking and Application Profiling in Shared ... (20)

B03410609
B03410609B03410609
B03410609
 
Hybrid Based Resource Provisioning in Cloud
Hybrid Based Resource Provisioning in CloudHybrid Based Resource Provisioning in Cloud
Hybrid Based Resource Provisioning in Cloud
 
A survey on various resource allocation policies in cloud computing environment
A survey on various resource allocation policies in cloud computing environmentA survey on various resource allocation policies in cloud computing environment
A survey on various resource allocation policies in cloud computing environment
 
A survey on various resource allocation policies in cloud computing environment
A survey on various resource allocation policies in cloud computing environmentA survey on various resource allocation policies in cloud computing environment
A survey on various resource allocation policies in cloud computing environment
 
B02120307013
B02120307013B02120307013
B02120307013
 
Allocation Strategies of Virtual Resources in Cloud-Computing Networks
Allocation Strategies of Virtual Resources in Cloud-Computing NetworksAllocation Strategies of Virtual Resources in Cloud-Computing Networks
Allocation Strategies of Virtual Resources in Cloud-Computing Networks
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
CPET- Project Report
CPET- Project ReportCPET- Project Report
CPET- Project Report
 
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
 
An Efficient Queuing Model for Resource Sharing in Cloud Computing
	An Efficient Queuing Model for Resource Sharing in Cloud Computing	An Efficient Queuing Model for Resource Sharing in Cloud Computing
An Efficient Queuing Model for Resource Sharing in Cloud Computing
 
IRJET- Dynamic Resource Allocation of Heterogeneous Workload in Cloud
IRJET- Dynamic Resource Allocation of Heterogeneous Workload in CloudIRJET- Dynamic Resource Allocation of Heterogeneous Workload in Cloud
IRJET- Dynamic Resource Allocation of Heterogeneous Workload in Cloud
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
 
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGGROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
 
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGGROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
 
IRJET- Efficient Resource Allocation for Heterogeneous Workloads in Iaas Clouds
IRJET- Efficient Resource Allocation for Heterogeneous Workloads in Iaas CloudsIRJET- Efficient Resource Allocation for Heterogeneous Workloads in Iaas Clouds
IRJET- Efficient Resource Allocation for Heterogeneous Workloads in Iaas Clouds
 
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENT
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENTA STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENT
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENT
 
A Virtual Machine Resource Management Method with Millisecond Precision
A Virtual Machine Resource Management Method with Millisecond PrecisionA Virtual Machine Resource Management Method with Millisecond Precision
A Virtual Machine Resource Management Method with Millisecond Precision
 
Survey on Dynamic Resource Allocation Strategy in Cloud Computing Environment
Survey on Dynamic Resource Allocation Strategy in Cloud Computing EnvironmentSurvey on Dynamic Resource Allocation Strategy in Cloud Computing Environment
Survey on Dynamic Resource Allocation Strategy in Cloud Computing Environment
 
Availability in Cloud Computing
Availability in Cloud ComputingAvailability in Cloud Computing
Availability in Cloud Computing
 

More from webhostingguy

Running and Developing Tests with the Apache::Test Framework
Running and Developing Tests with the Apache::Test FrameworkRunning and Developing Tests with the Apache::Test Framework
Running and Developing Tests with the Apache::Test Frameworkwebhostingguy
 
MySQL and memcached Guide
MySQL and memcached GuideMySQL and memcached Guide
MySQL and memcached Guidewebhostingguy
 
Novell® iChain® 2.3
Novell® iChain® 2.3Novell® iChain® 2.3
Novell® iChain® 2.3webhostingguy
 
Load-balancing web servers Load-balancing web servers
Load-balancing web servers Load-balancing web serversLoad-balancing web servers Load-balancing web servers
Load-balancing web servers Load-balancing web serverswebhostingguy
 
SQL Server 2008 Consolidation
SQL Server 2008 ConsolidationSQL Server 2008 Consolidation
SQL Server 2008 Consolidationwebhostingguy
 
Master Service Agreement
Master Service AgreementMaster Service Agreement
Master Service Agreementwebhostingguy
 
PHP and MySQL PHP Written as a set of CGI binaries in C in ...
PHP and MySQL PHP Written as a set of CGI binaries in C in ...PHP and MySQL PHP Written as a set of CGI binaries in C in ...
PHP and MySQL PHP Written as a set of CGI binaries in C in ...webhostingguy
 
Dell Reference Architecture Guide Deploying Microsoft® SQL ...
Dell Reference Architecture Guide Deploying Microsoft® SQL ...Dell Reference Architecture Guide Deploying Microsoft® SQL ...
Dell Reference Architecture Guide Deploying Microsoft® SQL ...webhostingguy
 
Managing Diverse IT Infrastructure
Managing Diverse IT InfrastructureManaging Diverse IT Infrastructure
Managing Diverse IT Infrastructurewebhostingguy
 
Web design for business.ppt
Web design for business.pptWeb design for business.ppt
Web design for business.pptwebhostingguy
 
IT Power Management Strategy
IT Power Management Strategy IT Power Management Strategy
IT Power Management Strategy webhostingguy
 
Excel and SQL Quick Tricks for Merchandisers
Excel and SQL Quick Tricks for MerchandisersExcel and SQL Quick Tricks for Merchandisers
Excel and SQL Quick Tricks for Merchandiserswebhostingguy
 
Parallels Hosting Products
Parallels Hosting ProductsParallels Hosting Products
Parallels Hosting Productswebhostingguy
 
Microsoft PowerPoint presentation 2.175 Mb
Microsoft PowerPoint presentation 2.175 MbMicrosoft PowerPoint presentation 2.175 Mb
Microsoft PowerPoint presentation 2.175 Mbwebhostingguy
 

More from webhostingguy (20)

File Upload
File UploadFile Upload
File Upload
 
Running and Developing Tests with the Apache::Test Framework
Running and Developing Tests with the Apache::Test FrameworkRunning and Developing Tests with the Apache::Test Framework
Running and Developing Tests with the Apache::Test Framework
 
MySQL and memcached Guide
MySQL and memcached GuideMySQL and memcached Guide
MySQL and memcached Guide
 
Novell® iChain® 2.3
Novell® iChain® 2.3Novell® iChain® 2.3
Novell® iChain® 2.3
 
Load-balancing web servers Load-balancing web servers
Load-balancing web servers Load-balancing web serversLoad-balancing web servers Load-balancing web servers
Load-balancing web servers Load-balancing web servers
 
SQL Server 2008 Consolidation
SQL Server 2008 ConsolidationSQL Server 2008 Consolidation
SQL Server 2008 Consolidation
 
What is mod_perl?
What is mod_perl?What is mod_perl?
What is mod_perl?
 
What is mod_perl?
What is mod_perl?What is mod_perl?
What is mod_perl?
 
Master Service Agreement
Master Service AgreementMaster Service Agreement
Master Service Agreement
 
Notes8
Notes8Notes8
Notes8
 
PHP and MySQL PHP Written as a set of CGI binaries in C in ...
PHP and MySQL PHP Written as a set of CGI binaries in C in ...PHP and MySQL PHP Written as a set of CGI binaries in C in ...
PHP and MySQL PHP Written as a set of CGI binaries in C in ...
 
Dell Reference Architecture Guide Deploying Microsoft® SQL ...
Dell Reference Architecture Guide Deploying Microsoft® SQL ...Dell Reference Architecture Guide Deploying Microsoft® SQL ...
Dell Reference Architecture Guide Deploying Microsoft® SQL ...
 
Managing Diverse IT Infrastructure
Managing Diverse IT InfrastructureManaging Diverse IT Infrastructure
Managing Diverse IT Infrastructure
 
Web design for business.ppt
Web design for business.pptWeb design for business.ppt
Web design for business.ppt
 
IT Power Management Strategy
IT Power Management Strategy IT Power Management Strategy
IT Power Management Strategy
 
Excel and SQL Quick Tricks for Merchandisers
Excel and SQL Quick Tricks for MerchandisersExcel and SQL Quick Tricks for Merchandisers
Excel and SQL Quick Tricks for Merchandisers
 
OLUG_xen.ppt
OLUG_xen.pptOLUG_xen.ppt
OLUG_xen.ppt
 
Parallels Hosting Products
Parallels Hosting ProductsParallels Hosting Products
Parallels Hosting Products
 
Microsoft PowerPoint presentation 2.175 Mb
Microsoft PowerPoint presentation 2.175 MbMicrosoft PowerPoint presentation 2.175 Mb
Microsoft PowerPoint presentation 2.175 Mb
 
Reseller's Guide
Reseller's GuideReseller's Guide
Reseller's Guide
 

Resource Overbooking and Application Profiling in Shared ...

  • 1.   Resource Overbooking and Application Profiling in Shared Hosting Platforms ¡ Bhuvan Urgaonkar, Prashant Shenoy and Timothy Roscoe ¡ Department of Computer Science Intel Research at Berkeley University of Massachusetts 2150 Shattuck Avenue Suite 1300 ¢ Amherst MA 01003 Berkeley CA 94704 bhuvan,shenoy @cs.umass.edu £ troscoe@intel-research.net Abstract search engine), or each individual processing element in the cluster is dedicated to a single application (as in the “man- In this paper, we present techniques for provisioning CPU and aged hosting” services provided by some data centers). In network resources in shared hosting platforms running poten- contrast, shared hosting platforms run a large number of dif- tially antagonistic third-party applications. The primary con- ferent third-party applications (web servers, streaming media tribution of our work is to demonstrate the feasibility and ben- servers, multi-player game servers, e-commerce applications, efits of overbooking resources in shared platforms. Since an etc.), and the number of applications typically exceeds the num- accurate estimate of an application’s resource needs is nec- ber of nodes in the cluster. More specifically, each application essary when overbooking resources, we present techniques to runs on a subset of the nodes and these subsets can overlap profile applications on dedicated nodes, possibly while in ser- with one another. Whereas dedicated hosting platforms are used vice, and use these profiles to guide the placement of appli- for many niche applications that warrant their additional cost, cation components onto shared nodes. We then propose tech- the economic reasons of space, power, cooling and cost make niques to overbook cluster resources in a controlled fashion shared hosting platforms an attractive choice for many applica- such that the platform can provide performance guarantees to tion hosting environments. applications even when overbooked. We show how our tech- Shared hosting platforms imply a business relationship be- niques can be combined with commonly used QoS resource al- tween the platform provider and application providers: the lat- location mechanisms to provide application isolation and per- ter pay the former for resources on the platform. In return, formance guarantees at run-time. We implement our techniques the platform provider gives some kind of guarantee of resource in a Linux cluster and evaluate them using common server ap- availability to applications[20]. plications. We find that the efficiency (and consequently rev- Perhaps the central challenge in building such a shared host- enue) benefits from controlled overbooking of resources can be ing platform is resource management: the ability to reserve dramatic. Specifically, we find that overbooking resources by resources for individual applications, the ability to isolate ap- as little as 1% we can increase the utilization of the cluster by plications from other misbehaving or overloaded applications, a factor of two, and a 5% overbooking yields a 300-500% im- and the ability to provide performance guarantees to applica- provement, while still providing useful resource guarantees to tions. Arguably, the widespread deployment of shared hosting applications. platforms has been hampered by the lack of effective resource management mechanisms that meet these requirements. Conse- 1 Introduction and Motivation quently, most hosting platforms in use today adopt one of two approaches. Server clusters built using commodity hardware and software The first avoids resource sharing altogether by employing a are an increasingly attractive alternative to traditional large dedicated model. This delivers useful resources to application multiprocessor servers for many applications, in part due to providers, but is expensive in machine resources. The second rapid advances in computing technologies and falling hardware approach is to share resources in a best-effort manner among prices. applications, which consequently receive no resource guaran- This paper addresses challenges in the design of a particular tees. While this is cheap in resources, the value delivered to ap- type of server cluster we call shared hosting platforms. These plication providers is limited. Consequently, both approaches can be contrasted with dedicated hosting platforms, where ei- imply an economic disincentive to deploy viable hosting plat- ther the entire cluster runs a single application (such as a web forms. ¤ Some recent research efforts have proposed resource man- Portions of this research were done when Timothy Roscoe was a researcher at Sprint ATL and Bhuvan Urgaonkar was a summer intern at Sprint ATL. This agement mechanisms for shared hosting platforms [2, 7, 29]. research was supported in part by NSF grants CCR-9984030, EIA-0080119 and While these efforts take an initial step towards the design of ef- a gift from Sprint Corporation. fective shared hosting platforms, many challenges remain to be 1
  • 2. addressed. This is the context of the present work. a high percentile of the application needs yields statistical mul- tiplexing gains that significantly increase the average utilization of the cluster at the expense of a small amount of overbooking, 1.1 Research Contributions and increases the number of applications that can be supported The contribution of this paper is threefold. First, we show how on a given hardware configuration. the resource requirements of an application can be derived us- A well-designed hosting platform should be able to provide ing online profiling and modeling. Second, we demonstrate performance guarantees to applications even when overbooked, the efficiency benefits to the platform provider of overbooking with the proviso that this guarantee is now probabilistic instead resources on the platform, and how this can be usefully done of deterministic (for instance, an application might be provided without adversely impacting the guarantees offered to applica- a 99% guarantee (0.99 probability) that its resource needs will tion providers. Thirdly, we show how untrusted and/or mutually be met). Since different applications have different tolerance antagonistic applications in the platform can be isolated from to such overbooking (e.g., the latency requirements of a game one another. server make it less tolerant to violations of performance guar- antees than a web server), an overbooking mechanism should take into account diverse application needs. Automatic derivation of QoS requirements The primary contribution of this paper is to demonstrate the Recent work on resource management mechanisms for clusters feasibility and benefits of overbooking resources in shared host- (e.g. [2, 29]) implicitly assumes that the resource requirements ing platforms. We propose techniques to overbook (i.e. under- of an application are either known in advance or can be de- provision) resources in a controlled fashion based on applica- rived, but does not specifically address the problem of how tion resource needs. Although such overbooking can result in to determine these requirements. However, the effectiveness transient overloads where the aggregate resource demand tem- of these techniques is crucially dependent on the ability to re- porarily exceeds capacity, our techniques limit the chances of serve appropriate amount of resources for each application— transient overload of resources to predictably rare occasions, overestimating an application’s resource needs can result in and provide useful performance guarantees to applications in idling of resources, while underestimating them can degrade the presence of overbooking. application performance. The techniques we describe are general enough to work with A shared hosting platform can significantly enhance its util- many commonly used OS resource allocation mechanisms. Ex- ity to users by automatically deriving the QoS requirements of perimental results demonstrate that overbooking resources by an application. Automatic derivation of QoS requirements in- amounts as small as 1% yields a factor of two increase in the volves (i) monitoring an application’s resource usage, and (ii) number of applications supported by a given platform config- using these statistics to derive QoS requirements that conform uration, while a 5-10% overbooking yields a 300-500% in- to the observed behavior. crease in effective platform capacity. In general, we find that In this paper, we employ kernel-based profiling mechanisms the more bursty the application resource needs, the higher are to empirically monitor an application’s resource usage and pro- the benefits of resource overbooking. We also find that collo- pose techniques to derive QoS requirements from this observed cating CPU-bound and network-bound applications as well as behavior. We then use these techniques to experimentally pro- bursty and non-bursty applications yields additional multiplex- file several server applications such as web, streaming, game, ing gains when overbooking resources. and database servers. Our results show that the bursty resource usage of server applications makes it feasible to extract statisti- Placement and isolation of antagonistic applications cal multiplexing gains by overbooking resources on the hosting platform. In a shared hosting platform, it is assumed that third-party ap- plications may be antagonistic to each other and/or the platform Revenue maximization through overbooking itself, either through malice or bugs. A hosting platform should address these issues by isolating applications from one another The goal of the owner of a hosting platform is to maximize rev- and preventing malicious, misbehaving, or overloaded applica- enue, which implies that the cluster should strive to maximize tions from affecting the performance of other applications. the number of applications that can be housed on a given hard- A third contribution of our work is to demonstrate how un- ware configuration. One approach is to overbook resources—a trusted third-party applications can be isolated from one an- technique routinely used to maximize yield in airline reserva- other in shared hosting platforms. This isolation takes two tion systems [24]. forms. Local to a machine, each processing node in the plat- Provisioning cluster resources solely based on the worst- form employs resource management techniques that “sandbox” case needs of an application results in low average utilization, applications by restricting the resources consumed by an ap- since the average resource requirements of an application are plication to its reserved share. Globally, the process of place- typically smaller than its worst case (peak) requirements, and ment, whereby components of an application are assigned to resources tend to idle when the application does not utilize its individual processing nodes, can be constrained by externally peak reserved share. In contrast, provisioning a cluster based on imposed policies—for instance, by risk assessments made by 2
  • 3. App C on a single node, then the application will consist of a single App A App B capsule with all three components. On the other hand, if each cpu component needs to be placed on a different node, then the ap- NIC plication should be partitioned into three capsules. Depending Cluster Interconnect (gigabit ethernet) on the number of its capsules, each application runs on a subset of the platform nodes and these subsets can overlap with one another, resulting in resource sharing (see Figure 1). The rest of this paper is structured as follows. Section 2 dis- App E App F App G cusses techniques for empirically deriving an application’s re- App H source needs, while Section 3 discusses our resource overbook- ing techniques and capsule placement strategies. We discuss implementation issues in Section 4 and present our experimen- Figure 1: Architecture of a shared hosting platform. Each ap- tal results in Section 5. Section 6 discusses related work, and plication runs on one or more nodes and shares resources with finally, Section 7 presents concluding remarks. other applications. 2 Automatic Derivation of Application QoS the provider about each application, which may prevent an ap- Requirements plication from being collocated with certain other applications. Since a manual placement of applications onto nodes is infea- The first step in hosting a new application is to derive its re- sibly complex in large clusters, the design of automated place- source requirements. While the problem of QoS-aware re- ment techniques that allow a platform provider to exert suffi- source management has been studied extensively in the litera- cient control over the placement process is a key issue. ture [4, 8, 14, 15], the problem of how much resource to allocate to each application has received relatively little attention. In this 1.2 System Model and Terminology section, we address this issue by proposing techniques to auto- matically derive the QoS requirements of an application (the The shared hosting platform assumed in our research consists terms resource requirements and QoS requirements are used in- of a cluster of   nodes, each of which consists of processor, terchangeably in this paper.) Deriving the QoS requirements memory, and storage resources as well as one or more network is a two step process: (i) we first use profiling techniques to interfaces. Platform nodes are allowed to be heterogeneous monitor application behavior, and (ii) we then use our empiri- with different amounts of these resources on each node. The cal measurements to derive QoS requirements that conform to nodes in the hosting platform are assumed to be interconnected the observed behavior. by a high-speed LAN such as gigabit ethernet (see Figure 1). Each cluster node is assumed to run an operating system kernel that supports some notion of quality of service such as reserva- 2.1 Application QoS Requirements: Definitions tions or shares. Such mechanisms have been extensively stud- The QoS requirements of an application are defined on a per- ied over the past decade and many deployed commercial and capsule basis. For each capsule, the QoS requirements specify open-source operating systems such as Solaris [26], IRIX [22], the intrinsic rate of resource usage, the variability in the re- Linux [27], and FreeBSD [5] already support such features. In source usage, the time period over which the capsule desires this paper, we primarily focus on managing two resources— resource guarantees, and the level of overbooking that the ap- CPU and network interface bandwidth—in shared hosting plat- plication (capsule) is willing to tolerate. As explained earlier, forms. The challenges of managing other resources in host- in this paper, we are concerned with two key resources, namely ing environments, such as memory and storage, are beyond the CPU and network interface bandwidth. For each of these re- scope of this paper. Nevertheless, we believe the techniques sources, we define the QoS requirements along the above di- developed here are also applicable to these other resources. mensions in an OS-independent manner. In Section 4.1, we We use the term application for a complete service running show how to map these requirements to various OS-specific re- on behalf of an application provider; since an application will source management mechanisms that have been developed. frequently consist of multiple distributed components, we use More formally, we represent the QoS requirements of an ap- the term capsule (borrowed from [13]) to refer to that compo- plication capsule by a quintuple : §¨§¥£¡ ¤ ¤ © ¤ ¦ ¤ ¢ nent of an application running on a single node. Each applica- tion has at least one capsule, possibly more if the application Token Bucket Parameters !¦ ¥¡ ¤ ¢ : We capture the basic re- is distributed. Capsules provide a useful abstraction for logi- source requirements of a capsule by modeling resource us- cally partitioning an application into sub-components and for age as a token bucket §¥¡ ¦ ¤ ¢ [28]. The parameter de- ¢ exerting control over the distribution of these components onto notes the intrinsic rate of resource consumption, while ¦ different nodes. To illustrate, consider an e-commerce applica- denotes the variability in the resource consumption. More tion consisting of a web server, a Java application server and a specifically, denotes the rate at which the capsule con- ¢ database server. If all three components need to be collocated sumes CPU cycles or network interface bandwidth, while 3
  • 4. ¦ captures the maximum burst size. By definition, a token mechanisms ranging from OS-kernel-based profiling [1] to run- bucket bounds the resource usage of the capsule to ¤¢ ¡¢ £   ¦ time profiling using specially linked libraries [23] have been over any interval . ¢ proposed. Period : The third parameter denotes the time pe- © © We use kernel-based profiling mechanisms in the context of riod over which the capsule desires guarantees on resource shared hosting platforms, for a number of reasons. Firstly, be- availability. Put another way, the system should strive to ing kernel-based, these mechanisms work with any application meet the QoS requirements of the capsule over each inter- and require no changes to the application at the source or bi- val of length . The smaller the value of , the more strin- © © nary levels. This is especially important in hosting environ- gent are the desired guarantees (since the capsule needs to ments where the platform provider may have little or no ac- be guaranteed resources over a finer time scale). In par- cess to third-party applications. Secondly, accurate estimation ticular, for the above token bucket parameters, the capsule £ © ¥¢ of an application’s resource needs requires detailed information requires that it be allocated at least resources every   ¦ about when and how much resources are used by the application © time units. at a fine time-scale. Whereas detailed resource allocation infor- Usage Distribution : While the token bucket parameters mation is difficult to obtain using application-level techniques, succinctly capture the capsule’s resource requirements, kernel-based techniques can provide precise information about they are not sufficiently expressive by themselves to de- various kernel events such as CPU scheduling instances and note the QoS requirements in the presence of overbook- network packet transmissions times. ing. Consequently, we use two additional parameters— and —to specify resource requirements in the pres- The profiling process involves running the application on a ence of overbooking. The first parameter denotes the set of isolated platform nodes (the number of nodes required probability distribution of resource usage. Note that is a for profiling depends on the number of capsules). By isolated, more detailed specification of resource usage than the to- we mean that each node runs only the minimum number of ken bucket parameters , and indicates the probability !¦ ¥¡ ¤ ¢ system services necessary for executing the application and no with which the capsule is likely to use a certain fraction of other applications are run on these nodes during the profiling the resource (i.e., is the probability that the capsule ¦ § ¥ ¡ process—such isolation is necessary to minimize interference uses a fraction of the resource, ¦ ). A prob- ¨ © ¦ © from unrelated tasks when determining the application’s re- ability distribution of resource usage is necessary so that source usage. The application is then subjected to a realistic the hosting platform can provide (quantifiable) probabilis- workload, and the kernel profiling mechanism is used to track tic guarantees even in the presence of overbooking. its resource usage. It is important to emphasize that the work- load used during profiling should be both realistic and represen- Overbooking Tolerance : The parameter is the over- tative of real-world workloads. While techniques for generating booking tolerance of the capsule. It specifies the probabil- such realistic workloads are orthogonal to our current research, ity with which the capsule’s requirements may be violated we note that a number of different workload-generation tech- due to resource overbooking (by providing it with less re- niques exist, ranging from trace replay of actual workloads to sources than the required amount). Thus, the overbooking running the application in a “live” setting, and from the use of tolerance indicates the minimum level of service that is synthetic workload generators to the use of well-known bench- acceptable to the capsule. To illustrate, if , the ¨ ¨ marks. Any such technique suffices for our purpose as long as capsule’s resource requirements should be met 99% of the it realistically emulates real-world conditions, although we note time (or with a probability of 0.99 in each interval ). © that, from a business perspective, running the application “for In general, we assume that parameters and are specified © real” on an isolated machine to obtain a profile may be prefer- by the application provider. This may be based on a contract be- able to other workload generations techniques. tween the platform provider and the application provider (e.g., the more the application provider is willing to pay for resources, We use the Linux trace toolkit as our kernel profiling mech- the stronger are the provided guarantees), or on the particular anism [16]. The toolkit provides flexible, low-overhead mech- characteristics of the application (e.g., a streaming media server anisms to trace a variety of kernel events such as system call requires more stringent guarantees and is less tolerant to viola- invocations, process, memory, file system and network opera- tions of these guarantees). In the rest of this section, we show tions. The user can specify the specific kernel events of interest how to derive the remaining three parameters , and using ¢ ¦ as well as the processes that are being profiled to selectively profiling, given values of and . © log events. For our purposes, it is sufficient to monitor CPU and network activity of capsule processes—we monitor CPU scheduling instances (the time instants at which capsule pro- 2.2 Kernel-based Profiling of Resource Usage cesses get scheduled and the corresponding quantum durations) Our techniques for empirically deriving the QoS requirements as well as network transmission times and packet sizes. Given of an application rely on profiling mechanisms that monitor ap- such a trace of CPU and network activity, we now discuss the plication behavior. Recently, a number of application profiling derivation of the capsule’s QoS requirements. 4
  • 5. Begin CPU quantum/Network transmission Our objective is to find a line that bounds the cumulative £ ¢ ¥¢   ¦ End CPU quantum/Network transmission resource usage; the slope of this line is the token bucket rate ¢ and its Y-intercept is the burst size . As shown in Figure 3(b), ¦ time there are in general many such curves, all of which are valid Idle/ Non capsule related activity (OFF) Busy period (ON) descriptions of the observed resource usage. Several algorithms that mechanically compute all valid pairs for a given On-Off trace have been proposed re- !¦ ¥¡ ¤ ¢ Figure 2: An example of an On-Off trace. cently. We use a variant of one such algorithm [28] in our research—for each On-Off trace, the algorithm produces a Measurement interval I I I range of values (i.e., ¢ ) that constitute valid to- 31%(¢ ¤ '%#¢ ! 2 0 ) $ ken bucket rates for observed behavior. For each within this ¢ range, the algorithm also computes the corresponding burst size Time ¦ . Although any pair within this range conforms to the observed ¨¦ ¦ behavior, the choice of a particular has important practi- !¦ ¥¡ ¤ ¢ cal implications. Since the overbooking tolerance for the capsule is given, resource usage 1 Probability Cumulative ©¨¦¤¢  ¡ § ¥ £¡ we can use to choose a particular pair. To illustrate, !¦ ¥¡ ¤ ¢ © if , the capsule needs must be met 95% of the time, ¨ ¨ 4 5 0 1 which can be achieved by reserving resources corresponding to Fraction resource usage Time the percentile of the usage distribution. Consequently, a A976 @ 8 4 (a) Usage distribution (b) Token bucket parameters good policy for shared hosting platforms is to pick a that cor- ¢ responds to the percentile of the resource usage B¨ ¡ 7 C ¨¨ @ A8 distribution , and to pick the corresponding as computed by ¦ Figure 3: Derivation of the usage distribution and token bucket the above algorithm. This ensures that we provision resources parameters. based on a high percentile of the capsule’s needs and that this percentile is chosen based on the specified overbooking toler- ance . 2.3 Empirical Derivation of the QoS Require- ments 2.4 Profiling Server Applications: Experimental We use the trace of kernel events obtained from the profiling Results process to model CPU and network activity as a simple On-Off process. This is achieved by examining the time at which each In this section, we profile several commonly-used server appli- event occurs and its duration and deriving a sequence of busy cations to illustrate the process of deriving an application’s QoS (On) and idle (Off) periods from this information (see Figure requirements. Our experimentally derived profiles not only il- 2). This trace of busy and idle periods can then be used to de- lustrate the inherent nature of various server application but also rive both the resource usage distribution as well as the token demonstrate the utility and benefits of resource overbooking in bucket parameters . §¥£¡ ¦ ¤ ¢ shared hosting platforms. Determining the usage distribution : Recall that, the us- The test bed for our profiling experiments consists of a clus- age distribution denotes the probability with which the cap- ter of five Dell Poweredge 1550 servers, each with a 966 MHz sule uses a certain fraction of the resource. To derive , we Pentium III processor and 512 MB memory running Red Hat simply partition the trace into measurement intervals of length Linux 7.0. All servers runs the 2.2.17 version of the Linux ker- and measure the fraction of time for which the capsule was nel patched with the Linux trace toolkit version 0.9.5, and are busy in each such interval. This value, which represents the connected by 100Mbps Ethernet links to a Dell PowerConnect fractional resource usage in that interval, is histogrammed and (model no. 5012) ethernet switch. then each bucket is normalized with respect to the number of To profile an application, we run it on one of our servers and measurement intervals in the trace to obtain the probability use the remaining servers to generate the workload for profil- distribution . Figure 3(a) illustrates this process. ing. We assume that all machines are lightly loaded and that all Deriving token bucket parameters : Recall that a token !¦ ¥¡ ¤ ¢ non-essential system services (e.g., mail services, X windows bucket limits the resource usage of a capsule to over any £ ¢  ¢ ¢ ¦ server) are turned off to prevent interference during profiling. interval . A given On-Off trace can have, in general, many ( , ¢ We profile the following server applications in our experiments: ¦ ) pairs that satisfy this bound. To intuitively understand why, let us compute the cumulative resource usage for the capsule Apache web server: We use the SPECWeb99 benchmark over time. The cumulative resource usage is simply the total re- [25] to generate the workload for the Apache web server source consumption thus far and is computed by incrementing (version 1.3.24). The SPECWeb benchmark allows control the cumulative usage after each ON period. Thus, the cumula- along two dimensions—the number of concurrent clients tive resource usage is a step function as depicted in Figure 3(b). and the percentage of dynamic (cgi-bin) HTTP requests. 5
  • 6. Apache Web Server, SPECWEB99 default Apache Web Server, SPECWEB99 default Valid Token Bucket Pairs 0.5 1 2.5 Probability Cumulative Probability 0.45 0.4 0.8 2 Cumulative Probability Burst, rho (msec) 0.35 Probability 0.3 0.6 1.5 0.25 0.2 0.4 1 0.15 0.1 0.2 0.5 0.05 0 0 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Fraction of CPU Fraction of CPU Rate, sigma (fraction) (a) Probability distribution (PDF) (b) Cumulative distribution function (CDF) (c) Token bucket parameters Figure 4: Profile of the Apache web server using the default SPECWeb99 configuration. We vary both parameters to study their impact on Apache’s and 4% of capacity, respectively. These results indicate that resource needs. CPU usage is bursty in nature and that the worst-case require- ments are significantly higher than a high percentile of the us- MPEG streaming media server: We use a home-grown age. Consequently, under provisioning (i.e., overbooking) by a streaming server to stream MPEG-1 video files to multiple mere 1% reduces the CPU requirements of Apache by a factor concurrent clients over UDP. Each client in our experiment of 2.5, while overbooking by 5% yields a factor of 6.25 reduc- requests a 15 minute long variable bit rate MPEG-1 video tion (implying that 2.5 and 6.25 times as many web servers can with a mean bit rate of 1.5 Mb/s. We vary the number be supported when provisioning based on the and A756 @ 8 6 A356 @ 8 4 of concurrent clients and study its impact on the resource percentiles, respectively, instead of the profile). Thus, ¨¨ A5 @ 8 usage at the server. even small amounts of overbooking can potentially yield sig- nificant increases in platform capacity. Figure 4(c) depicts the Quake game server: We use the publicly available Linux possible valid pairs for Apache’s CPU usage. Depend- §¥£¡ ¦ ¤ ¢ Quake server to understand the resource usage of a multi- ing on the specified overbooking tolerance , we can set to ¢ player game server; our experiments use the standard ver- an appropriate percentile of the usage distribution , and the sion of Quake I—a popular multi-player game on the In- corresponding can then be chosen using this figure. ¦ ternet. The client workload is generated using a bot— an autonomous software program that emulates a human player. We use the publicly available “terminator” bot to Figures 5(a)-(d) depict the CPU or network bandwidth distri- emulate each player; we vary the number of concurrent butions, as appropriate, for various server applications. Specif- players connected to the server and study its impact on the ically, the figure shows the usage distribution for the Apache resource usage. web server with 50% dynamic SPECWeb requests, the stream- ing media server with 20 concurrent clients, the Quake game PostgreSQL database server: We profile the postgreSQL server with 4 clients and the postgreSQL server with 10 clients. database server (version 7.2.1) using the pgbench 1.2 Table 1 summarizes our results and also presents profiles for benchmark. This benchmark is part of the postgreSQL dis- several additional scenarios (only a small subset of the three tribution and emulates the TPC-B transactional benchmark dozen profiles obtained from our experiments are presented due [19]. The benchmark provides control over the number of to space constraints). Table 1 also lists the worst-case resource concurrent clients as well as the number of transactions needs as well as the and the percentile of the resource A8 56 @ 6 A976 @ 8 4 performed by each client. We vary both parameters and usage. study their impact on the resource usage of the database server. Together, Figure 5 and Table 1 demonstrate that all server We now present some results from our profiling study. applications exhibit burstiness in their resource usage, albeit Figure 4(a) depicts the CPU usage distribution of the to different degrees. This burstiness causes the worst-case re- Apache web server obtained using the default settings of the source needs to be significantly higher than a high percentile SPECWeb99 benchmark (50 concurrent clients, 30% dynamic of the usage distribution. Consequently, we find that the A756 @ 8 6 cgi-bin requests). Figure 4(b) plots the corresponding cumu- percentile is smaller by a factor of 1.1-2.5, while the per- A576 @ 8 4 lative distribution function (CDF) of the resource usage. As centile yields a factor of 1.3-6.25 reduction when compared to shown in the figure (and summarized in Table 1), the worst the ¨¨ percentile. Together, these results illustrate the po- @ A8 case CPU usage ( profile) is 25% of CPU capacity. Fur- ¨¨ A5 @ 8 tential gains that can be realized by overbooking resources in ther, the and the @ percentiles of CPU usage are 10 A8 6 6 A356 @ 8 4 shared hosting platforms. 6
  • 7. ¨¦¤¢  § ¥ £ ¡ Apache Web Server, 50% cgi-bin Streaming Media Server, 20 clients Application Resource Res. usage at percentile © $ # © 0.3 Probability 0.3 Probability for ! WS,default CPU 0.25 0.10 0.04 (0.10, 0.218) 0.25 0.25 WS, 50% dynamic CPU 0.69 0.29 0.12 (0.29, 0.382) SMS,k=4 Net 0.19 0.16 0.11 (0.16, 1.89) 0.2 0.2 SMS,k=20 Net 0.63 0.49 0.43 (0.49, 6.27) Probability Probability 0.15 0.15 GS,k=2 CPU 0.011 0.010 0.009 (0.010, 0.00099) GS,k=4 CPU 0.018 0.016 0.014 (0.016, 0.00163) 0.1 0.1 DBS,k=1 (def) CPU 0.33 0.27 0.20 (0.27, 0.184) DBS,k=10 CPU 0.85 0.81 0.79 (0.81, 0.130) 0.05 0.05 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Fraction of CPU Fraction of Network Bandwidth Table 1: Summary of profiles. Although we profiled both (a) Apache: dynamic requests (b) Streaming media server CPU and network usage for each application, we only present Quake Game Server, 4 clients Postgres Database Server, 10 clients results for the more constraining resource due to space con- 0.3 Probability 0.1 Probability straints. Abbreviations: WS=Apache, SMS=streaming media 0.25 0.08 server, GS=Quake game server, DBS=database server, k=num. 0.2 clients. Probability Probability 0.06 0.15 0.04 0.1 Test 1: Resource requirements of the new and existing cap- 0.05 0.02 sules can be met. To verify that a node can meet the require- ments of all capsules, we simply sum the requirements of indi- 0 0 0 0.005 0.01 0.015 0.02 0.025 0 0.2 0.4 0.6 0.8 1 vidual capsules and ensure that the aggregate requirements do Fraction of CPU Fraction of CPU not exceed node capacity. For each capsule on the node, the % (c) Quake Server (d) PostgreSQL Server QoS parameters ( , ) and require that the capsule be allo- $ ¦ $ ¢ $ © cated £ $ £  resources in each interval of duration . Fur- Figure 5: Profiles of Various Server Applications $ £¡ ¢ © $ ¦ $ © ther, since the capsule has an overbooking tolerance , in the $ worst case, the node can allocate only re- $ ¡ ¢ £ $   © C 3 $ ¦ ¡ B $ 3 Resource Overbooking and Capsule sources and yet satisfy the capsule needs (thus, the overbooking Placement in Hosting Platforms tolerance represents the fraction by which the allocation may be reduced if the node saturates due to overbooking). Conse- Having derived the QoS requirements of each capsule, the next quently, even in the worst case scenario, the resource require- step is to determine which platform node will run each capsule. ments of all capsules can be met so long as the total resource Several considerations arise when making such placement de- requirements do not exceed the capacity: cisions. First, since platform resources are being overbooked, ' $ © the platform should ensure that the QoS requirements of a cap- ( (¡ $ ¢   % © $ £ £$ ¦ !  ¡ B © ¥ $ ) 0   '% © $ (1) sule will be met even in the presence of overbooking. Second, $ © since multiple nodes may have the resources necessary to house each application capsule, the platform will need to pick a spe- where denotes the CPU or network interface capacity on the ) cific mapping from the set of feasible mappings. This choice node, denotes the number of existing capsules on the node, 1 will be constrained by issues such as trust among competing 1 £ is the new capsule, and min % © $ © ¡ © #§¤ 2 © ¤ © ' 3 © applications. In this section, we present techniques for over- is the period for the capsule that desires the most stringent © booking platform resources in a controlled manner. The aim is guarantees. to ensure that: (i) the QoS requirements of the application are Test 2: Overbooking tolerances of all capsules are met. satisfied and (ii) overbooking tolerances as well as external pol- The overbooking tolerance of a capsule is met only if the to- icy constraints are taken into account while making placement tal amount of overbooking is smaller than its specified toler- decisions. ance. To compute the aggregate overbooking on a node, we must first compute the total resource usage on the node. Since the usage distributions of individual capsules are known, the $ 3.1 Resource Overbooking Techniques total resource on a node is simply the sum of the individual us- ' $ © ages. That is, , where denotes the of aggregate 4 5 6 $ © $ 4 A platform node can accept a new application capsule so long as resource usage distribution on the node. Assuming each is $ the resource requirements of existing capsules are not violated, independent, the resulting distribution can be computed from 4 and sufficient unused resources exist to meet the requirements elementary probability theory.1 Given the total resource usage of the new capsule. However, if the node resources are over- distribution , the probability that the total demand exceeds the 4 booked, another requirement is added: the overbooking toler- 1 This is done using the z-transform. The z-transform of a random variable ances of individual capsules already placed on the node should 7 is the polynomial TPR¤$P@GEC@7 D S I where the coefficient 9 @8 H Q D I H F D B A S WWVH U U U not be exceeded as a result of accepting the new capsule. Veri- of the `X a Y term represents the probability that the random variable equals X 7 A 7 7 Q c 3h7 fffec i g c d d d q EH fying these conditions involves two tests: (i.e., X ). If b9 are independent random variables, S Q p 7