SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
Resilient Overlay Networks

                    David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris
                                                        MIT Laboratory for Computer Science
                                                                ron@nms.lcs.mit.edu
                                                             http://nms.lcs.mit.edu/ron/




Abstract                                                                            and its constituent networks, usually operated by some network ser-
A Resilient Overlay Network (RON) is an architecture that allows                    vice provider. The information shared with other providers and
distributed Internet applications to detect and recover from path                   AS’s is heavily filtered and summarized using the Border Gateway
outages and periods of degraded performance within several sec-                     Protocol (BGP-4) running at the border routers between AS’s [21],
onds, improving over today’s wide-area routing protocols that take                  which allows the Internet to scale to millions of networks.
at least several minutes to recover. A RON is an application-layer                      This wide-area routing scalability comes at the cost of re-
overlay on top of the existing Internet routing substrate. The RON                  duced fault-tolerance of end-to-end communication between Inter-
nodes monitor the functioning and quality of the Internet paths                     net hosts. This cost arises because BGP hides many topological
among themselves, and use this information to decide whether to                     details in the interests of scalability and policy enforcement, has
route packets directly over the Internet or by way of other RON                     little information about traffic conditions, and damps routing up-
nodes, optimizing application-specific routing metrics.                              dates when potential problems arise to prevent large-scale oscil-
   Results from two sets of measurements of a working RON de-                       lations. As a result, BGP’s fault recovery mechanisms sometimes
ployed at sites scattered across the Internet demonstrate the benefits               take many minutes before routes converge to a consistent form [12],
of our architecture. For instance, over a 64-hour sampling period in                and there are times when path outages even lead to significant dis-
March 2001 across a twelve-node RON, there were 32 significant                       ruptions in communication lasting tens of minutes or more [3, 18,
outages, each lasting over thirty minutes, over the 132 measured                    19]. The result is that today’s Internet is vulnerable to router and
paths. RON’s routing mechanism was able to detect, recover, and                     link faults, configuration errors, and malice—hardly a week goes
route around all of them, in less than twenty seconds on average,                   by without some serious problem affecting the connectivity pro-
showing that its methods for fault detection and recovery work well                 vided by one or more Internet Service Providers (ISPs) [15].
at discovering alternate paths in the Internet. Furthermore, RON                        Resilient Overlay Networks (RONs) are a remedy for some of
was able to improve the loss rate, latency, or throughput perceived                 these problems. Distributed applications layer a “resilient overlay
by data transfers; for example, about 5% of the transfers doubled                   network” over the underlying Internet routing substrate. The nodes
their TCP throughput and 5% of our transfers saw their loss prob-                   comprising a RON reside in a variety of routing domains, and co-
ability reduced by 0.05. We found that forwarding packets via at                    operate with each other to forward data on behalf of any pair of
most one intermediate RON node is sufficient to overcome faults                      communicating nodes in the RON. Because AS’s are independently
and improve performance in most cases. These improvements, par-                     administrated and configured, and routing domains rarely share in-
ticularly in the area of fault detection and recovery, demonstrate the              terior links, they generally fail independently of each other. As
benefits of moving some of the control over routing into the hands                   a result, if the underlying topology has physical path redundancy,
of end-systems.                                                                     RON can often find paths between its nodes, even when wide-area
                                                                                    routing Internet protocols like BGP-4 cannot.
                                                                                        The main goal of RON is to enable a group of nodes to commu-
1.     Introduction                                                                 nicate with each other in the face of problems with the underlying
  The Internet is organized as independently operating au-                          Internet paths connecting them. RON detects problems by aggres-
tonomous systems (AS’s) that peer together. In this architecture,                   sively probing and monitoring the paths connecting its nodes. If
detailed routing information is maintained only within a single AS                  the underlying Internet path is the best one, that path is used and no
                                                                                    other RON node is involved in the forwarding path. If the Internet
                                                                                    path is not the best one, the RON will forward the packet by way of
This research was sponsored by the Defense Advanced Research
Projects Agency (DARPA) and the Space and Naval Warfare Sys-                        other RON nodes. In practice, we have found that RON can route
tems Center, San Diego, under contract N66001-00-1-8933.                            around most failures by using only one intermediate hop.
                                                                                        RON nodes exchange information about the quality of the paths
                                                                                    among themselves via a routing protocol and build forwarding ta-
Permission to make digital or hard copies of all or part of this work for           bles based on a variety of path metrics, including latency, packet
personal or classroom use is granted without fee provided that copies are           loss rate, and available throughput. Each RON node obtains the
not made or distributed for profit or commercial advantage and that copies           path metrics using a combination of active probing experiments
bear this notice and the full citation on the first page. To copy otherwise, to      and passive observations of on-going data transfers. In our imple-
republish, to post on servers or to redistribute to lists, requires prior specific   mentation, each RON is explicitly designed to be limited in size—
permission and/or a fee.
18th ACM Symp. on Operating Systems Principles (SOSP) October 2001,
                                                                                    between two and fifty nodes—to facilitate aggressive path main-
Banff, Canada.                                                                      tenance via probing without excessive bandwidth overhead. This
Copyright 2001 ACM
To vu.nl
                                                           Lulea.se                and do not depend on the existence of non-commercial or private
OR−DSL                                                                             networks (such as the Internet2 backbone that interconnects many
                                                     CMU
                                                                        MIT
                                                                                   educational institutions); our ability to determine this was enabled
                                                                        MA−Cable
                                                                        Cisco
                                                                                   by RON’s policy routing feature that allows the expression and im-
                                                                        Mazu
                                                                        Cornell
                                                                                   plementation of sophisticated policies that determine how paths are
CA−T1
                   CCI                                                 NYU         selected for packets.
PDI                Aros
                   Utah                                                               We also found that RON successfully routed around performance
                                                                                                ¦¤¢ 
                                                                                               ¥ £ ¡
                                                                      NC−Cable     failures: in        , the loss probability improved by at least 0.05
                                                                                   in 5% of the samples, end-to-end communication latency reduced
                                                                                   by 40ms in 11% of the samples, and TCP throughput doubled in
                                                                                   5% of all samples. In addition, we found cases when RON’s loss,
                                                                                   latency, and throughput-optimizing path selection mechanisms all
                                                                                   chose different paths between the same two nodes, suggesting that
                                                                                   application-specific path selection techniques are likely to be use-
Figure 1: The current sixteen-node RON deployment. Five sites                      ful in practice. A noteworthy finding from the experiments and
are at universities in the USA, two are European universities                      analysis is that in most cases, forwarding packets via at most one
(not shown), three are “broadband” home Internet hosts con-                        intermediate RON node is sufficient both for recovering from fail-
nected by Cable or DSL, one is located at a US ISP, and five are                    ures and for improving communication latency.
at corporations in the USA.
                                                                                   2. Related Work
                                                                                      To our knowledge, RON is the first wide-area network overlay
allows RON to recover from problems in the underlying Internet in                  system that can detect and recover from path outages and periods of
several seconds rather than several minutes.                                       degraded performance within several seconds. RON builds on pre-
   The second goal of RON is to integrate routing and path selec-                  vious studies that quantify end-to-end network reliability and per-
tion with distributed applications more tightly than is traditionally              formance, on IP-based routing techniques for fault-tolerance, and
done. This integration includes the ability to consult application-                on overlay-based techniques to enhance performance.
specific metrics in selecting paths, and the ability to incorporate
application-specific notions of what network conditions constitute a                2.1 Internet Performance Studies
“fault.” As a result, RONs can be used in a variety of ways. A mul-                   Labovitz et al. [12] use a combination of measurement and anal-
timedia conferencing program may link directly against the RON                     ysis to show that inter-domain routers in the Internet may take tens
library, transparently forming an overlay between all participants                 of minutes to reach a consistent view of the network topology after
in the conference, and using loss rates, delay jitter, or application-             a fault, primarily because of routing table oscillations during BGP’s
observed throughput as metrics on which to choose paths. An ad-                    rather complicated path selection process. They find that during
ministrator may wish to use a RON-based router application to                      this period of “delayed convergence,” end-to-end communication
form an overlay network between multiple LANs as an “Overlay                       is adversely affected. In fact, outages on the order of minutes cause
VPN.” This idea can be extended further to develop an “Overlay                     active TCP connections (i.e., connections in the ESTABLISHED
ISP,” formed by linking (via RON) points of presence in different                  state with outstanding data) to terminate when TCP does not re-
traditional ISPs after buying bandwidth from them. Using RON’s                     ceive an acknowledgment for its outstanding data. They also find
routing machinery, an Overlay ISP can provide more resilient and                   that, while part of the convergence delays can be fixed with changes
failure-resistant Internet service to its customers.                               to the deployed BGP implementations, long delays and temporary
   The third goal of RON is to provide a framework for the imple-                  oscillations are a fundamental consequence of the BGP path vector
mentation of expressive routing policies, which govern the choice                  routing protocol.
of paths in the network. For example, RON facilitates classifying                     Paxson’s probe experiments show that routing pathologies pre-
packets into categories that could implement notions of acceptable                 vent selected Internet hosts from communicating up to 3.3% of the
use, or enforce forwarding rate controls.                                          time averaged over a long time period, and that this percentage has
   This paper describes the design and implementation of RON,                      not improved with time [18]. Labovitz et al. find, by examining
and presents several experiments that evaluate whether RON is a                    routing table logs at Internet backbones, that 10% of all considered
good idea. To conduct this evaluation and demonstrate the ben-                     routes were available less than 95% of the time, and that less than
efits of RON, we have deployed a working sixteen-node RON at                        35% of all routes were available more than 99.99% of the time [13].
sites sprinkled across the Internet (see Figure 1). The RON client                 Furthermore, they find that about 40% of all path outages take more
we experiment with is a resilient IP forwarder, which allows us to                 than 30 minutes to repair and are heavy-tailed in their duration.
compare connections between pairs of nodes running over a RON                      More recently, Chandra et al. find using active probing that 5%
against running straight over the Internet.                                        of all detected failures last more than 10,000 seconds (2 hours, 45
   We have collected a few weeks’ worth of experimental results of                 minutes), and that failure durations are heavy-tailed and can last
path outages and performance failures and present a detailed analy-
                               ¦¤¢ 
                              ¥ £ ¡                                                for as long as 100,000 seconds before being repaired [3]. These
sis of two separate datasets:
                  ©¤§ 
                 ¨ £ ¡                  with twelve nodes measured in              findings do not augur well for mission-critical services that require
March 2001 and             with sixteen nodes measured in May 2001.                a higher degree of end-to-end communication availability.
In both datasets, we found that RON was able to route around be-                      The Detour measurement study made the observation, using Pax-
tween 60% and 100% of all significant outages. Our implementa-                      son’s and their own data collected at various times between 1995
tion takes 18 seconds, on average, to detect and route around a path               and 1999, that path selection in the wide-area Internet is sub-
failure and is able to do so in the face of an active denial-of-service            optimal from the standpoint of end-to-end latency, packet loss rate,
attack on a path. We also found that these benefits of quick fault de-              and TCP throughput [23]. This study showed the potential long-
tection and successful recovery are realized on the public Internet                term benefits of “detouring” packets via a third node by comparing
the long-term average properties of detoured paths against Internet-    phasis on high performance packet classification and routing. It
chosen paths.                                                           uses IP-in-IP encapsulation to send packets along alternate paths.
                                                                           While RON shares with Detour the idea of routing via other
2.2 Network-layer Techniques                                            nodes, our work differs from Detour in three significant ways. First,
   Much work has been done on performance-based and fault-              RON seeks to prevent disruptions in end-to-end communication in
tolerant routing within a single routing domain, but practical mech-    the face of failures. RON takes advantage of underlying Internet
anisms for wide-area Internet recovery from outages or badly per-       path redundancy on time-scales of a few seconds, reacting respon-
forming paths are lacking.                                              sively to path outages and performance failures. Second, RON is
   Although today’s wide-area BGP-4 routing is based largely on         designed as an application-controlled routing overlay; because each
AS hop-counts, early ARPANET routing was more dynamic, re-              RON is more closely tied to the application using it, RON more
sponding to the current delay and utilization of the network. By        readily integrates application-specific path metrics and path selec-
1989, the ARPANET evolved to using a delay- and congestion-             tion policies. Third, we present and analyze experimental results
based distributed shortest path routing algorithm [11]. However,        from a real-world deployment of a RON to demonstrate fast re-
the diversity and size of today’s decentralized Internet necessitated   covery from failure and improved latency and loss-rates even over
the deployment of protocols that perform more aggregation and           short time-scales.
fewer updates. As a result, unlike some interior routing protocols         An alternative design to RON would be to use a generic overlay
within AS’s, BGP-4 routing between AS’s optimizes for scalable          infrastructure like the X-Bone and port a standard network routing
operation over all else.                                                protocol (like OSPF or RIP) with low timer values. However, this
   By treating vast collections of subnetworks as a single entity for   by itself will not improve the resilience of Internet communications
global routing purposes, BGP-4 is able to summarize and aggregate       for two reasons. First, a reliable and low-overhead outage detection
enormous amounts of routing information into a format that scales       module is required, to distinguish between packet losses caused by
to hundreds of millions of hosts. To prevent costly route oscilla-      congestion or error-prone links from legitimate problems with a
tions, BGP-4 explicitly damps changes in routes. Unfortunately,         path. Second, generic network-level routing protocols do not utilize
while aggregation and damping provide good scalability, they in-        application-specific definitions of faults.
terfere with rapid detection and recovery when faults occur. RON           Various Content Delivery Networks (CDNs) use overlay tech-
handles this by leaving scalable operation to the underlying Inter-     niques and caching to improve the performance of content delivery
net substrate, moving fault detection and recovery to a higher layer    for specific applications such as HTTP and streaming video. The
overlay that is capable of faster response because it does not have     functionality provided by RON may ease future CDN development
to worry about scalability.                                             by providing some routing components required by these services.
   An oft-cited “solution” to achieving fault-tolerant network con-
nectivity for a small- or medium-sized customer is to multi-home,
advertising a customer network through multiple ISPs. The idea          3. Design Goals
is that an outage in one ISP would leave the customer connected            The design of RON seeks to meet three main design goals: (i)
via the other. However, this solution does not generally achieve        failure detection and recovery in less than 20 seconds; (ii) tighter
fault detection and recovery within several seconds because of the      integration of routing and path selection with the application; and
degree of aggregation used to achieve wide-area routing scalabil-       (iii) expressive policy routing.
ity. To limit the size of their routing tables, many ISPs will not
accept routing announcements for fewer than 8192 contiguous ad-         3.1 Fast Failure Detection and Recovery
dresses (a “/19” netblock). Small companies, regardless of their           Today’s wide-area Internet routing system based on BGP-4 does
fault-tolerance needs, do not often require such a large address        not handle failures well. From a network perspective, we define
block, and cannot effectively multi-home. One alternative may be        two kinds of failures. Link failures occur when a router or a link
“provider-based addressing,” where an organization gets addresses       connecting two routers fails because of a software error, hardware
from multiple providers, but this requires handling two distinct sets   problem, or link disconnection. Path failures occur for a variety of
of addresses on its hosts. It is unclear how on-going connections       reasons, including denial-of-service attacks or other bursts of traffic
on one address set can seamlessly switch on a failure in this model.    that cause a high degree of packet loss or high, variable latencies.
                                                                           Applications perceive all failures in one of two ways: outages or
2.3 Overlay-based Techniques                                            performance failures. Link failures and extreme path failures cause
   Overlay networks are an old idea; in fact, the Internet itself was   outages, when the average packet loss rate over a sustained period
developed as an overlay on the telephone network. Several Inter-        of several minutes is high (about 30% or higher), causing most pro-
net overlays have been designed in the past for various purposes,       tocols including TCP to degrade by several orders of magnitude.
including providing OSI network-layer connectivity [10], easing         Performance failures are less extreme; for example, throughput, la-
IP multicast deployment using the MBone [6], and providing IPv6         tency, or loss-rates might degrade by a factor of two or three.
connectivity using the 6-Bone [9]. The X-Bone is a recent infras-          BGP-4 takes a long time, on the order of several minutes, to con-
tructure project designed to speed the deployment of IP-based over-     verge to a new valid route after a link failure causes an outage [12].
lay networks [26]. It provides management functions and mecha-          In contrast, RON’s goal is to detect and recover from outages and
nisms to insert packets into the overlay, but does not yet support      performance failures within several seconds. Compounding this
fault-tolerant operation or application-controlled path selection.      problem, IP-layer protocols like BGP-4 cannot detect problems
   Few overlay networks have been designed for efficient fault de-       such as packet floods and persistent congestion on links or paths
tection and recovery, although some have been designed for better       that greatly degrade end-to-end performance. As long as a link is
end-to-end performance. The Detour framework [5, 22] was mo-            deemed “live” (i.e., the BGP session is still alive), BGP’s AS-path-
tivated by the potential long-term performance benefits of indirect      based routing will continue to route packets down the faulty path;
routing [23]. It is an in-kernel packet encapsulation and routing       unfortunately, such a path may not provide adequate performance
architecture designed to support alternate-hop routing, with an em-     for an application using it.
Data
                         vBNS / Internet 2                                                    Node 1                    Node 2              Node 3




                                                                          External Probes
     Utah                                        130        MIT
                            155Mbps / 60ms     Mbps
                                                                                               Conduits                 Conduits            Conduits
                      155                                                                         Forwarder                 Forwarder           Forwarder

                         Qwest                BBN                                             Probes      Router        Probes     Router   Probes     Router
            Private                                      Private
            Peering                                      Peering
                                                                                                 Performance Database
             3Mbps                                      45Mbps
              6ms           UUNET            AT&T        5ms              Figure 3: The RON system architecture. Data enters the RON
                                                                          from RON clients via a conduit at an entry node. At each node,
            ArosNet         6Mbps                                         the RON forwarder consults with its router to determine the best
                                  1Mbps, 3ms                              path for the packet, and sends it to the next node. Path selec-
                                                MediaOne                  tion is done at the entry node, which also tags the packet, sim-
                 Cable Modem                                              plifying the forwarding path at other nodes. When the packet
                                                                          reaches the RON exit node, the forwarder there hands it to the
                                                                          appropriate output conduit, which passes the data to the client.
Figure 2: Internet interconnections are often complex. The dot-           To choose paths, RON nodes monitor the quality of their vir-
ted links are private and are not announced globally.                     tual links using active probing and passive observation. RON
                                                                          nodes use a link-state routing protocol to disseminate the topol-
                                                                          ogy and virtual-link quality of the overlay network.
3.2 Tighter Integration with Applications
   Failures and faults are application-specific notions: network con-      4. Design
ditions that are fatal for one application may be acceptable for an-
                                                                             The conceptual design of RON, shown in Figure 3, is quite sim-
other, more adaptive one. For instance, a UDP-based Internet audio
                                                                          ple. RON nodes, deployed at various locations on the Internet,
application not using good packet-level error correction may not
                                                                          form an application-layer overlay to cooperatively route packets
work at all at loss rates larger than 10%. At this loss rate, a bulk
                                                                          for each other. Each RON node monitors the quality of the Internet
transfer application using TCP will continue to work because of
                                                                          paths between it and the other nodes, and uses this information to
TCP’s adaptation mechanisms, albeit at lower performance. How-
                                                                          intelligently select paths for packets. Each Internet path between
ever, at loss rates of 30% or more, TCP becomes essentially un-
                                                                          two nodes is called a virtual link. To discover the topology of the
usable because it times out for most packets [16]. RON allows
                                                                          overlay network and obtain information about all virtual links in
applications to independently define and react to failures.
                                                                          the topology, every RON node participates in a routing protocol
   In addition, applications may prioritize some metrics over oth-
                                                                          to exchange information about a variety of quality metrics. Most
ers (e.g., latency over throughput, or low loss over latency) in their
                                                                          of RON’s design supports routing through multiple intermediate
path selection. They may also construct their own metrics to select
                                                                          nodes, but our results (Section 6) show that using at most one inter-
paths. A routing system may not be able to optimize all of these
                                                                          mediate RON node is sufficient most of the time. Therefore, parts
metrics simultaneously; for example, a path with a one-second la-
                                                                          of our design focus on finding better paths via a single intermediate
tency may appear to be the best throughput path, but this degree
                                                                          RON node.
of latency may be unacceptable to an interactive application. Cur-
rently, RON’s goal is to allow applications to influence the choice        4.1 Software Architecture
of paths using a single metric. We plan to explore multi-criteria
                                                                             Each program that communicates with the RON software on a
path selection in the future.
                                                                          node is a RON client. The overlay network is defined by a sin-
3.3 Expressive Policy Routing                                             gle group of clients that collaborate to provide a distributed service
                                                                          or application. This group of clients can use service-specific rout-
    Despite the need for policy routing and enforcement of accept-
                                                                          ing metrics when deciding how to forward packets in the group.
able use and other policies, today’s approaches are primitive and
                                                                          Our design accommodates a variety of RON clients, ranging from
cumbersome. For instance, BGP-4 is incapable of expressing fine-           a generic IP packet forwarder that improves the reliability of IP
grained policies aimed at users or hosts. This lack of precision          packet delivery, to a multi-party conferencing application that in-
not only reduces the set of paths available in the case of a failure,
                                                                          corporates application-specific metrics in its route selection.
but also inhibits innovation in the use of carefully targeted policies,
                                                                             A RON client interacts with RON across an API called a conduit,
such as end-to-end per-user rate controls or enforcement of accept-
                                                                          which the client uses to send and receive packets. On the data path,
able use policies (AUPs) based on packet classification. Because           the first node that receives a packet (via the conduit) classifies it
RONs will typically run on relatively powerful end-points, we be-         to determine the type of path it should use (e.g., low-latency, high-
lieve they are well-suited to providing fine-grained policy routing.       throughput, etc.). This node is called the entry node: it determines
    Figure 2 shows the AS-level network connectivity between four
                                                                          a path from its topology table, encapsulates the packet into a RON
of our RON hosts; the full graph for (only) 12 hosts traverses 36
                                                                          header, tags it with some information that simplifies forwarding
different autonomous systems. The figure gives a hint of the con-
                                                                          by downstream RON nodes, and forwards it on. Each subsequent
siderable underlying path redundancy available in the Internet—the        RON node simply determines the next forwarding hop based on the
reason RON works—and shows situations where BGP’s blunt pol-              destination address and the tag. The final RON node that delivers
icy expression inhibits fail-over. For example, if the Aros-UUNET         the packet to the RON application is called the exit node.
connection failed, users at Aros would be unable to reach MIT even
                                                                             The conduits access RON via two functions:
if they were authorized to use Utah’s network resources to get there.
This is because it impossible to announce a BGP route only to par-                          1. send(pkt, dst, via ron) allows a node to forward
ticular users, so the Utah-MIT link is kept completely private.                                a packet to a destination RON node either along the RON or
using the direct Internet path. RON’s delivery, like UDP, is       incomplete information about any other one is when all paths in
      best-effort and unreliable.                                        the RON from the other RON nodes to it are unavailable.

   2. recv(pkt, via ron) is a callback function that is                  4.2.2 Path Evaluation and Selection
      called when a packet arrives for the client program. This             The RON routers need an algorithm to determine if a path is still
      callback is invoked after the RON conduit matches the type         alive, and a set of algorithms with which to evaluate potential paths.
      of the packet in the RON header to the set of types pre-           The responsibility of these metric evaluators is to provide a number
      registered by the client when it joins the RON. The RON            quantifying how “good” a path is according to that metric. These
      packet type is a demultiplexing field for incoming packets.         numbers are relative, and are only compared to other numbers from
   The basic RON functionality is provided by the forwarder              the same evaluator. The two important aspects of path evaluation
object, which implements the above functions. It also provides a         are the mechanism by which the data for two links are combined
timer registration and callback mechanism to perform periodic op-        into a single path, and the formula used to evaluate the path.
erations, and a similar service for network socket data availability.       Every RON router implements outage detection, which it uses
   Each client must instantiate a forwarder and hand to it two mod-      to determine if the virtual link between it and another node is still
ules: a RON router and a RON membership manager. The RON                 working. It uses an active probing mechanism for this. On de-
router implements a routing protocol. The RON membership man-            tecting the loss of a probe, the normal low-frequency probing is re-
ager implements a protocol to maintain the list of members of a          placed by a sequence of consecutive probes, sent in relatively quick
RON. By default, RON provides a few different RON router and             succession spaced by         ¨¦
                                                                                                     ©   ©§   seconds. If
                                                                         probes in a row elicit no response, then the path is considered
                                                                                                                                           $# #©
                                                                                                                                             %§$ !
membership manager modules for clients to use.
   RON routers and membership managers exchange packets using            “dead.” If even one of them gets a response, then the subsequent
RON as their forwarding service, rather than over direct IP paths.       higher-frequency probes are canceled. Paths experiencing outages
This feature of our system is beneficial because it allows these mes-     are rated on their packet loss rate history; a path having an out-
sages to be forwarded even when some underlying IP paths fail.           age will always lose to a path not experiencing an outage. The
                                                                         $¨ #¨©
                                                                           %§$ !         and the frequency of probing (             ¨#' ¦
                                                                                                                                        )!(§  ©§
                                                                                                                                              )
4.2 Routing and Path Selection                                           permit a trade-off between outage detection time and the bandwidth
   Routing is the process of building up the forwarding tables that      consumed by the (low-frequency) probing process (Section 6.2 in-
are used to choose paths for packets. In RON, the entry node             vestigates this).
has more control over subsequent path selection than in traditional         By default, every RON router implements three different routing
datagram networks. This node tags the packet’s RON header with           metrics: the latency-minimizer, the loss-minimizer, and the TCP
an identifier that identifies the flow to which the packet belongs;         throughput-optimizer. The latency-minimizer forwarding table is
subsequent routers attempt to keep a flow ID on the same path it          computed by computing an exponential weighted moving average
first used, barring significant link changes. Tagging, like the IPv6       (EWMA) of round-trip latency samples with parameter . For any
                                                                         link , its latency estimate  65321
                                                                                                        4 is updated as: 1                      0
flow ID, helps support multi-hop routing by speeding up the for-
warding path at intermediate nodes. It also helps tie a packet flow           5RYXVWQ2T SA8PQIH0
                                                                                 1 U R 9G          F¢ EC6532A1@¤865321
                                                                                                    D B4 90 7 4                                           (1)
to a chosen path, making performance more predictable, and pro-
vides a basis for future support of multi-path routing in RON. By
tagging at the entry node, the application is given maximum control
                                                                            We use                              e ca0
                                                                                                                  db `
                                                                                              , which means that 10% of the current latency
                                                                         estimate is based on the most recent sample. This number is similar
over what the network considers a “flow.”                                 to the values suggested for TCP’s round-trip time estimator [20].
   The small size of a RON relative to the Internet allows it to main-   For a RON path, the overall latency is the sum of the individual
tain information about multiple alternate routes and to select the
path that best suits the RON client according to a client-specified
                                                                         virtual link latencies:    5 321 YpAiQhf 3s5 £qYpAiQhgf321
                                                                                                      4                r ` 4
                                                                                                                         .
                                                                            To estimate loss rates, RON uses the average of the last                 #b ¢ ut
                                                                                                                                                    b `
routing metric. By default, it maintains information about three         probe samples as the current average. Like Floyd et al. [7], we
specific metrics for each virtual link: (i) latency, (ii) packet loss     found this to be a better estimator than EWMA, which retains some
rate, and (iii) throughput, as might be obtained by a bulk-transfer      memory of samples obtained in the distant past as well. It might be
TCP connection between the end-points of the virtual link. RON           possible to further improve our estimator by unequally weighting
clients can override these defaults with their own metrics, and the
RON library constructs the appropriate forwarding table to pick
                                                                                       t
                                                                         some of the samples [7].
                                                                            Loss metrics are multiplicative on a path: if we assume that
good paths. The router builds up forwarding tables for each com-         losses are independent, the probability of success on the entire path
bination of policy routing and chosen routing metric.                    is roughly equal to the probability of surviving all hops individu-

4.2.1 Link-State Dissemination
                                                                         ally:€GA5xR2AwT¨3v1 ¢ D €pAiyhf 3s5 ƒ¢ ` €pAiyhf xR2QwT¨3v1
                                                                                    4 T                          ‚             . 4 T
                                                                            RON does not attempt to find optimal throughput paths, but
   The default RON router uses a link-state routing protocol to dis-     strives to avoid paths of low throughput when good alternatives are
seminate topology information between routers, which in turn is
                                                       £                 available. Given the time-varying and somewhat unpredictable na-
used to build the forwarding tables. Each node in an -node RON
 £¡£
¢                                                                        ture of available bandwidth on Internet paths [2, 19], we believe this
has           virtual links. Each node’s router periodically requests    is an appropriate goal. From the standpoint of improving the reli-
summary information of the different performance metrics to the
     ¥¤£
    ¢                                                                    ability of path selection in the face of performance failures, avoid-
          other nodes from its local performance database and dis-       ing bad paths is more important than optimizing to eliminate small
seminates its view to the others.                                        throughput differences between paths. While a characterization of
   This information is sent via the RON forwarding mesh itself, to       the utility received by programs at different available bandwidths
ensure that routing information is propagated in the event of path       may help determine a good path selection threshold, we believe that
outages and heavy loss periods. Thus, the RON routing protocol           more than a 50% bandwidth reduction is likely to reduce the util-
is itself a RON client, with a well-defined RON packet type. This         ity of many programs. This threshold also falls outside the typical
leads to an attractive property: The only time a RON router has          variation observed on a given path over time-scales of tens of min-
utes [2]. We therefore concentrate on avoiding throughput faults of                Probers
this order of magnitude.                                                                       Samples
                                                                                                               request
   Throughput-intensive applications typically use TCP or TCP-                     Latency




                                                                        Metrics
                                                                        System
like congestion control, so the throughput optimizer focuses on this                Loss                                  Client
                                                                                                     Performance
type of traffic. The performance of a bulk TCP transfer is a func-                 Throughput         Database
tion of the round-trip latency and the packet loss rate it observes.                                               callback
Throughput optimization combines the latency and loss metrics us-




                                                                        Metrics
                                                                                                                          Client




                                                                         App
ing a simplified version of the TCP throughput equation [16], which
provides an upper-bound on TCP throughput. Our granularity of
                                                                                                              pdb_request()
loss rate detection is 1%, and the throughput equation is more sen-                  pdb_update()             pdb_callback()
sitive at lower loss rates. We set a minimum packet loss rate of 2%
to prevent an infinite bandwidth and to prevent large route oscilla-
tions from single packet losses. The formula used is:                               Figure 4: The RON performance database.

                                    d ¢ ` #wyv T
                                            ¢
                                                     £
                                  V I4 4w R
                                    ¡
                                      9          ¢               (2)
                                                                           To make good routing decisions RON needs to have detailed per-
                  V
where is the one-way end-to-end packet loss probability and     g44w   formance information. It is impractical to send large performance
is the end-to-end round-trip time estimated from the hop-by-hop         histories to all participants in the RON. A reliable performance
samples as described above.                                             repository must handle participants that crash, reboot, or rejoin the
   The non-linear combination of loss and round-trip time makes         RON. Measurement data is often noisy, and different clients may
this scoring difficult to optimize with standard shortest-path algo-     have different ways in which they will use that data; for instance,
rithms; for instance, the TCP throughput between nodes and¤       ¥     an outage detector may want to know if any packets were success-
via a node is not usually the smaller of the TCP throughputs be-
                      ¦                                                 fully sent in the last 15 seconds, but a throughput improver may
tween and and and . While more complicated search algo-
              ¤           ¦   ¦         ¥                               be interested in a longer-term packet loss average. Therefore, the
rithms can be employed to handle this, RON routing currently takes      system needs a flexible summarization mechanism.
advantage of the resulting simplicity when only single-intermediate        To support these requirements, each RON node or local group
paths are considered to obtain throughput-optimized paths.              of nodes uses a separate performance database to store samples
   Our probe samples give us the two-way packet loss probability,       (Figure 4). The database is a generalization of SPAND [24],
from which we estimate the one-way loss probability by (unrealis-       supporting different data types and summarization mechanisms.
tically, for some paths) assuming symmetric loss rates and solving      Sources of information and clients that use it agree on a class
the resulting quadratic equation. Assuming that losses are indepen-     name for the data and the meaning of the values inserted into
dent on the two paths             and
                                  ¦ ©¨¤
                                     §         , then the maximum
                                                ¤ ¦
                                                   §                    the database. The probes insert data into the database by calling
absolute error in the one-way loss rate estimate occurs when all        pdb update(class, src, dst, type, value). The
of the loss is in only one direction, resulting in an error equal to    src and dst fields contain the IP addresses of the sampled data.
5    #!$#!
     %     
          ¨    . Assuming a minimum loss rate of 2% ensures that,
when choosing between two high-quality links, our loss rate esti-       4.3 Policy Routing
mate is within 50% of the true value. At large loss rates, highly          RON allows users or administrators to define the types of traffic
asymmetric loss patterns cause this method to disregard a poten-        allowed on particular network links. In traditional Internet policy
tially good path because the reverse direction of that path has a       routing, “type” is typically defined only by the packet’s source and
high loss rate. However, the benefits of avoiding the high loss rate     destination addresses [4]; RON generalizes this notion to include
path seem to outweigh the cost of missing one good link, but we in-     other information about the packet. RON separates policy routing
tend to remedy this problem soon by explicitly estimating one-way       into two components: classification and routing table formation.
loss rates.                                                                When a packet enters the RON, it is classified and given a policy
   This method of throughput comparison using Equation 2 is not         tag; this tag is used to perform lookups in the appropriate set of
without faults. For instance, a slow but relatively unused bottle-      routing tables at each RON router. A separate set of routing tables
neck link on a path would almost never observe packet losses, and       is constructed for each policy by re-running the routing computa-
none of our probes would be lost. The equation we use would pre-        tion, removing the links disallowed by the corresponding policy.
dict an unreasonably high bandwidth estimate. One way of tack-          The routing computation computes the shortest path from the RON
ling this would be to send pairs of probe packets to estimate an        node to all other nodes, for any given metric. This construction
upper-bound on link bandwidth,while another might be to use non-        applies to multi-hop or single-hop indirection; because a standard
invasive probes to predict throughput [8].                              shortest-paths algorithm may not work for all metrics (specifically,
   Oscillating rapidly between multiple paths is harmful to applica-    TCP throughput), our implementation, described in Section 5.2, is
tions sensitive to packet reordering or delay jitter. While each of     specific to single-hop indirection. The result of running the table
the evaluation metrics applies some smoothing, this is not enough       construction is the multi-level routing tables shown in Figure 5.
to avoid “flapping” between two nearly equal routes: RON routers            The construction of these routing tables and the production of
therefore employ hysteresis. Based on an analysis of 5000 snap-         packet tags is facilitated by the policy classifier component. The
shots from a RON node’s link-state table, we chose to apply a sim-      policy classifier produces the permits function that determines if
ple 5% hysteresis bonus to the “last good” route for the three met-     a given policy is allowed to use a particular virtual link. It also pro-
rics. This simple method appears to provide a reasonable trade-off      vides a conduit-specific data classifier module that helps the RON
between responsiveness to changes in the underlying paths and un-       node decide which policy is appropriate for the incoming packet.
necessary route flapping.                                                For example, if packets from a commercial site are not allowed to
                                                                        go across the Internet2 educational backbone, and we received a
4.2.3 Performance Database                                              packet from such a site at an MIT RON node, the data classifier
from net                    to next hop via net
                                                                         a policy tag that is interpreted by the forwarders to decide which
                                  Forwarder                              network routing policies apply to the packet. If the packet is des-
                                  Routing Lookup                         tined for the local node, the forwarder uses the packet type field to
                                     Routing Pref Demux
                                                                         demultiplex the packet to the RON client.
                      Policy
                      Demux                                                 If the packet’s flow ID has a valid flow cache entry, the forwarder
                                                                         short-cuts the routing process with this entry. Otherwise, the rout-
                                            Dst Next                     ing table lookup occurs in three stages. The first stage examines the
                                                Hop
                                                                         policy tag, and locates the proper routing preference table. There is
                                                                         one routing preference table for each known policy tag. Policy ad-
                     Router                                              herence supersedes all other routing preferences. Next, the lookup
                                                                         procedure examines the routing preference flags to find a compat-
                                                                         ible route selection metric for the packet. The flags are examined
Figure 5: The forwarding control and data paths. The for-                from the right, and the first flag understood by the router directs
warder passes the packet header to the routing table, which              the packet to the next table. All RON routers understand the ba-
first checkes if a valid flow cache entry exists. If not, it per-          sic system metrics of latency, loss, and throughput; this provides
forms three levels of lookups. The first level is based on the            a way for a shared RON environment to support users who may
policy type, the second is based on the routing preference, and          also have client-defined metrics, and still provide them with good
the third is a hash lookup of next hops indexed by destination.          default metrics. A lookup in the routing preference table leads to
                                                                         a hash table of next-hops based upon the destination RON node.
                  Version      Hop Limit   Routing Flags                 The entry in the next-hop table is returned to the forwarder, which
                            RON Source Address                           places the packet on the network destined for the next hop.
                            RON Destination Address
                    Source port             Dest port
                                                                         4.5 Bootstrap and Membership Management
                                     Flow ID                                In addition to allowing clients to define their own membership
                                    Policy Tag                           mechanisms, RON provides two system membership managers: a
                                   Packet Type                           simple static membership mechanism that loads its peers from a
                                                                         file, and a dynamic announcement-based, soft-state membership
                                                                         protocol. To bootstrap the dynamic membership protocol, a new
Figure 6: The RON packet header. The routing flags and flow                node needs to know the identity of at least one peer in the RON.
ID are set by the input conduit. The packet type is a demul-             The new node uses this neighbor to broadcast its existence using a
tiplexing key indicating the appropriate conduit or protocol at          flooder, which is a special RON client that implements a general-
the receiver.                                                            purpose, resilient flooding mechanism using a RON forwarder.
                                                                            The main challenge in the dynamic membership protocol is to
                                                                         avoid confusing a path outage to a node from its having left the
module would determine which policy to use and set the corre-            RON. Each node builds up and periodically (every five minutes on
sponding tag on the packet. Subsequent RON nodes examine the             average in our implementation) floods to all other nodes its list of
policy tag instead of reclassifying the packet.                          peer RON nodes. If a node has not heard about a peer in sixty
   We have designed two policy mechanisms: exclusive cliques and         minutes, it assumes that the peer is no longer participating in the
general policies. In an exclusive clique, only data originating from     RON. Observe that this mechanism allows two nodes in the same
and destined to other members of the clique may traverse inter-          RON to have a non-functioning direct Internet path for long periods
clique links. This mimics, for instance, the “educational only” pol-     of time: as long as there is some path in the RON between the two
icy on the Internet2 backbone. This policy classifier takes only a        nodes, neither will think that the other has left.
list of networks and subnet masks to match against.                         The overhead of broadcasting is minuscule compared to the traf-
   Our general policy component is more powerful; it accepts a           fic caused by active probing and routing updates, especially given
BPF-like packet matcher [14], and a list of links that are denied        the limited size of a RON. The resulting robustness is high: each
                                                                                             £   ¢  
by this policy. It returns the first policy that matches a packet’s       node receives up to          copies of the peer list sent from a given
fields to the stored BPF-based information. We do not currently           other node. This redundancy causes a node to be deleted from an-
address policy composition, although users may create composed           other node’s view of the RON only if the former node is genuinely
versions of their own policies using the general policy component.       partitioned for over an hour from every other node in the RON. A
In Section 7, we discuss possible abuse of AUPs and network tran-        RON node will re-bootstrap from a well-known peer after a long
sit policies.                                                            partition.

4.4 Data Forwarding                                                      5. Implementation
   The forwarder at each RON node examines every incoming                   The RON system is implemented at user-level as a flexible set
packet to determine if it is destined for a local client or a remote     of C++ libraries to which user-level RON clients link. Each client
destination. If it requires further delivery, the forwarder passes the   can pick and choose the components that best suits its needs. To
RON packet header to the routing table, as shown in Figure 5.            provide a specific example of a client using the RON library, we
   The RON packet header is shown in Figure 6. It is inspired by the     describe the implementation of resilient IP forwarder. This for-
design of IPv6 [17]. Because RON needs to support more than sim-         warder improves IP packet delivery without any modification to
ply IP-in-IP encapsulation, RON uses its own header. RON does            the transport protocols and applications running at end-nodes. The
not fragment packets, but does inform applications if they exceed        architecture of the resilient IP forwarder is shown in Figure 7.
the Maximum Transmission Unit (MTU). As in IPv6, applications               RON uses UDP to forward data, since TCP’s reliable byte-stream
must perform end-to-end path MTU discovery. RON also provides            is mismatched to many RON clients, and IP-based encapsulation
To RON
   nodes                                           FreeBSD Networking
                                                                                         To/From      5.2 Routers
                                                                                         Local Site
                                                             IP                                          RON routers implement the router virtual interface, which
                               UDP                  Divert Socket  Raw Socket                         has only a single function call, lookup(pkt *mypkt). The
                                                                                                      RON library provides a trivial static router, and a dynamic router
                                                                     Yes            Emit              that routes based upon different metric optimizations. The dynamic
                                                    Must_Fragment?
                   send()

                               recv()
                                                                                    ICMP              router is extensible by linking it with a different set of metric de-
                                         Conduit      Classify   flags
                                                                 policy                               scriptions. Metric descriptions provide an evaluation function that
                                                     Encapsulate       Outbound                       returns the “score” of a link, and a list of metrics that the routing
                                                                                                      table needs to generate and propagate. The implementation’s rout-
                                                                                                      ing table creation is specific to single-hop indirection, which also
                                                    Yes
                              is_local?                                                               eliminates the need for the flow cache. The following algorithm
       Forwarder




                                                           Type Demux             Local
                                        No                                        Data                fills in the multi-level routing table at the bottom of Figure 7:
                                                                                  Protocols
                                                                                  and                      M AKE ROUTING TABLE (P OLICIES , M ETRICS , P EERS )
                              Route Lookup
                                                                                  Conduits
                                                                                                              foreach in P OLICIES
                                                                                                                         
                                                                                                                  foreach    ¡in M ETRICS

                                                                                                                                                                                       vV y4T£ R ¢  
                            Policy                   Routing Pref Demux                                               foreach       in P EERS
                            Demux                    Latency Loss B/W                                                   foreach        in P EERS
                            vBNS                                                                                           if .permits                 GX vV ¦££ ¥R €UD  
       Router




                                                                                                                                                                      ¤
                                                            Dst   Next
                                                                  Hop
                                                                                                                           AND .permits    G6QR ¦¨ vV §D
                                                                                                                                          © 4 T ¢ ¤
                            default
                                                                                                                             if
                                                                                                                                     G 4yTR ¢¤ vV £© ¨w¤R R yv €UDT y4TQRFT` T
                                                                                                                                      .eval
                                                                                                                                       ¡                  ;                          ¡
                                                                                                                                             ¡
                                                                                                                                                 ;
                                                                                                                                                   ;
                                                                                                                                                         ¡       £` ` ¨wyvvV  T 4 y4 !TR QRP
                                                                                                                                                                   ¡

                                                                                                                                                             vV T # R
                                                                                                                                                                   £
                                                                                                                                                                             
                                                                                                                                                                                  

                                                                                                                                                                                                    $
                                                                                                                                 $
Figure 7: The resilient IP forwarder. Packets enter the sys-
tem through FreeBSD’s divert sockets. They are handed to the                                                                table[ ][          vV  4 !8P ` y4TR ¢ ¡  
                                                                                                                                                        R
                                                                                                                                                       ][              ]                         ;
RON forwarder, which looks up the next hop. When the packet
arrives at its destination, the conduit writes the packet out on a                                       We have implemented the clique classifier discussed in Sec-
local raw socket, where it re-enters IP processing.                                                   tion 4.3, and are implementing a general policy classifier. Both
                                                                                                      provide classifiers for the resilient IP forwarder.

                                                                                                      5.3 Monitoring Virtual Links
would restrict its application-specific forwarding capabilities. The
RON core services run without special kernel support or elevated                                                             Node A                                        Node B
                                                                                                                                                     Initial P
privileges. The conduit at the entry node is the client’s gateway to                                            ID 5: time 10                                   ing
the RON, and classifies every packet. This classification is client-                                                                                      se 1                     ID 5: time 33
                                                                                                                                                 Respon
specific; it labels the packet with information that decides what                                                ID 5: time 15                      Respon
routing metric is later used to route the packet. As an example,                                                                                           se 2
                                                                                                                                                                                 ID 5: time 39
a RON IP forwarder conduit might label DNS and HTTP traffic
as “latency-sensitive” and FTP traffic as “throughput-intensive,”
                                                                                                               Record success with RTT 5
which would cause the downstream RON forwarders to use the                                                                          Record success with RTT 6
appropriate routing metric for each packet type. Non-entry RON
nodes route the packet based only on the attached label and desti-
nation of the packet.                                                                                 Figure 8: The active prober’s probing mechanism. With three
   This design ensures that all client- and application-specific rout-                                 packets, both participants get an RTT sample without requir-
ing functions occur only at the entry and exit conduits, and that for-                                ing synchronized clocks.
warding inside the RON is independent of client-specific logic. In
addition to reducing the per-packet overhead at intermediate nodes,                                                                              £
                                                                                                         Each RON node in an -node RON monitors its
                                                                                                                                                                                                        £   ¢  
it also localizes the client-specific computation required on each                                     virtual links using randomized periodic probes. The active
packet. This means, for example, that a RON node implementing                                         prober component maintains a copy of a peers table with a
an IP forwarder may also participate in an overlay with a confer-                                     next probe time field per peer. When this field expires, the
encing application, and help improve the reliability of data delivery                                 prober sends a small UDP probe packet to the remote peer. Each
for the conference. The conference node conduits implement the                                        probe packet has a random 64-bit ID. The process used by the
application-specific methods that label packets to tell the IP for-                                    prober is shown in Figure 8. When a node receives an initial probe
warder which routing metrics to use for conference packets.                                           request from a peer, it sends response 1 to that peer, and resets its
                                                                                                      probe timer for that peer. When the originating node sees response
5.1 The IP Forwarder                                                                                  1, it sends response 2 back to the peer, so that both sides get reach-
   We implemented the resilient IP forwarder using FreeBSD’s di-                                      ability and RTT information from 3 packets.
vert sockets to automatically send IP traffic over the RON, and emit                                      The probing protocol is implemented as a RON client, which
it at the other end. The resilient IP forwarder provides classifica-                                   communicates with performance database (implemented as a stand-
tion, encapsulation, and decapsulation of IP packets through a spe-                                   alone application running on the Berkeley DB3 backend) using a
cial conduit called the ip conduit (top of Figure 7).                                                 simple UDP-based protocol.
RON was able to successfully detect and recover from                                         Name Description
 100% (in
            ¥ ¤§ 
              £ ¡ ) and 60% (in
                                          ¨ ¤§ 
                                            £ ¡
                                          ) of all complete                                   Aros ISP in Salt Lake City, UT
                                                                              6.2              CCI .com in Salt Lake City, UT
 outages and all periods of sustained high loss rates of
 30% or more.                                                                             Cisco-MA .com in Waltham, MA
                                                                                             * CMU Pittsburgh, PA
 RON takes 18 seconds, on average, to route around a
                                                                              6.2        * Cornell Ithaca, NY
 failure and can do so in the face of a flooding attack.
                                                                                             Lulea Lulea University, Sweden
 RON successfully routed around bad throughput fail-                                      MA-Cable MediaOne Cable in Cambridge, MA
                                                                              6.3
 ures, doubling TCP throughput in 5% of all samples.                                         * MIT Cambridge, MA
                                                                                             CA-T1 .com in Foster City, CA
 In 5% of the samples, RON reduced the loss probability
                                                                              6.3            * NYU New York, NY
 by 0.05 or more.
                                                                                            * Utah Salt Lake City, UT
 Single-hop route indirection captured the majority of                                       VU-NL Vrije Univ., Amsterdam, Netherlands
 benefits in our RON deployment, for both outage recov-                        6.4             Additional hosts used in dataset
                                                                                                                                           ¨ ¤¢ 
                                                                                                                                             £ ¡
 ery and latency optimization.                                                              OR-DSL DSL in Corvallis, OR
                                                                                          NC-Cable MediaOne Cable in Durham, NC
Table 1: Major results from measurements of the RON testbed.                                   PDI .com in Palo Alto, CA
                                                                                              Mazu .com in Boston, MA

6.    Evaluation
                                                                                    Table 2: The hosts in our sixteen-node RON deployment, which
   The goal of the RON system is to overcome path outages and per-                  we study in detail to determine the effectiveness of RON in
formance failures, without introducing excessive overhead or new                    practice. Asterisks indicate U.S. Universities on the Internet2
failure modes. In this section, we present an evaluation of how                     backbone. The two European universities, Lulea and VU-NL
well RON meets these goals. We evaluate the performance of the                      are classified as non-Internet2.
resilient IP forwarder, which uses RON for outage detection and
loss, latency, and throughput optimization.
   Our evaluation has three main parts to it. First, we study                       commercial links, and never from the more reliable Internet2 links.
RON’s ability to detect outages and recover quickly from them.                         The raw measurement data used in this paper consists of probe
Next, we investigate performance failures and RON’s ability to im-                  packets, throughput samples, and traceroute results. To probe,
prove the loss rate, latency, and throughput of badly performing                    each RON node independently repeated the following steps:
paths. Finally, we investigate two important aspects of RON’s rout-                     1. Pick a random node, . ¤                                            $
ing, showing the effectiveness of its one-intermediate-hop strategy
compared to more general alternatives and the stability of RON-
                                                                                        2. Pick a probe-type from one of     §¡©
                                                                                                                            ¦ ¥    T¨3vx1¤ APxR32x1¤ 4 #w
                                                                                                                                     T 
                                                                                           round-robin selection. Send a probe to node .
                                                                                                                                                4 R
                                                                                                                                                 ¤
                                                                                                                                                    using
                                                                                                                                                     ¨ © 

generated routes. Table 1 summarizes our key findings.                                   3. Delay for a random time interval between 1 and 2 seconds.

6.1 Methodology
                                                                                       This characterizes all
                                                                                                             £   G ¢   £Dpaths. For
                                                                                                                                            ¥ ¤¢ 
                                                                                                                                              £ ¡
                                                                                                                                             , we analyze
                                                                                    10.9 million packet departure and arrival times, collected over 64
   Most of our results come from experiments with a wide-area                       hours between 21 March 2001 and 23 March 2001. This produces
RON deployed at several Internet sites. There are
                                      £
ent paths between the hosts in an -site RON deployment. Table 2
                                                               £   G ¢ a£ D
                                                               differ-
                                                                                    2.6 million individual RTT, one-way loss, and jitter data samples,
                                                                                    for which we calculated time-averaged samples averaged over a
shows the location of sites for our experiments. We analyze two                     30-minute duration.1 We also took 8,855 throughput samples from
           ¥ ¤¢ 
            £
              £ ¡
distinct datasets—¨ ¤§ 
                    £ ¡
                             ¢ ` £¢
                             with
                                ¢ £
                                                    ¡
                                              nodes and 132 distinct                1 MByte bulk transfers (or 30 seconds, whichever came sooner),
paths, and
¥ ¤§ 
  £ ¡               with            `nodes and 240 distinct paths. In
        , traceroute data shows that 36 different AS’s were tra-
                                                                                    recording the time at which each power of two’s worth of data was
                                                                                    sent and the duration of the transfer.
                                                                                                                                   ¨ ¤¢ 
                                                                                                                                     £ ¡
                                                                                                                                     data was collected
versed, with at least 74 distinct inter-AS links; for
                                                              ¨ ¤¢  ¨
                                                                £ ¡
                                                            , 50 AS’s               over 85 hours from 7 May 2001 and 11 May 2001, and consists of
and 118 inter-AS links were traversed. The
                                                        ¡   G £D
                                                      scaling of path               13.8 million packet arrival and departure times. Space constraints
diversity suggests that even small numbers of confederating hosts                   allow us to present only the path outage analysis of
                                                                                                                                                         ¨ ¤§ 
                                                                                                                                                 , but the
                                                                                                                                                            £ ¡
expose a large number of Internet paths to examination [18, 19].                    performance failure results from
                                                                                                                       ¨ ¤¢ 
                                                                                                                         £ ¡  are similar to
                                                                                                                                                     ¥ £¤¢ 
                                                                                                                                                     [1].
                                                                                                                                                          ¡
We do not claim that our experiments and results are typical or rep-                   Most of the RON hosts were Intel Celeron/733-based machines
resentative of anything other than our deployment, but present and                  running FreeBSD, with 256MB RAM and 9GB of disk space. The
analyze them to demonstrate the kinds of gains one might realize                    processing capability of a host was never a bottleneck in any of
with RONs.                                                                          our wide-area experiments. Our measurement data is available at
   Several of our host sites are Internet2-connected educational                    http://nms.lcs.mit.edu/ron/.
sites, and we found that this high-speed experimental network
presents opportunities for path improvement not available in the                    6.2 Overcoming Path Outages
public Internet. To demonstrate that our policy routing module                        Precisely measuring a path outage is harder than one might think.
works, and to make our measurements closer to what Internet hosts                   One possibility is to define an outage as the length of time during
in general would observe, all the measurements reported herein                      which no packets get through the path. As a yardstick, we could
were taken with a policy that prohibited sending traffic to or from                  use ranges of time in which TCP implementations will time out
commercial sites over the Internet2. Therefore, for instance, a                     and shut down a connection; these vary from about 120 seconds
packet could travel from Utah to Cornell to NYU, but not from
                                                                                    ¥
                                                                                      Some hosts came and went during the study, so the number of
Aros to Utah to NYU. This is consistent with the AUP of Internet2                   samples in the averages is not always as large as one might expect
that precludes commercial traffic. As a result, in our measurements,                 if all hosts were continually present. More details are available in
all path improvements involving a commercial site came from only                    the documentation accompanying the data.
Resilient Overlay Networks Detects and Recovers from Internet Path Outages in Seconds
Resilient Overlay Networks Detects and Recovers from Internet Path Outages in Seconds
Resilient Overlay Networks Detects and Recovers from Internet Path Outages in Seconds
Resilient Overlay Networks Detects and Recovers from Internet Path Outages in Seconds
Resilient Overlay Networks Detects and Recovers from Internet Path Outages in Seconds
Resilient Overlay Networks Detects and Recovers from Internet Path Outages in Seconds

Contenu connexe

Tendances

A Cross-Layer Based Multipath Routing Protocol To Improve QoS In Mobile Adhoc...
A Cross-Layer Based Multipath Routing Protocol To Improve QoS In Mobile Adhoc...A Cross-Layer Based Multipath Routing Protocol To Improve QoS In Mobile Adhoc...
A Cross-Layer Based Multipath Routing Protocol To Improve QoS In Mobile Adhoc...IDES Editor
 
Multipath routing protocol for effective local route recovery
Multipath routing protocol for effective local route recoveryMultipath routing protocol for effective local route recovery
Multipath routing protocol for effective local route recoveryIJARBEST JOURNAL
 
Source routing in Mobile Ad hoc NETworks (MANETs)
Source routing in Mobile Ad hoc NETworks (MANETs)Source routing in Mobile Ad hoc NETworks (MANETs)
Source routing in Mobile Ad hoc NETworks (MANETs)Narendra Singh Yadav
 
A survey on energy aware routing issues and cross layering in mane ts
A survey on energy aware routing issues and cross layering in mane tsA survey on energy aware routing issues and cross layering in mane ts
A survey on energy aware routing issues and cross layering in mane tsIAEME Publication
 
EFFICIENT PACKET DELIVERY APPROACH FOR ADHOC WIRELESS NETWORKS
EFFICIENT PACKET DELIVERY APPROACH FOR ADHOC WIRELESS NETWORKS EFFICIENT PACKET DELIVERY APPROACH FOR ADHOC WIRELESS NETWORKS
EFFICIENT PACKET DELIVERY APPROACH FOR ADHOC WIRELESS NETWORKS cscpconf
 
Analysis of Selfish behavior in energy consumption model based Multihop cellu...
Analysis of Selfish behavior in energy consumption model based Multihop cellu...Analysis of Selfish behavior in energy consumption model based Multihop cellu...
Analysis of Selfish behavior in energy consumption model based Multihop cellu...ijsrd.com
 
Comparison of routing protocols with performance parameters for different num...
Comparison of routing protocols with performance parameters for different num...Comparison of routing protocols with performance parameters for different num...
Comparison of routing protocols with performance parameters for different num...ijctet
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
New adaptation method based on cross layer and TCP over protocols to improve ...
New adaptation method based on cross layer and TCP over protocols to improve ...New adaptation method based on cross layer and TCP over protocols to improve ...
New adaptation method based on cross layer and TCP over protocols to improve ...IJECEIAES
 
Dvr based hybrid routing protocols in mobile ad-hoc network application and c...
Dvr based hybrid routing protocols in mobile ad-hoc network application and c...Dvr based hybrid routing protocols in mobile ad-hoc network application and c...
Dvr based hybrid routing protocols in mobile ad-hoc network application and c...eSAT Publishing House
 
Multipath qos aware routing protocol
Multipath qos aware routing protocolMultipath qos aware routing protocol
Multipath qos aware routing protocolIJCNCJournal
 
11.a review of improvement in tcp congestion control using route failure det...
11.a  review of improvement in tcp congestion control using route failure det...11.a  review of improvement in tcp congestion control using route failure det...
11.a review of improvement in tcp congestion control using route failure det...Alexander Decker
 
Performance Evaluation of Routing Protocols for MAC Layer Models
Performance Evaluation of Routing Protocols for MAC Layer ModelsPerformance Evaluation of Routing Protocols for MAC Layer Models
Performance Evaluation of Routing Protocols for MAC Layer ModelsIOSR Journals
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...researchinventy
 
Comparative Review for Routing Protocols in Mobile Ad-Hoc Networks
Comparative Review for Routing Protocols in Mobile Ad-Hoc NetworksComparative Review for Routing Protocols in Mobile Ad-Hoc Networks
Comparative Review for Routing Protocols in Mobile Ad-Hoc Networksijasuc
 

Tendances (18)

A Cross-Layer Based Multipath Routing Protocol To Improve QoS In Mobile Adhoc...
A Cross-Layer Based Multipath Routing Protocol To Improve QoS In Mobile Adhoc...A Cross-Layer Based Multipath Routing Protocol To Improve QoS In Mobile Adhoc...
A Cross-Layer Based Multipath Routing Protocol To Improve QoS In Mobile Adhoc...
 
Multipath routing protocol for effective local route recovery
Multipath routing protocol for effective local route recoveryMultipath routing protocol for effective local route recovery
Multipath routing protocol for effective local route recovery
 
Source routing in Mobile Ad hoc NETworks (MANETs)
Source routing in Mobile Ad hoc NETworks (MANETs)Source routing in Mobile Ad hoc NETworks (MANETs)
Source routing in Mobile Ad hoc NETworks (MANETs)
 
A survey on energy aware routing issues and cross layering in mane ts
A survey on energy aware routing issues and cross layering in mane tsA survey on energy aware routing issues and cross layering in mane ts
A survey on energy aware routing issues and cross layering in mane ts
 
A02100108
A02100108A02100108
A02100108
 
EFFICIENT PACKET DELIVERY APPROACH FOR ADHOC WIRELESS NETWORKS
EFFICIENT PACKET DELIVERY APPROACH FOR ADHOC WIRELESS NETWORKS EFFICIENT PACKET DELIVERY APPROACH FOR ADHOC WIRELESS NETWORKS
EFFICIENT PACKET DELIVERY APPROACH FOR ADHOC WIRELESS NETWORKS
 
Analysis of Selfish behavior in energy consumption model based Multihop cellu...
Analysis of Selfish behavior in energy consumption model based Multihop cellu...Analysis of Selfish behavior in energy consumption model based Multihop cellu...
Analysis of Selfish behavior in energy consumption model based Multihop cellu...
 
Comparison of routing protocols with performance parameters for different num...
Comparison of routing protocols with performance parameters for different num...Comparison of routing protocols with performance parameters for different num...
Comparison of routing protocols with performance parameters for different num...
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
New adaptation method based on cross layer and TCP over protocols to improve ...
New adaptation method based on cross layer and TCP over protocols to improve ...New adaptation method based on cross layer and TCP over protocols to improve ...
New adaptation method based on cross layer and TCP over protocols to improve ...
 
Dvr based hybrid routing protocols in mobile ad-hoc network application and c...
Dvr based hybrid routing protocols in mobile ad-hoc network application and c...Dvr based hybrid routing protocols in mobile ad-hoc network application and c...
Dvr based hybrid routing protocols in mobile ad-hoc network application and c...
 
Multipath qos aware routing protocol
Multipath qos aware routing protocolMultipath qos aware routing protocol
Multipath qos aware routing protocol
 
11.a review of improvement in tcp congestion control using route failure det...
11.a  review of improvement in tcp congestion control using route failure det...11.a  review of improvement in tcp congestion control using route failure det...
11.a review of improvement in tcp congestion control using route failure det...
 
Performance Evaluation of Routing Protocols for MAC Layer Models
Performance Evaluation of Routing Protocols for MAC Layer ModelsPerformance Evaluation of Routing Protocols for MAC Layer Models
Performance Evaluation of Routing Protocols for MAC Layer Models
 
eaodv
eaodveaodv
eaodv
 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...
 
S26117122
S26117122S26117122
S26117122
 
Comparative Review for Routing Protocols in Mobile Ad-Hoc Networks
Comparative Review for Routing Protocols in Mobile Ad-Hoc NetworksComparative Review for Routing Protocols in Mobile Ad-Hoc Networks
Comparative Review for Routing Protocols in Mobile Ad-Hoc Networks
 

Similaire à Resilient Overlay Networks Detects and Recovers from Internet Path Outages in Seconds

Quick Routing for Communication in MANET using Zone Routing Protocol
Quick Routing for Communication in MANET using Zone Routing ProtocolQuick Routing for Communication in MANET using Zone Routing Protocol
Quick Routing for Communication in MANET using Zone Routing Protocolijceronline
 
Load aware and load balancing using aomdv routing in manet
Load aware and load balancing using aomdv routing in manetLoad aware and load balancing using aomdv routing in manet
Load aware and load balancing using aomdv routing in manetijctet
 
IRJET-A_AODV: A Modern Routing Algorithm for Mobile Ad-Hoc Network
IRJET-A_AODV: A Modern Routing Algorithm for Mobile Ad-Hoc NetworkIRJET-A_AODV: A Modern Routing Algorithm for Mobile Ad-Hoc Network
IRJET-A_AODV: A Modern Routing Algorithm for Mobile Ad-Hoc NetworkIRJET Journal
 
Packet Transfer Rate & Robust Throughput for Mobile Adhoc Network
Packet Transfer Rate & Robust Throughput for Mobile Adhoc NetworkPacket Transfer Rate & Robust Throughput for Mobile Adhoc Network
Packet Transfer Rate & Robust Throughput for Mobile Adhoc NetworkEswar Publications
 
Multipath Fault Tolerant Routing Protocol in MANET
Multipath Fault Tolerant Routing Protocol in MANET  Multipath Fault Tolerant Routing Protocol in MANET
Multipath Fault Tolerant Routing Protocol in MANET pijans
 
Simulation Based Routing Protocols Evaluation for IEEE 802.15.4 enabled Wirel...
Simulation Based Routing Protocols Evaluation for IEEE 802.15.4 enabled Wirel...Simulation Based Routing Protocols Evaluation for IEEE 802.15.4 enabled Wirel...
Simulation Based Routing Protocols Evaluation for IEEE 802.15.4 enabled Wirel...IDES Editor
 
Detecting Good Neighbor Nodes and Finding Reliable Routing Path Based on AODV...
Detecting Good Neighbor Nodes and Finding Reliable Routing Path Based on AODV...Detecting Good Neighbor Nodes and Finding Reliable Routing Path Based on AODV...
Detecting Good Neighbor Nodes and Finding Reliable Routing Path Based on AODV...IOSR Journals
 
Cross-layer based performance optimization for different mobility and traffic...
Cross-layer based performance optimization for different mobility and traffic...Cross-layer based performance optimization for different mobility and traffic...
Cross-layer based performance optimization for different mobility and traffic...IOSR Journals
 
Comparative Review for Routing Protocols in Mobile Ad-Hoc Networks
Comparative Review for Routing Protocols in Mobile Ad-Hoc NetworksComparative Review for Routing Protocols in Mobile Ad-Hoc Networks
Comparative Review for Routing Protocols in Mobile Ad-Hoc Networksjake henry
 
A Literature Survey of MANET.pdf
A Literature Survey of MANET.pdfA Literature Survey of MANET.pdf
A Literature Survey of MANET.pdfLaurie Smith
 
A Literature Survey of MANET
A Literature Survey of MANETA Literature Survey of MANET
A Literature Survey of MANETIRJET Journal
 
cse Wireless Mesh Networks ppt.pptx
cse  Wireless Mesh Networks ppt.pptxcse  Wireless Mesh Networks ppt.pptx
cse Wireless Mesh Networks ppt.pptxSANTHAKUMARP5
 
Performance Analysis of Optimization Techniques for OLSR Routing Protocol for...
Performance Analysis of Optimization Techniques for OLSR Routing Protocol for...Performance Analysis of Optimization Techniques for OLSR Routing Protocol for...
Performance Analysis of Optimization Techniques for OLSR Routing Protocol for...IRJET Journal
 
Cooperative Data Forwarding In AODV Routing Protocol
Cooperative Data Forwarding In AODV Routing ProtocolCooperative Data Forwarding In AODV Routing Protocol
Cooperative Data Forwarding In AODV Routing ProtocolIOSR Journals
 
IRJET- Survey on Enhancement of Manet Routing Protocol
IRJET- Survey on Enhancement of Manet Routing ProtocolIRJET- Survey on Enhancement of Manet Routing Protocol
IRJET- Survey on Enhancement of Manet Routing ProtocolIRJET Journal
 

Similaire à Resilient Overlay Networks Detects and Recovers from Internet Path Outages in Seconds (20)

Quick Routing for Communication in MANET using Zone Routing Protocol
Quick Routing for Communication in MANET using Zone Routing ProtocolQuick Routing for Communication in MANET using Zone Routing Protocol
Quick Routing for Communication in MANET using Zone Routing Protocol
 
Load aware and load balancing using aomdv routing in manet
Load aware and load balancing using aomdv routing in manetLoad aware and load balancing using aomdv routing in manet
Load aware and load balancing using aomdv routing in manet
 
IRJET-A_AODV: A Modern Routing Algorithm for Mobile Ad-Hoc Network
IRJET-A_AODV: A Modern Routing Algorithm for Mobile Ad-Hoc NetworkIRJET-A_AODV: A Modern Routing Algorithm for Mobile Ad-Hoc Network
IRJET-A_AODV: A Modern Routing Algorithm for Mobile Ad-Hoc Network
 
Packet Transfer Rate & Robust Throughput for Mobile Adhoc Network
Packet Transfer Rate & Robust Throughput for Mobile Adhoc NetworkPacket Transfer Rate & Robust Throughput for Mobile Adhoc Network
Packet Transfer Rate & Robust Throughput for Mobile Adhoc Network
 
Multipath Fault Tolerant Routing Protocol in MANET
Multipath Fault Tolerant Routing Protocol in MANET  Multipath Fault Tolerant Routing Protocol in MANET
Multipath Fault Tolerant Routing Protocol in MANET
 
O26087092
O26087092O26087092
O26087092
 
Simulation Based Routing Protocols Evaluation for IEEE 802.15.4 enabled Wirel...
Simulation Based Routing Protocols Evaluation for IEEE 802.15.4 enabled Wirel...Simulation Based Routing Protocols Evaluation for IEEE 802.15.4 enabled Wirel...
Simulation Based Routing Protocols Evaluation for IEEE 802.15.4 enabled Wirel...
 
Detecting Good Neighbor Nodes and Finding Reliable Routing Path Based on AODV...
Detecting Good Neighbor Nodes and Finding Reliable Routing Path Based on AODV...Detecting Good Neighbor Nodes and Finding Reliable Routing Path Based on AODV...
Detecting Good Neighbor Nodes and Finding Reliable Routing Path Based on AODV...
 
Cross-layer based performance optimization for different mobility and traffic...
Cross-layer based performance optimization for different mobility and traffic...Cross-layer based performance optimization for different mobility and traffic...
Cross-layer based performance optimization for different mobility and traffic...
 
Comparative Review for Routing Protocols in Mobile Ad-Hoc Networks
Comparative Review for Routing Protocols in Mobile Ad-Hoc NetworksComparative Review for Routing Protocols in Mobile Ad-Hoc Networks
Comparative Review for Routing Protocols in Mobile Ad-Hoc Networks
 
A Literature Survey of MANET.pdf
A Literature Survey of MANET.pdfA Literature Survey of MANET.pdf
A Literature Survey of MANET.pdf
 
A Literature Survey of MANET
A Literature Survey of MANETA Literature Survey of MANET
A Literature Survey of MANET
 
International R/E Routing (v1.0)
International R/E Routing (v1.0)International R/E Routing (v1.0)
International R/E Routing (v1.0)
 
Hs2413641369
Hs2413641369Hs2413641369
Hs2413641369
 
cse Wireless Mesh Networks ppt.pptx
cse  Wireless Mesh Networks ppt.pptxcse  Wireless Mesh Networks ppt.pptx
cse Wireless Mesh Networks ppt.pptx
 
Performance Analysis of Optimization Techniques for OLSR Routing Protocol for...
Performance Analysis of Optimization Techniques for OLSR Routing Protocol for...Performance Analysis of Optimization Techniques for OLSR Routing Protocol for...
Performance Analysis of Optimization Techniques for OLSR Routing Protocol for...
 
Cooperative Data Forwarding In AODV Routing Protocol
Cooperative Data Forwarding In AODV Routing ProtocolCooperative Data Forwarding In AODV Routing Protocol
Cooperative Data Forwarding In AODV Routing Protocol
 
B010130508
B010130508B010130508
B010130508
 
IRJET- Survey on Enhancement of Manet Routing Protocol
IRJET- Survey on Enhancement of Manet Routing ProtocolIRJET- Survey on Enhancement of Manet Routing Protocol
IRJET- Survey on Enhancement of Manet Routing Protocol
 
A340105
A340105A340105
A340105
 

Dernier

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Dernier (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Resilient Overlay Networks Detects and Recovers from Internet Path Outages in Seconds

  • 1. Resilient Overlay Networks David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris MIT Laboratory for Computer Science ron@nms.lcs.mit.edu http://nms.lcs.mit.edu/ron/ Abstract and its constituent networks, usually operated by some network ser- A Resilient Overlay Network (RON) is an architecture that allows vice provider. The information shared with other providers and distributed Internet applications to detect and recover from path AS’s is heavily filtered and summarized using the Border Gateway outages and periods of degraded performance within several sec- Protocol (BGP-4) running at the border routers between AS’s [21], onds, improving over today’s wide-area routing protocols that take which allows the Internet to scale to millions of networks. at least several minutes to recover. A RON is an application-layer This wide-area routing scalability comes at the cost of re- overlay on top of the existing Internet routing substrate. The RON duced fault-tolerance of end-to-end communication between Inter- nodes monitor the functioning and quality of the Internet paths net hosts. This cost arises because BGP hides many topological among themselves, and use this information to decide whether to details in the interests of scalability and policy enforcement, has route packets directly over the Internet or by way of other RON little information about traffic conditions, and damps routing up- nodes, optimizing application-specific routing metrics. dates when potential problems arise to prevent large-scale oscil- Results from two sets of measurements of a working RON de- lations. As a result, BGP’s fault recovery mechanisms sometimes ployed at sites scattered across the Internet demonstrate the benefits take many minutes before routes converge to a consistent form [12], of our architecture. For instance, over a 64-hour sampling period in and there are times when path outages even lead to significant dis- March 2001 across a twelve-node RON, there were 32 significant ruptions in communication lasting tens of minutes or more [3, 18, outages, each lasting over thirty minutes, over the 132 measured 19]. The result is that today’s Internet is vulnerable to router and paths. RON’s routing mechanism was able to detect, recover, and link faults, configuration errors, and malice—hardly a week goes route around all of them, in less than twenty seconds on average, by without some serious problem affecting the connectivity pro- showing that its methods for fault detection and recovery work well vided by one or more Internet Service Providers (ISPs) [15]. at discovering alternate paths in the Internet. Furthermore, RON Resilient Overlay Networks (RONs) are a remedy for some of was able to improve the loss rate, latency, or throughput perceived these problems. Distributed applications layer a “resilient overlay by data transfers; for example, about 5% of the transfers doubled network” over the underlying Internet routing substrate. The nodes their TCP throughput and 5% of our transfers saw their loss prob- comprising a RON reside in a variety of routing domains, and co- ability reduced by 0.05. We found that forwarding packets via at operate with each other to forward data on behalf of any pair of most one intermediate RON node is sufficient to overcome faults communicating nodes in the RON. Because AS’s are independently and improve performance in most cases. These improvements, par- administrated and configured, and routing domains rarely share in- ticularly in the area of fault detection and recovery, demonstrate the terior links, they generally fail independently of each other. As benefits of moving some of the control over routing into the hands a result, if the underlying topology has physical path redundancy, of end-systems. RON can often find paths between its nodes, even when wide-area routing Internet protocols like BGP-4 cannot. The main goal of RON is to enable a group of nodes to commu- 1. Introduction nicate with each other in the face of problems with the underlying The Internet is organized as independently operating au- Internet paths connecting them. RON detects problems by aggres- tonomous systems (AS’s) that peer together. In this architecture, sively probing and monitoring the paths connecting its nodes. If detailed routing information is maintained only within a single AS the underlying Internet path is the best one, that path is used and no other RON node is involved in the forwarding path. If the Internet path is not the best one, the RON will forward the packet by way of This research was sponsored by the Defense Advanced Research Projects Agency (DARPA) and the Space and Naval Warfare Sys- other RON nodes. In practice, we have found that RON can route tems Center, San Diego, under contract N66001-00-1-8933. around most failures by using only one intermediate hop. RON nodes exchange information about the quality of the paths among themselves via a routing protocol and build forwarding ta- Permission to make digital or hard copies of all or part of this work for bles based on a variety of path metrics, including latency, packet personal or classroom use is granted without fee provided that copies are loss rate, and available throughput. Each RON node obtains the not made or distributed for profit or commercial advantage and that copies path metrics using a combination of active probing experiments bear this notice and the full citation on the first page. To copy otherwise, to and passive observations of on-going data transfers. In our imple- republish, to post on servers or to redistribute to lists, requires prior specific mentation, each RON is explicitly designed to be limited in size— permission and/or a fee. 18th ACM Symp. on Operating Systems Principles (SOSP) October 2001, between two and fifty nodes—to facilitate aggressive path main- Banff, Canada. tenance via probing without excessive bandwidth overhead. This Copyright 2001 ACM
  • 2. To vu.nl Lulea.se and do not depend on the existence of non-commercial or private OR−DSL networks (such as the Internet2 backbone that interconnects many CMU MIT educational institutions); our ability to determine this was enabled MA−Cable Cisco by RON’s policy routing feature that allows the expression and im- Mazu Cornell plementation of sophisticated policies that determine how paths are CA−T1 CCI NYU selected for packets. PDI Aros Utah We also found that RON successfully routed around performance ¦¤¢  ¥ £ ¡ NC−Cable failures: in , the loss probability improved by at least 0.05 in 5% of the samples, end-to-end communication latency reduced by 40ms in 11% of the samples, and TCP throughput doubled in 5% of all samples. In addition, we found cases when RON’s loss, latency, and throughput-optimizing path selection mechanisms all chose different paths between the same two nodes, suggesting that application-specific path selection techniques are likely to be use- Figure 1: The current sixteen-node RON deployment. Five sites ful in practice. A noteworthy finding from the experiments and are at universities in the USA, two are European universities analysis is that in most cases, forwarding packets via at most one (not shown), three are “broadband” home Internet hosts con- intermediate RON node is sufficient both for recovering from fail- nected by Cable or DSL, one is located at a US ISP, and five are ures and for improving communication latency. at corporations in the USA. 2. Related Work To our knowledge, RON is the first wide-area network overlay allows RON to recover from problems in the underlying Internet in system that can detect and recover from path outages and periods of several seconds rather than several minutes. degraded performance within several seconds. RON builds on pre- The second goal of RON is to integrate routing and path selec- vious studies that quantify end-to-end network reliability and per- tion with distributed applications more tightly than is traditionally formance, on IP-based routing techniques for fault-tolerance, and done. This integration includes the ability to consult application- on overlay-based techniques to enhance performance. specific metrics in selecting paths, and the ability to incorporate application-specific notions of what network conditions constitute a 2.1 Internet Performance Studies “fault.” As a result, RONs can be used in a variety of ways. A mul- Labovitz et al. [12] use a combination of measurement and anal- timedia conferencing program may link directly against the RON ysis to show that inter-domain routers in the Internet may take tens library, transparently forming an overlay between all participants of minutes to reach a consistent view of the network topology after in the conference, and using loss rates, delay jitter, or application- a fault, primarily because of routing table oscillations during BGP’s observed throughput as metrics on which to choose paths. An ad- rather complicated path selection process. They find that during ministrator may wish to use a RON-based router application to this period of “delayed convergence,” end-to-end communication form an overlay network between multiple LANs as an “Overlay is adversely affected. In fact, outages on the order of minutes cause VPN.” This idea can be extended further to develop an “Overlay active TCP connections (i.e., connections in the ESTABLISHED ISP,” formed by linking (via RON) points of presence in different state with outstanding data) to terminate when TCP does not re- traditional ISPs after buying bandwidth from them. Using RON’s ceive an acknowledgment for its outstanding data. They also find routing machinery, an Overlay ISP can provide more resilient and that, while part of the convergence delays can be fixed with changes failure-resistant Internet service to its customers. to the deployed BGP implementations, long delays and temporary The third goal of RON is to provide a framework for the imple- oscillations are a fundamental consequence of the BGP path vector mentation of expressive routing policies, which govern the choice routing protocol. of paths in the network. For example, RON facilitates classifying Paxson’s probe experiments show that routing pathologies pre- packets into categories that could implement notions of acceptable vent selected Internet hosts from communicating up to 3.3% of the use, or enforce forwarding rate controls. time averaged over a long time period, and that this percentage has This paper describes the design and implementation of RON, not improved with time [18]. Labovitz et al. find, by examining and presents several experiments that evaluate whether RON is a routing table logs at Internet backbones, that 10% of all considered good idea. To conduct this evaluation and demonstrate the ben- routes were available less than 95% of the time, and that less than efits of RON, we have deployed a working sixteen-node RON at 35% of all routes were available more than 99.99% of the time [13]. sites sprinkled across the Internet (see Figure 1). The RON client Furthermore, they find that about 40% of all path outages take more we experiment with is a resilient IP forwarder, which allows us to than 30 minutes to repair and are heavy-tailed in their duration. compare connections between pairs of nodes running over a RON More recently, Chandra et al. find using active probing that 5% against running straight over the Internet. of all detected failures last more than 10,000 seconds (2 hours, 45 We have collected a few weeks’ worth of experimental results of minutes), and that failure durations are heavy-tailed and can last path outages and performance failures and present a detailed analy- ¦¤¢  ¥ £ ¡ for as long as 100,000 seconds before being repaired [3]. These sis of two separate datasets: ©¤§  ¨ £ ¡ with twelve nodes measured in findings do not augur well for mission-critical services that require March 2001 and with sixteen nodes measured in May 2001. a higher degree of end-to-end communication availability. In both datasets, we found that RON was able to route around be- The Detour measurement study made the observation, using Pax- tween 60% and 100% of all significant outages. Our implementa- son’s and their own data collected at various times between 1995 tion takes 18 seconds, on average, to detect and route around a path and 1999, that path selection in the wide-area Internet is sub- failure and is able to do so in the face of an active denial-of-service optimal from the standpoint of end-to-end latency, packet loss rate, attack on a path. We also found that these benefits of quick fault de- and TCP throughput [23]. This study showed the potential long- tection and successful recovery are realized on the public Internet term benefits of “detouring” packets via a third node by comparing
  • 3. the long-term average properties of detoured paths against Internet- phasis on high performance packet classification and routing. It chosen paths. uses IP-in-IP encapsulation to send packets along alternate paths. While RON shares with Detour the idea of routing via other 2.2 Network-layer Techniques nodes, our work differs from Detour in three significant ways. First, Much work has been done on performance-based and fault- RON seeks to prevent disruptions in end-to-end communication in tolerant routing within a single routing domain, but practical mech- the face of failures. RON takes advantage of underlying Internet anisms for wide-area Internet recovery from outages or badly per- path redundancy on time-scales of a few seconds, reacting respon- forming paths are lacking. sively to path outages and performance failures. Second, RON is Although today’s wide-area BGP-4 routing is based largely on designed as an application-controlled routing overlay; because each AS hop-counts, early ARPANET routing was more dynamic, re- RON is more closely tied to the application using it, RON more sponding to the current delay and utilization of the network. By readily integrates application-specific path metrics and path selec- 1989, the ARPANET evolved to using a delay- and congestion- tion policies. Third, we present and analyze experimental results based distributed shortest path routing algorithm [11]. However, from a real-world deployment of a RON to demonstrate fast re- the diversity and size of today’s decentralized Internet necessitated covery from failure and improved latency and loss-rates even over the deployment of protocols that perform more aggregation and short time-scales. fewer updates. As a result, unlike some interior routing protocols An alternative design to RON would be to use a generic overlay within AS’s, BGP-4 routing between AS’s optimizes for scalable infrastructure like the X-Bone and port a standard network routing operation over all else. protocol (like OSPF or RIP) with low timer values. However, this By treating vast collections of subnetworks as a single entity for by itself will not improve the resilience of Internet communications global routing purposes, BGP-4 is able to summarize and aggregate for two reasons. First, a reliable and low-overhead outage detection enormous amounts of routing information into a format that scales module is required, to distinguish between packet losses caused by to hundreds of millions of hosts. To prevent costly route oscilla- congestion or error-prone links from legitimate problems with a tions, BGP-4 explicitly damps changes in routes. Unfortunately, path. Second, generic network-level routing protocols do not utilize while aggregation and damping provide good scalability, they in- application-specific definitions of faults. terfere with rapid detection and recovery when faults occur. RON Various Content Delivery Networks (CDNs) use overlay tech- handles this by leaving scalable operation to the underlying Inter- niques and caching to improve the performance of content delivery net substrate, moving fault detection and recovery to a higher layer for specific applications such as HTTP and streaming video. The overlay that is capable of faster response because it does not have functionality provided by RON may ease future CDN development to worry about scalability. by providing some routing components required by these services. An oft-cited “solution” to achieving fault-tolerant network con- nectivity for a small- or medium-sized customer is to multi-home, advertising a customer network through multiple ISPs. The idea 3. Design Goals is that an outage in one ISP would leave the customer connected The design of RON seeks to meet three main design goals: (i) via the other. However, this solution does not generally achieve failure detection and recovery in less than 20 seconds; (ii) tighter fault detection and recovery within several seconds because of the integration of routing and path selection with the application; and degree of aggregation used to achieve wide-area routing scalabil- (iii) expressive policy routing. ity. To limit the size of their routing tables, many ISPs will not accept routing announcements for fewer than 8192 contiguous ad- 3.1 Fast Failure Detection and Recovery dresses (a “/19” netblock). Small companies, regardless of their Today’s wide-area Internet routing system based on BGP-4 does fault-tolerance needs, do not often require such a large address not handle failures well. From a network perspective, we define block, and cannot effectively multi-home. One alternative may be two kinds of failures. Link failures occur when a router or a link “provider-based addressing,” where an organization gets addresses connecting two routers fails because of a software error, hardware from multiple providers, but this requires handling two distinct sets problem, or link disconnection. Path failures occur for a variety of of addresses on its hosts. It is unclear how on-going connections reasons, including denial-of-service attacks or other bursts of traffic on one address set can seamlessly switch on a failure in this model. that cause a high degree of packet loss or high, variable latencies. Applications perceive all failures in one of two ways: outages or 2.3 Overlay-based Techniques performance failures. Link failures and extreme path failures cause Overlay networks are an old idea; in fact, the Internet itself was outages, when the average packet loss rate over a sustained period developed as an overlay on the telephone network. Several Inter- of several minutes is high (about 30% or higher), causing most pro- net overlays have been designed in the past for various purposes, tocols including TCP to degrade by several orders of magnitude. including providing OSI network-layer connectivity [10], easing Performance failures are less extreme; for example, throughput, la- IP multicast deployment using the MBone [6], and providing IPv6 tency, or loss-rates might degrade by a factor of two or three. connectivity using the 6-Bone [9]. The X-Bone is a recent infras- BGP-4 takes a long time, on the order of several minutes, to con- tructure project designed to speed the deployment of IP-based over- verge to a new valid route after a link failure causes an outage [12]. lay networks [26]. It provides management functions and mecha- In contrast, RON’s goal is to detect and recover from outages and nisms to insert packets into the overlay, but does not yet support performance failures within several seconds. Compounding this fault-tolerant operation or application-controlled path selection. problem, IP-layer protocols like BGP-4 cannot detect problems Few overlay networks have been designed for efficient fault de- such as packet floods and persistent congestion on links or paths tection and recovery, although some have been designed for better that greatly degrade end-to-end performance. As long as a link is end-to-end performance. The Detour framework [5, 22] was mo- deemed “live” (i.e., the BGP session is still alive), BGP’s AS-path- tivated by the potential long-term performance benefits of indirect based routing will continue to route packets down the faulty path; routing [23]. It is an in-kernel packet encapsulation and routing unfortunately, such a path may not provide adequate performance architecture designed to support alternate-hop routing, with an em- for an application using it.
  • 4. Data vBNS / Internet 2 Node 1 Node 2 Node 3 External Probes Utah 130 MIT 155Mbps / 60ms Mbps Conduits Conduits Conduits 155 Forwarder Forwarder Forwarder Qwest BBN Probes Router Probes Router Probes Router Private Private Peering Peering Performance Database 3Mbps 45Mbps 6ms UUNET AT&T 5ms Figure 3: The RON system architecture. Data enters the RON from RON clients via a conduit at an entry node. At each node, ArosNet 6Mbps the RON forwarder consults with its router to determine the best 1Mbps, 3ms path for the packet, and sends it to the next node. Path selec- MediaOne tion is done at the entry node, which also tags the packet, sim- Cable Modem plifying the forwarding path at other nodes. When the packet reaches the RON exit node, the forwarder there hands it to the appropriate output conduit, which passes the data to the client. Figure 2: Internet interconnections are often complex. The dot- To choose paths, RON nodes monitor the quality of their vir- ted links are private and are not announced globally. tual links using active probing and passive observation. RON nodes use a link-state routing protocol to disseminate the topol- ogy and virtual-link quality of the overlay network. 3.2 Tighter Integration with Applications Failures and faults are application-specific notions: network con- 4. Design ditions that are fatal for one application may be acceptable for an- The conceptual design of RON, shown in Figure 3, is quite sim- other, more adaptive one. For instance, a UDP-based Internet audio ple. RON nodes, deployed at various locations on the Internet, application not using good packet-level error correction may not form an application-layer overlay to cooperatively route packets work at all at loss rates larger than 10%. At this loss rate, a bulk for each other. Each RON node monitors the quality of the Internet transfer application using TCP will continue to work because of paths between it and the other nodes, and uses this information to TCP’s adaptation mechanisms, albeit at lower performance. How- intelligently select paths for packets. Each Internet path between ever, at loss rates of 30% or more, TCP becomes essentially un- two nodes is called a virtual link. To discover the topology of the usable because it times out for most packets [16]. RON allows overlay network and obtain information about all virtual links in applications to independently define and react to failures. the topology, every RON node participates in a routing protocol In addition, applications may prioritize some metrics over oth- to exchange information about a variety of quality metrics. Most ers (e.g., latency over throughput, or low loss over latency) in their of RON’s design supports routing through multiple intermediate path selection. They may also construct their own metrics to select nodes, but our results (Section 6) show that using at most one inter- paths. A routing system may not be able to optimize all of these mediate RON node is sufficient most of the time. Therefore, parts metrics simultaneously; for example, a path with a one-second la- of our design focus on finding better paths via a single intermediate tency may appear to be the best throughput path, but this degree RON node. of latency may be unacceptable to an interactive application. Cur- rently, RON’s goal is to allow applications to influence the choice 4.1 Software Architecture of paths using a single metric. We plan to explore multi-criteria Each program that communicates with the RON software on a path selection in the future. node is a RON client. The overlay network is defined by a sin- 3.3 Expressive Policy Routing gle group of clients that collaborate to provide a distributed service or application. This group of clients can use service-specific rout- Despite the need for policy routing and enforcement of accept- ing metrics when deciding how to forward packets in the group. able use and other policies, today’s approaches are primitive and Our design accommodates a variety of RON clients, ranging from cumbersome. For instance, BGP-4 is incapable of expressing fine- a generic IP packet forwarder that improves the reliability of IP grained policies aimed at users or hosts. This lack of precision packet delivery, to a multi-party conferencing application that in- not only reduces the set of paths available in the case of a failure, corporates application-specific metrics in its route selection. but also inhibits innovation in the use of carefully targeted policies, A RON client interacts with RON across an API called a conduit, such as end-to-end per-user rate controls or enforcement of accept- which the client uses to send and receive packets. On the data path, able use policies (AUPs) based on packet classification. Because the first node that receives a packet (via the conduit) classifies it RONs will typically run on relatively powerful end-points, we be- to determine the type of path it should use (e.g., low-latency, high- lieve they are well-suited to providing fine-grained policy routing. throughput, etc.). This node is called the entry node: it determines Figure 2 shows the AS-level network connectivity between four a path from its topology table, encapsulates the packet into a RON of our RON hosts; the full graph for (only) 12 hosts traverses 36 header, tags it with some information that simplifies forwarding different autonomous systems. The figure gives a hint of the con- by downstream RON nodes, and forwards it on. Each subsequent siderable underlying path redundancy available in the Internet—the RON node simply determines the next forwarding hop based on the reason RON works—and shows situations where BGP’s blunt pol- destination address and the tag. The final RON node that delivers icy expression inhibits fail-over. For example, if the Aros-UUNET the packet to the RON application is called the exit node. connection failed, users at Aros would be unable to reach MIT even The conduits access RON via two functions: if they were authorized to use Utah’s network resources to get there. This is because it impossible to announce a BGP route only to par- 1. send(pkt, dst, via ron) allows a node to forward ticular users, so the Utah-MIT link is kept completely private. a packet to a destination RON node either along the RON or
  • 5. using the direct Internet path. RON’s delivery, like UDP, is incomplete information about any other one is when all paths in best-effort and unreliable. the RON from the other RON nodes to it are unavailable. 2. recv(pkt, via ron) is a callback function that is 4.2.2 Path Evaluation and Selection called when a packet arrives for the client program. This The RON routers need an algorithm to determine if a path is still callback is invoked after the RON conduit matches the type alive, and a set of algorithms with which to evaluate potential paths. of the packet in the RON header to the set of types pre- The responsibility of these metric evaluators is to provide a number registered by the client when it joins the RON. The RON quantifying how “good” a path is according to that metric. These packet type is a demultiplexing field for incoming packets. numbers are relative, and are only compared to other numbers from The basic RON functionality is provided by the forwarder the same evaluator. The two important aspects of path evaluation object, which implements the above functions. It also provides a are the mechanism by which the data for two links are combined timer registration and callback mechanism to perform periodic op- into a single path, and the formula used to evaluate the path. erations, and a similar service for network socket data availability. Every RON router implements outage detection, which it uses Each client must instantiate a forwarder and hand to it two mod- to determine if the virtual link between it and another node is still ules: a RON router and a RON membership manager. The RON working. It uses an active probing mechanism for this. On de- router implements a routing protocol. The RON membership man- tecting the loss of a probe, the normal low-frequency probing is re- ager implements a protocol to maintain the list of members of a placed by a sequence of consecutive probes, sent in relatively quick RON. By default, RON provides a few different RON router and succession spaced by ¨¦ © ©§ seconds. If probes in a row elicit no response, then the path is considered $# #© %§$ ! membership manager modules for clients to use. RON routers and membership managers exchange packets using “dead.” If even one of them gets a response, then the subsequent RON as their forwarding service, rather than over direct IP paths. higher-frequency probes are canceled. Paths experiencing outages This feature of our system is beneficial because it allows these mes- are rated on their packet loss rate history; a path having an out- sages to be forwarded even when some underlying IP paths fail. age will always lose to a path not experiencing an outage. The $¨ #¨© %§$ ! and the frequency of probing ( ¨#' ¦ )!(§ ©§ ) 4.2 Routing and Path Selection permit a trade-off between outage detection time and the bandwidth Routing is the process of building up the forwarding tables that consumed by the (low-frequency) probing process (Section 6.2 in- are used to choose paths for packets. In RON, the entry node vestigates this). has more control over subsequent path selection than in traditional By default, every RON router implements three different routing datagram networks. This node tags the packet’s RON header with metrics: the latency-minimizer, the loss-minimizer, and the TCP an identifier that identifies the flow to which the packet belongs; throughput-optimizer. The latency-minimizer forwarding table is subsequent routers attempt to keep a flow ID on the same path it computed by computing an exponential weighted moving average first used, barring significant link changes. Tagging, like the IPv6 (EWMA) of round-trip latency samples with parameter . For any link , its latency estimate 65321 4 is updated as: 1 0 flow ID, helps support multi-hop routing by speeding up the for- warding path at intermediate nodes. It also helps tie a packet flow 5RYXVWQ2T SA8PQIH0 1 U R 9G F¢ EC6532A1@¤865321   D B4 90 7 4 (1) to a chosen path, making performance more predictable, and pro- vides a basis for future support of multi-path routing in RON. By tagging at the entry node, the application is given maximum control We use e ca0 db ` , which means that 10% of the current latency estimate is based on the most recent sample. This number is similar over what the network considers a “flow.” to the values suggested for TCP’s round-trip time estimator [20]. The small size of a RON relative to the Internet allows it to main- For a RON path, the overall latency is the sum of the individual tain information about multiple alternate routes and to select the path that best suits the RON client according to a client-specified virtual link latencies: 5 321 YpAiQhf 3s5 £qYpAiQhgf321 4 r ` 4 . To estimate loss rates, RON uses the average of the last #b ¢ ut b ` routing metric. By default, it maintains information about three probe samples as the current average. Like Floyd et al. [7], we specific metrics for each virtual link: (i) latency, (ii) packet loss found this to be a better estimator than EWMA, which retains some rate, and (iii) throughput, as might be obtained by a bulk-transfer memory of samples obtained in the distant past as well. It might be TCP connection between the end-points of the virtual link. RON possible to further improve our estimator by unequally weighting clients can override these defaults with their own metrics, and the RON library constructs the appropriate forwarding table to pick t some of the samples [7]. Loss metrics are multiplicative on a path: if we assume that good paths. The router builds up forwarding tables for each com- losses are independent, the probability of success on the entire path bination of policy routing and chosen routing metric. is roughly equal to the probability of surviving all hops individu- 4.2.1 Link-State Dissemination ally:€GA5xR2AwT¨3v1 ¢ D €pAiyhf 3s5 ƒ¢ ` €pAiyhf xR2QwT¨3v1 4 T   ‚  . 4 T RON does not attempt to find optimal throughput paths, but The default RON router uses a link-state routing protocol to dis- strives to avoid paths of low throughput when good alternatives are seminate topology information between routers, which in turn is £ available. Given the time-varying and somewhat unpredictable na- used to build the forwarding tables. Each node in an -node RON £¡£ ¢   ture of available bandwidth on Internet paths [2, 19], we believe this has virtual links. Each node’s router periodically requests is an appropriate goal. From the standpoint of improving the reli- summary information of the different performance metrics to the ¥¤£ ¢   ability of path selection in the face of performance failures, avoid- other nodes from its local performance database and dis- ing bad paths is more important than optimizing to eliminate small seminates its view to the others. throughput differences between paths. While a characterization of This information is sent via the RON forwarding mesh itself, to the utility received by programs at different available bandwidths ensure that routing information is propagated in the event of path may help determine a good path selection threshold, we believe that outages and heavy loss periods. Thus, the RON routing protocol more than a 50% bandwidth reduction is likely to reduce the util- is itself a RON client, with a well-defined RON packet type. This ity of many programs. This threshold also falls outside the typical leads to an attractive property: The only time a RON router has variation observed on a given path over time-scales of tens of min-
  • 6. utes [2]. We therefore concentrate on avoiding throughput faults of Probers this order of magnitude. Samples request Throughput-intensive applications typically use TCP or TCP- Latency Metrics System like congestion control, so the throughput optimizer focuses on this Loss Client Performance type of traffic. The performance of a bulk TCP transfer is a func- Throughput Database tion of the round-trip latency and the packet loss rate it observes. callback Throughput optimization combines the latency and loss metrics us- Metrics Client App ing a simplified version of the TCP throughput equation [16], which provides an upper-bound on TCP throughput. Our granularity of pdb_request() loss rate detection is 1%, and the throughput equation is more sen- pdb_update() pdb_callback() sitive at lower loss rates. We set a minimum packet loss rate of 2% to prevent an infinite bandwidth and to prevent large route oscilla- tions from single packet losses. The formula used is: Figure 4: The RON performance database. d ¢ ` #wyv T ¢ £ V I4 4w R   ¡ 9 ¢ (2) To make good routing decisions RON needs to have detailed per- V where is the one-way end-to-end packet loss probability and g44w formance information. It is impractical to send large performance is the end-to-end round-trip time estimated from the hop-by-hop histories to all participants in the RON. A reliable performance samples as described above. repository must handle participants that crash, reboot, or rejoin the The non-linear combination of loss and round-trip time makes RON. Measurement data is often noisy, and different clients may this scoring difficult to optimize with standard shortest-path algo- have different ways in which they will use that data; for instance, rithms; for instance, the TCP throughput between nodes and¤ ¥ an outage detector may want to know if any packets were success- via a node is not usually the smaller of the TCP throughputs be- ¦ fully sent in the last 15 seconds, but a throughput improver may tween and and and . While more complicated search algo- ¤ ¦ ¦ ¥ be interested in a longer-term packet loss average. Therefore, the rithms can be employed to handle this, RON routing currently takes system needs a flexible summarization mechanism. advantage of the resulting simplicity when only single-intermediate To support these requirements, each RON node or local group paths are considered to obtain throughput-optimized paths. of nodes uses a separate performance database to store samples Our probe samples give us the two-way packet loss probability, (Figure 4). The database is a generalization of SPAND [24], from which we estimate the one-way loss probability by (unrealis- supporting different data types and summarization mechanisms. tically, for some paths) assuming symmetric loss rates and solving Sources of information and clients that use it agree on a class the resulting quadratic equation. Assuming that losses are indepen- name for the data and the meaning of the values inserted into dent on the two paths and ¦ ©¨¤ § , then the maximum ¤ ¦ § the database. The probes insert data into the database by calling absolute error in the one-way loss rate estimate occurs when all pdb update(class, src, dst, type, value). The of the loss is in only one direction, resulting in an error equal to src and dst fields contain the IP addresses of the sampled data. 5 #!$#! % ¨ . Assuming a minimum loss rate of 2% ensures that, when choosing between two high-quality links, our loss rate esti- 4.3 Policy Routing mate is within 50% of the true value. At large loss rates, highly RON allows users or administrators to define the types of traffic asymmetric loss patterns cause this method to disregard a poten- allowed on particular network links. In traditional Internet policy tially good path because the reverse direction of that path has a routing, “type” is typically defined only by the packet’s source and high loss rate. However, the benefits of avoiding the high loss rate destination addresses [4]; RON generalizes this notion to include path seem to outweigh the cost of missing one good link, but we in- other information about the packet. RON separates policy routing tend to remedy this problem soon by explicitly estimating one-way into two components: classification and routing table formation. loss rates. When a packet enters the RON, it is classified and given a policy This method of throughput comparison using Equation 2 is not tag; this tag is used to perform lookups in the appropriate set of without faults. For instance, a slow but relatively unused bottle- routing tables at each RON router. A separate set of routing tables neck link on a path would almost never observe packet losses, and is constructed for each policy by re-running the routing computa- none of our probes would be lost. The equation we use would pre- tion, removing the links disallowed by the corresponding policy. dict an unreasonably high bandwidth estimate. One way of tack- The routing computation computes the shortest path from the RON ling this would be to send pairs of probe packets to estimate an node to all other nodes, for any given metric. This construction upper-bound on link bandwidth,while another might be to use non- applies to multi-hop or single-hop indirection; because a standard invasive probes to predict throughput [8]. shortest-paths algorithm may not work for all metrics (specifically, Oscillating rapidly between multiple paths is harmful to applica- TCP throughput), our implementation, described in Section 5.2, is tions sensitive to packet reordering or delay jitter. While each of specific to single-hop indirection. The result of running the table the evaluation metrics applies some smoothing, this is not enough construction is the multi-level routing tables shown in Figure 5. to avoid “flapping” between two nearly equal routes: RON routers The construction of these routing tables and the production of therefore employ hysteresis. Based on an analysis of 5000 snap- packet tags is facilitated by the policy classifier component. The shots from a RON node’s link-state table, we chose to apply a sim- policy classifier produces the permits function that determines if ple 5% hysteresis bonus to the “last good” route for the three met- a given policy is allowed to use a particular virtual link. It also pro- rics. This simple method appears to provide a reasonable trade-off vides a conduit-specific data classifier module that helps the RON between responsiveness to changes in the underlying paths and un- node decide which policy is appropriate for the incoming packet. necessary route flapping. For example, if packets from a commercial site are not allowed to go across the Internet2 educational backbone, and we received a 4.2.3 Performance Database packet from such a site at an MIT RON node, the data classifier
  • 7. from net to next hop via net a policy tag that is interpreted by the forwarders to decide which Forwarder network routing policies apply to the packet. If the packet is des- Routing Lookup tined for the local node, the forwarder uses the packet type field to Routing Pref Demux demultiplex the packet to the RON client. Policy Demux If the packet’s flow ID has a valid flow cache entry, the forwarder short-cuts the routing process with this entry. Otherwise, the rout- Dst Next ing table lookup occurs in three stages. The first stage examines the Hop policy tag, and locates the proper routing preference table. There is one routing preference table for each known policy tag. Policy ad- Router herence supersedes all other routing preferences. Next, the lookup procedure examines the routing preference flags to find a compat- ible route selection metric for the packet. The flags are examined Figure 5: The forwarding control and data paths. The for- from the right, and the first flag understood by the router directs warder passes the packet header to the routing table, which the packet to the next table. All RON routers understand the ba- first checkes if a valid flow cache entry exists. If not, it per- sic system metrics of latency, loss, and throughput; this provides forms three levels of lookups. The first level is based on the a way for a shared RON environment to support users who may policy type, the second is based on the routing preference, and also have client-defined metrics, and still provide them with good the third is a hash lookup of next hops indexed by destination. default metrics. A lookup in the routing preference table leads to a hash table of next-hops based upon the destination RON node. Version Hop Limit Routing Flags The entry in the next-hop table is returned to the forwarder, which RON Source Address places the packet on the network destined for the next hop. RON Destination Address Source port Dest port 4.5 Bootstrap and Membership Management Flow ID In addition to allowing clients to define their own membership Policy Tag mechanisms, RON provides two system membership managers: a Packet Type simple static membership mechanism that loads its peers from a file, and a dynamic announcement-based, soft-state membership protocol. To bootstrap the dynamic membership protocol, a new Figure 6: The RON packet header. The routing flags and flow node needs to know the identity of at least one peer in the RON. ID are set by the input conduit. The packet type is a demul- The new node uses this neighbor to broadcast its existence using a tiplexing key indicating the appropriate conduit or protocol at flooder, which is a special RON client that implements a general- the receiver. purpose, resilient flooding mechanism using a RON forwarder. The main challenge in the dynamic membership protocol is to avoid confusing a path outage to a node from its having left the module would determine which policy to use and set the corre- RON. Each node builds up and periodically (every five minutes on sponding tag on the packet. Subsequent RON nodes examine the average in our implementation) floods to all other nodes its list of policy tag instead of reclassifying the packet. peer RON nodes. If a node has not heard about a peer in sixty We have designed two policy mechanisms: exclusive cliques and minutes, it assumes that the peer is no longer participating in the general policies. In an exclusive clique, only data originating from RON. Observe that this mechanism allows two nodes in the same and destined to other members of the clique may traverse inter- RON to have a non-functioning direct Internet path for long periods clique links. This mimics, for instance, the “educational only” pol- of time: as long as there is some path in the RON between the two icy on the Internet2 backbone. This policy classifier takes only a nodes, neither will think that the other has left. list of networks and subnet masks to match against. The overhead of broadcasting is minuscule compared to the traf- Our general policy component is more powerful; it accepts a fic caused by active probing and routing updates, especially given BPF-like packet matcher [14], and a list of links that are denied the limited size of a RON. The resulting robustness is high: each £ ¢   by this policy. It returns the first policy that matches a packet’s node receives up to copies of the peer list sent from a given fields to the stored BPF-based information. We do not currently other node. This redundancy causes a node to be deleted from an- address policy composition, although users may create composed other node’s view of the RON only if the former node is genuinely versions of their own policies using the general policy component. partitioned for over an hour from every other node in the RON. A In Section 7, we discuss possible abuse of AUPs and network tran- RON node will re-bootstrap from a well-known peer after a long sit policies. partition. 4.4 Data Forwarding 5. Implementation The forwarder at each RON node examines every incoming The RON system is implemented at user-level as a flexible set packet to determine if it is destined for a local client or a remote of C++ libraries to which user-level RON clients link. Each client destination. If it requires further delivery, the forwarder passes the can pick and choose the components that best suits its needs. To RON packet header to the routing table, as shown in Figure 5. provide a specific example of a client using the RON library, we The RON packet header is shown in Figure 6. It is inspired by the describe the implementation of resilient IP forwarder. This for- design of IPv6 [17]. Because RON needs to support more than sim- warder improves IP packet delivery without any modification to ply IP-in-IP encapsulation, RON uses its own header. RON does the transport protocols and applications running at end-nodes. The not fragment packets, but does inform applications if they exceed architecture of the resilient IP forwarder is shown in Figure 7. the Maximum Transmission Unit (MTU). As in IPv6, applications RON uses UDP to forward data, since TCP’s reliable byte-stream must perform end-to-end path MTU discovery. RON also provides is mismatched to many RON clients, and IP-based encapsulation
  • 8. To RON nodes FreeBSD Networking To/From 5.2 Routers Local Site IP RON routers implement the router virtual interface, which UDP Divert Socket Raw Socket has only a single function call, lookup(pkt *mypkt). The RON library provides a trivial static router, and a dynamic router Yes Emit that routes based upon different metric optimizations. The dynamic Must_Fragment? send() recv() ICMP router is extensible by linking it with a different set of metric de- Conduit Classify flags policy scriptions. Metric descriptions provide an evaluation function that Encapsulate Outbound returns the “score” of a link, and a list of metrics that the routing table needs to generate and propagate. The implementation’s rout- ing table creation is specific to single-hop indirection, which also Yes is_local? eliminates the need for the flow cache. The following algorithm Forwarder Type Demux Local No Data fills in the multi-level routing table at the bottom of Figure 7: Protocols and M AKE ROUTING TABLE (P OLICIES , M ETRICS , P EERS ) Route Lookup Conduits foreach in P OLICIES   foreach ¡in M ETRICS vV y4T£ R ¢   Policy Routing Pref Demux foreach in P EERS Demux Latency Loss B/W foreach in P EERS vBNS if .permits GX vV ¦££ ¥R €UD   Router ¤ Dst Next Hop AND .permits G6QR ¦¨ vV §D © 4 T ¢ ¤ default if G 4yTR ¢¤ vV £© ¨w¤R R yv €UDT y4TQRFT` T  .eval ¡ ; ¡   ¡ ; ;   ¡ £` ` ¨wyvvV T 4 y4 !TR QRP   ¡ vV T # R £   $ $ Figure 7: The resilient IP forwarder. Packets enter the sys- tem through FreeBSD’s divert sockets. They are handed to the table[ ][ vV 4 !8P ` y4TR ¢ ¡   R ][ ] ; RON forwarder, which looks up the next hop. When the packet arrives at its destination, the conduit writes the packet out on a We have implemented the clique classifier discussed in Sec- local raw socket, where it re-enters IP processing. tion 4.3, and are implementing a general policy classifier. Both provide classifiers for the resilient IP forwarder. 5.3 Monitoring Virtual Links would restrict its application-specific forwarding capabilities. The RON core services run without special kernel support or elevated Node A Node B Initial P privileges. The conduit at the entry node is the client’s gateway to ID 5: time 10 ing the RON, and classifies every packet. This classification is client- se 1 ID 5: time 33 Respon specific; it labels the packet with information that decides what ID 5: time 15 Respon routing metric is later used to route the packet. As an example, se 2 ID 5: time 39 a RON IP forwarder conduit might label DNS and HTTP traffic as “latency-sensitive” and FTP traffic as “throughput-intensive,” Record success with RTT 5 which would cause the downstream RON forwarders to use the Record success with RTT 6 appropriate routing metric for each packet type. Non-entry RON nodes route the packet based only on the attached label and desti- nation of the packet. Figure 8: The active prober’s probing mechanism. With three This design ensures that all client- and application-specific rout- packets, both participants get an RTT sample without requir- ing functions occur only at the entry and exit conduits, and that for- ing synchronized clocks. warding inside the RON is independent of client-specific logic. In addition to reducing the per-packet overhead at intermediate nodes, £ Each RON node in an -node RON monitors its £ ¢   it also localizes the client-specific computation required on each virtual links using randomized periodic probes. The active packet. This means, for example, that a RON node implementing prober component maintains a copy of a peers table with a an IP forwarder may also participate in an overlay with a confer- next probe time field per peer. When this field expires, the encing application, and help improve the reliability of data delivery prober sends a small UDP probe packet to the remote peer. Each for the conference. The conference node conduits implement the probe packet has a random 64-bit ID. The process used by the application-specific methods that label packets to tell the IP for- prober is shown in Figure 8. When a node receives an initial probe warder which routing metrics to use for conference packets. request from a peer, it sends response 1 to that peer, and resets its probe timer for that peer. When the originating node sees response 5.1 The IP Forwarder 1, it sends response 2 back to the peer, so that both sides get reach- We implemented the resilient IP forwarder using FreeBSD’s di- ability and RTT information from 3 packets. vert sockets to automatically send IP traffic over the RON, and emit The probing protocol is implemented as a RON client, which it at the other end. The resilient IP forwarder provides classifica- communicates with performance database (implemented as a stand- tion, encapsulation, and decapsulation of IP packets through a spe- alone application running on the Berkeley DB3 backend) using a cial conduit called the ip conduit (top of Figure 7). simple UDP-based protocol.
  • 9. RON was able to successfully detect and recover from Name Description 100% (in ¥ ¤§  £ ¡ ) and 60% (in ¨ ¤§  £ ¡ ) of all complete Aros ISP in Salt Lake City, UT 6.2 CCI .com in Salt Lake City, UT outages and all periods of sustained high loss rates of 30% or more. Cisco-MA .com in Waltham, MA * CMU Pittsburgh, PA RON takes 18 seconds, on average, to route around a 6.2 * Cornell Ithaca, NY failure and can do so in the face of a flooding attack. Lulea Lulea University, Sweden RON successfully routed around bad throughput fail- MA-Cable MediaOne Cable in Cambridge, MA 6.3 ures, doubling TCP throughput in 5% of all samples. * MIT Cambridge, MA CA-T1 .com in Foster City, CA In 5% of the samples, RON reduced the loss probability 6.3 * NYU New York, NY by 0.05 or more. * Utah Salt Lake City, UT Single-hop route indirection captured the majority of VU-NL Vrije Univ., Amsterdam, Netherlands benefits in our RON deployment, for both outage recov- 6.4 Additional hosts used in dataset ¨ ¤¢  £ ¡ ery and latency optimization. OR-DSL DSL in Corvallis, OR NC-Cable MediaOne Cable in Durham, NC Table 1: Major results from measurements of the RON testbed. PDI .com in Palo Alto, CA Mazu .com in Boston, MA 6. Evaluation Table 2: The hosts in our sixteen-node RON deployment, which The goal of the RON system is to overcome path outages and per- we study in detail to determine the effectiveness of RON in formance failures, without introducing excessive overhead or new practice. Asterisks indicate U.S. Universities on the Internet2 failure modes. In this section, we present an evaluation of how backbone. The two European universities, Lulea and VU-NL well RON meets these goals. We evaluate the performance of the are classified as non-Internet2. resilient IP forwarder, which uses RON for outage detection and loss, latency, and throughput optimization. Our evaluation has three main parts to it. First, we study commercial links, and never from the more reliable Internet2 links. RON’s ability to detect outages and recover quickly from them. The raw measurement data used in this paper consists of probe Next, we investigate performance failures and RON’s ability to im- packets, throughput samples, and traceroute results. To probe, prove the loss rate, latency, and throughput of badly performing each RON node independently repeated the following steps: paths. Finally, we investigate two important aspects of RON’s rout- 1. Pick a random node, . ¤ $ ing, showing the effectiveness of its one-intermediate-hop strategy compared to more general alternatives and the stability of RON- 2. Pick a probe-type from one of §¡© ¦ ¥ T¨3vx1¤ APxR32x1¤ 4 #w T  round-robin selection. Send a probe to node . 4 R ¤ using ¨ ©  generated routes. Table 1 summarizes our key findings. 3. Delay for a random time interval between 1 and 2 seconds. 6.1 Methodology This characterizes all £ G ¢   £Dpaths. For ¥ ¤¢  £ ¡ , we analyze 10.9 million packet departure and arrival times, collected over 64 Most of our results come from experiments with a wide-area hours between 21 March 2001 and 23 March 2001. This produces RON deployed at several Internet sites. There are £ ent paths between the hosts in an -site RON deployment. Table 2 £ G ¢ a£ D differ-   2.6 million individual RTT, one-way loss, and jitter data samples, for which we calculated time-averaged samples averaged over a shows the location of sites for our experiments. We analyze two 30-minute duration.1 We also took 8,855 throughput samples from ¥ ¤¢  £ £ ¡ distinct datasets—¨ ¤§  £ ¡ ¢ ` £¢ with ¢ £   ¡ nodes and 132 distinct 1 MByte bulk transfers (or 30 seconds, whichever came sooner), paths, and ¥ ¤§  £ ¡ with `nodes and 240 distinct paths. In , traceroute data shows that 36 different AS’s were tra- recording the time at which each power of two’s worth of data was sent and the duration of the transfer. ¨ ¤¢  £ ¡ data was collected versed, with at least 74 distinct inter-AS links; for ¨ ¤¢  ¨ £ ¡ , 50 AS’s over 85 hours from 7 May 2001 and 11 May 2001, and consists of and 118 inter-AS links were traversed. The ¡ G £D scaling of path 13.8 million packet arrival and departure times. Space constraints diversity suggests that even small numbers of confederating hosts allow us to present only the path outage analysis of ¨ ¤§  , but the £ ¡ expose a large number of Internet paths to examination [18, 19]. performance failure results from ¨ ¤¢  £ ¡ are similar to ¥ £¤¢  [1]. ¡ We do not claim that our experiments and results are typical or rep- Most of the RON hosts were Intel Celeron/733-based machines resentative of anything other than our deployment, but present and running FreeBSD, with 256MB RAM and 9GB of disk space. The analyze them to demonstrate the kinds of gains one might realize processing capability of a host was never a bottleneck in any of with RONs. our wide-area experiments. Our measurement data is available at Several of our host sites are Internet2-connected educational http://nms.lcs.mit.edu/ron/. sites, and we found that this high-speed experimental network presents opportunities for path improvement not available in the 6.2 Overcoming Path Outages public Internet. To demonstrate that our policy routing module Precisely measuring a path outage is harder than one might think. works, and to make our measurements closer to what Internet hosts One possibility is to define an outage as the length of time during in general would observe, all the measurements reported herein which no packets get through the path. As a yardstick, we could were taken with a policy that prohibited sending traffic to or from use ranges of time in which TCP implementations will time out commercial sites over the Internet2. Therefore, for instance, a and shut down a connection; these vary from about 120 seconds packet could travel from Utah to Cornell to NYU, but not from ¥ Some hosts came and went during the study, so the number of Aros to Utah to NYU. This is consistent with the AUP of Internet2 samples in the averages is not always as large as one might expect that precludes commercial traffic. As a result, in our measurements, if all hosts were continually present. More details are available in all path improvements involving a commercial site came from only the documentation accompanying the data.