SlideShare a Scribd company logo
1 of 44
TCP and Congestion Control
Supplemental slides
02/21/07
Aditya Akella
Introduction to TCP
• Communication abstraction:
– Reliable
– Ordered
– Point-to-point
– Byte-stream
– Full duplex
– Flow and congestion controlled
• Protocol implemented entirely at the ends
– Fate sharing
• Sliding window with cumulative acks
– Ack field contains last in-order packet received
– Duplicate acks sent when out-of-order packet received
Evolution of TCP
1975 1980 1985 1990
1982
TCP & IP
RFC 793 & 791
1974
TCP described by
Vint Cerf and Bob Kahn
In IEEE Trans Comm
1983
BSD Unix 4.2
supports TCP/IP
1984
Nagel’s algorithm
to reduce overhead
of small packets;
predicts congestion
collapse
1987
Karn’s algorithm
to better estimate
round-trip time
1986
Congestion
collapse
observed
1988
Van Jacobson’s
algorithms
congestion avoidance
and congestion control
(most implemented in
4.3BSD Tahoe)
1990
4.3BSD Reno
fast retransmit
delayed ACK’s
1975
Three-way handshake
Raymond Tomlinson
In SIGCOMM 75
TCP Through the 1990s
1993 1994 1996
1994
ECN
(Floyd)
Explicit
Congestion
Notification
1993
TCP Vegas
(Brakmo et al)
real congestion
avoidance
1994
T/TCP
(Braden)
Transaction
TCP
1996
SACK TCP
(Floyd et al)
Selective
Acknowledgement
1996
Hoe
Improving TCP
startup
1996
FACK TCP
(Mathis et al)
extension to SACK
Timeout-based Recovery
• Wait at least one RTT before retransmitting
• Importance of accurate RTT estimators:
– Low RTT  unneeded retransmissions
– High RTT  poor throughput
• RTT estimator must adapt to change in RTT
– But not too fast, or too slow!
• Spurious timeouts
– “Conservation of packets” principle – more than a
window worth of packets in flight
Initial Round-trip Estimator
• Round trip times exponentially averaged:
– New RTT = α (old RTT) + (1 - α) (new
sample)
– Recommended value for α: 0.8 - 0.9
• 0.875 for most TCP’s
• Retransmit timer set to β RTT, where β = 2
– Every time timer expires, RTO exponentially backed-
off
– Like Ethernet
• Not good at preventing spurious timeouts
Jacobson’s Retransmission
Timeout
• Key observation:
– At high loads round trip variance is high
• Solution:
– Base RTO on RTT and standard deviation or
RRTT
– rttvar =  * dev + (1- )rttvar
• dev = linear deviation
• Inappropriately named – actually smoothed linear
deviation
TCP Flavors
• Tahoe, Reno, Vegas  differ in data-
driven reliability
• TCP Tahoe (distributed with 4.3BSD Unix)
– Original implementation of Van Jacobson’s
mechanisms (VJ paper)
– Includes:
• Slow start
• Congestion avoidance
• Fast retransmit
Fast Retransmit
• What are duplicate acks (dupacks)?
– Repeated acks for the same sequence
• When can duplicate acks occur?
– Loss
– Packet re-ordering
– Window update – advertisement of new flow control
window
• Assume re-ordering is infrequent and not of
large magnitude
– Use receipt of 3 or more duplicate acks as indication
of loss
– Don’t wait for timeout to retransmit packet
Fast Retransmit
Time
Sequence No
Duplicate Acks
Retransmission
X
Multiple Losses
Time
Sequence No Duplicate Acks
Retransmission
X
X
X
X
Now what?
Time
Sequence No
X
X
X
X
Tahoe
TCP Reno (1990)
• All mechanisms in Tahoe
• Addition of fast-recovery
– Opening up congestion window after fast retransmit
• Delayed acks
• Header prediction
– Implementation designed to improve performance
– Has common case code inlined
• With multiple losses, Reno typically timeouts
because it does not receive enough duplicate
acknowledgements
Reno
Time
Sequence No
X
X
X
X
Now what?  timeout
NewReno
• The ack that arrives after retransmission
(partial ack) should indicate that a second
loss occurred
• When does NewReno timeout?
– When there are fewer than three dupacks for
first loss
– When partial ack is lost
• How fast does it recover losses?
– One per RTT
NewReno
Time
Sequence No
X
X
X
X
Now what?  partial ack
recovery
SACK
• Basic problem is that cumulative acks
provide little information
– Ack for just the packet received
• What if acks are lost?  carry cumulative also
• Not used
– Bitmask of packets received
• Selective acknowledgement (SACK)
• How to deal with reordering
Congestion Collapse
• Definition: Increase in network load results in
decrease of useful work done
• Many possible causes
– Spurious retransmissions of packets still in flight
• Classical congestion collapse
• How can this happen with packet conservation
• Solution: better timers and TCP congestion control
– Undelivered packets
• Packets consume resources and are dropped elsewhere in
network
• Solution: congestion control for ALL traffic
Other Congestion Collapse
Causes
• Fragments
– Mismatch of transmission and retransmission units
– Solutions
• Make network drop all fragments of a packet (early packet
discard in ATM)
• Do path MTU discovery
• Control traffic
– Large percentage of traffic is for control
• Headers, routing messages, DNS, etc.
• Stale or unwanted packets
– Packets that are delayed on long queues
– “Push” data that is never used
Where to Prevent Collapse?
• Can end hosts prevent problem?
– Yes, but must trust end hosts to do right thing
– E.g., sending host must adjust amount of data
it puts in the network based on detected
congestion
• Can routers prevent collapse?
– No, not all forms of collapse
– Doesn’t mean they can’t help
– Sending accurate congestion signals
– Isolating well-behaved from ill-behaved
sources
Congestion Control and
Avoidance
• A mechanism which:
– Uses network resources efficiently
– Preserves fair network resource allocation
– Prevents or avoids collapse
• Congestion collapse is not just a theory
– Has been frequently observed in many
networks
TCP Congestion Control
• Motivated by ARPANET congestion collapse
• Underlying design principle: packet conservation
– At equilibrium, inject packet into network only when
one is removed
– Basis for stability of physical systems
• Why was this not working?
– Connection doesn’t reach equilibrium
– Spurious retransmissions
– Resource limitations prevent equilibrium
TCP Congestion Control -
Solutions
• Reaching equilibrium
– Slow start
• Eliminates spurious retransmissions
– Accurate RTO estimation
– Fast retransmit
• Adapting to resource availability
– Congestion avoidance
TCP Congestion Control
• Changes to TCP motivated by
ARPANET congestion collapse
• Basic principles
–AIMD
–Packet conservation
–Reaching steady state quickly
–ACK clocking
AIMD
• Distributed, fair and efficient
• Packet loss is seen as sign of congestion and
results in a multiplicative rate decrease
– Factor of 2
• TCP periodically probes for available bandwidth
by increasing its rate
Time
Rate
Implementation Issue
• Operating system timers are very coarse – how to pace
packets out smoothly?
• Implemented using a congestion window that limits how
much data can be in the network.
– TCP also keeps track of how much data is in transit
• Data can only be sent when the amount of outstanding
data is less than the congestion window.
– The amount of outstanding data is increased on a “send” and
decreased on “ack”
– (last sent – last acked) < congestion window
• Window limited by both congestion and buffering
– Sender’s maximum window = Min (advertised window, cwnd)
Congestion Avoidance
• If loss occurs when cwnd = W
– Network can handle 0.5W ~ W segments
– Set cwnd to 0.5W (multiplicative decrease)
• Upon receiving ACK
– Increase cwnd by (1 packet)/cwnd
• What is 1 packet?  1 MSS worth of bytes
• After cwnd packets have passed by 
approximately increase of 1 MSS
• Implements AIMD
Congestion Avoidance
Sequence Plot
Time
Sequence No
Packets
Acks
Congestion Avoidance Behavior
Time
Congestion
Window
Packet loss
+ Timeout
Grabbing
back
Bandwidth
Cut
Congestion
Window
and Rate
Packet Conservation
• At equilibrium, inject packet into network only
when one is removed
– Sliding window and not rate controlled
– But still need to avoid sending burst of packets 
would overflow links
• Need to carefully pace out packets
• Helps provide stability
• Need to eliminate spurious retransmissions
– Accurate RTO estimation
– Better loss recovery techniques (e.g. fast retransmit)
TCP Packet Pacing
• Congestion window helps to “pace” the
transmission of data packets
• In steady state, a packet is sent when an ack is
received
– Data transmission remains smooth, once it is smooth
– Self-clocking behavior
Pr
Pb
Ar
Ab
ReceiverSender
As
Reaching Steady State
• Doing AIMD is fine in steady state but
slow…
• How does TCP know what is a good initial
rate to start with?
– Should work both for a CDPD (10s of Kbps or
less) and for supercomputer links (10 Gbps
and growing)
• Quick initial phase to help get up to speed
(slow start)
Slow Start Packet Pacing
• How do we get this
clocking behavior to
start?
– Initialize cwnd = 1
– Upon receipt of every
ack, cwnd = cwnd + 1
• Implications
– Window actually
increases to W in RTT *
log2(W)
– Can overshoot window
and cause packet loss
TCP Saw Tooth
Time
Congestion
Window
Initial
Slowstart
Fast
Retransmit
and Recovery
Slowstart
to pace
packets
Timeouts
may still
occur
TCP Modeling
• Given the congestion behavior of TCP can we
predict what type of performance we should get?
• What are the important factors
– Loss rate
• Affects how often window is reduced
– RTT
• Affects increase rate and relates BW to window
– RTO
• Affects performance during loss recovery
– MSS
• Affects increase rate
Overall TCP Behavior
Time
Window
• Let’s concentrate on steady state behavior
with no timeouts and perfect loss recovery
Simple TCP Model
• Some additional assumptions
– Fixed RTT
– No delayed ACKs
• In steady state, TCP losses packet each
time window reaches W packets
– Window drops to W/2 packets
– Each RTT window increases by 1
packetW/2 * RTT before next loss
– BW = MSS * avg window/RTT = MSS * (W +
W/2)/(2 * RTT) = .75 * MSS * W / RTT
Simple Loss Model
• What was the loss rate?
– Packets transferred = (.75 W/RTT) * (W/2 *
RTT) = 3W2
/8
– 1 packet lost  loss rate = p = 8/3W2
– W = sqrt( 8 / (3 * loss rate))
• BW = .75 * MSS * W / RTT
– BW = MSS / (RTT * sqrt (2/3p))
TCP Vegas Slow Start
• ssthresh estimation via packet pair
• Only increase every other RTT
– Tests new window size before increasing
Packet Pair
• What would happen if a source
transmitted a pair of packets back-to-
back?
• Spacing of these packets would be
determined by bottleneck link
– Basis for ack clocking in TCP
• What type of bottleneck router behavior
would affect this spacing
– Queuing scheduling
Packet Pair in Practice
• Most Internet routers are FIFO/Drop-Tail
• Easy to measure link bandwidths
– Bprobe, pathchar, pchar, nettimer, etc.
• How can this be used?
– NewReno and Vegas use it to initialize
ssthresh
– Prevents large overshoot of available
bandwidth
– Want a high estimate – otherwise will take a
long time in linear growth to reach desired
bandwidth
TCP Vegas Congestion
Avoidance
• Only reduce cwnd if packet sent after last
such action
– Reaction per congestion episode not per loss
• Congestion avoidance vs. control
• Use change in observed end-to-end delay to
detect onset of congestion
– Compare expected to actual throughput
– Expected = window size / round trip time
– Actual = acks / round trip time
TCP Vegas
• Fine grain timers
– Check RTO every time a dupack is received or for
“partial ack”
– If RTO expired, then re-xmit packet
– Standard Reno only checks at 500ms
• Allows packets to be retransmitted earlier
– Not the real source of performance gain
• Allows retransmission of packet that would have
timed-out
– Small windows/loss of most of window
– Real source of performance gain
– Shouldn’t comparison be against NewReno/SACK
TCP Vegas
• Flaws
– Sensitivity to delay variation
– Paper did not do great job of explaining where
performance gains came from
• Some ideas have been incorporated into
more recent implementations
• Overall
– Some very intriguing ideas
– Controversies killed it

More Related Content

What's hot

Congestion control avoidance
Congestion control avoidanceCongestion control avoidance
Congestion control avoidance
Anthony-Claret Onwutalobi
 
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
TCP-FIT: An Improved TCP Congestion Control Algorithm and its PerformanceTCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
Kevin Tong
 
Congestion Control in Computer Networks - ATM and TCP
Congestion Control in Computer Networks - ATM and TCPCongestion Control in Computer Networks - ATM and TCP
Congestion Control in Computer Networks - ATM and TCP
Attila Balazs
 

What's hot (20)

Congestion control
Congestion controlCongestion control
Congestion control
 
Congestion control in tcp
Congestion control in tcpCongestion control in tcp
Congestion control in tcp
 
TCP Congestion Control
TCP Congestion ControlTCP Congestion Control
TCP Congestion Control
 
Congestion control avoidance
Congestion control avoidanceCongestion control avoidance
Congestion control avoidance
 
TCP congestion control
TCP congestion controlTCP congestion control
TCP congestion control
 
features of tcp important for the web
features of tcp  important for the webfeatures of tcp  important for the web
features of tcp important for the web
 
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
Comparative Analysis of Different TCP Variants in Mobile Ad-Hoc Network
 
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
TCP-FIT: An Improved TCP Congestion Control Algorithm and its PerformanceTCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
 
Congestion Control in Computer Networks - ATM and TCP
Congestion Control in Computer Networks - ATM and TCPCongestion Control in Computer Networks - ATM and TCP
Congestion Control in Computer Networks - ATM and TCP
 
TCP Congestion Control By Owais Jara
TCP Congestion Control By Owais JaraTCP Congestion Control By Owais Jara
TCP Congestion Control By Owais Jara
 
Tcp Congestion Avoidance
Tcp Congestion AvoidanceTcp Congestion Avoidance
Tcp Congestion Avoidance
 
Tcp congestion avoidance
Tcp congestion avoidanceTcp congestion avoidance
Tcp congestion avoidance
 
Adoptive retransmission in TCP
Adoptive retransmission in TCPAdoptive retransmission in TCP
Adoptive retransmission in TCP
 
Tcp(no ip) review part1
Tcp(no ip) review part1Tcp(no ip) review part1
Tcp(no ip) review part1
 
RIPE 80: Buffers and Protocols
RIPE 80: Buffers and ProtocolsRIPE 80: Buffers and Protocols
RIPE 80: Buffers and Protocols
 
Congestion Control
Congestion ControlCongestion Control
Congestion Control
 
Congestion control
Congestion controlCongestion control
Congestion control
 
A Baker's dozen of TCP
A Baker's dozen of TCPA Baker's dozen of TCP
A Baker's dozen of TCP
 
Network performance overview
Network  performance overviewNetwork  performance overview
Network performance overview
 
Tcp (1)
Tcp (1)Tcp (1)
Tcp (1)
 

Viewers also liked (7)

MCS Perfil de Empresa
MCS Perfil de EmpresaMCS Perfil de Empresa
MCS Perfil de Empresa
 
Je čas změnit základy počítačové bezpečnosti
Je čas změnit základy počítačové bezpečnostiJe čas změnit základy počítačové bezpečnosti
Je čas změnit základy počítačové bezpečnosti
 
небесна сотня
небесна сотнянебесна сотня
небесна сотня
 
Mis últimos días en gauss
Mis últimos días en gaussMis últimos días en gauss
Mis últimos días en gauss
 
Chemical bonding (ncert)
Chemical bonding (ncert)Chemical bonding (ncert)
Chemical bonding (ncert)
 
театралізована діяльність дошкільників
театралізована діяльність дошкільниківтеатралізована діяльність дошкільників
театралізована діяльність дошкільників
 
Jays Resume 2
Jays Resume 2Jays Resume 2
Jays Resume 2
 

Similar to Lect9 (1)

05 ergeg mmergm maergergcongergeestion.ppt
05 ergeg mmergm maergergcongergeestion.ppt05 ergeg mmergm maergergcongergeestion.ppt
05 ergeg mmergm maergergcongergeestion.ppt
TLRTHR
 
05compuernetworkscongestioncontrolalgo.ppt
05compuernetworkscongestioncontrolalgo.ppt05compuernetworkscongestioncontrolalgo.ppt
05compuernetworkscongestioncontrolalgo.ppt
TLRTHR
 
Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc
Chandra Meena
 
Computer network (18)
Computer network (18)Computer network (18)
Computer network (18)
NYversity
 

Similar to Lect9 (1) (20)

Designing TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion ControlDesigning TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion Control
 
Part9-congestion.pptx
Part9-congestion.pptxPart9-congestion.pptx
Part9-congestion.pptx
 
Tcp
TcpTcp
Tcp
 
05 ergeg mmergm maergergcongergeestion.ppt
05 ergeg mmergm maergergcongergeestion.ppt05 ergeg mmergm maergergcongergeestion.ppt
05 ergeg mmergm maergergcongergeestion.ppt
 
05compuernetworkscongestioncontrolalgo.ppt
05compuernetworkscongestioncontrolalgo.ppt05compuernetworkscongestioncontrolalgo.ppt
05compuernetworkscongestioncontrolalgo.ppt
 
6610-l14.pptx
6610-l14.pptx6610-l14.pptx
6610-l14.pptx
 
Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc Lecture 19 22. transport protocol for ad-hoc
Lecture 19 22. transport protocol for ad-hoc
 
9_Network.ppt
9_Network.ppt9_Network.ppt
9_Network.ppt
 
NE #1.pptx
NE #1.pptxNE #1.pptx
NE #1.pptx
 
Mobile comn.pptx
Mobile comn.pptxMobile comn.pptx
Mobile comn.pptx
 
Transport layer
Transport layerTransport layer
Transport layer
 
High Performance Networking with Advanced TCP
High Performance Networking with Advanced TCPHigh Performance Networking with Advanced TCP
High Performance Networking with Advanced TCP
 
Computer network (18)
Computer network (18)Computer network (18)
Computer network (18)
 
QoSintro.PPT
QoSintro.PPTQoSintro.PPT
QoSintro.PPT
 
Congestion control
Congestion controlCongestion control
Congestion control
 
Investigating the Use of Synchronized Clocks in TCP Congestion Control
Investigating the Use of Synchronized Clocks in TCP Congestion ControlInvestigating the Use of Synchronized Clocks in TCP Congestion Control
Investigating the Use of Synchronized Clocks in TCP Congestion Control
 
Network protocols and vulnerabilities
Network protocols and vulnerabilitiesNetwork protocols and vulnerabilities
Network protocols and vulnerabilities
 
Mobile Transpot Layer
Mobile Transpot LayerMobile Transpot Layer
Mobile Transpot Layer
 
Congestion control 1
Congestion control 1Congestion control 1
Congestion control 1
 
TCP Theory
TCP TheoryTCP Theory
TCP Theory
 

Recently uploaded

Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)
Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)
Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)
Cara Menggugurkan Kandungan 087776558899
 

Recently uploaded (6)

FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCRFULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
 
Leading Mobile App Development Companies in India (2).pdf
Leading Mobile App Development Companies in India (2).pdfLeading Mobile App Development Companies in India (2).pdf
Leading Mobile App Development Companies in India (2).pdf
 
9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Service
 
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost LoverPowerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
 
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
 
Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)
Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)
Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)
 

Lect9 (1)

  • 1. TCP and Congestion Control Supplemental slides 02/21/07 Aditya Akella
  • 2. Introduction to TCP • Communication abstraction: – Reliable – Ordered – Point-to-point – Byte-stream – Full duplex – Flow and congestion controlled • Protocol implemented entirely at the ends – Fate sharing • Sliding window with cumulative acks – Ack field contains last in-order packet received – Duplicate acks sent when out-of-order packet received
  • 3. Evolution of TCP 1975 1980 1985 1990 1982 TCP & IP RFC 793 & 791 1974 TCP described by Vint Cerf and Bob Kahn In IEEE Trans Comm 1983 BSD Unix 4.2 supports TCP/IP 1984 Nagel’s algorithm to reduce overhead of small packets; predicts congestion collapse 1987 Karn’s algorithm to better estimate round-trip time 1986 Congestion collapse observed 1988 Van Jacobson’s algorithms congestion avoidance and congestion control (most implemented in 4.3BSD Tahoe) 1990 4.3BSD Reno fast retransmit delayed ACK’s 1975 Three-way handshake Raymond Tomlinson In SIGCOMM 75
  • 4. TCP Through the 1990s 1993 1994 1996 1994 ECN (Floyd) Explicit Congestion Notification 1993 TCP Vegas (Brakmo et al) real congestion avoidance 1994 T/TCP (Braden) Transaction TCP 1996 SACK TCP (Floyd et al) Selective Acknowledgement 1996 Hoe Improving TCP startup 1996 FACK TCP (Mathis et al) extension to SACK
  • 5. Timeout-based Recovery • Wait at least one RTT before retransmitting • Importance of accurate RTT estimators: – Low RTT  unneeded retransmissions – High RTT  poor throughput • RTT estimator must adapt to change in RTT – But not too fast, or too slow! • Spurious timeouts – “Conservation of packets” principle – more than a window worth of packets in flight
  • 6. Initial Round-trip Estimator • Round trip times exponentially averaged: – New RTT = α (old RTT) + (1 - α) (new sample) – Recommended value for α: 0.8 - 0.9 • 0.875 for most TCP’s • Retransmit timer set to β RTT, where β = 2 – Every time timer expires, RTO exponentially backed- off – Like Ethernet • Not good at preventing spurious timeouts
  • 7. Jacobson’s Retransmission Timeout • Key observation: – At high loads round trip variance is high • Solution: – Base RTO on RTT and standard deviation or RRTT – rttvar =  * dev + (1- )rttvar • dev = linear deviation • Inappropriately named – actually smoothed linear deviation
  • 8. TCP Flavors • Tahoe, Reno, Vegas  differ in data- driven reliability • TCP Tahoe (distributed with 4.3BSD Unix) – Original implementation of Van Jacobson’s mechanisms (VJ paper) – Includes: • Slow start • Congestion avoidance • Fast retransmit
  • 9. Fast Retransmit • What are duplicate acks (dupacks)? – Repeated acks for the same sequence • When can duplicate acks occur? – Loss – Packet re-ordering – Window update – advertisement of new flow control window • Assume re-ordering is infrequent and not of large magnitude – Use receipt of 3 or more duplicate acks as indication of loss – Don’t wait for timeout to retransmit packet
  • 11. Multiple Losses Time Sequence No Duplicate Acks Retransmission X X X X Now what?
  • 13. TCP Reno (1990) • All mechanisms in Tahoe • Addition of fast-recovery – Opening up congestion window after fast retransmit • Delayed acks • Header prediction – Implementation designed to improve performance – Has common case code inlined • With multiple losses, Reno typically timeouts because it does not receive enough duplicate acknowledgements
  • 15. NewReno • The ack that arrives after retransmission (partial ack) should indicate that a second loss occurred • When does NewReno timeout? – When there are fewer than three dupacks for first loss – When partial ack is lost • How fast does it recover losses? – One per RTT
  • 16. NewReno Time Sequence No X X X X Now what?  partial ack recovery
  • 17. SACK • Basic problem is that cumulative acks provide little information – Ack for just the packet received • What if acks are lost?  carry cumulative also • Not used – Bitmask of packets received • Selective acknowledgement (SACK) • How to deal with reordering
  • 18. Congestion Collapse • Definition: Increase in network load results in decrease of useful work done • Many possible causes – Spurious retransmissions of packets still in flight • Classical congestion collapse • How can this happen with packet conservation • Solution: better timers and TCP congestion control – Undelivered packets • Packets consume resources and are dropped elsewhere in network • Solution: congestion control for ALL traffic
  • 19. Other Congestion Collapse Causes • Fragments – Mismatch of transmission and retransmission units – Solutions • Make network drop all fragments of a packet (early packet discard in ATM) • Do path MTU discovery • Control traffic – Large percentage of traffic is for control • Headers, routing messages, DNS, etc. • Stale or unwanted packets – Packets that are delayed on long queues – “Push” data that is never used
  • 20. Where to Prevent Collapse? • Can end hosts prevent problem? – Yes, but must trust end hosts to do right thing – E.g., sending host must adjust amount of data it puts in the network based on detected congestion • Can routers prevent collapse? – No, not all forms of collapse – Doesn’t mean they can’t help – Sending accurate congestion signals – Isolating well-behaved from ill-behaved sources
  • 21. Congestion Control and Avoidance • A mechanism which: – Uses network resources efficiently – Preserves fair network resource allocation – Prevents or avoids collapse • Congestion collapse is not just a theory – Has been frequently observed in many networks
  • 22. TCP Congestion Control • Motivated by ARPANET congestion collapse • Underlying design principle: packet conservation – At equilibrium, inject packet into network only when one is removed – Basis for stability of physical systems • Why was this not working? – Connection doesn’t reach equilibrium – Spurious retransmissions – Resource limitations prevent equilibrium
  • 23. TCP Congestion Control - Solutions • Reaching equilibrium – Slow start • Eliminates spurious retransmissions – Accurate RTO estimation – Fast retransmit • Adapting to resource availability – Congestion avoidance
  • 24. TCP Congestion Control • Changes to TCP motivated by ARPANET congestion collapse • Basic principles –AIMD –Packet conservation –Reaching steady state quickly –ACK clocking
  • 25. AIMD • Distributed, fair and efficient • Packet loss is seen as sign of congestion and results in a multiplicative rate decrease – Factor of 2 • TCP periodically probes for available bandwidth by increasing its rate Time Rate
  • 26. Implementation Issue • Operating system timers are very coarse – how to pace packets out smoothly? • Implemented using a congestion window that limits how much data can be in the network. – TCP also keeps track of how much data is in transit • Data can only be sent when the amount of outstanding data is less than the congestion window. – The amount of outstanding data is increased on a “send” and decreased on “ack” – (last sent – last acked) < congestion window • Window limited by both congestion and buffering – Sender’s maximum window = Min (advertised window, cwnd)
  • 27. Congestion Avoidance • If loss occurs when cwnd = W – Network can handle 0.5W ~ W segments – Set cwnd to 0.5W (multiplicative decrease) • Upon receiving ACK – Increase cwnd by (1 packet)/cwnd • What is 1 packet?  1 MSS worth of bytes • After cwnd packets have passed by  approximately increase of 1 MSS • Implements AIMD
  • 29. Congestion Avoidance Behavior Time Congestion Window Packet loss + Timeout Grabbing back Bandwidth Cut Congestion Window and Rate
  • 30. Packet Conservation • At equilibrium, inject packet into network only when one is removed – Sliding window and not rate controlled – But still need to avoid sending burst of packets  would overflow links • Need to carefully pace out packets • Helps provide stability • Need to eliminate spurious retransmissions – Accurate RTO estimation – Better loss recovery techniques (e.g. fast retransmit)
  • 31. TCP Packet Pacing • Congestion window helps to “pace” the transmission of data packets • In steady state, a packet is sent when an ack is received – Data transmission remains smooth, once it is smooth – Self-clocking behavior Pr Pb Ar Ab ReceiverSender As
  • 32. Reaching Steady State • Doing AIMD is fine in steady state but slow… • How does TCP know what is a good initial rate to start with? – Should work both for a CDPD (10s of Kbps or less) and for supercomputer links (10 Gbps and growing) • Quick initial phase to help get up to speed (slow start)
  • 33. Slow Start Packet Pacing • How do we get this clocking behavior to start? – Initialize cwnd = 1 – Upon receipt of every ack, cwnd = cwnd + 1 • Implications – Window actually increases to W in RTT * log2(W) – Can overshoot window and cause packet loss
  • 34. TCP Saw Tooth Time Congestion Window Initial Slowstart Fast Retransmit and Recovery Slowstart to pace packets Timeouts may still occur
  • 35. TCP Modeling • Given the congestion behavior of TCP can we predict what type of performance we should get? • What are the important factors – Loss rate • Affects how often window is reduced – RTT • Affects increase rate and relates BW to window – RTO • Affects performance during loss recovery – MSS • Affects increase rate
  • 36. Overall TCP Behavior Time Window • Let’s concentrate on steady state behavior with no timeouts and perfect loss recovery
  • 37. Simple TCP Model • Some additional assumptions – Fixed RTT – No delayed ACKs • In steady state, TCP losses packet each time window reaches W packets – Window drops to W/2 packets – Each RTT window increases by 1 packetW/2 * RTT before next loss – BW = MSS * avg window/RTT = MSS * (W + W/2)/(2 * RTT) = .75 * MSS * W / RTT
  • 38. Simple Loss Model • What was the loss rate? – Packets transferred = (.75 W/RTT) * (W/2 * RTT) = 3W2 /8 – 1 packet lost  loss rate = p = 8/3W2 – W = sqrt( 8 / (3 * loss rate)) • BW = .75 * MSS * W / RTT – BW = MSS / (RTT * sqrt (2/3p))
  • 39. TCP Vegas Slow Start • ssthresh estimation via packet pair • Only increase every other RTT – Tests new window size before increasing
  • 40. Packet Pair • What would happen if a source transmitted a pair of packets back-to- back? • Spacing of these packets would be determined by bottleneck link – Basis for ack clocking in TCP • What type of bottleneck router behavior would affect this spacing – Queuing scheduling
  • 41. Packet Pair in Practice • Most Internet routers are FIFO/Drop-Tail • Easy to measure link bandwidths – Bprobe, pathchar, pchar, nettimer, etc. • How can this be used? – NewReno and Vegas use it to initialize ssthresh – Prevents large overshoot of available bandwidth – Want a high estimate – otherwise will take a long time in linear growth to reach desired bandwidth
  • 42. TCP Vegas Congestion Avoidance • Only reduce cwnd if packet sent after last such action – Reaction per congestion episode not per loss • Congestion avoidance vs. control • Use change in observed end-to-end delay to detect onset of congestion – Compare expected to actual throughput – Expected = window size / round trip time – Actual = acks / round trip time
  • 43. TCP Vegas • Fine grain timers – Check RTO every time a dupack is received or for “partial ack” – If RTO expired, then re-xmit packet – Standard Reno only checks at 500ms • Allows packets to be retransmitted earlier – Not the real source of performance gain • Allows retransmission of packet that would have timed-out – Small windows/loss of most of window – Real source of performance gain – Shouldn’t comparison be against NewReno/SACK
  • 44. TCP Vegas • Flaws – Sensitivity to delay variation – Paper did not do great job of explaining where performance gains came from • Some ideas have been incorporated into more recent implementations • Overall – Some very intriguing ideas – Controversies killed it