SlideShare une entreprise Scribd logo
1  sur  64
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Network Performance:
Making Every Packet Count
M i k e F u r r , P r i n c i p a l E n g i n e e r , E C 2
N o v e m b e r 2 9 , 2 0 1 7
N E T 4 0 1
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What to expect from this session
Tuning TCP
on Linux
TCP Performance Application
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
TCP
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
TCP
• Transmission Control Protocol
• Underlies SSH, HTTP, *SQL, SMTP
• Stream delivery, flow control
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
TCP
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Limiting in-flight data
Jack Jill
Receive
Window
Receive
Window
Congestion
Window
Congestion
Window
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Bandwidth delay product
Jack Jill
2 ms round-trip time
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Bandwidth delay product
Jack Jill
100 ms round-trip time
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Receive window
Receiver controlled, signaled to sender
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Congestion window
Jack Jill
Receive
Window
Receive
Window
Congestion
Window
Congestion
Window
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Congestion window
• Sender controlled
• Window is managed by the congestion control algorithm
• Inputs—vary by algorithm

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Initial congestion window
$ ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link
169.254.169.254 dev eth0 scope link
1448 1448 1448 = 4344 bytes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Initial congestion window
# ip route change 10.16.16.0/24 dev eth0 
proto kernel scope link initcwnd 16
$ ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link initcwnd 16
169.254.169.254 dev eth0 scope link
1448 1448 1448 1448[ + 12 ] = 23168 bytes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
0
20
40
60
80
100
0% 2% 4% 6% 8% 10%
Loss Rate
Impact of loss on TCP throughput
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Loss is visible as TCP retransmissions
$ netstat -s | grep retransmit
58496 segments retransmitted
52788 fast retransmits
135 forward retransmits
3659 retransmits in slow start
392 SACK retransmits failed
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
TCP State
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
Bytes queued for
transmission
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
Congestion
control algorithm
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
Retransmission
timeout
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
Congestion
window
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Socket level diagnostic
$ ss -ite
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008
timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <->
ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40
mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138
retrans:0/11737 rcv_space:26847
Retransmissions
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Monitoring retransmissions in real time
Observable using Linux kernel tracing
# tcpretrans
TIME PID LADDR:LPORT -- RADDR:RPORT STATE
03:31:07 106588 10.16.16.18:443 R> 10.16.16.75:52291 ESTABLISHED
https://github.com/brendangregg/perf-tools/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Congestion control algorithm
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Congestion control algorithms in Linux
• New Reno: Pre-2.6.8
• BIC: 2.6.8–2.6.18
• CUBIC: 2.6.19+
• Pluggable architecture
• Other algorithms often available
• BBR, Vegas, Illinois, Westwood, Highspeed, Scalable
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning congestion control algorithm
$ sysctl net.ipv4.tcp_available_congestion_control
net.ipv4.tcp_available_congestion_control = cubic reno
$ find /lib/modules -name tcp_*
[…]
# modprobe tcp_illinois
$ sysctl net.ipv4.tcp_available_congestion_control
net.ipv4.tcp_available_congestion_control = cubic reno illinois
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning congestion control algorithm
# sysctl net.ipv4.tcp_congestion_control=illinois
net.ipv4.tcp_congestion_control = illinois
# echo “net.ipv4.tcp_congestion_control = illinois” >
/etc/sysctl.d/01-tcp.conf
[Restart network processes]
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
TCP-BBR
• Available in Linux 4.9
• Uses pacing and active probing to estimate Bandwidth and RTT
• Starting in 4.13, fq no longer required
# modprobe sch_fq
# modprobe tcp_bbr
# sysctl net.core.default_qdisc=fq
# sysctl net.ipv4.tcp_congestion_control=bbr
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Retransmission timer
• Input to when the congestion control
algorithm considers a packet lost
• Too low: spurious retransmission; congestion control
can over-react and be slow to re-open the congestion
window
• Too high: increased latency while algorithm determines
a packet is lost and retransmits
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning retransmission timer minimum
• Default minimum: 200 ms
# ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link
169.254.169.254 dev eth0 scope link
Route to other
instances in
our subnet
(same AZ)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning retransmission timer minimum
# ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link
169.254.169.254 dev eth0 scope link
# ip route change 10.16.16.0/24 dev eth0 proto kernel 
scope link rto_min 50ms
# ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link rto_min 
lock 50ms
169.254.169.254 dev eth0 scope link
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Queueing along the network path
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Queueing along the network path
• Intermediate routers along a path have
interface buffers
• High load leads to more packets in buffer
• Latency increases due to queue time
• Can trigger retransmission timeouts
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Active queue management
$ tc qdisc list
qdisc mq 0: dev eth0 root
qdisc pfifo_fast 0: dev eth0 parent :1 bands 3 […]
qdisc pfifo_fast 0: dev eth0 parent :2 bands 3 […]
# tc qdisc add dev eth0 root fq_codel
qdisc fq_codel 8006: dev eth0 root refcnt 9 limit 10240p
flows 1024 quantum 9015 target 5.0ms interval 100.0ms ecn
www.bufferbloat.net/projects/codel/wiki
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Maximum transmission unit
3.47% overhead versus 0.58% overhead
Improvement seen among instances in your VPC
1448 B
Payload
8949 B Payload
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning maximum transmission unit
# ip link list
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc
mq state UP mode DEFAULT group default qlen 1000
link/ether 06:f1:b7:e1:3b:e7
# ip route list
default via 10.16.16.1 dev eth0
10.16.16.0/24 dev eth0 proto kernel scope link
169.254.169.254 dev eth0 scope link
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tuning maximum transmission unit
# ip route change default via 10.16.16.1 dev eth0 mtu 1500
# ip route list
default via 10.16.16.1 dev eth0 mtu 1500
10.16.16.0/24 dev eth0 proto kernel scope link
169.254.169.254 dev eth0 scope link
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 enhanced networking
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 enhanced networking
Virtualization
Layer
HW NIC
Virtualization
Layer
HW NIC
Xen-PV Xen-PV
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 enhanced networking
HW NIC HW NIC
VF VF
Intel
82599
Intel
82599
10 Gbps
Virtualization
Layer
Virtualization
Layer
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 Elastic Network Adapter
ENA ENA
VF VF
20 Gbps
25 Gbps
Virtualization
Layer
Virtualization
Layer
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
PV-XEN
$ ethtool -k eth0
driver: vif
Enhanced Networking
$ ethtool -i eth0
driver: ixgbevf
C3, C4, D2, I2, R3,
M4 (not m4.16XL)
Elastic Network Adapter
$ ethtool -i eth0
driver: ena
F1, G3, I3, P2, P3, R4, X1,
m4.16xlarge
Verifying ENA is enabled
https://github.com/amzn/amzn-drivers
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Applying our new knowledge
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Test setup
• m4.16xlarge instances—Jack and Jill
• Amazon Linux 2017.09 (Kernel 4.9.51-10.52.amzn1)
• Web Server: Nginx 1.12.1
• Client: ApacheBench 2.3
• TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
• Transferring uncompressible data (random bits)
• Origin data stored in tmpfs (RAM based; no server disk I/O)
• Data discarded once retrieved (no client disk I/O)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Application 1
HTTPS with intermediate network loss
Jack Jill
0.5%
loss
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Test setup
• 1 test server instance, 1 test client instance
• 80 ms RTT
• 80 parallel clients retrieving a 100 MB object
$ ab -n 1600 -c 80 https://server/100m
• Simulated packet loss
# tc qdisc add dev eth0 root netem loss 0.5%
Goal: Minimize throughput impact with 0.5% loss
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 1
DefaultsDefaults w/0.5% loss
23.2 s
42.8 s 37.6 s52.3 s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 1
Cubic w/0.5% loss Illinois w/0.5% loss
20.7 s
42.8 s52.3 s 41.5 s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 1
Cubic w/0.5% loss BBR w/0.5% loss
42.8 s
11.1 s
52.3 s 38.3 s
74%
Decrease!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 1
BBR no loss BBR w/0.5% loss
44.7 s
8.8 s 11.1 s
38.3 s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 1
BBR no loss Cubic no loss
44.7 s
8.8 s 11.1s
38.3s
23.2 s
37.6 s
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Application 2
Data transfer; low RTT path
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Test setup
• 1 test server instance, 1 test client instance
• 1 ms RTT
• 8 parallel clients retrieving a 10 MB object
$ ab -n 100000 -c 8 https://server/10m
• Start at default RTO, then decrease
Goal: Minimize latency at high percentiles with 0.2% loss
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 2
p99.99
200 ms
2 ms
p50
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 2
RTO:200 p99.99 Latency RTO:50 p99.99 Latency
200 ms
100 ms
50%
Decrease!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Application 3
High transaction rate HTTP service
Jack Jill
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Test setup
• 1 test server instance, 1 test client instance
• 80 ms RTT
• HTTP, not HTTPS
• 1500 MTU
• 200k requests for a 10k object
$ ab -n 200000 -c 200 http://server/10k
Goal: Minimize latency
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Results—application 3
Test P50 latency Avg BW
Initial congestion window—3 packets 321 ms 12.550 Mbps
Initial congestion window—10 packets 241 ms 16.765 Mbps
Initial congestion window—16 packets 161 ms 22.518 Mbps
79%
Increase!
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Takeaways
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Takeaways
• The network doesn’t have to be a black box—Linux
tools can be used to interrogate and understand
• Simple tweaks to settings can dramatically increase
performance—test, measure, change
• Understand what your application needs from the
network, and tune accordingly
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank You
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you!
Remember to complete
your evaluations!

Contenu connexe

Tendances

OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)Dan Wendlandt
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Brendan Gregg
 
Kvm and libvirt
Kvm and libvirtKvm and libvirt
Kvm and libvirtplarsen67
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCKernel TLV
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux NetworkingPLUMgrid
 
From printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debuggingFrom printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debuggingThe Linux Foundation
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking WalkthroughThomas Graf
 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedBrendan Gregg
 
Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)Weaveworks
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF SuperpowersBrendan Gregg
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPFRogerColl2
 
Kamailio with Docker and Kubernetes
Kamailio with Docker and KubernetesKamailio with Docker and Kubernetes
Kamailio with Docker and KubernetesPaolo Visintin
 
The Real World with OpenShift - Red Hat DevOps & Microservices Conference 2017
The Real World with OpenShift - Red Hat DevOps & Microservices Conference 2017 The Real World with OpenShift - Red Hat DevOps & Microservices Conference 2017
The Real World with OpenShift - Red Hat DevOps & Microservices Conference 2017 Xpand IT
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelDivye Kapoor
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KernelThomas Graf
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)Kirill Tsym
 
오픈스택 기반 클라우드 서비스 구축 방안 및 사례
오픈스택 기반 클라우드 서비스 구축 방안 및 사례오픈스택 기반 클라우드 서비스 구축 방안 및 사례
오픈스택 기반 클라우드 서비스 구축 방안 및 사례SONG INSEOB
 

Tendances (20)

Kamailio on Docker
Kamailio on DockerKamailio on Docker
Kamailio on Docker
 
OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)OpenStack Quantum Intro (OS Meetup 3-26-12)
OpenStack Quantum Intro (OS Meetup 3-26-12)
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
 
Kvm and libvirt
Kvm and libvirtKvm and libvirt
Kvm and libvirt
 
Namespaces in Linux
Namespaces in LinuxNamespaces in Linux
Namespaces in Linux
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
 
From printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debuggingFrom printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debugging
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started
 
Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)Introduction to the Container Network Interface (CNI)
Introduction to the Container Network Interface (CNI)
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF Superpowers
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
 
Kamailio with Docker and Kubernetes
Kamailio with Docker and KubernetesKamailio with Docker and Kubernetes
Kamailio with Docker and Kubernetes
 
The Real World with OpenShift - Red Hat DevOps & Microservices Conference 2017
The Real World with OpenShift - Red Hat DevOps & Microservices Conference 2017 The Real World with OpenShift - Red Hat DevOps & Microservices Conference 2017
The Real World with OpenShift - Red Hat DevOps & Microservices Conference 2017
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
오픈스택 기반 클라우드 서비스 구축 방안 및 사례
오픈스택 기반 클라우드 서비스 구축 방안 및 사례오픈스택 기반 클라우드 서비스 구축 방안 및 사례
오픈스택 기반 클라우드 서비스 구축 방안 및 사례
 

Similaire à AWS TCP Performance Guide

CMP315_Optimizing Network Performance for Amazon EC2 Instances
CMP315_Optimizing Network Performance for Amazon EC2 InstancesCMP315_Optimizing Network Performance for Amazon EC2 Instances
CMP315_Optimizing Network Performance for Amazon EC2 InstancesAmazon Web Services
 
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...Amazon Web Services
 
(NET404) Making Every Packet Count
(NET404) Making Every Packet Count(NET404) Making Every Packet Count
(NET404) Making Every Packet CountAmazon Web Services
 
AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)Amazon Web Services
 
Building CloudScale Networks - AWS Summit Sydney 2018
Building CloudScale Networks - AWS Summit Sydney 2018Building CloudScale Networks - AWS Summit Sydney 2018
Building CloudScale Networks - AWS Summit Sydney 2018Amazon Web Services
 
Deep Dive into AWS Fargate - CON333 - re:Invent 2017
Deep Dive into AWS Fargate - CON333 - re:Invent 2017Deep Dive into AWS Fargate - CON333 - re:Invent 2017
Deep Dive into AWS Fargate - CON333 - re:Invent 2017Amazon Web Services
 
Handy Networking Tools and How to Use Them
Handy Networking Tools and How to Use ThemHandy Networking Tools and How to Use Them
Handy Networking Tools and How to Use ThemSneha Inguva
 
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017Amazon Web Services
 
Leveraging Network Offload to Accelerate SDN and NFV Deployments
Leveraging Network Offload to Accelerate SDN and NFV DeploymentsLeveraging Network Offload to Accelerate SDN and NFV Deployments
Leveraging Network Offload to Accelerate SDN and NFV DeploymentsNetronome
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterIvan Babrou
 
Cilium:: Application-Aware Microservices via BPF
Cilium:: Application-Aware Microservices via BPFCilium:: Application-Aware Microservices via BPF
Cilium:: Application-Aware Microservices via BPFCynthia Thomas
 
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018Amazon Web Services
 
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...Amazon Web Services
 
Forward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentationForward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentationAndrew Wesbecher
 
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...Nur Shiqim Chok
 
Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1Yongyoon Shin
 
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeAcademy
 

Similaire à AWS TCP Performance Guide (20)

CMP315_Optimizing Network Performance for Amazon EC2 Instances
CMP315_Optimizing Network Performance for Amazon EC2 InstancesCMP315_Optimizing Network Performance for Amazon EC2 Instances
CMP315_Optimizing Network Performance for Amazon EC2 Instances
 
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
Optimizing Network Performance for Amazon EC2 Instances (CMP308-R1) - AWS re:...
 
Kubernetes on AWS
Kubernetes on AWSKubernetes on AWS
Kubernetes on AWS
 
(NET404) Making Every Packet Count
(NET404) Making Every Packet Count(NET404) Making Every Packet Count
(NET404) Making Every Packet Count
 
AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)AWS re:Invent 2016: Making Every Packet Count (NET404)
AWS re:Invent 2016: Making Every Packet Count (NET404)
 
Building CloudScale Networks - AWS Summit Sydney 2018
Building CloudScale Networks - AWS Summit Sydney 2018Building CloudScale Networks - AWS Summit Sydney 2018
Building CloudScale Networks - AWS Summit Sydney 2018
 
Deep Dive into AWS Fargate - CON333 - re:Invent 2017
Deep Dive into AWS Fargate - CON333 - re:Invent 2017Deep Dive into AWS Fargate - CON333 - re:Invent 2017
Deep Dive into AWS Fargate - CON333 - re:Invent 2017
 
Building Cloudscale Networks
Building Cloudscale NetworksBuilding Cloudscale Networks
Building Cloudscale Networks
 
Handy Networking Tools and How to Use Them
Handy Networking Tools and How to Use ThemHandy Networking Tools and How to Use Them
Handy Networking Tools and How to Use Them
 
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017
Container Networking Deep Dive with Amazon ECS - CON401 - re:Invent 2017
 
Leveraging Network Offload to Accelerate SDN and NFV Deployments
Leveraging Network Offload to Accelerate SDN and NFV DeploymentsLeveraging Network Offload to Accelerate SDN and NFV Deployments
Leveraging Network Offload to Accelerate SDN and NFV Deployments
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
 
Introduction to TCP/IP
Introduction to TCP/IPIntroduction to TCP/IP
Introduction to TCP/IP
 
Cilium:: Application-Aware Microservices via BPF
Cilium:: Application-Aware Microservices via BPFCilium:: Application-Aware Microservices via BPF
Cilium:: Application-Aware Microservices via BPF
 
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018
Kubernetes Networking in Amazon EKS (CON412) - AWS re:Invent 2018
 
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...
From One to Many: Diving Deeper into Evolving VPC Design (ARC310-R2) - AWS re...
 
Forward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentationForward Networks - Networking Field Day 13 presentation
Forward Networks - Networking Field Day 13 presentation
 
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...
[Cisco Connect 2018 - Vietnam] Anh duc le reap the benefits of sdn with cisco...
 
Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1Harmonia open iris_basic_v0.1
Harmonia open iris_basic_v0.1
 
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
 

Plus de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

AWS TCP Performance Guide

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Network Performance: Making Every Packet Count M i k e F u r r , P r i n c i p a l E n g i n e e r , E C 2 N o v e m b e r 2 9 , 2 0 1 7 N E T 4 0 1
  • 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What to expect from this session Tuning TCP on Linux TCP Performance Application
  • 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. TCP
  • 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. TCP • Transmission Control Protocol • Underlies SSH, HTTP, *SQL, SMTP • Stream delivery, flow control
  • 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. TCP Jack Jill
  • 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Jack Jill
  • 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Limiting in-flight data Jack Jill Receive Window Receive Window Congestion Window Congestion Window
  • 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Bandwidth delay product Jack Jill 2 ms round-trip time
  • 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Bandwidth delay product Jack Jill 100 ms round-trip time
  • 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Receive window Receiver controlled, signaled to sender
  • 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Congestion window Jack Jill Receive Window Receive Window Congestion Window Congestion Window
  • 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Congestion window • Sender controlled • Window is managed by the congestion control algorithm • Inputs—vary by algorithm 
  • 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Initial congestion window $ ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link 169.254.169.254 dev eth0 scope link 1448 1448 1448 = 4344 bytes
  • 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Initial congestion window # ip route change 10.16.16.0/24 dev eth0 proto kernel scope link initcwnd 16 $ ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link initcwnd 16 169.254.169.254 dev eth0 scope link 1448 1448 1448 1448[ + 12 ] = 23168 bytes
  • 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 0 20 40 60 80 100 0% 2% 4% 6% 8% 10% Loss Rate Impact of loss on TCP throughput
  • 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Loss is visible as TCP retransmissions $ netstat -s | grep retransmit 58496 segments retransmitted 52788 fast retransmits 135 forward retransmits 3659 retransmits in slow start 392 SACK retransmits failed
  • 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847 TCP State
  • 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic Bytes queued for transmission $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847
  • 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847 Congestion control algorithm
  • 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847 Retransmission timeout
  • 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847 Congestion window
  • 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Socket level diagnostic $ ss -ite State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 3829960 10.16.16.18:https 10.16.16.75:52008 timer:(on,012ms,0) uid:498 ino:7116021 sk:0001c286 <-> ts sack cubic wscale:7,7 rto:204 rtt:1.423/0.14 ato:40 mss:1448 cwnd:138 ssthresh:80 send 1123.4Mbps unacked:138 retrans:0/11737 rcv_space:26847 Retransmissions
  • 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Monitoring retransmissions in real time Observable using Linux kernel tracing # tcpretrans TIME PID LADDR:LPORT -- RADDR:RPORT STATE 03:31:07 106588 10.16.16.18:443 R> 10.16.16.75:52291 ESTABLISHED https://github.com/brendangregg/perf-tools/
  • 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Congestion control algorithm Jack Jill
  • 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Congestion control algorithms in Linux • New Reno: Pre-2.6.8 • BIC: 2.6.8–2.6.18 • CUBIC: 2.6.19+ • Pluggable architecture • Other algorithms often available • BBR, Vegas, Illinois, Westwood, Highspeed, Scalable
  • 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning congestion control algorithm $ sysctl net.ipv4.tcp_available_congestion_control net.ipv4.tcp_available_congestion_control = cubic reno $ find /lib/modules -name tcp_* […] # modprobe tcp_illinois $ sysctl net.ipv4.tcp_available_congestion_control net.ipv4.tcp_available_congestion_control = cubic reno illinois
  • 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning congestion control algorithm # sysctl net.ipv4.tcp_congestion_control=illinois net.ipv4.tcp_congestion_control = illinois # echo “net.ipv4.tcp_congestion_control = illinois” > /etc/sysctl.d/01-tcp.conf [Restart network processes]
  • 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. TCP-BBR • Available in Linux 4.9 • Uses pacing and active probing to estimate Bandwidth and RTT • Starting in 4.13, fq no longer required # modprobe sch_fq # modprobe tcp_bbr # sysctl net.core.default_qdisc=fq # sysctl net.ipv4.tcp_congestion_control=bbr
  • 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Retransmission timer • Input to when the congestion control algorithm considers a packet lost • Too low: spurious retransmission; congestion control can over-react and be slow to re-open the congestion window • Too high: increased latency while algorithm determines a packet is lost and retransmits
  • 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning retransmission timer minimum • Default minimum: 200 ms # ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link 169.254.169.254 dev eth0 scope link Route to other instances in our subnet (same AZ)
  • 32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning retransmission timer minimum # ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link 169.254.169.254 dev eth0 scope link # ip route change 10.16.16.0/24 dev eth0 proto kernel scope link rto_min 50ms # ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link rto_min lock 50ms 169.254.169.254 dev eth0 scope link
  • 33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Queueing along the network path Jack Jill
  • 34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Queueing along the network path • Intermediate routers along a path have interface buffers • High load leads to more packets in buffer • Latency increases due to queue time • Can trigger retransmission timeouts
  • 35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Active queue management $ tc qdisc list qdisc mq 0: dev eth0 root qdisc pfifo_fast 0: dev eth0 parent :1 bands 3 […] qdisc pfifo_fast 0: dev eth0 parent :2 bands 3 […] # tc qdisc add dev eth0 root fq_codel qdisc fq_codel 8006: dev eth0 root refcnt 9 limit 10240p flows 1024 quantum 9015 target 5.0ms interval 100.0ms ecn www.bufferbloat.net/projects/codel/wiki
  • 36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Maximum transmission unit 3.47% overhead versus 0.58% overhead Improvement seen among instances in your VPC 1448 B Payload 8949 B Payload
  • 37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning maximum transmission unit # ip link list 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 06:f1:b7:e1:3b:e7 # ip route list default via 10.16.16.1 dev eth0 10.16.16.0/24 dev eth0 proto kernel scope link 169.254.169.254 dev eth0 scope link
  • 38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tuning maximum transmission unit # ip route change default via 10.16.16.1 dev eth0 mtu 1500 # ip route list default via 10.16.16.1 dev eth0 mtu 1500 10.16.16.0/24 dev eth0 proto kernel scope link 169.254.169.254 dev eth0 scope link
  • 39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 enhanced networking
  • 40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 enhanced networking Virtualization Layer HW NIC Virtualization Layer HW NIC Xen-PV Xen-PV
  • 41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 enhanced networking HW NIC HW NIC VF VF Intel 82599 Intel 82599 10 Gbps Virtualization Layer Virtualization Layer
  • 42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 Elastic Network Adapter ENA ENA VF VF 20 Gbps 25 Gbps Virtualization Layer Virtualization Layer
  • 43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. PV-XEN $ ethtool -k eth0 driver: vif Enhanced Networking $ ethtool -i eth0 driver: ixgbevf C3, C4, D2, I2, R3, M4 (not m4.16XL) Elastic Network Adapter $ ethtool -i eth0 driver: ena F1, G3, I3, P2, P3, R4, X1, m4.16xlarge Verifying ENA is enabled https://github.com/amzn/amzn-drivers
  • 44. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Applying our new knowledge
  • 45. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Test setup • m4.16xlarge instances—Jack and Jill • Amazon Linux 2017.09 (Kernel 4.9.51-10.52.amzn1) • Web Server: Nginx 1.12.1 • Client: ApacheBench 2.3 • TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256 • Transferring uncompressible data (random bits) • Origin data stored in tmpfs (RAM based; no server disk I/O) • Data discarded once retrieved (no client disk I/O)
  • 46. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Application 1 HTTPS with intermediate network loss Jack Jill 0.5% loss
  • 47. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Test setup • 1 test server instance, 1 test client instance • 80 ms RTT • 80 parallel clients retrieving a 100 MB object $ ab -n 1600 -c 80 https://server/100m • Simulated packet loss # tc qdisc add dev eth0 root netem loss 0.5% Goal: Minimize throughput impact with 0.5% loss
  • 48. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 1 DefaultsDefaults w/0.5% loss 23.2 s 42.8 s 37.6 s52.3 s
  • 49. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 1 Cubic w/0.5% loss Illinois w/0.5% loss 20.7 s 42.8 s52.3 s 41.5 s
  • 50. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 1 Cubic w/0.5% loss BBR w/0.5% loss 42.8 s 11.1 s 52.3 s 38.3 s 74% Decrease!
  • 51. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 1 BBR no loss BBR w/0.5% loss 44.7 s 8.8 s 11.1 s 38.3 s
  • 52. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 1 BBR no loss Cubic no loss 44.7 s 8.8 s 11.1s 38.3s 23.2 s 37.6 s
  • 53. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Application 2 Data transfer; low RTT path Jack Jill
  • 54. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Test setup • 1 test server instance, 1 test client instance • 1 ms RTT • 8 parallel clients retrieving a 10 MB object $ ab -n 100000 -c 8 https://server/10m • Start at default RTO, then decrease Goal: Minimize latency at high percentiles with 0.2% loss
  • 55. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 2 p99.99 200 ms 2 ms p50
  • 56. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 2 RTO:200 p99.99 Latency RTO:50 p99.99 Latency 200 ms 100 ms 50% Decrease!
  • 57. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Application 3 High transaction rate HTTP service Jack Jill
  • 58. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Test setup • 1 test server instance, 1 test client instance • 80 ms RTT • HTTP, not HTTPS • 1500 MTU • 200k requests for a 10k object $ ab -n 200000 -c 200 http://server/10k Goal: Minimize latency
  • 59. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Results—application 3 Test P50 latency Avg BW Initial congestion window—3 packets 321 ms 12.550 Mbps Initial congestion window—10 packets 241 ms 16.765 Mbps Initial congestion window—16 packets 161 ms 22.518 Mbps 79% Increase!
  • 60. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Takeaways
  • 61. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Takeaways • The network doesn’t have to be a black box—Linux tools can be used to interrogate and understand • Simple tweaks to settings can dramatically increase performance—test, measure, change • Understand what your application needs from the network, and tune accordingly
  • 62. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank You
  • 63. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank you!