SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
Introduction to Linux Kernel
   TCP/IP procotol stack
                雕梁
      核心系统服务器平台组
      diaoliang@taobao.com
   simohayha.bobo@gmail.com
     http://www.pagefault.info
             2011/01/15
Agenda
Introduction

Networking code in the Linux kernel tree

L2 (Link Layer)

L3 (Network Layer)

L4 (Transport Layer)

Config and benchmark tools

Resource
Introduction

   Source
      http://git.kernel.org/
      net-next-2.6 and net-2.6
   Developer
      Alan Cox, David Miller, Eric Dumazet, Patrick Mchardy
      etc.
   Traffic directions
      input , forward and output
   Layer
      L2(Link Layer)/L3(Network Layer)/L4(Transport Layer)
   Device interface
      PCI/PCI-E
Networking code in the Linux kernel tree



Net-Kernel
source tree
Big picture
Link layer
  Frame type
     802.3/802.2/802.2-SNAP/Ethernet
  Input
     Driver
        NAPI
             Poll + Interrupt
     Soft interrupt
        GRO
             feed packet to network stack
        RPS/RFS
             make steer in SMP
     Protocol handler
        use eth_type_trans
        Packet_type list
Link layer
  Output
      Traffic Control
      Soft interrupt
          Transmit SKB
              Scatter/Gather DMA
          Free skb
          XPS
              multiqueue
              avoid cache line bouncing
              improve locality
  Bridge
      Virtual device, must bind one or more real device
      Spanning Tree Protocol
Link
Layer
bigmap
Network Layer(IP)
 Input
    Protocol handler
        net_protocol array
    defragment
        Hashtable
            Each IP packet being defragmented save in a list
        stored in kernel memory until they are totally
        processed
 Output
    fragment
        MTU
        Scatter/Gather IO
        udp
    neighboring
Network Layer(IP)
 Forward
    process ip option
    igonore defragmentation
         Router Alert option
 Route
    Forwarding Information Base(routing table)
    cache
 Netfilter
    HOOK point
         NF_IP_LOCAL_OUT/ NF_IP_LOCAL_IN etc..
 Management
    Long-living IP peer information
          AVL tree
    IP statistics
          per cpu data ipstats_mib
         /proc/net/snmp
Network
Layer
Bigmap
Transport Layer (tcp)
  Init
    bind callback (sock_create)
    Three handshrek
        accept queue
        syn table
        create new socket fd and change state
  Manage socket
    inet_ehash_bucket
            TCP_ESTABLISHED <= sk->sk_state < TCP_CLOSE
         inet_bind_hashbucket
              local binding port info
         listening_hash
              socket in TCP_LISTEN state
Transport Layer (tcp)
 Output
    Tcp push
    Congestion control
        state transition
        congestion windows
        packet count
 Input
    fast path and slow path
    Interrupt context/ Process context
    sk_backlog/receive_queue/prequeue
 Tcp state transition
    Kernel control
 Timer
    Retransmit/keep-alive/time-wait etc
TCP
Bigmap
Config and Benchmark Tools

 Ethtool
    offload fetures
 Benchmark and test tools
    Netperf/pktgen
    Mpstat/tcpstat

 Proc FileSystem
    /proc/net
    /proc/sys/net
        ipv4
        core
 Sys FileSystem
    /sys/class/net/ethx
Resource

 http://kernelnewbies.org

 http://kernel.org

 http://www.kernelplanet.org

 https://lkml.org

 http://vger.kernel.org/vger-lists.html

 http://www.pagefault.info/?tag=kernel

Contenu connexe

Tendances

Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network InterfacesKernel TLV
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDPDaniel T. Lee
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingKernel TLV
 
Intel® QuickAssist Technology Introduction, Applications, and Lab, Including ...
Intel® QuickAssist Technology Introduction, Applications, and Lab, Including ...Intel® QuickAssist Technology Introduction, Applications, and Lab, Including ...
Intel® QuickAssist Technology Introduction, Applications, and Lab, Including ...Michelle Holley
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingMichelle Holley
 
OpenvSwitch Deep Dive
OpenvSwitch Deep DiveOpenvSwitch Deep Dive
OpenvSwitch Deep Diverajdeep
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptablesKernel TLV
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecturehugo lu
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machineAlexei Starovoitov
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
Tutorial: Using GoBGP as an IXP connecting router
Tutorial: Using GoBGP as an IXP connecting routerTutorial: Using GoBGP as an IXP connecting router
Tutorial: Using GoBGP as an IXP connecting routerShu Sugimoto
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Andriy Berestovskyy
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux KernelKernel TLV
 

Tendances (20)

Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
 
Dpdk performance
Dpdk performanceDpdk performance
Dpdk performance
 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
Intel® QuickAssist Technology Introduction, Applications, and Lab, Including ...
Intel® QuickAssist Technology Introduction, Applications, and Lab, Including ...Intel® QuickAssist Technology Introduction, Applications, and Lab, Including ...
Intel® QuickAssist Technology Introduction, Applications, and Lab, Including ...
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet Processing
 
eBPF maps 101
eBPF maps 101eBPF maps 101
eBPF maps 101
 
OpenvSwitch Deep Dive
OpenvSwitch Deep DiveOpenvSwitch Deep Dive
OpenvSwitch Deep Dive
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecture
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
Tutorial: Using GoBGP as an IXP connecting router
Tutorial: Using GoBGP as an IXP connecting routerTutorial: Using GoBGP as an IXP connecting router
Tutorial: Using GoBGP as an IXP connecting router
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux Kernel
 

Similaire à Introduction to Linux Kernel TCP/IP protocol stack

NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioHajime Tazaki
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking WalkthroughThomas Graf
 
หน่วยที่ 2 โปรโตคอล
หน่วยที่ 2 โปรโตคอลหน่วยที่ 2 โปรโตคอล
หน่วยที่ 2 โปรโตคอลnatnathapong
 
The Osi Model And Layers
The Osi Model And LayersThe Osi Model And Layers
The Osi Model And LayersKathleenSSmith
 
Cisco crs1
Cisco crs1Cisco crs1
Cisco crs1wjunjmt
 
Tempesta FW: a FrameWork and FireWall for HTTP DDoS mitigation and Web Applic...
Tempesta FW: a FrameWork and FireWall for HTTP DDoS mitigation and Web Applic...Tempesta FW: a FrameWork and FireWall for HTTP DDoS mitigation and Web Applic...
Tempesta FW: a FrameWork and FireWall for HTTP DDoS mitigation and Web Applic...Alexander Krizhanovsky
 
Wireshark
WiresharkWireshark
Wiresharkbtohara
 
Tcp and introduction to protocol
Tcp and introduction to protocolTcp and introduction to protocol
Tcp and introduction to protocolSripati Mahapatra
 
Tcpandintroductiontoprotocol 150618054958-lva1-app6892
Tcpandintroductiontoprotocol 150618054958-lva1-app6892Tcpandintroductiontoprotocol 150618054958-lva1-app6892
Tcpandintroductiontoprotocol 150618054958-lva1-app6892Saumendra Pradhan
 
Ccent notes part 1
Ccent notes part 1Ccent notes part 1
Ccent notes part 1ahmady
 
End to End Convergence
End to End ConvergenceEnd to End Convergence
End to End ConvergenceSkillFactory
 
MPLS Deployment Chapter 1 - Basic
MPLS Deployment Chapter 1 - BasicMPLS Deployment Chapter 1 - Basic
MPLS Deployment Chapter 1 - BasicEricsson
 
สรุปองค์ความรู้หน่วยที่ 2 โปรโตคอล
สรุปองค์ความรู้หน่วยที่ 2 โปรโตคอลสรุปองค์ความรู้หน่วยที่ 2 โปรโตคอล
สรุปองค์ความรู้หน่วยที่ 2 โปรโตคอลThitinan607
 

Similaire à Introduction to Linux Kernel TCP/IP protocol stack (20)

Ccna Imp Guide
Ccna Imp GuideCcna Imp Guide
Ccna Imp Guide
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
 
Network
NetworkNetwork
Network
 
หน่วยที่ 2 โปรโตคอล
หน่วยที่ 2 โปรโตคอลหน่วยที่ 2 โปรโตคอล
หน่วยที่ 2 โปรโตคอล
 
OSI layers
OSI layersOSI layers
OSI layers
 
The Osi Model And Layers
The Osi Model And LayersThe Osi Model And Layers
The Osi Model And Layers
 
Dc fabric path
Dc fabric pathDc fabric path
Dc fabric path
 
FD.io - The Universal Dataplane
FD.io - The Universal DataplaneFD.io - The Universal Dataplane
FD.io - The Universal Dataplane
 
Cisco crs1
Cisco crs1Cisco crs1
Cisco crs1
 
Tempesta FW: a FrameWork and FireWall for HTTP DDoS mitigation and Web Applic...
Tempesta FW: a FrameWork and FireWall for HTTP DDoS mitigation and Web Applic...Tempesta FW: a FrameWork and FireWall for HTTP DDoS mitigation and Web Applic...
Tempesta FW: a FrameWork and FireWall for HTTP DDoS mitigation and Web Applic...
 
Wireshark
WiresharkWireshark
Wireshark
 
Tcp and introduction to protocol
Tcp and introduction to protocolTcp and introduction to protocol
Tcp and introduction to protocol
 
Tcpandintroductiontoprotocol 150618054958-lva1-app6892
Tcpandintroductiontoprotocol 150618054958-lva1-app6892Tcpandintroductiontoprotocol 150618054958-lva1-app6892
Tcpandintroductiontoprotocol 150618054958-lva1-app6892
 
Ccent notes part 1
Ccent notes part 1Ccent notes part 1
Ccent notes part 1
 
End to End Convergence
End to End ConvergenceEnd to End Convergence
End to End Convergence
 
MPLS Deployment Chapter 1 - Basic
MPLS Deployment Chapter 1 - BasicMPLS Deployment Chapter 1 - Basic
MPLS Deployment Chapter 1 - Basic
 
TCP/IP Basics
TCP/IP BasicsTCP/IP Basics
TCP/IP Basics
 
Networking revolution
Networking revolutionNetworking revolution
Networking revolution
 
สรุปองค์ความรู้หน่วยที่ 2 โปรโตคอล
สรุปองค์ความรู้หน่วยที่ 2 โปรโตคอลสรุปองค์ความรู้หน่วยที่ 2 โปรโตคอล
สรุปองค์ความรู้หน่วยที่ 2 โปรโตคอล
 

Introduction to Linux Kernel TCP/IP protocol stack

  • 1. Introduction to Linux Kernel TCP/IP procotol stack 雕梁 核心系统服务器平台组 diaoliang@taobao.com simohayha.bobo@gmail.com http://www.pagefault.info 2011/01/15
  • 2. Agenda Introduction Networking code in the Linux kernel tree L2 (Link Layer) L3 (Network Layer) L4 (Transport Layer) Config and benchmark tools Resource
  • 3. Introduction Source http://git.kernel.org/ net-next-2.6 and net-2.6 Developer Alan Cox, David Miller, Eric Dumazet, Patrick Mchardy etc. Traffic directions input , forward and output Layer L2(Link Layer)/L3(Network Layer)/L4(Transport Layer) Device interface PCI/PCI-E
  • 4. Networking code in the Linux kernel tree Net-Kernel source tree
  • 6. Link layer Frame type 802.3/802.2/802.2-SNAP/Ethernet Input Driver NAPI Poll + Interrupt Soft interrupt GRO feed packet to network stack RPS/RFS make steer in SMP Protocol handler use eth_type_trans Packet_type list
  • 7. Link layer Output Traffic Control Soft interrupt Transmit SKB Scatter/Gather DMA Free skb XPS multiqueue avoid cache line bouncing improve locality Bridge Virtual device, must bind one or more real device Spanning Tree Protocol
  • 9. Network Layer(IP) Input Protocol handler net_protocol array defragment Hashtable Each IP packet being defragmented save in a list stored in kernel memory until they are totally processed Output fragment MTU Scatter/Gather IO udp neighboring
  • 10. Network Layer(IP) Forward process ip option igonore defragmentation Router Alert option Route Forwarding Information Base(routing table) cache Netfilter HOOK point NF_IP_LOCAL_OUT/ NF_IP_LOCAL_IN etc.. Management Long-living IP peer information AVL tree IP statistics per cpu data ipstats_mib /proc/net/snmp
  • 12. Transport Layer (tcp) Init bind callback (sock_create) Three handshrek accept queue syn table create new socket fd and change state Manage socket inet_ehash_bucket TCP_ESTABLISHED <= sk->sk_state < TCP_CLOSE inet_bind_hashbucket local binding port info listening_hash socket in TCP_LISTEN state
  • 13. Transport Layer (tcp) Output Tcp push Congestion control state transition congestion windows packet count Input fast path and slow path Interrupt context/ Process context sk_backlog/receive_queue/prequeue Tcp state transition Kernel control Timer Retransmit/keep-alive/time-wait etc
  • 15. Config and Benchmark Tools Ethtool offload fetures Benchmark and test tools Netperf/pktgen Mpstat/tcpstat Proc FileSystem /proc/net /proc/sys/net ipv4 core Sys FileSystem /sys/class/net/ethx
  • 16. Resource http://kernelnewbies.org http://kernel.org http://www.kernelplanet.org https://lkml.org http://vger.kernel.org/vger-lists.html http://www.pagefault.info/?tag=kernel