SlideShare une entreprise Scribd logo
1  sur  65
Télécharger pour lire hors ligne
Andriy Berestovskyy
2014
( ц ) А н д р
і й Б е р е с
т о в с ь к и
й
networking hourTCP
UDP
NAT
IPsec
IPv4
IPv6
internet
protocolsAH
ESP
authentication
authorization
accounting
encapsulation
security
BGP
OSPF
ICMP
ACLSNAT
tunnelPPPoE
GRE
ARP
discovery
NDP
OSI
broadcast
multicast
IGMP
PIM
MAC
DHCP
DNS
fragmentation
semihalf
berestovskyy
Network Programming
Data Plane Development Kit
Network Programming Series
● Berkeley Sockets
● > Data Plane Development Kit (DPDK)
● Open Data Plane (ODP)
2
Let’s Make a 10Gbit/s Switch!
3
1. Receive frame, check Ethernet FCS
2. Add/update source MAC in MAC table
3. If multicast bit is set:
a. forward all ports, but the source
4. If destination is in MAC table:
a. forward to the specific port
5. Else, forward to all ports
Let’s Make a Switch Simple Hub!
4
1. Receive frame, check Ethernet FCS
2. Add/update source MAC in MAC table
3. If multicast bit is set:
a. forward all ports, but the source
4. If destination is in MAC table:
a. forward to the specific port
5. Else, forward to another port
But still, 10 Gbit/s!
Recap: Ethernet Frame Format
5Source: https://en.wikipedia.org/wiki/Ethernet_frame
Ethernet
Frame
Bits
2
1
7
Preamble, 7 octets
Start of Frame, 1 octet
6
Destination MAC, 6 octets
6
Source MAC, 6 octets
4
Optional 802.1q (VLAN) Tag, 4 octets
Ethertype, 2 octets
46-1500 4
Frame Check Sequence, 4 octets
12
Interframe gap, 12 octetsPayload, 46-1500 octets
72 — 1530 octets
64 — 1522 octets
TrailerHeader
Performance Challenges
Minimum Ethernet Frame Size:
min frame size = preamble + start + min frame + interframe gap
min frame size = 7 + 1 + 64 + 12 = 84 octets (84 * 8 = 672 bits)
Maximum number of frames on 1 Gbps link:
packets per second = 1 Gbps / 672 bits = 1 488 095 pps
Maximum number of frames on 10 Gbps link:
packets per second = 10 Gbps / 672 bits =14,88 Mpps
6
Ethernet vs CPU
7
Skylake Intel® Xeon® Processor E3-1280 v5 — 3,7 GHz
CPU budget per packet = CPU Freq / Packet Per Second
CPU budget per packet = 3,7 GHz / 14,88 Mpps = 249 cycles
249 CPU cycles per packet
Is it a lot?
Recap: Xeon Memory Hierarchy
8
Xeon Package
Core Core
Registers
32 KB L1 / ~4 cycles
256 KB L2 / ~10 cycles
36 MB LLC / ~30 cycles
32 KB L1 / ~4 cycles
256 KB L2 / ~10 cycles
Registers
Up to 1.5 TB DDR4 / ~200 cycles / 60 GB/s
Source: http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
Ethernet vs Memory
9
249 cycles per packet is:
249 cycles / 4 cycles = 62 L1 cache reads
249 cycles / 10 cycles = 24 L2 cache reads
249 cycles / 30 cycles = 8 L3 cache reads
249 cycles / 200 cycles = 1 DDR4 read
:(
What is Wrong With Kernel?
1. Interrupts
2. Context switches
3. User-kernel space data copying
4. Kernel development is not easy
10
Solution?
11
Data Plane Development Kit — set of user space
dataplane libraries and NIC drivers
for fast packet processing
— Wikipedia
12
Dataplane?
Dataplane — part of architecture that decides
what to do with packets arriving on an inbound interface
— Wikipedia
13
What is DPDK
1. Set of user space libraries
2. Set of user space drivers with direct NIC access
3. Support for Network Functions Virtualization
4. Open source, BSD Licensed
14
How?
What DPDK is Not
1. Not a TCP/IP stack
2. Not a Berkeley socket API
3. Not a ready to use solution,
i.e. neither a router nor a switch
15
Fitting Into 249 Cycles: Polling
Poll for new packets, do not wait for an interrupt:
1. Interrupts are out of the budget
2. Avoid context switches
void lcore_loop()
{
while (1) {
...
}
}
16
Pros/cons?
Fitting Into 249 Cycles: Bursts
Process few packets at a time (a burst), not one-by-one:
1. Amortize slow memory reads per burst
2. Increase cache hit ratio: do the same for few packets at once
void lcore_loop()
{
struct rte_mbuf *burst[32];
while (1) {
nb_rx = rte_eth_rx_burst(0, 0, burst, 32);
...
rte_eth_tx_burst(1, 0, burst, nb_rx);
}
}
17
Pros/cons?
Fitting Into 249 Cycles: Hugepages
Virtual-Physical address translation — Translation Lookaside Buffer:
1. Haswell First Level DTLB Cache for 4K pages: 64 entries x 4
(~4 cycles)
2. Haswell Second Level DTLB Cache: 1024 entries x 8
(~10 cycles)
Use 2MB or 1GB hugepages to reduce TLB cache misses:
GRUB_CMDLINE_LINUX_DEFAULT=”... default_hugepagesz=1G hugepagesz=1G hugepages=4”
# mount -t hugetlbfs nodev /mnt/huge
18More on system requirements: http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html
Fitting Into 249 Cycles: Multicore
Use few CPU cores to process packets:
1. Use Receive Side Scaling
2. Use Flow affinity
void main()
{
...
rte_eal_mp_remote_launch(lcore_loop, ...);
...
}
19
Pros/cons?
CPU 1: lcore_loop()
CPU 2: lcore_loop()
CPU 3: lcore_loop()
port port
Why DPDK?
1. High performance
○ efficient memory-management
(zero-copy, hugepages, user space)
○ efficient packet handling
(DIR-24-8 implementation, cuckoo hash)
○ efficient CPU management
(lockless poll-mode drivers, run-to-completion, NUMA awarness)
2. Simple
○ user space application
○ many examples
3. De-facto standard for dataplanes in Linux
20
Pipeline Model
lcore 3:
TX
lcore 2:
process
Run-to-Completion Model
lcore 1: RX, process, TX
lcore 2: RX, process, TX
lcore 3: RX, process, TX
Dataplane Application Models
21
port port
lcore 1:
RX
port port
Cons/pros?
lcore?
Lcore — logical execution unit of the processor,
sometimes called a hardware thread
(usually, pthread bound to a CPU core)
— Wikipedia
22
Pipeline Model
lcore 3:
TX
lcore 2:
process
Run-to-Completion Model
lcore 1: RX, process, TX
lcore 2: RX, process, TX
lcore 3: RX, process, TX
Synchronization Issues
23
port port
lcore 1:
RX
port port
How?
How?
Run-to-Completion Model
port q2
lcore 1: RX, process, TX
lcore 2: RX, process, TX
lcore 3: RX, process, TX
Run-to-Completion Synchronization
24
port
q1
q3
q2
q1
q3
Hardware
queue
Cons/pros?
RSS* and Flow Affinity
25
q2
lcore 1: RX, process, TX
lcore 2: RX, process, TX
lcore 3: RX, process, TX
q1
q3
q2
q1
q3
Why?
port port
* Receive Side Scaling
Pipeline Model Synchronization
26
Pipeline Model
lcore 3:
TX
lcore 2:
process
lcore 1:
RX
port q1 portq1
Software
queue
Cons/pros?
DPDK Overview
27
User space
Kernel
Hardware NICNIC NIC
igb_uio (~700 lines)
or vfio-pci
or uio_pci_generic
DPDK
lcore 3
Classification and Scheduling
Linux/FreeBSD
App
App
Poll Mode Drivers
Hashing and Forwarding
CPU and Memory Management
NIC
QEMU (VM)
eth0
ixgbe
lcore 2
lcore 1
tap0
log
vEth0KNI Driver
PCAP
Driver
rte_kni
Virtio
Virtio
Shared
Memory
TUN/TAP
DPDK
Kernel
So, Let’s Make a Simple Hub!
28
void lcore_loop()
{
struct rte_mbuf *burst[32];
uint16_t id = rte_lcore_id();
while (1) {
nb_rx = rte_eth_rx_burst(0, id, burst, 32);
rte_eth_tx_burst(1, id, burst, nb_rx);
nb_rx = rte_eth_rx_burst(1, id, burst, 32);
rte_eth_tx_burst(0, id, burst, nb_rx);
}
}
PMD
App
PMD
while (1) {
rx(...); tx(...);
}
mmap()
DPDK Libraries Overview
29
rte_malloc
Hugepage-based heap
rte_eal
Environment Abstraction Layer:
hugepages, CPU, PCI,
logs, debugs ...
rte_ring
Lockless queue
Multi/single consumer/producer
rte_mempool
Fixed-sized objects
Uses rings to keep free objects
rte_mbuf
Memory buffer
Uses mempools as a storage
rte_cmdline
Command line interface
rte_ether
Generic PMD API
rte_hash
Hash library
rte_kni
Kernel NIC interface
rte_meter + rte_sched
QoS classifier and scheduler
rte_pmd_*
Poll Mode Drivers
rte_timer
Timers management
Why?
rte_lpm
Longest prefix match (DIR-24-8)
rte_net
IP-related (ARP, IP, TCP, UDP...)
DPDK Application Initialization
1. Init Environment Abstraction Layer (EAL):
rte_eal_init(argc, argv);
2. Allocate mempools:
rte_pktmbuf_pool_create(name, n, cache_sz, …);
3. Configure ports:
rte_eth_dev_configure(port_id, nb_rx_q, nb_tx_q, …);
4. Configure queues:
rte_eth_*_queue_setup(port_id, queue_id, nb_desc, …);
5. Launch workers (lcores):
rte_eal_mp_remote_launch(lcore_loop, …);
30
Arguments?
DPDK Command Line Arguments
31
1. Number of workers:
-c COREMASK, i.e. -c 0xf
better: -l CORELIST, i.e. -l 0-3
best: --lcores COREMAP, i.e. --lcores 0-3@0
2. Number of memory channels (optional):
-n NUM, i.e. -n 4
3. Allocate memory using hugepages (optional):
-m MB, i.e. -m 512
--socket-mem SOCKET_MB,SOCKET_MB,..., i.e. --socket-mem 384,128
4. Add virtual devices (optional):
--vdev <driver><id>[,key=val, ...], i.e. --vdev net_pcap0,iface=eth0
More command-line options: http://dpdk.org/doc/guides/testpmd_app_ug/run_app.html
DPDK Main Loop
1. Receive burst of packets:
nb_rx = rte_eth_rx_burst(port_id, queue_id, burst, BURST_SZ);
2. Process packets (app logic)
3. Send burst of packets:
nb_tx = rte_eth_tx_burst(port_id, queue_id, burst, nb_rx);
4. Free unsent buffers:
for (; nb_tx < nb_rx; nb_tx++)
rte_pktmbuf_free(burst[nb_tx]);
32
DPDK CPU Management
33
Master lcore
int main() {
rte_eal_mp_remote_launch(m_loop, ...);
lcore 1
void m_loop()
{
while (!stop) {
rx_burst();
...
tx_burst();
}
}
lcore 2
void m_loop()
{
while (!stop) {
rx_burst();
...
tx_burst();
}
}
void signal_handler(int signum)
{
if (SIGINT == signum) stop = 1;
}
rte_eal_init(argc, argv); wait wait
rte_eal_mp_wait_lcore() wait wait
Allocate mempools, configure ports...
rte_memzone_reserve(name, len, socket, size)
rte_mempool_create(name, n, elt_size, ...)
DPDK Memory Management
34
phys. memory: hugepages
memory zones: mempool zonering zone heap zonefree
rte_pktmbuf:
rte_mempool: free◌ free mbufs mbufmbuf free mbuf mbuf mbuf privatefree
headroom tailroomdata
next
rte_pktmbuf_alloc(mempool)
priv
free
Memory Management Libraries
35
rte_eal
Hugepages management
rte_memzone_reserve(name, len, socket)
rte_malloc
Hugepage-based heap
rte_malloc(type, size, align)
rte_free(ptr)
rte_ring
Lockless queue
rte_ring_create(name,count,socket,flags)
rte_ring_dequeue_bulk(r, table,n)
rte_mempool
Fixed-sized objects
rte_mempool_create(name, n, elt_size, ...)
rte_mempool_get_bulk(pool, table, n)
rte_mbuf
Memory buffers
rte_pktmbuf_alloc(mempool)
*_bulk(N): N objects or nothing *_burst(N): 0-N objects
1. Lockless
2. Fixed size queue of pointers
3. Bulk/burst enqueue/dequeue operations
DPDK Lockless Queue
36
Lockless?
consumer
Why
head/tail?
producer
cons.head
cons.tail
rte_ring: ptr1 ptr2 ptr3free free free
prod.head
prod.tail
ptr2
ptr1
ptr3
Non-blocking Algorithms
Blocking — sequential access, other threads are blocked:
mutex, semaphore
Non-blocking — sequential access, other threads busy wait:
spinlocks
Lock-free — concurrent access, unpredictable number of retries:
consistency markers
Wait-free — concurrent access, predictable number of steps
37
Markers?
cons.head
cons.tail
lcore 2
lcore 1
Enqueue Lock-Free Algorithm
38
rte_ring: ptr1 ptr2 free free free free
next
Step 1
1. head = prod.head
2. next = head + N
head
nexthead
prod.head
prod.tail
cons.head
cons.tail
lcore 2
lcore 1
rte_ring: ptr1 ptr2 free free free free
next
Step 2
1. CAS(prod.head, head, next)
2. if (failed) goto Step 1
head
nexthead
prod.tail
prod.head
Predictable?
1. Uses memory zones to store pointers
2. rte_ring_dequeue_* to consume pointers
3. rte_ring_enqueue_* to produce pointers
DPDK rte_ring Library
39
rte_eal
Hugepages management
rte_memzone_reserve(name, len, socket)
rte_ring
Lockless queue
rte_ring_create(name,count,socket,flags)
rte_ring_dequeue_bulk(r, table,n)
*_bulk(N): N objects or nothing
*_burst(N): 0-N objects
lcore 1
rte_mempool_create(name, n, elt_size, ...)
memory zones: mempool zonering zone heap zonefree
rte_mempool: free◌ free mbufs mbufmbuf free mbuf mbuf mbuf privatefree
DPDK Memory Pools
40
1. Allocator of fixed-sized objects
2. Uses ring to store free objects
3. High-performance
RX ringTX ring
DPDK rte_mempool Library
41
rte_eal
Hugepages management
rte_memzone_reserve(name, len, socket)
rte_ring
Lockless queue
rte_ring_create(name,count,socket,flags)
rte_ring_dequeue_bulk(r, table,n)
rte_mempool
Fixed-sized objects
rte_mempool_create(name, n, elt_size, ...)
rte_mempool_get_bulk(pool, table, n)
1. Uses ring as a queue of free objects
2. rte_mempool_get_* to allocate objects
3. rte_mempool_put_* to free objects
4. Optional per-lcore cache
*_bulk(N): N objects or nothing
*_burst(N): 0-N objects
Why?
DPDK Memory Buffer
42
rte_pktmbuf:
rte_mempool: free◌ free mbufs mbufmbuf free mbuf mbuf mbuf privatefree
headroom tailroomdata
next or NULL
next
rte_pktmbuf_alloc(mempool)
priv
1. Fixed-size buffers
2. Chained buffers support
3. Indirect buffers support
Why?Why?
Chained Memory Buffers
43
mbuf mbuf mbuf mbuf
next next next NULL
1. For jumbo frames
2. To append/prepend more than head/tailroom
Indirect Memory Buffers
44
1. For fast packet “cloning”
2. For broadcasts/multicasts
rte_pktmbuf: headroom tailroomdatapriv
rte_pktmbuf:
buf_addr
refcnt
2
buf_addr
flags |= ATTACHED
DPDK rte_hash Library
45
1. Array of buckets
2. Fixed number of entries per bucket
3. No data, key management only
4. 4-byte key signatures
Why?
Why?
keyskeysrte_hash free sigfree free sig sig sig free sigsigfree
int. key array free keyfree free key key key free keykeyfree
index
data
Data?
46
Cuckoo hashing?
47
Cuckoo hashing — scheme for resolving hash collisions
with worst-case constant lookup time
— Wikipedia
Collision?
Cuckoo Hashing Algorithm
48
AA A
B
B
1 2 3a
A
B
A
C
3b
C
B
A
C
no collision
primary hash
collision - use
alt. location
both hashes
collision -
push hash to
alt. location
add hash to a
vacant space
Double
addressing
Cuckoo
Why Cuckoo Hashing?
49
● Fast
● Constant lookup time
● High load factor (~90%)
Cuckoo Hash Structures
50
B
Bucket
Numberofbuckets
Bucket entries
S
Ki
A
Key signatures []
Entry
K
Key indexes []
Alt. signatures []
Numberofslots
D
KeyData
Ring of free
slots
51
DPDK Longest Prefix Match:
Why?
Recap: IP Routing
52
WAN
LAN2LAN1
IP: 2.2.2.2 IP: 1.1.1.1
R1 R2
1
By default,
send to R1
All 1.*.*.*,
send to R2
3
All 1.*.*.*,
send locally
2
Recap: IP Flexible Subnetting
53
3.2.5100.
Subnet Host
Service Provider (AS100)
Subnet 100.0.0.0
Company 3
Subnet 100.3.0.0
Office
100.3.1.0
Lab
100.3.2.0
2.5100.3.
5100.3.2.
100.3.2.5
How?
Route to: AS100
Route to: AS100, Company 3
Route to: AS100, Company 3, Lab
Deliver to: AS100, Company 3, Lab, Host 5
AS200
Recap: Router Logic
54
Service Provider
Subnet 100.0.0.0
Company 3
Subnet 100.3.0.0
Lab
100.3.2.0
1. Receive IP packet:
○ Check Ethernet FCS
○ Remove Ethernet header
2. Decrease TTL
3. Find the best route in routing table:
○ Most specific route is the best
4. If found, send to next-hop router:
○ Destination MAC = next-hop gateway IP
5. Else, drop the packet
Lab Router Routing Table:
100.3.2.* —> dev eth1 (Lab),
directly
*.*.*.* —> dev eth0 (Company 3),
via 100.3.0.1 (Company 3 Router)
Recap: IPv4 Subnet Mask
55
32-bit Number
0110 0100 0000 0011 0000 0010 0000 0101
100. 3. 2. 5
IPv4 Address
IPv4 Address in Dotted Decimal Notation
1111 1111 1111 1111 1111 1111 0000 0000
255. 255. 255. 0
IPv4 Subnet Mask
Subnet Mask in Dotted Decimal Notation
&
0110 0100 0000 0011 0000 0010 0000 0000
100. 3. 2. 0
IPv4 Subnet
Subnet in Dotted Decimal Notation
Recap: IPv4 Subnet Mask Length
56
Subnet Mask Length = 24
1111 1111 1111 1111 1111 1111 0000 0000
255. 255. 255. 0
IPv4 Subnet Mask
Subnet Mask in Dotted Decimal Notation
Dotted Decimal Notation:
IPv4 Address: 100.3.2.5
Subnet Mask: 255.255.255.0
CIDR* Notation:
IPv4 Prefix: 100.3.2.5/24
==
* Classless Inter-Domain Routing
Longest Prefix Match (LPM)
Example routing table:
0.0.0.0/0 -> R1
10.0.0.0/8 -> R2
10.10.0.0/16 -> R3
Destination address 10.10.0.1 matches all three routes.
Which route to use?
57
DPDK rte_lpm Library
IPv4:
1. 32-bit keys
2. Fixed maximum number of rules
3. LPM rule: 32-bit key + prefix len + user data (next hop)
4. DIR-24-8 algorithm (1-2 memory reads per match)
IPv6:
1. 128-bit keys
2. Similar algorithm:
24 bit + 13 x 8 bit tables = 1-14 memory reads (typically 5)
58
How?
DPDK rte_lpm Library: DIR-24-8
59
Table of
Next Hops
R1
Table of 2^24 words
for prefix len 0-24
0.0.0
Table of records x 256
for prefix lens 25-32
...
R2
9.255.255
10.0.0
10.0.1
...
255.255.255
R3
...
(10.0.0.) 0
(10.0.0.) 1
...
(10.0.0.) 255
...
2^24*2 = 32MB table
N*2*256 bytes table
where N is a max number of
routes with prefix len > 24
tbl24 tbl8
24 bits index 8 bits index
more
specific
routes flag
LPM Rules
(routes)
1st read
optional
2nd read
DPDK Poll Mode Drivers
1. Lock-free —> thread unsafe
2. Based on user space IO (uio)
3. Limited number of NICs
4. Any interface via PCAP (slow)
60
DPDK Kernel NIC Interface
61
User space
Kernel
Hardware NIC
igb_uio
DPDK
lcore 2
Linux/FreeBSD
ping
Poll Mode Drivers
lcore 1
vEth0KNI Driver
rte_kni
1. Allows user space applications to access Linux control plane
2. Allows management of DPDK ports using Linux tools
3. Interface with the kernel network stack
DPDK Thread Safety
1. Thread unsafe: all performance sensitive functions
hash, LPM, PMD...
2. Multi-threaded: performance insensitive
malloc, memzone...
3. Fast and thread safe: rings (lockless queues) and mempools
62
DPDK Performance Tips
1. Never use libc nor Linux API
malloc -> rte_malloc
printf -> RTE_LOG
2. Avoid cache misses / false sharing by using per lcore variables
Example: port statistics
3. Use NUMA sockets to allocate local memory
4. Use rings to inter-core communication
5. Use burst mode in PMDs
6. Help branch predictor: use likely()/unlikely()
7. Prefetch data into cache with rte_prefetchX()
63
Why?
False
sharing?
DPDK Checklist
1. What is DPDK?
2. Performance challenges?
3. DPDK application command line options?
4. Application models?
5. Flow affinity?
6. Lockless queue?
7. Indirect buffers?
8. Cuckoo hash?
9. Longest Prefix match?
10. KNI?
11. Performance tips?
64
References
1. DPDK Programmer’s Guide: http://dpdk.org/doc/guides/prog_guide/
2. Alex Kogan, Erez Petrank. Wait-Free Queues With Multiple Enqueuers and Dequeuers, 2011
3. Michael, Scott. Simple, fast and practical non-blocking and blocking concurrent queue algorithms, 1996.
65

Contenu connexe

Tendances

Understanding DPDK algorithmics
Understanding DPDK algorithmicsUnderstanding DPDK algorithmics
Understanding DPDK algorithmicsDenys Haryachyy
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingMichelle Holley
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabMichelle Holley
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingKernel TLV
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking WalkthroughThomas Graf
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_mapslcplcp1
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network InterfacesKernel TLV
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?Michelle Holley
 
How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.Naoto MATSUMOTO
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
 
Poll mode driver integration into dpdk
Poll mode driver integration into dpdkPoll mode driver integration into dpdk
Poll mode driver integration into dpdkVipin Varghese
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KernelThomas Graf
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDPDaniel T. Lee
 
1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hwvideos
 

Tendances (20)

DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
 
Understanding DPDK algorithmics
Understanding DPDK algorithmicsUnderstanding DPDK algorithmics
Understanding DPDK algorithmics
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet Processing
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on Lab
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
 
Intel dpdk Tutorial
Intel dpdk TutorialIntel dpdk Tutorial
Intel dpdk Tutorial
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?
 
How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.How to Speak Intel DPDK KNI for Web Services.
How to Speak Intel DPDK KNI for Web Services.
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
Poll mode driver integration into dpdk
Poll mode driver integration into dpdkPoll mode driver integration into dpdk
Poll mode driver integration into dpdk
 
OVS v OVS-DPDK
OVS v OVS-DPDKOVS v OVS-DPDK
OVS v OVS-DPDK
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
 
1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw
 
eBPF maps 101
eBPF maps 101eBPF maps 101
eBPF maps 101
 
Ixgbe internals
Ixgbe internalsIxgbe internals
Ixgbe internals
 

Similaire à Network Programming: Data Plane Development Kit (DPDK)

Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterAnne Nicolas
 
Dpdk accelerated Ostinato
Dpdk accelerated OstinatoDpdk accelerated Ostinato
Dpdk accelerated Ostinatopstavirs
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...PROIDEA
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linuxbrouer
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCKernel TLV
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioHajime Tazaki
 
Ngrep commands
Ngrep commandsNgrep commands
Ngrep commandsRishu Seth
 
Playing BBR with a userspace network stack
Playing BBR with a userspace network stackPlaying BBR with a userspace network stack
Playing BBR with a userspace network stackHajime Tazaki
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Michelle Holley
 
LF_DPDK17_Accelerating P4-based Dataplane with DPDK
LF_DPDK17_Accelerating P4-based Dataplane with DPDKLF_DPDK17_Accelerating P4-based Dataplane with DPDK
LF_DPDK17_Accelerating P4-based Dataplane with DPDKLF_DPDK
 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)Yuuki Takano
 
Evaluation of OpenFlow in RB750GL
Evaluation of OpenFlow in RB750GLEvaluation of OpenFlow in RB750GL
Evaluation of OpenFlow in RB750GLToshiki Tsuboi
 
Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014Hajime Tazaki
 
Capturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in GoCapturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in GoScyllaDB
 
Run Run Trema Test
Run Run Trema TestRun Run Trema Test
Run Run Trema TestHiroshi Ota
 
Intel® RDT Hands-on Lab
Intel® RDT Hands-on LabIntel® RDT Hands-on Lab
Intel® RDT Hands-on LabMichelle Holley
 

Similaire à Network Programming: Data Plane Development Kit (DPDK) (20)

Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
 
Dpdk accelerated Ostinato
Dpdk accelerated OstinatoDpdk accelerated Ostinato
Dpdk accelerated Ostinato
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
 
Ngrep commands
Ngrep commandsNgrep commands
Ngrep commands
 
Playing BBR with a userspace network stack
Playing BBR with a userspace network stackPlaying BBR with a userspace network stack
Playing BBR with a userspace network stack
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
 
LF_DPDK17_Accelerating P4-based Dataplane with DPDK
LF_DPDK17_Accelerating P4-based Dataplane with DPDKLF_DPDK17_Accelerating P4-based Dataplane with DPDK
LF_DPDK17_Accelerating P4-based Dataplane with DPDK
 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
 
Evaluation of OpenFlow in RB750GL
Evaluation of OpenFlow in RB750GLEvaluation of OpenFlow in RB750GL
Evaluation of OpenFlow in RB750GL
 
Linux router
Linux routerLinux router
Linux router
 
Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014Direct Code Execution - LinuxCon Japan 2014
Direct Code Execution - LinuxCon Japan 2014
 
Capturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in GoCapturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in Go
 
Run Run Trema Test
Run Run Trema TestRun Run Trema Test
Run Run Trema Test
 
Intel® RDT Hands-on Lab
Intel® RDT Hands-on LabIntel® RDT Hands-on Lab
Intel® RDT Hands-on Lab
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
 
Ccna Imp Guide
Ccna Imp GuideCcna Imp Guide
Ccna Imp Guide
 
100 M pps on PC.
100 M pps on PC.100 M pps on PC.
100 M pps on PC.
 

Plus de Andriy Berestovskyy

Networking Fundamentals: Transport Protocols (TCP and UDP)
Networking Fundamentals: Transport Protocols (TCP and UDP)Networking Fundamentals: Transport Protocols (TCP and UDP)
Networking Fundamentals: Transport Protocols (TCP and UDP)Andriy Berestovskyy
 
Networking Fundamentals: IPv4 Routing and Support Protocols
Networking Fundamentals: IPv4 Routing and Support ProtocolsNetworking Fundamentals: IPv4 Routing and Support Protocols
Networking Fundamentals: IPv4 Routing and Support ProtocolsAndriy Berestovskyy
 
Networking Fundamentals: Computer Network Basics
Networking Fundamentals: Computer Network BasicsNetworking Fundamentals: Computer Network Basics
Networking Fundamentals: Computer Network BasicsAndriy Berestovskyy
 
Networking Fundamentals: Local Networks
Networking Fundamentals: Local NetworksNetworking Fundamentals: Local Networks
Networking Fundamentals: Local NetworksAndriy Berestovskyy
 
Why my network does not work? Networking Quiz 2017
Why my network does not work? Networking Quiz 2017Why my network does not work? Networking Quiz 2017
Why my network does not work? Networking Quiz 2017Andriy Berestovskyy
 
IPsec Basics: AH and ESP Explained
IPsec Basics: AH and ESP ExplainedIPsec Basics: AH and ESP Explained
IPsec Basics: AH and ESP ExplainedAndriy Berestovskyy
 

Plus de Andriy Berestovskyy (7)

Networking Fundamentals: Transport Protocols (TCP and UDP)
Networking Fundamentals: Transport Protocols (TCP and UDP)Networking Fundamentals: Transport Protocols (TCP and UDP)
Networking Fundamentals: Transport Protocols (TCP and UDP)
 
Networking Fundamentals: IPv4 Routing and Support Protocols
Networking Fundamentals: IPv4 Routing and Support ProtocolsNetworking Fundamentals: IPv4 Routing and Support Protocols
Networking Fundamentals: IPv4 Routing and Support Protocols
 
Networking Fundamentals: Computer Network Basics
Networking Fundamentals: Computer Network BasicsNetworking Fundamentals: Computer Network Basics
Networking Fundamentals: Computer Network Basics
 
Networking Fundamentals: Local Networks
Networking Fundamentals: Local NetworksNetworking Fundamentals: Local Networks
Networking Fundamentals: Local Networks
 
Why my network does not work? Networking Quiz 2017
Why my network does not work? Networking Quiz 2017Why my network does not work? Networking Quiz 2017
Why my network does not work? Networking Quiz 2017
 
The Spectre of Meltdowns
The Spectre of MeltdownsThe Spectre of Meltdowns
The Spectre of Meltdowns
 
IPsec Basics: AH and ESP Explained
IPsec Basics: AH and ESP ExplainedIPsec Basics: AH and ESP Explained
IPsec Basics: AH and ESP Explained
 

Dernier

How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 

Dernier (20)

How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 

Network Programming: Data Plane Development Kit (DPDK)

  • 1. Andriy Berestovskyy 2014 ( ц ) А н д р і й Б е р е с т о в с ь к и й networking hourTCP UDP NAT IPsec IPv4 IPv6 internet protocolsAH ESP authentication authorization accounting encapsulation security BGP OSPF ICMP ACLSNAT tunnelPPPoE GRE ARP discovery NDP OSI broadcast multicast IGMP PIM MAC DHCP DNS fragmentation semihalf berestovskyy Network Programming Data Plane Development Kit
  • 2. Network Programming Series ● Berkeley Sockets ● > Data Plane Development Kit (DPDK) ● Open Data Plane (ODP) 2
  • 3. Let’s Make a 10Gbit/s Switch! 3 1. Receive frame, check Ethernet FCS 2. Add/update source MAC in MAC table 3. If multicast bit is set: a. forward all ports, but the source 4. If destination is in MAC table: a. forward to the specific port 5. Else, forward to all ports
  • 4. Let’s Make a Switch Simple Hub! 4 1. Receive frame, check Ethernet FCS 2. Add/update source MAC in MAC table 3. If multicast bit is set: a. forward all ports, but the source 4. If destination is in MAC table: a. forward to the specific port 5. Else, forward to another port But still, 10 Gbit/s!
  • 5. Recap: Ethernet Frame Format 5Source: https://en.wikipedia.org/wiki/Ethernet_frame Ethernet Frame Bits 2 1 7 Preamble, 7 octets Start of Frame, 1 octet 6 Destination MAC, 6 octets 6 Source MAC, 6 octets 4 Optional 802.1q (VLAN) Tag, 4 octets Ethertype, 2 octets 46-1500 4 Frame Check Sequence, 4 octets 12 Interframe gap, 12 octetsPayload, 46-1500 octets 72 — 1530 octets 64 — 1522 octets TrailerHeader
  • 6. Performance Challenges Minimum Ethernet Frame Size: min frame size = preamble + start + min frame + interframe gap min frame size = 7 + 1 + 64 + 12 = 84 octets (84 * 8 = 672 bits) Maximum number of frames on 1 Gbps link: packets per second = 1 Gbps / 672 bits = 1 488 095 pps Maximum number of frames on 10 Gbps link: packets per second = 10 Gbps / 672 bits =14,88 Mpps 6
  • 7. Ethernet vs CPU 7 Skylake Intel® Xeon® Processor E3-1280 v5 — 3,7 GHz CPU budget per packet = CPU Freq / Packet Per Second CPU budget per packet = 3,7 GHz / 14,88 Mpps = 249 cycles 249 CPU cycles per packet Is it a lot?
  • 8. Recap: Xeon Memory Hierarchy 8 Xeon Package Core Core Registers 32 KB L1 / ~4 cycles 256 KB L2 / ~10 cycles 36 MB LLC / ~30 cycles 32 KB L1 / ~4 cycles 256 KB L2 / ~10 cycles Registers Up to 1.5 TB DDR4 / ~200 cycles / 60 GB/s Source: http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
  • 9. Ethernet vs Memory 9 249 cycles per packet is: 249 cycles / 4 cycles = 62 L1 cache reads 249 cycles / 10 cycles = 24 L2 cache reads 249 cycles / 30 cycles = 8 L3 cache reads 249 cycles / 200 cycles = 1 DDR4 read :(
  • 10. What is Wrong With Kernel? 1. Interrupts 2. Context switches 3. User-kernel space data copying 4. Kernel development is not easy 10
  • 12. Data Plane Development Kit — set of user space dataplane libraries and NIC drivers for fast packet processing — Wikipedia 12 Dataplane?
  • 13. Dataplane — part of architecture that decides what to do with packets arriving on an inbound interface — Wikipedia 13
  • 14. What is DPDK 1. Set of user space libraries 2. Set of user space drivers with direct NIC access 3. Support for Network Functions Virtualization 4. Open source, BSD Licensed 14 How?
  • 15. What DPDK is Not 1. Not a TCP/IP stack 2. Not a Berkeley socket API 3. Not a ready to use solution, i.e. neither a router nor a switch 15
  • 16. Fitting Into 249 Cycles: Polling Poll for new packets, do not wait for an interrupt: 1. Interrupts are out of the budget 2. Avoid context switches void lcore_loop() { while (1) { ... } } 16 Pros/cons?
  • 17. Fitting Into 249 Cycles: Bursts Process few packets at a time (a burst), not one-by-one: 1. Amortize slow memory reads per burst 2. Increase cache hit ratio: do the same for few packets at once void lcore_loop() { struct rte_mbuf *burst[32]; while (1) { nb_rx = rte_eth_rx_burst(0, 0, burst, 32); ... rte_eth_tx_burst(1, 0, burst, nb_rx); } } 17 Pros/cons?
  • 18. Fitting Into 249 Cycles: Hugepages Virtual-Physical address translation — Translation Lookaside Buffer: 1. Haswell First Level DTLB Cache for 4K pages: 64 entries x 4 (~4 cycles) 2. Haswell Second Level DTLB Cache: 1024 entries x 8 (~10 cycles) Use 2MB or 1GB hugepages to reduce TLB cache misses: GRUB_CMDLINE_LINUX_DEFAULT=”... default_hugepagesz=1G hugepagesz=1G hugepages=4” # mount -t hugetlbfs nodev /mnt/huge 18More on system requirements: http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html
  • 19. Fitting Into 249 Cycles: Multicore Use few CPU cores to process packets: 1. Use Receive Side Scaling 2. Use Flow affinity void main() { ... rte_eal_mp_remote_launch(lcore_loop, ...); ... } 19 Pros/cons? CPU 1: lcore_loop() CPU 2: lcore_loop() CPU 3: lcore_loop() port port
  • 20. Why DPDK? 1. High performance ○ efficient memory-management (zero-copy, hugepages, user space) ○ efficient packet handling (DIR-24-8 implementation, cuckoo hash) ○ efficient CPU management (lockless poll-mode drivers, run-to-completion, NUMA awarness) 2. Simple ○ user space application ○ many examples 3. De-facto standard for dataplanes in Linux 20
  • 21. Pipeline Model lcore 3: TX lcore 2: process Run-to-Completion Model lcore 1: RX, process, TX lcore 2: RX, process, TX lcore 3: RX, process, TX Dataplane Application Models 21 port port lcore 1: RX port port Cons/pros? lcore?
  • 22. Lcore — logical execution unit of the processor, sometimes called a hardware thread (usually, pthread bound to a CPU core) — Wikipedia 22
  • 23. Pipeline Model lcore 3: TX lcore 2: process Run-to-Completion Model lcore 1: RX, process, TX lcore 2: RX, process, TX lcore 3: RX, process, TX Synchronization Issues 23 port port lcore 1: RX port port How? How?
  • 24. Run-to-Completion Model port q2 lcore 1: RX, process, TX lcore 2: RX, process, TX lcore 3: RX, process, TX Run-to-Completion Synchronization 24 port q1 q3 q2 q1 q3 Hardware queue Cons/pros?
  • 25. RSS* and Flow Affinity 25 q2 lcore 1: RX, process, TX lcore 2: RX, process, TX lcore 3: RX, process, TX q1 q3 q2 q1 q3 Why? port port * Receive Side Scaling
  • 26. Pipeline Model Synchronization 26 Pipeline Model lcore 3: TX lcore 2: process lcore 1: RX port q1 portq1 Software queue Cons/pros?
  • 27. DPDK Overview 27 User space Kernel Hardware NICNIC NIC igb_uio (~700 lines) or vfio-pci or uio_pci_generic DPDK lcore 3 Classification and Scheduling Linux/FreeBSD App App Poll Mode Drivers Hashing and Forwarding CPU and Memory Management NIC QEMU (VM) eth0 ixgbe lcore 2 lcore 1 tap0 log vEth0KNI Driver PCAP Driver rte_kni Virtio Virtio Shared Memory TUN/TAP
  • 28. DPDK Kernel So, Let’s Make a Simple Hub! 28 void lcore_loop() { struct rte_mbuf *burst[32]; uint16_t id = rte_lcore_id(); while (1) { nb_rx = rte_eth_rx_burst(0, id, burst, 32); rte_eth_tx_burst(1, id, burst, nb_rx); nb_rx = rte_eth_rx_burst(1, id, burst, 32); rte_eth_tx_burst(0, id, burst, nb_rx); } } PMD App PMD while (1) { rx(...); tx(...); } mmap()
  • 29. DPDK Libraries Overview 29 rte_malloc Hugepage-based heap rte_eal Environment Abstraction Layer: hugepages, CPU, PCI, logs, debugs ... rte_ring Lockless queue Multi/single consumer/producer rte_mempool Fixed-sized objects Uses rings to keep free objects rte_mbuf Memory buffer Uses mempools as a storage rte_cmdline Command line interface rte_ether Generic PMD API rte_hash Hash library rte_kni Kernel NIC interface rte_meter + rte_sched QoS classifier and scheduler rte_pmd_* Poll Mode Drivers rte_timer Timers management Why? rte_lpm Longest prefix match (DIR-24-8) rte_net IP-related (ARP, IP, TCP, UDP...)
  • 30. DPDK Application Initialization 1. Init Environment Abstraction Layer (EAL): rte_eal_init(argc, argv); 2. Allocate mempools: rte_pktmbuf_pool_create(name, n, cache_sz, …); 3. Configure ports: rte_eth_dev_configure(port_id, nb_rx_q, nb_tx_q, …); 4. Configure queues: rte_eth_*_queue_setup(port_id, queue_id, nb_desc, …); 5. Launch workers (lcores): rte_eal_mp_remote_launch(lcore_loop, …); 30 Arguments?
  • 31. DPDK Command Line Arguments 31 1. Number of workers: -c COREMASK, i.e. -c 0xf better: -l CORELIST, i.e. -l 0-3 best: --lcores COREMAP, i.e. --lcores 0-3@0 2. Number of memory channels (optional): -n NUM, i.e. -n 4 3. Allocate memory using hugepages (optional): -m MB, i.e. -m 512 --socket-mem SOCKET_MB,SOCKET_MB,..., i.e. --socket-mem 384,128 4. Add virtual devices (optional): --vdev <driver><id>[,key=val, ...], i.e. --vdev net_pcap0,iface=eth0 More command-line options: http://dpdk.org/doc/guides/testpmd_app_ug/run_app.html
  • 32. DPDK Main Loop 1. Receive burst of packets: nb_rx = rte_eth_rx_burst(port_id, queue_id, burst, BURST_SZ); 2. Process packets (app logic) 3. Send burst of packets: nb_tx = rte_eth_tx_burst(port_id, queue_id, burst, nb_rx); 4. Free unsent buffers: for (; nb_tx < nb_rx; nb_tx++) rte_pktmbuf_free(burst[nb_tx]); 32
  • 33. DPDK CPU Management 33 Master lcore int main() { rte_eal_mp_remote_launch(m_loop, ...); lcore 1 void m_loop() { while (!stop) { rx_burst(); ... tx_burst(); } } lcore 2 void m_loop() { while (!stop) { rx_burst(); ... tx_burst(); } } void signal_handler(int signum) { if (SIGINT == signum) stop = 1; } rte_eal_init(argc, argv); wait wait rte_eal_mp_wait_lcore() wait wait Allocate mempools, configure ports...
  • 34. rte_memzone_reserve(name, len, socket, size) rte_mempool_create(name, n, elt_size, ...) DPDK Memory Management 34 phys. memory: hugepages memory zones: mempool zonering zone heap zonefree rte_pktmbuf: rte_mempool: free◌ free mbufs mbufmbuf free mbuf mbuf mbuf privatefree headroom tailroomdata next rte_pktmbuf_alloc(mempool) priv free
  • 35. Memory Management Libraries 35 rte_eal Hugepages management rte_memzone_reserve(name, len, socket) rte_malloc Hugepage-based heap rte_malloc(type, size, align) rte_free(ptr) rte_ring Lockless queue rte_ring_create(name,count,socket,flags) rte_ring_dequeue_bulk(r, table,n) rte_mempool Fixed-sized objects rte_mempool_create(name, n, elt_size, ...) rte_mempool_get_bulk(pool, table, n) rte_mbuf Memory buffers rte_pktmbuf_alloc(mempool) *_bulk(N): N objects or nothing *_burst(N): 0-N objects
  • 36. 1. Lockless 2. Fixed size queue of pointers 3. Bulk/burst enqueue/dequeue operations DPDK Lockless Queue 36 Lockless? consumer Why head/tail? producer cons.head cons.tail rte_ring: ptr1 ptr2 ptr3free free free prod.head prod.tail ptr2 ptr1 ptr3
  • 37. Non-blocking Algorithms Blocking — sequential access, other threads are blocked: mutex, semaphore Non-blocking — sequential access, other threads busy wait: spinlocks Lock-free — concurrent access, unpredictable number of retries: consistency markers Wait-free — concurrent access, predictable number of steps 37 Markers?
  • 38. cons.head cons.tail lcore 2 lcore 1 Enqueue Lock-Free Algorithm 38 rte_ring: ptr1 ptr2 free free free free next Step 1 1. head = prod.head 2. next = head + N head nexthead prod.head prod.tail cons.head cons.tail lcore 2 lcore 1 rte_ring: ptr1 ptr2 free free free free next Step 2 1. CAS(prod.head, head, next) 2. if (failed) goto Step 1 head nexthead prod.tail prod.head Predictable?
  • 39. 1. Uses memory zones to store pointers 2. rte_ring_dequeue_* to consume pointers 3. rte_ring_enqueue_* to produce pointers DPDK rte_ring Library 39 rte_eal Hugepages management rte_memzone_reserve(name, len, socket) rte_ring Lockless queue rte_ring_create(name,count,socket,flags) rte_ring_dequeue_bulk(r, table,n) *_bulk(N): N objects or nothing *_burst(N): 0-N objects
  • 40. lcore 1 rte_mempool_create(name, n, elt_size, ...) memory zones: mempool zonering zone heap zonefree rte_mempool: free◌ free mbufs mbufmbuf free mbuf mbuf mbuf privatefree DPDK Memory Pools 40 1. Allocator of fixed-sized objects 2. Uses ring to store free objects 3. High-performance RX ringTX ring
  • 41. DPDK rte_mempool Library 41 rte_eal Hugepages management rte_memzone_reserve(name, len, socket) rte_ring Lockless queue rte_ring_create(name,count,socket,flags) rte_ring_dequeue_bulk(r, table,n) rte_mempool Fixed-sized objects rte_mempool_create(name, n, elt_size, ...) rte_mempool_get_bulk(pool, table, n) 1. Uses ring as a queue of free objects 2. rte_mempool_get_* to allocate objects 3. rte_mempool_put_* to free objects 4. Optional per-lcore cache *_bulk(N): N objects or nothing *_burst(N): 0-N objects Why?
  • 42. DPDK Memory Buffer 42 rte_pktmbuf: rte_mempool: free◌ free mbufs mbufmbuf free mbuf mbuf mbuf privatefree headroom tailroomdata next or NULL next rte_pktmbuf_alloc(mempool) priv 1. Fixed-size buffers 2. Chained buffers support 3. Indirect buffers support Why?Why?
  • 43. Chained Memory Buffers 43 mbuf mbuf mbuf mbuf next next next NULL 1. For jumbo frames 2. To append/prepend more than head/tailroom
  • 44. Indirect Memory Buffers 44 1. For fast packet “cloning” 2. For broadcasts/multicasts rte_pktmbuf: headroom tailroomdatapriv rte_pktmbuf: buf_addr refcnt 2 buf_addr flags |= ATTACHED
  • 45. DPDK rte_hash Library 45 1. Array of buckets 2. Fixed number of entries per bucket 3. No data, key management only 4. 4-byte key signatures Why? Why? keyskeysrte_hash free sigfree free sig sig sig free sigsigfree int. key array free keyfree free key key key free keykeyfree index data Data?
  • 47. 47 Cuckoo hashing — scheme for resolving hash collisions with worst-case constant lookup time — Wikipedia Collision?
  • 48. Cuckoo Hashing Algorithm 48 AA A B B 1 2 3a A B A C 3b C B A C no collision primary hash collision - use alt. location both hashes collision - push hash to alt. location add hash to a vacant space Double addressing Cuckoo
  • 49. Why Cuckoo Hashing? 49 ● Fast ● Constant lookup time ● High load factor (~90%)
  • 50. Cuckoo Hash Structures 50 B Bucket Numberofbuckets Bucket entries S Ki A Key signatures [] Entry K Key indexes [] Alt. signatures [] Numberofslots D KeyData Ring of free slots
  • 51. 51 DPDK Longest Prefix Match: Why?
  • 52. Recap: IP Routing 52 WAN LAN2LAN1 IP: 2.2.2.2 IP: 1.1.1.1 R1 R2 1 By default, send to R1 All 1.*.*.*, send to R2 3 All 1.*.*.*, send locally 2
  • 53. Recap: IP Flexible Subnetting 53 3.2.5100. Subnet Host Service Provider (AS100) Subnet 100.0.0.0 Company 3 Subnet 100.3.0.0 Office 100.3.1.0 Lab 100.3.2.0 2.5100.3. 5100.3.2. 100.3.2.5 How? Route to: AS100 Route to: AS100, Company 3 Route to: AS100, Company 3, Lab Deliver to: AS100, Company 3, Lab, Host 5 AS200
  • 54. Recap: Router Logic 54 Service Provider Subnet 100.0.0.0 Company 3 Subnet 100.3.0.0 Lab 100.3.2.0 1. Receive IP packet: ○ Check Ethernet FCS ○ Remove Ethernet header 2. Decrease TTL 3. Find the best route in routing table: ○ Most specific route is the best 4. If found, send to next-hop router: ○ Destination MAC = next-hop gateway IP 5. Else, drop the packet Lab Router Routing Table: 100.3.2.* —> dev eth1 (Lab), directly *.*.*.* —> dev eth0 (Company 3), via 100.3.0.1 (Company 3 Router)
  • 55. Recap: IPv4 Subnet Mask 55 32-bit Number 0110 0100 0000 0011 0000 0010 0000 0101 100. 3. 2. 5 IPv4 Address IPv4 Address in Dotted Decimal Notation 1111 1111 1111 1111 1111 1111 0000 0000 255. 255. 255. 0 IPv4 Subnet Mask Subnet Mask in Dotted Decimal Notation & 0110 0100 0000 0011 0000 0010 0000 0000 100. 3. 2. 0 IPv4 Subnet Subnet in Dotted Decimal Notation
  • 56. Recap: IPv4 Subnet Mask Length 56 Subnet Mask Length = 24 1111 1111 1111 1111 1111 1111 0000 0000 255. 255. 255. 0 IPv4 Subnet Mask Subnet Mask in Dotted Decimal Notation Dotted Decimal Notation: IPv4 Address: 100.3.2.5 Subnet Mask: 255.255.255.0 CIDR* Notation: IPv4 Prefix: 100.3.2.5/24 == * Classless Inter-Domain Routing
  • 57. Longest Prefix Match (LPM) Example routing table: 0.0.0.0/0 -> R1 10.0.0.0/8 -> R2 10.10.0.0/16 -> R3 Destination address 10.10.0.1 matches all three routes. Which route to use? 57
  • 58. DPDK rte_lpm Library IPv4: 1. 32-bit keys 2. Fixed maximum number of rules 3. LPM rule: 32-bit key + prefix len + user data (next hop) 4. DIR-24-8 algorithm (1-2 memory reads per match) IPv6: 1. 128-bit keys 2. Similar algorithm: 24 bit + 13 x 8 bit tables = 1-14 memory reads (typically 5) 58 How?
  • 59. DPDK rte_lpm Library: DIR-24-8 59 Table of Next Hops R1 Table of 2^24 words for prefix len 0-24 0.0.0 Table of records x 256 for prefix lens 25-32 ... R2 9.255.255 10.0.0 10.0.1 ... 255.255.255 R3 ... (10.0.0.) 0 (10.0.0.) 1 ... (10.0.0.) 255 ... 2^24*2 = 32MB table N*2*256 bytes table where N is a max number of routes with prefix len > 24 tbl24 tbl8 24 bits index 8 bits index more specific routes flag LPM Rules (routes) 1st read optional 2nd read
  • 60. DPDK Poll Mode Drivers 1. Lock-free —> thread unsafe 2. Based on user space IO (uio) 3. Limited number of NICs 4. Any interface via PCAP (slow) 60
  • 61. DPDK Kernel NIC Interface 61 User space Kernel Hardware NIC igb_uio DPDK lcore 2 Linux/FreeBSD ping Poll Mode Drivers lcore 1 vEth0KNI Driver rte_kni 1. Allows user space applications to access Linux control plane 2. Allows management of DPDK ports using Linux tools 3. Interface with the kernel network stack
  • 62. DPDK Thread Safety 1. Thread unsafe: all performance sensitive functions hash, LPM, PMD... 2. Multi-threaded: performance insensitive malloc, memzone... 3. Fast and thread safe: rings (lockless queues) and mempools 62
  • 63. DPDK Performance Tips 1. Never use libc nor Linux API malloc -> rte_malloc printf -> RTE_LOG 2. Avoid cache misses / false sharing by using per lcore variables Example: port statistics 3. Use NUMA sockets to allocate local memory 4. Use rings to inter-core communication 5. Use burst mode in PMDs 6. Help branch predictor: use likely()/unlikely() 7. Prefetch data into cache with rte_prefetchX() 63 Why? False sharing?
  • 64. DPDK Checklist 1. What is DPDK? 2. Performance challenges? 3. DPDK application command line options? 4. Application models? 5. Flow affinity? 6. Lockless queue? 7. Indirect buffers? 8. Cuckoo hash? 9. Longest Prefix match? 10. KNI? 11. Performance tips? 64
  • 65. References 1. DPDK Programmer’s Guide: http://dpdk.org/doc/guides/prog_guide/ 2. Alex Kogan, Erez Petrank. Wait-Free Queues With Multiple Enqueuers and Dequeuers, 2011 3. Michael, Scott. Simple, fast and practical non-blocking and blocking concurrent queue algorithms, 1996. 65