(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
PF_DIRECT@TMA12
1. Flexible High Performance Traffic
Generation on Commodity Multi-core
Platforms
Nicola Bonelli, Andrea Di Pietro,
Stefano Giordano, Gregorio Procissi
CNIT and Dip. di Ingegneria dell’Informazione - Università di Pisa
2. Introduction and Motivations
• New network devices are emerging… (probes, NIDs, shapers)
• Available traffic generator from the market:
• Expensive black-box solutions (i.e. Spirent AX analyzer)
• Not enough extensible: limited traffic patterns, poor semantics for randomization, etc.
• PC and professional NICs based solutions are cheaper (Endace, Napatech,
Invea-tech)
• Enable fast packet transmission but usually do not provide a framework for traffic
generation
• Traffic generator should combine the flexibility of the software with the
power of the modern hardware
• multi-core architectures equipped with multi-queues NICs are today
commodity hardware
• Is it possible to create a software for traffic generation that, running of
top of such a parallel architecture, is able to provide hardware-class
performance?
3. Software for traffic generation
• A number of software solutions for traffic generation (trafgen, iperf, rude/crude,
mgen)
• Ostinato, and Brute makes use of PF_PACKET sockets and therefore are able to
customize traffic at data-link layer:
• - Packet rate hardly exceed few million packets per second (no scalability)
• - No explicit support of multi-queue NICs
• - It does not support time-stamping to adjust the timing with which to transmit packets
Fast packet transmission…
• Recently accelerated drivers have also emerged: netmap (Luigi Rizzo)
• memory-map the DMA descriptors of NICs to user-space and can transmit at wire-speed
(14.8Mpps) the same packet or a small set of of packets
• A single thread generating a random-address IP packets does not fill the pipe (~6/8 Mpps
each)
• Also using the very fast Mersenne-twister random generator! (~50 CPU cycles)
• Additional investigations are required…
4. PF_DIRECT features
We implemented a brand new socket, named PF_DIRECT:
• A socket designed for the traffic generation (and transmission)
• Compliant with vanilla drivers (not a custom driver)
• Designed to run on top of commodity parallel hardware
• Support of timestamp in transmission
• Decoupling the traffic generation from packet transmission
• Packets are generated by a user-space thread and transmitted by
multiple kernel threads
• Simple patterns are generated and transmitted nearly at wire speed
• More complex patterns, most likely, do not have this requirement
5. PF_DIRECT architecture
PF_DIRECT kernel module consists of:
• A user-space library written in C++11 supposed to handle memory mapping,
packet dispatching among k-thread, etc.
• A special memory mapped byte-oriented SPSC queue
• Amortizes traffic coherence between cores (of queue index invalidations)
• Kernel thread supposed to transmit the packets buffered at the SPSC
queues, each at the given timestamp
• Active wait or reschedule in case of long wait…
• TSC of different cores are synchronized on modern CPUs (INVARIANT_TSC)
• A ring of pre-allocated socket buffers (skb) which are re-used by the
kernel module and never get deallocated by network drivers
• User-counter trick
7. Traffic generation with PF_DIRECT
Our experimental traffic generator, built on top of PF_DIRECT, consists of:
• User-space application, where each thread of execution represent a
source of traffic
• Traffic sources “Engine” (that can concurrently make use of different
traffic models)
• User-space thread, one per core, running a deadline scheduler (~20 ns
context switch)
• A user-defined traffic mode (micro-thread) is in charge of:
• Create the packet to be transmitted
• Schedule the timestamp for the packet transmission
• Send the packet through the PF_DIRECT socket (buffered it at the SPSC queue)
• Xml composition blocks that allow to instantiate and bind a given source
to a core and to a given hardware queue
19. Conclusions
• PF_DIRECT a Linux socket that leverages the potential
of multi-core architectures and multi-queues NICs
• PF_DIRECT decouples the task of packet generation
from that of transmission
• A single thread is able to generate non-trivial traffic, close
to the wire-rate ~13Mpps
• Multiple kernel-threads transmit packets though multiple
queues
• Support transmission timestamp (in TSC)
• Experimental traffic generator on top of PF_DIRECT
20. Future work
• Release the PF_DIRECT source code
• Additional performance improvements in PF_DIRECT
• Performance: identify a small set of changes, common to
different drivers, that could define a “PF_DIRECT aware-
driver”
• Implement a stable version of the “traffic generator” with
complex traffic models