On Multi-gigabit Packet Capturing with Multi-core Commodity Hardware

  title={On Multi-gigabit Packet Capturing with Multi-core Commodity Hardware},
  author={N. Bonelli and A. D. Pietro and S. Giordano and G. Procissi},
Nowadays commodity hardware is offering an ever increasing degree of parallelism (CPUs with more and more cores, NICs with parallel queues. [...] Key Method Such an engine is based on a novel lockless queue and allows parallel packet capturing to let the user---space application arbitrarily define its degree of parallelism. Therefore, both legacy applications and natively parallel ones can benefit from such a capturing engine. In addition, PFQ outperforms its competitors both in terms of captured packets and…Expand
Commodity Packet Capture Engines: Tutorial, Cookbook and Applicability
This tutorial explains how the arrival of commodity packet engines has revolutionized the development of traffic processing tasks and explains the foundation of this new paradigm, i.e., the knowledge required to capture packets at multi-Gb/s rates on commodity hardware. Expand
Towards high-performance packet processing on commodity multi-cores: current issues and future directions
A novel Self-Described Buffer (SDB) management technology is introduced to eliminate the overheads of the allocation and deallocation of the packet buffers offloaded to FPGA and the future work of packet processing optimization on multi-core CPUs is discussed. Expand
On memory allocation for high-speed packet analysis applications
A multi-layer slice memory allocator specifically designed to take advantage of spatial and temporal locality in dealing with high-speed packet processing applications and clearly outperforms existing memory allocators in common networking use-cases. Expand
Network Traffic Processing With PFQ
The results show that the flexibility and the backward compatibility provided by PFQ do not impact its processing performance that reaches line rate figures in the cases of pure speed tests and real practical monitoring use cases on 10+ Gb/s links. Expand
Comparison of frameworks for high-performance packet IO
This paper surveys various frameworks for high-performance packet IO and introduces a model to estimate and assess the performance of these packet processing frameworks, and quantifies the effects of caching and looks at the tradeoff between throughput and latency. Expand
Batch to the Future: Analyzing Timestamp Accuracy of High-Performance Packet I/O Engines
Experimental results show that a simple algorithm to distribute inter-batch time among the packets composing a batch, and a driver modification to poll NIC buffers avoiding batch processing allow capturing accurately timestamped traffic for monitoring purposes at multi-10Gb/s rates. Expand
Comparison of Memory Mapping Techniques for High-Speed Packet Processing
Network stacks currently implemented in operating systems can no longer cope with the high packet rates offered by 10 GBit Ethernet. Thus, frameworks were developed claiming to offer a fasterExpand
A purely functional approach to packet processing
This paper explores a new direction to packet processing by pushing forward functional programming principles in the definition of a “software defined networking” paradigm by introducing PFQ-Lang, an extensible functional language which can be used to process, analyze and forward packets captured on modern multi-queue NICs. Expand
Wire-speed statistical classification of network traffic on commodity hardware
A software-based traffic classification engine running on commodity multi-core hardware, able to process in real-time aggregates of up to 14.2 Mpps over a single 10 Gbps interface, with significant advance with respect to the current state of the art in terms of achieved classification rates. Expand
Development and evaluation of a low-cost scalable architecture for network traffic capture and storage for 10Gbps networks
Experimental results, using both synthetic and real traffic, show that the proposed proposals allow capturing accurately timestamped traffic for monitoring purposes at multi-10Gbps rates. Expand


Packet capturing on parallel architectures
The potential of parallelism when coupled with existing packet capturing technologies is explored and it is shown how, by accurately tuning configurations, a huge performance gain can be obtained. Expand
High speed network traffic analysis with commodity multi-core systems
This work describes the design and implementation of a novel multi-core aware packet capture kernel module that enables monitoring applications to scale with the number of cores and demonstrates that it can achieve high packet capture performance on modern commodity hardware. Expand
Forwarding path architectures for multicore software routers
This work investigates a set of input/output processing architectures, as well as resource allocation strategies for forwarding paths in multi-core systems, and uncovers the gains and possible implications by either running different components concurrently or replicating the same components across different cores. Expand
Building a single-box 100 Gbps software router
This paper maps out expected hurdles and projected speed-ups to reach 100 Gbps in packet routing on a single commodity PC, and proposes reducing per-packet processing overhead with software-level optimizations and buying extra computing power with GPUs. Expand
PacketShader: Massively Parallel Packet Processing with GPUs to Accelerate Software Routers
This work offload packet processing to graphics processing units (GPUs) and confirms that GPU acceleration for core packet processing functions with enough parallelism can significantly boost the performance of software routers. Expand
Flexible High Performance Traffic Generation on Commodity Multi-core Platforms
The aim of this work is to design a traffic generator which can both achieve good performance and provide a flexible framework for supporting arbitrary traffic models, and the key factor that enables this system to meet both requirements is parallelism. Expand
PacketShader: a GPU-accelerated software router
The evaluation results show that GPU brings significantly higher throughput over the CPU-only implementation, confirming the effectiveness of GPU for computation and memory-intensive operations in packet processing. Expand
RouteBricks: exploiting parallelism to scale software routers
This work proposes a software router architecture that parallelizes router functionality both across multiple servers and across multiple cores within a single server, and demonstrates a 35Gbps parallel router prototype. Expand
nCap: wire-speed packet capture and transmission
  • L. Deri
  • Computer Science
  • Workshop on End-to-End Monitoring Techniques and Services, 2005.
  • 2005
A new approach to wire-speed packet capture and transmission named nCap based on commercial network adapters rather than on custom network adapters and software is described. Expand
The click modular router
On conventional PC hardware, the Click IP router achieves a maximum loss-free forwarding rate of 333,000 64-byte packets per second, demonstrating that Click's modular and flexible architecture is compatible with good performance. Expand