Using HPC instructions to accelerate

By Ray Kinsella | May 24 2022, TSC member Ray Kinsella has been blogging recently on VPP. His latest post describes how VPP is accelerated using SIMD instructions. Single instruction, multiple data (SIMD) instructions are commonly used to improve software efficiency by performing an operation on multiple buffers (data) in parallel. This can improve ‘instructions per cycle’ (IPC) a common metric that describes the efficiency of software. SIMD instructions are most commonly used to accelerate High Performance Computing (HPC) workloads, however re-purposes these to accelerate networking workloads.

Vector visibility: Why DPI is a must for vector packet processing

By Tobias Roeder | May 17 2022,

The rise of virtualized and cloud-native networks is revolutionizing today’s networking spaces. Rigid architectures are giving way to network instances that can be spun up in seconds and put to task in no time…

A Terabit Secure Network Data-Plane

By The linux foundation | Apr 6 2021, (“Fido”), relentlessly focused on data IO speed and efficiency supporting the creation of high performance, flexible, and scalable software defined infrastructures, today announced support for terabit rates of IPsec, as well as a billion packets per second of IPv4 routing at scale. Architectural improvements in 3rd Gen Intel Xeon Scalable processors including PCIe bandwidth increase and overall decrease in cycles-per-packet due to CPU micro-architecture improvements combined with software deliver significant price-performance gains for both cloud- and appliance-based software router and secure networking solutions. offers the software defined infrastructure developer community a landing site with multiple projects fostering innovations in software-based packet processing towards the creation of high-throughput, low-latency, and resource-efficient I/O services suitable to many architectures (x86, ARM, and PowerPC) and deployment environments (bare-metal, VM, …

CuVPP: Filter-based Longest Prefix Matching in Software Data Planes

By Minseok Kwon, Krishna Prasad Neupane, John Marshall, M. Mustafa Rafique | Sep 15 2020,

IEEE Cluster 2020 Kobe – The paper titled “CuVPP: Filter-based Longest Prefix Matching in Software Data Planes” wins “Best Papers” award at IEEE Cluster 2020 on September 15th. Programmability in network switches (or data planes) has become increasingly important with increasing network virtualization in the Internet infrastructure and large-scale data centers. A critical challenge in data plane programmability is to maintain high-speed packet processing performance with ever increasing link speed to hundreds of Gbps or Tbps. Another challenge is the rapid growing routing table size, e.g., more than 500,000 entries. We implement CuVPP as part of the Real Software Switch VPP and provide a comprehensive evaluation using popular alternative approaches with realistic data sets for network prefixes and traffic. The video presentation can be found here: CuVPP Video The paper can be found here: CuVPP Paper

Fast Data Project’s Vector Packet Processor (VPP) Release 20.05

By Linux Foundation | Jul 23 2020,

SAN JOSE – (“Fido”) – an open source project within The Linux Foundation’s LF Networking (LFN) – announced the availability of Vector Packet Processor (VPP) software release 20.05. The VPP (Vector Packet Processor) release 20.05 is now available. VPP continues to be relentlessly focused on performance. In addition, VPP continues to add features. All this without sacrificing packet throughput. In this article we highlight some remarkable performance numbers, point to some of the features added in 20.05 and then point to some articles that have been published in the past 5 months.

Myth-busting DPDK in 2020

By Linux Foundation | Jul 20 2020,

Revealed: the past, present, and future of the most popular data plane development kit in the world.

Create a 40G Encrypted Container Network with Calico/VPP on Commodity Hardware

By Aloys Augustin, Emran Chandry, Mohsin Kazmi, Nathan Skrzypczak, Jerome Tollet | May 26 2020,

When we started integrating VPP in Kubernetes with Calico as a management plane, the goal was to bring the performance of VPP with the flexibility of userspace networking to containers. With its unrivaled IPsec performance, this was clearly an area where VPP would be able to help. Without further ado, here is the encrypted throughput we achieved between two pods on a 40G network To learn more about VPP/Calico click below.

Kernel bypass networking with VPP

By Andree Toonk | Apr 5 2020,

In this blog We will compare the result with the results of my last blog in which we looked at how much a vanilla Linux kernel could do in terms of forwarding (routing) packets. We observed that on Linux, to achieve 14Mpps we needed roughly 16 and 26 cores for a unidirectional and bidirectional test. In this article, we’ll look at what we need to accomplish this with To continue reading about kernel bypass networking with VPP please click below.

Building fast QUIC sockets in VPP

By Aloys Augustin | Mar 30 2020,

As most of you may already know, QUIC is a new transport protocol that began as a Google experiment for HTTP/2, which is now being standardized at the IETF. It will also be the default transport protocol for HTTP/3. As a result, it is likely to be very widely deployed in the next few years. Given the growing popularity of QUIC and its expected widespread deployment, it was essential to provide an implementation of QUIC in the Vector Packet Processor (VPP), both to measure the performance that we could reach with a full userspace QUIC stack, and as an enabler for more innovation around the QUIC protocol. To build fast QUIC sockets with VPP please click below. Release 20.01 Improves Multicore IPSec

By Linux Foundation Networking | Mar 23 2020,

The rise in worker mobility and increasingly complex multi-cloud architectures is escalating organizations’ reliance on encryption. This puts computational strain on VPN products, especially as they evolve, for example, from 1 to 10 to 40 Gbps or more. Traditional router/VPN appliances buckle under the load, forcing the quest for higher performance solutions that won’t break the bank. High-performance IPSec is an application where VPP clearly shines – especially when compared to traditional solutions underpinned by kernel-based, single packet at a time processing approaches. In fact, one vendor who has productized VPP reports observing the following performance numbers (based on AES-GCM-128 encrypted IMIX traffic being processed by a stock Intel® Xeon® Gold 6130 CPU @ 2.10GHz CPU): 3.07 MPPS (8.86 Gbps) (QAT assist) 2.13 MPPS (6.14 Gbps) (no QAT assist) That was on a single core. For more on Improves Multicore IPSec click below.

Introducing Universal Deep Packet Inspection (UDPI), a new Project

By Ni, Hongjun | Feb 28 2020,

The Universal Deep Packet Inspection (UDPI) project is a reference framework to build a high performance solution for Deep Packet Inspection, integrated with the general purpose VPP stack. It leverages industry regex matching library to provide a rich set of features, which can be used in IPS/IDS, Web Firewall and similar applications. It also can be integrated into 5G, Edge, and Cloud Networking for application based services. The initial code contributions are from Intel and Travelping. So far, there are 17 organizations joined and 20 committers, including Intel, ZTE, China Telecom, HuachenTel, Inspur, Yxlink, Sunyainfo, Tencent, China Unicom, Huawei, QingCloud, Netgate, Alibaba, 360, Trend Micro, Nokia, HAOHAN Data.

Fast Data Project’s Vector Packet Processor (VPP) Release 20.01

By Linux Foundation | Jan 31 2020,

SAN JOSE – (“Fido”) – an open source project within The Linux Foundation’s LF Networking (LFN) – announced the availability of Vector Packet Processor (VPP) software release 20.01. With release 20.01 VPP includes multiple queue/core support with all it’s drivers including Linux TAPv2. End to end Generic Segment Offload (GSO) is also now supported. The VPP host stack supports GSO for TCP and at the driver level, VPP supports GSO across vmxnet3 on esxi, linux tap devices, and vhost-user devices for virtualization. This significantly improves VPP interaction and performance with Linux, and container solutions like Kubernetes. The same can be also said of the VPP interface with Virtual Machines whether it be with vhost (QEMU) or vmxnet3 (VMware). For an example of how using multiple queues/cores improve packet throughput let’s examine these impressive performance numbers from the Continuous System Integration and Test (CSIT) tests. VPP performance is continuously being …