1 Vector Packet Processing (VPP)

2 Hardware Acceleration

3 Programmability

4 Integration with Other Systems


1 Vector Packet Processing (VPP)

At the heart of fd.io is Vector Packet Processing (VPP) technology.

In development since 2002, VPP is production code currently running in shipping products. It runs in user space on multiple architectures including x86, ARM, and Power architectures on both x86 servers and embedded devices.  The design of VPP is hardware, kernel, and deployment (bare metal, VM, container) agnostic.  It runs completely in userspace.

VPP helps FD.io push extreme limits of performance and scale. Independent testing shows that, at scale, VPP-powered FD.io is two orders of magnitude faster than currently available technologies.

VPP reads the largest available vector of packets from the network IO layer.

Packet Vector

VPP then processes the vector of packets through a Packet Processing graph.

Packet Processing

Rather than processing the first packet through the whole graph, and then the second packet through the whole graph, VPP instead processes the entire vector of packets through a graph node before moving on to the next graph node.

Because the first packet in the vector warms up the instruction cache, the remaining packets tend to be processed at extreme performance.  The fixed costs of processing the vector of packets are amortized across the entire vector.  This lead not only to very high performance, but also statistically reliable performance.  If VPP falls a little behind, the next vector contains more packets, and thus the fixed costs are amortized over a larger number of packets, bringing down the average processing cost per packet, causing the system to catch up.  As a result, throughput and latency are very stable.  If multiple cores are available, the graph scheduler can schedule (vector,graph node) pairs to different cores.

A rough everyday analogy to this would be to consider the problem of a stack of lumber where each piece of lumber needs to be cut, sanded, and have holes drilled in it.  You can only have one tool in your hand at a time (analogous to the instruction cache).  You are going to finish cutting, sanding, and drilling the lumber faster if you first pick up the saw and do all of your cutting, then pick up the sander and do all of your sanding, and then pick up the drill and do all of your drilling.  Picking up each tool in order for each piece of lumber is going to be much slower.

The graph node architecture of VPP also makes for easy extensibility.  You can build an independent binary plugin for VPP from a separate source code base (you need only the headers).  Plugins are loaded from the plugin directory.  A plugin for VPP can rearrange the packet graph and introduce new graph nodes.  This allows new features to be introduce via the plugin, without needing to change the core infrastructure code.

Plugin Nodes

2 Hardware Acceleration

This same graph node architecture also allows FD.io to dynamically take advantage of hardware acceleration when available, allowing vendors to continue to innovate in hardware without breaking the “run anywhere” promise of FD.io’s software. 

Logically, you can think of accelerating hardware as replacing some of the software graph nodes with hardware that performs the same function, only faster.  Because a hardware accelerator vendor can enable this by simply providing a plugin that acts as an input node that hands off to the first graph node that is to perform a function in software, or by providing an output node which is handed off to as soon as the software processing is finished.  In this way, accelerating hardware can be used if present, but functionality continues if it is missing or its resources are exhausted.  This architecture allows maximum space for innovation by hardware accelerators.

Hardware Accelerator

3 Programmability

The VPP Technology also provides a very high performance low level API.  The API works via a shared memory message bus.  The messages passed along the bus are specified in a simple IDL (Interface Definition Language) which is used to create C client libraries and Java client libraries.  Support for generating bindings in additional languages could be added.  These client libraries make it very easy to write external applications that programmatically control VPP.  The shared memory message bus approach is very high performance, for example, allowing an external application to program 750k routes/s.

Local Programming

Remote programmability can be achieved by having as your external app a Management Agent that exposes your SDN protocol of choice.

Remote Programming

FD.io is agnostic as to Management Agent and high level API.  Anyone can bring a Management Agent that meets their needs.  There is a ‘Honeycomb’ agent available at launch which exposes yang models for VPP functionality via netconf and restconf available currently.  A controller which supports netconf/yang, such as OpenDaylight can ‘mount’ the Honeycomb Management Agent.


4 Integration With Other Systems

If the controller supports OpenStack Neutron (as OpenDaylight does) this provides a simple story for Openstack to VPP integration.