The Programmable Data Plane

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Programmable data plane technologies enable the systematic reconfiguration of the low-level processing steps applied to network packets and are key drivers toward realizing the next generation of network services and applications. This survey presents recent trends and issues in the design and implementation of programmable network devices, focusing on prominent abstractions, architectures, algorithms, and applications proposed, debated, and realized over the past years. We elaborate on the trends that led to the emergence of this technology and highlight the most important pointers from the literature, casting different taxonomies for the field, and identifying avenues for future research.

Similar Papers
  • Research Article
  • Cite Count Icon 2
  • 10.3389/fcomp.2024.1493399
ML-NIC: accelerating machine learning inference using smart network interface cards
  • Jan 6, 2025
  • Frontiers in Computer Science
  • Raghav Kapoor + 2 more

Low-latency inference for machine learning models is increasingly becoming a necessary requirement, as these models are used in mission-critical applications such as autonomous driving, military defense (e.g., target recognition), and network traffic analysis. A widely studied and used technique to overcome this challenge is to offload some or all parts of the inference tasks onto specialized hardware such as graphic processing units. More recently, offloading machine learning inference onto programmable network devices, such as programmable network interface cards or a programmable switch, is gaining interest from both industry and academia, especially due to the latency reduction and computational benefits of performing inference directly on the data plane where the network packets are processed. Yet, current approaches are relatively limited in scope, and there is a need to develop more general approaches for mapping offloading machine learning models onto programmable network devices. To fulfill such a need, this work introduces a novel framework, called ML-NIC, for deploying trained machine learning models onto programmable network devices' data planes. ML-NIC deploys models directly into the computational cores of the devices to efficiently leverage the inherent parallelism capabilities of network devices, thus providing huge latency and throughput gains. Our experiments show that ML-NIC reduced inference latency by at least 6 × on average and in the 99th percentile and increased throughput by at least 16x with little to no degradation in model effectiveness compared to the existing CPU solutions. In addition, ML-NIC can provide tighter guaranteed latency bounds in the presence of other network traffic with shorter tail latencies. Furthermore, ML-NIC reduces CPU and host server RAM utilization by 6.65% and 320.80 MB. Finally, ML-NIC can handle machine learning models that are 2.25 × larger than the current state-of-the-art network device offloading approaches.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/latincom50620.2020.9282299
Enabling Partial Offload of Virtualized Network Functions into the Programmable Data Plane
  • Nov 18, 2020
  • Leonardo Da C Marcuzzo + 1 more

One of the key aspects hindering NFV adoption is the VNF performance, which is well below middleboxes. The offload of parts of a VNF into programmable data planes presents an alternative to mitigate this loss of performance. However, currently offload is only used locally on SmartNICs or in parts of an entire SFC, in a way that there are no platforms capable of offloading parts of a single VNF into programmable devices. In this paper we propose an architecture capable of managing the offload of elements of a VNF into programmable network devices. The architecture is composed of a VNF Platform with offload support, and a offload manager which communicates with the underlying infrastructure in order to allow the process. A prototype of the architecture was developed and evaluated, with results showing performance gains even in a completely virtualized test scenario, demonstrating the benefits provided by a offload platform.

  • Single Book
  • Cite Count Icon 49
  • 10.1007/b117975
Digital Design and Implementation with Field Programmable Devices
  • Jan 1, 2005
  • Zainalabedin Navabi

This book is on digital system design for programmable devices, such as FPGAs, CPLDs, and PALs. A designer wanting to design with programmable devices must understand digital system design at the RT (Register Transfer) level, circuitry and programming of programmable devices, digital design methodologies, use of hardware description languages in design, design tools and environments; and finally, such a designer must be familiar with one or several digital design tools and environments. Books on these topics are many, and they cover individual design topics with very general approaches. The number of books a designer needs to gather the necessary information for a practical knowledge of design with field programmable devices can easily reach five or six, much of which is on theoretical concepts that are not directly applicable to RT level design with programmable devices. The focus of this book is on a practical knowledge of digital system design for programmable devices. The book covers all necessary topics under one cover, and covers each topic just enough that is actually used by an advanced digital designer. In the three parts of the book, we cover digital system design concepts, use of tools, and systematic design of digital systems. In the first chapter, design methodologies, use of simulation and synthesis tools and programming programmable devices are discussed. Based on this automated design methodology, the next four chapters present the necessary background for logic design, the Verilog language, programmable devices, and computer architectures.

  • Conference Article
  • Cite Count Icon 96
  • 10.1109/hpsr.2018.8850761
A Survey on the Programmable Data Plane: Abstractions, Architectures, and Open Problems
  • Jun 1, 2018
  • Roberto Bifulco + 1 more

Programmable switches allow the packet processing behavior to be applied to transmitted packets, including the type, sequence, and semantics of processing operations, to be reconfigured on the fly in a systematic fashion. As such, programmable switches are the key to realize the next-generation of network services and applications, including software-defined networking, 5G, IoT, and massive-scale cloud computing. This paper presents a survey on the recent trends and issues in the design and implementation of programmable network devices, focusing on the prominent abstractions and architectures proposed, debated, realized, and deployed during the last 10 years. First we describe the anatomy of a programmable switch, then we highlight the most important pointers from the literature and cast different taxonomies for the field, and finally we sketch open issues and possible future research directions.

  • Conference Article
  • Cite Count Icon 3
  • 10.1109/iscas48785.2022.9937268
An FPGA-based HW/SW Co-Verification Environment for Programmable Network Devices
  • May 28, 2022
  • Mengyue Su + 4 more

Bugs in network devices translate to financial losses for the service providers and degrade the quality of experience for the users. Simulation tools cannot guarantee complete fault coverage as bugs can manifest at any time in live hardware. To mitigate these issues, we propose a novel hardware/software (HW/SW) co-verification tool that targets programmable dataplane network devices. The system integrates cycle-accurate software simulation with a hardware implementation. For the software simulation, open-source tools such as CocoTB and GHDL were used. The Design Under Test (DUT) and our test interfaces are embedded in programmable hardware. Data from the software can be inserted and then extracted in real-time from the input/output (I/O) ports of the DUT. To achieve this functionality the hardware design uses data insertion and extraction blocks which also support assertions. For the hardware implementation, reported experiments have been conducted on a NetFPGA-SUME platform. When a packet flows through the NetFPGA and triggers an assertion, the data present in the DUT at that time can be captured, and sent back to the simulator for further analysis and replay. Each of our design block consumes less than 1% of the available resources on the FPGA.

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/lanman.1999.939963
Dynamic classification in silicon-based forwarding engine environments
  • Nov 21, 1999
  • R Jaeger + 4 more

Current network devices enable connectivity between end systems with support for routing with a defined set of protocol software bundled with the hardware. These devices do not support user customization or the introduction of new software applications. Programmable network devices allow for the dynamic downloading of customized programs into network devices allowing for the introduction of new protocols and network services. The Oplet Runtime Environment (ORE) is a programmable network architecture built on a Gigabit Ethernet L3 Routing Switch to support downloadable services. Complementing the ORE, we introduce the JFWD API, a uniform, platform-independent portal through which application programmers control the forwarding engines of heterogeneous network nodes (e.g., switches and routers). Using the JFWD API, an ORE service has been implemented to classify and dynamically adjust packet handling on silicon-based network devices.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 28
  • 10.3390/s21155199
Achieving Low Latency Communications in Smart Industrial Networks with Programmable Data Planes
  • Jul 31, 2021
  • Sensors (Basel, Switzerland)
  • Asier Atutxa + 4 more

Industrial networks are introducing Internet of Things (IoT) technologies in their manufacturing processes in order to enhance existing methods and obtain smarter, greener and more effective processes. Global predictions forecast a massive widespread of IoT technology in industrial sectors in the near future. However, these innovations face several challenges, such as achieving short response times in case of time-critical applications. Concepts like in-network computing or edge computing can provide adequate communication quality for these industrial environments, and data plane programming has been proved as a useful mechanism for their implementation. Specifically, P4 language is used for the definition of the behavior of programmable switches and network elements. This paper presents a solution for industrial IoT (IIoT) network communications to reduce response times using in-network computing through data plane programming and P4. Our solution processes Message Queuing Telemetry Transport (MQTT) packets sent by a sensor in the data plane and generates an alarm in case of exceeding a threshold in the measured value. The implementation has been tested in an experimental facility, using a Netronome SmartNIC as a P4 programmable network device. Response times are reduced by 74% while processing, and delay introduced by the P4 network processing is insignificant.

  • Research Article
  • Cite Count Icon 28
  • 10.1109/tnsm.2022.3212913
MAP4: A Pragmatic Framework for In-Network Machine Learning Traffic Classification
  • Dec 1, 2022
  • IEEE Transactions on Network and Service Management
  • Bruno Missi Xavier + 3 more

Self-driving networks guided by machine-learning (ML) algorithms are the driving force for building networks of the future. ML is effective at making inferences about data that is too complex or too unpredictable for humans. The network softwarization enabled by a deep programmability approach opens up new opportunities to deploy ML at the programmable data plane. In this paper, we introduce the MAP4 as a framework that explores the feasibility of mapping ML models in programmable network devices. To achieve this, we rely on the P4 language to deploy a pre-trained model into a programmable switch, utilizing the ML model to accurately classify flows at line rate. Our approach demonstrates that ML models working as classifiers can better fit the data by using the new levels of network programmability from the P4 language. The results showed that with few packets, most of the flows are properly classified. In some use cases, with two packets in the flow, 97% of traffic can be correctly classified, and all classes are properly labeled with a maximum of four packets.

  • Research Article
  • Cite Count Icon 21
  • 10.1109/tnsm.2020.3040011
Toward In-Network Event Detection and Filtering for Publish/Subscribe Communication Using Programmable Data Planes
  • Dec 7, 2020
  • IEEE Transactions on Network and Service Management
  • Jonathan Vestin + 3 more

Industrial Internet of Things (I-IoT) applications require a large number of sensor data to be processed under tight delay and jitter constraints. In such applications, flexible event detection and fast reaction to critical events is an important building block. Traditional approaches use either proprietary networks and dedicated hardware or transmit sensor data towards processing elements in the Cloud or at the Network Edge, using distributed stream processing frameworks. For scalability, a large number of servers are needed and processing on commodity CPUs typically involves high and unpredictable latency. In this article, we explore how programmable data planes can be used to detect events flexibly and trigger customized and programmable actions directly from the switch program or the programmable network interface card (SmartNIC). We present FastReact-PS, an event-based publish/subscribe I-IoT processing framework in P4 language, which can be flexibly customized from the control plane. Together with stateful processing, FastReact-PS supports windowed time series analysis as well as complex event detection and processing based on Boolean logic directly in the data plane of newly emerging programmable networking devices. The logic can be adjusted dynamically from the control plane without the need for recompilation. We implement FastReact-PS in P4 and evaluate it on both a SmartNIC and a DPDK-based software switch running in user space. Our evaluation shows that the latency is reduced by one order of magnitude compared to end-host based approaches at significantly lower jitter while being scalable to processing up to 11 million events per second.

  • Research Article
  • 10.5075/epfl-thesis-4171
Self-replication of complex digital circuits in programmable logic devices
  • Jan 1, 2008
  • Infoscience (Ecole Polytechnique Fédérale de Lausanne)
  • Joël S Rossier

Self-replication of complex digital circuits in programmable logic devices

  • Addendum
  • Cite Count Icon 1
  • 10.1016/j.matpr.2021.01.921
WITHDRAWN: Design and execution of programmable logic device using quantum dot cellular automata
  • Mar 1, 2021
  • Materials Today: Proceedings
  • G Mahendran + 5 more

WITHDRAWN: Design and execution of programmable logic device using quantum dot cellular automata

  • Conference Article
  • Cite Count Icon 72
  • 10.1109/icccn.2017.8038396
HyperV: A High Performance Hypervisor for Virtualization of the Programmable Data Plane
  • Jul 1, 2017
  • Cheng Zhang + 4 more

P4 is a domain specific language designed to define the behavior of a programmable data plane. It facilitates offloading hardware-suitable Network Functions (NFs) to a data plane. Consequently, NFs can maximally benefit from high performance of hardware devices, meanwhile more CPU power can be reserved for user applications. However, since the programmable data plane provides an NF with an exclusive network context, different NFs cannot operate on the same data plane simultaneously. Besides, it is hardly possible to dynamically reconfigure programmable network devices without interrupting the operation of a data plane. Therefore, we propose HyperV, a high performance hypervisor for virtualization of a P4 specific data plane, to provide both non-exclusive and uninterrupted features.We implemented HyperV based on a P4-BMv2 target and a DPDK target respectively. Then we evaluated BMv2-target HyperV by comparing with Hyper4, a recently proposed hypervisor, and evaluated DPDK- target HyperV by comparing with PISCES and Open vSwitch. Results show that BMv2- target HyperV averagely prevails over Hyper4 2.5x in performance while reducing resource usage by 4x. DPDK-target HyperV performs comparably to Open vSwitch and PISCES, with the worst case of a throughput penalty in less than 7\%, while providing a powerful capability of virtualization which neither of them provides.

  • Research Article
  • Cite Count Icon 12
  • 10.1109/mwc.003.2200060
Functional Split of In-Network Deep Learning for 6G: A Feasibility Study
  • Oct 1, 2022
  • IEEE Wireless Communications
  • Jia He + 4 more

In existing mobile network systems, the data plane (DP) is mainly considered a pipeline consisting of network elements end-to-end forwarding user data traffics. With the rapid maturity of programmable network devices, however, mobile network infrastructure mutates towards a programmable computing platform. Therefore, such a programmable DP can provide in-network computing capability for many application services. In this paper, we target to enhance the data plane with in-network deep learning (DL) capability. However, in-network intelligence can be a significant load for network devices. Then, the paradigm of the functional split is applied so that the deep neural network (DNN) is decomposed into sub-elements of the data plane for making machine learning inference jobs more efficient. As a proof-of-concept, we take a Blind Source Separation (BSS) problem as an example to exhibit the benefits of such an approach. We implement the proposed enhancement in a full-stack emulator and we provide a quantitative evaluation with professional datasets. As an initial trial, our study provides insightful guidelines for the design of the future mobile network system, employing in-network intelligence (e.g., 6G).

  • Research Article
  • Cite Count Icon 5
  • 10.1016/j.snb.2021.130538
An automated low-cost modular hardware and software platform for versatile programmable microfluidic device testing and development
  • Aug 8, 2021
  • Sensors and Actuators B: Chemical
  • Giorgio Gianini Morbioli + 3 more

An automated low-cost modular hardware and software platform for versatile programmable microfluidic device testing and development

  • Research Article
  • Cite Count Icon 1
  • 10.7717/peerj-cs.2382
A multi-queue-based ECN marking strategy for multi-class QoS guarantee in programmable networks.
  • Oct 31, 2024
  • PeerJ. Computer science
  • Yazhi Liu + 3 more

Currently, network applications are experiencing explosive growth, and various types of network applications are showing a trend of varied demands for quality of network service. However, the existing Explicit Congestion Notification (ECN) marking methods have not taken into account the diversified Quality of Service (QoS) requirements of network applications. This article introduces a multi-queue ECN marking strategy targeting multiple QoS guarantees. The strategy utilizes virtual queues and dynamic weighted round-robin scheduling to achieve traffic partitioning in a programmable data plane. It constructs a multi-queue, multi-class QoS queuing model based on the QoS requirements of different traffic and network conditions. The model is solved by real-time to obtain the ECN marking thresholds and round-robin weights for different queues, in order to achieve dynamic QoS requirements of different network applications. We implemented this strategy in Mininet and BMv2, and compared it with DCQCN, P4QCN, and TCN. The experimental results indicate that this policy demonstrates good performance in terms of queue length, RTT, and throughput, while also ensuring fairness between traffics. Results of the experiment indicate that the proposed approach is superior to DCQCN and P4QCN in the field of performance fluctuation and rapid feedback, and it exhibits notable advantages over TCN, and also ensures the fairness of traffic.

Save Icon
Up Arrow
Open/Close