In-network Computing with SmartNICs for HPC Applications
Supercomputing Conference Tutorial
St. Louis, Missouri
November 16, 2025
Overview
Data Processing Units (DPUs) are programmable processors designed to offload and accelerate infrastructure workloads and data processing. This tutorial introduces the NVIDIA BlueField-3 DPU and examines its programming models including the DOCA SDK, P4, and DPDK. It also demonstrates High-performance Computing (HPC) workloads that can be offloaded to the DPU.
Audience
This tutorial is intended for HPC users, application developers, researchers and developers of programming models and communication libraries, as well as tool developers who are interested in leveraging next-generation SmartNICs for HPC.
Tutorial Goals
By participating in this comprehensive tutorial, attendees will gain:
- An understanding of asynchronous programmable engines, such as SmartNICs, and their evolution in HPC architectures, including an overview of current efforts by major vendors such as NVIDIA, Intel, and AMD.
- Familiarity with programming models for SmartNICs, such as vendor-supported frameworks like P4 and DOCA, OpenMP offloading, and communication offloading with MPI.
- Practical knowledge of leveraging SmartNICs for in-line packet processing, communication offload optimizations, storage optimizations, and algorithmic changes in applications.
- Real-world application experiences and mini-apps case studies that leverage SmartNICs and DPUs
- Hands-on experience with exercises covering a variety of application examples, including tutorials on P4 and DOCA features, blocking and nonblocking MPI collective offload operations, OpenMP offload for DPUs, and using accelerators like Data Path Accelerators (DPAs).
Pre-requisites
Connectivity to the Internet and a browser to access the online virtual platform. Attendees will be provided with an account to access USC’s NETLAB system: https://netlab.cec.sc.edu/
Agenda
Sunday, November 14
| Time (GMT) | Topic | Presenter |
|---|---|---|
| 8:30 - 8:40 | Introduction and Attendee Survey; Logistics | Jeffrey Young (Gatech), Elie Kfoury (USC), Jorge Crichigno (USC), Richard Graham (NVIDIA), Oscar Hernandez (ORNL), Antonio Peña (BSC), Mariam Kiran (ORNL), Aaron Jezghani (Gatech) |
| 8:40 - 9:00 | Communication Offloading and SmartNIC and Overview | Jeffrey Young, Elie Kfoury |
| 9:00 - 9:30 | SmartNIC Use Cases | Richard Graham, Jeffrey Young, Oscar Hernandez, Mariam Kiran |
| 9:30 - 10:00 | Supporting Infrastructure SW - DOCA and P4, HPC Programming Approaches | Richard Graham, Elie Kfoury, Jorge Crichigno, Antonio Peña |
| 10:00 - 10:30 | BREAK | |
| 10:30 - 11:15 | Supporting Infrastructure SW - DOCA and P4, HPC Programming Approaches | Elie Kfoury, Jorge Crichigno |
| 11:15 - 11:45 | Hands on with DOCA and P4, DPA example | Jeffrey Young, Elie Kfoury, Jorge Crichigno, Richard Graham, Oscar Hernandez, Antonio Peña, Mariam Kiran, Aaron Jezghani |
| 11:45 - 12:00 | Tutorial Survey / Wrap-up | |
| Slides [ppt, pdf] | ||