DPUs vs. SmartNICs: What storage admins need to know

2022-10-16 00:06:29 By : Ms. Linda Yin

Cookie-cutter environments are no longer adequate for many organizations, especially those that process large amounts of data. SmartNICs are right for some companies, but others might need hyperscale DPUs.

Network interface cards have evolved over time to meet increased network demands, and now, there are several types. The basic or foundational NIC is what most network and server administrators are familiar with. It commonly comes with the server, and it supports 1 Gbps, 10 Gbps, 25 Gbps and, sometimes, even 50 Gbps. For most standard applications -- virtualized or not -- these NICs are fine. They're the least expensive and should be more than adequate. They are colloquially called "dumb" NICs because they passively hand off network packet processing to the server CPU. However, as networking speeds have increased, so has the burden of network packet processing on the server CPU.

The increasing popularity of even higher speeds -- such as 100 Gbps, 200 Gbps and even 400 Gbps networks -- has changed the ecosystem, especially for storage networking. These much higher speeds have placed processing loads that often exceed 30% on the server CPUs. Server CPU cycles for packet processing cannot be used for application processing, which led to offload NICs.

Offload NICs have come to market several times in the past, but this time is different. Historically, as CPUs gained transistors following Moore's law, offload NICs became an expensive and unnecessary option. However, Moore's law has been slowing to a crawl over the past few years. That slowdown made offload NICs viable. They tend to offload network traffic functions, such as packet processing in the TCP/IP stack. This frees up the server CPU to return cycles back to the applications.

Offload NICs are useful when those standard NICs cause the servers to slow down. When NICs bog down the server, it reduces the number of VMs or containers that can run effectively, which means organizations need an offloading solution -- or more physical servers. It's easy to justify the higher price of offload NICs when they reduce the number of physical servers required.

Many organizations determined that offload NICs were not enough. This led to SmartNICs, which do more than just offload the TCP/IP stack. They're somewhat more flexible than offload NICs, and they have a more programmable pipeline. SmartNICs offload more network processing from the server CPU. In fact, they have their own CPU, memory and OS. What they offload varies by vendor, but SmartNICs can offload tasks such as network compression and decompression, encryption and decryption, and even security.

SmartNICs usually cost more than offload NICs. But, when servers are bogged down again from processing compression and decompression or encryption and decryption, instead of adding more servers, SmartNICs become an obvious choice.

Still, as higher networking speeds have proliferated across IT ecosystems -- especially in storage networking -- IT admins have been looking for more. The next turn of the screw has been the data processing unit (DPU).

DPUs are a major evolution of the SmartNIC. The DPU includes the offload, flexible programmable pipeline, processing and CPU of SmartNICs. But the DPU is the network infrastructure endpoint, not the server it resides in. DPUs include custom chips and, in some cases, customized field-programmable gate arrays or custom application-specific integrated circuits. A DPU can support much more than a SmartNIC, including networking based on P4 programmable pipelines, stateful Layer 4 firewalls, L2/L3 networking, L4 load balancing, storage routing, storage analytics and VPNs. DPU functionality varies by vendor. Some of the major players in the market in 2022 are Fungible, AMD Pensando and Marvell.

DPUs help support high-performance storage. They reduce shared storage networking issues and provide storage latencies equivalent to embedded NVMe storage media within servers. That's a significant accomplishment, but it still might not be enough for high-performance storage networking. The issue this time is the switched network.

Switches cannot support massive scale or hyperscale, which didn't exist until the advent of hyperscalers, such as Meta, and public cloud service providers, such as AWS, Azure, Google Cloud Platform, Oracle, IBM and Alibaba. This shortcoming of switches has started to rear its ugly head as organizations have discovered the intrinsic value of their data, which they analyze and mine with analytics databases, machine learning and AI.

The amount of data being analyzed ranges from petabytes to exabytes -- volumes that would have been unheard of just a few years ago. In these data analytics processes, latency matters a lot. The leaf architecture of switches ultimately adds too much latency at hyperscale levels, especially tail latencies. The unpredictable latencies are even worse. This brought about the development of hyperscale DPUs.

DPUs are ideal for storage networking. They cost more than the other NICs, but they do a lot more and can make storage systems more efficient with lower latencies. That can reduce the number of storage controllers required for a given high-performance application.

Hyperscale DPUs are extremely programmable, and they eliminate east-west switching. They do not eliminate north-south switching. They're generally deployed in a torus mesh and, potentially, a dragonfly or even a slim fly. Rockport Networks is one hyperscale DPU vendor.

Hyperscale DPUs do everything that DPUs do and more. They reduce tail-end latencies and predictable latencies, but they do not necessarily cost more than DPUs. And they eliminate a lot of cost by significantly reducing the switch infrastructure.

Disaster recovery is a complex and high-stakes operation. When healthcare data is in the mix, a good DR plan is even more crucial...

Incident management is critical to ensure that businesses can deal with unplanned disruptive events. Find out how the ISO:22320:...

New Everbridge CEO David Wagner details his areas of focus for the company. Investor Ancora has suggested that a private equity ...

The current and former Cohesity CEOs seek to bring the company to 'the next level.' Plans include melding backup and security as ...

How well do you know RAID? It may be time to test your know-how. Find out how much you know about RAID levels, origins and their ...

The LTO Program laid out an updated roadmap showing that tape could break the 1 petabyte barrier within the next decade, keeping ...

File server reporting within File Server Resource Manager can help admins identify problems and then troubleshoot Windows servers...

Administrators who manage many users can go one step further toward streamlining license assignments by taking advantage of a new...

ServiceNow doubled down on its commitment to take the complexity out of digital transformation projects with a new version of its...

All Rights Reserved, Copyright 2000 - 2022, TechTarget Privacy Policy Cookie Preferences Do Not Sell My Personal Info