For AI and HPC, Data Center Liquid Cooling Is Now

No longer on the horizon, liquid cooling technology moves front and center thanks to NVIDIA and massive data center power demands.

David Chernicoff

April 5, 2024

5 min read

Grace Hopper Superchip data center racks

Here at Data Center Frontier, we’ve been talking about the need for, and the issues with, liquid cooling and support for high rack densities for quite a while.

While no one has really argued that either was unnecessary, much of the discussion has revolved around whether both were strictly niche needs, and still a ways down the pike.

Both of those are reasonable points. Much of what goes in at large-scale data centers is relatively stable. What has worked last year will continue to work next year, and major changes aren’t going to happen outside of normal hardware and facility replacement cycles.

That being said, the rapidly growing niche market surrounding AI is making people rethink their development plans.

In many cases the ability to support AI and HPC operations in the data center is no longer an add-on, but an integral part of data center planning. And being able to support liquid cooling solutions means getting a clear understanding of what’s happening right now.

The View from NVIDIA GTC

At NVIDIA’s GTC conference in March 2024, one of the hardware solutions announced was the company’s latest DGX AI supercomputer, a two-rack cluster based on the NVIDIA GB200 NVL72 liquid-cooled system, with each rack containing 18 NVIDIA Grace Hopper CPUs and 36 NVIDIA Blackwell GPUs, connected by fourth-generation NVIDIA NVLink switches.

While NVIDIA didn’t announce any power consumption figures, industry estimates place the power requirement at approximately 50 kW per rack, and it is entirely possible those are conservative numbers.

If you are a hyperscaler or a large colocation provider, you’re already providing high density racks and liquid cooling for some portion of your data center. But what about everybody else?

Wholesale changes to your physical infrastructure are expensive and unlikely in the short term, and many vendors are aware of that, and are offering ways to support your own AI infrastructure, from a couple of "U" in your rack systems to full support at scale. Many of these vendors announced products to fit just these needs at the GTC.

Is It All About the Racks?

NVIDIA has identified many of the vendors who are planning on introducing hardware to fit these needs in their blog. And quite a few of the system vendors were demonstrating their complete rack hardware on the show floor.

NVIDIA had announced prior to the event that they would be showcasing more than 500 servers from their partner, showcasing the NVIDIA GH200 Grace Hopper Superchip, in 18 racks in the MGX pavilion at the event.

Supermicro demonstrating its rack solution at NVIDIA GTC.

So Many Options Will Give Customers their Favorite Thing: Choice

But there is an even broader selection of companies that are making their mark specifically in the liquid cooling market, showcasing their capabilities to take on the highest density and most power intensive applications. Some examples of the different technologies showcased included:

At the chip-level was Zutacore at the GTC and made a significant impression with their direct-to-chip, waterless, two-phase liquid cooling system that has been designed for AI and HPC workloads. Partnering with a broad selection of vendors, from Dell to Intel to Rittal to bring their cooling technology to those companies’ HPC and AI solutions, Zutacore could be a standard bearer for how direct-to-chip cooling solutions will impact the industry.
Quanta Cloud Technology was there with the latest iteration of their QCT CoolRack, their rack-level direct-to-chip cooling solution. They announced that one of their intelligent liquid cooling rack systems could support 16 of their liquid-cooled server systems, each with two of the GH200 Superchips.
Wiwynn, who also announced their rack-level AI solutions for supporting the latest SuperChips and high density computing also drew focus to their purpose-built liquid-cooling management system their UMS 100 (Universal Management System), a modular, open design, that works with various types of liquid cooling environments from racks to immersion systems, focused on real-time monitoring and cooling energy optimization.

As this small selection of three vendors indicates, development and research into advanced liquid cooling systems is ongoing in many technology areas.

The use cases range from building new data centers from the design phase, up to retrofitting existing data centers, to deploying localized AI server implementations almost anywhere.

Bottom line: The effort to make liquid-cooled solutions practical across the entire market is happening quickly, and is no longer a stumbling block for putting an AI solution where your business needs it.

Keep pace with the fast-moving world of data centers and cloud computing by connecting with Data Center Frontier on LinkedIn, following us on X/Twitter and Facebook, and signing up for our weekly newsletters using the form below.

About the Author

David Chernicoff

David Chernicoff is an experienced technologist and editorial content creator with the ability to see the connections between technology and business while figuring out how to get the most from both and to explain the needs of business to IT and IT to business.

Building the Thermal Backbone of AI: Tracking the Latest Data Center Liquid Cooling Deals and Deployments

DeepCoolAI and Sanmina Enter Strategic Partnership to Scale AI Infrastructure with Liquid Cooling and High Density Deployments

Sponsored

DigitalBridge Launches VC Unit, Outlines Full Stack Investment Strategy

Sponsored

American Tower Adds Stonepeak as Investor in its Cloud-to-Edge Data Center Strategy

Voices of the Industry

Source: Image courtesy of Colocation America

Sponsored

Recent Trends for Colocation Providers

Samantha Walters of Colocation America shares her thoughts on four trends she's seeing in the colocation space.

Sponsored

Reshaping Energy Supply for the Data Center Value Chain

Peter Huang, Global President - Data Center & Thermal Management at bp Castrol, explains why AI isn't just consuming more power, it's demanding better power systems.

For AI and HPC, Data Center Liquid Cooling Is Now

The View from NVIDIA GTC

Is It All About the Racks?

So Many Options Will Give Customers their Favorite Thing: Choice

About the Author

David Chernicoff

Related

Building the Thermal Backbone of AI: Tracking the Latest Data Center Liquid Cooling Deals and Deployments

DeepCoolAI and Sanmina Enter Strategic Partnership to Scale AI Infrastructure with Liquid Cooling and High Density Deployments

DigitalBridge Launches VC Unit, Outlines Full Stack Investment Strategy

American Tower Adds Stonepeak as Investor in its Cloud-to-Edge Data Center Strategy

Voices of the Industry

Recent Trends for Colocation Providers

Reshaping Energy Supply for the Data Center Value Chain

Trending

Top 5 Data Center Industry Trends and Predictions for 2026

Google’s TPU Roadmap: Challenging Nvidia’s Dominance in AI Infrastructure

NVIDIA and Partners Define a Repeatable Blueprint for AI Factory Data Centers

Sponsored Picks

DigitalBridge Launches VC Unit, Outlines Full Stack Investment Strategy

American Tower Adds Stonepeak as Investor in its Cloud-to-Edge Data Center Strategy

Facing the Challenges of Today’s Modern Data Center: Know Your Site – From Hyperscale to Edge