Immersion GPU System Provides AI Horsepower for Frontera

The new Frontera supercomputer at the University of Texas combines several approaches to liquid cooling, including GPUs immersed in dielectric liquid coolant fluid, and x86 servers using direct water cooling to the processor.
Sept. 5, 2018
4 min read

What might the rise of artificial intelligence revolution look like in the data center? If one new ssytem is any indication, it could look like GPUs immersed in dielectric liquid coolant fluid, supporting water-cooled x86 servers.

That’s the vision put forward by the creators of Frontera, a new $60 million supercomputer to be built at the Texas Advanced Computing Center (TACC) in Austin. It is expected to be the most powerful supercomputer at any U.S. university, and continue the TACC’s history of deploying new systems ranking among the top 10 on the Top500 list of the world’s leading supercomputers.

The vast majority of data centers continue to cool IT equipment using air, while liquid cooling has been used primarily in high-performance computing (HPC). With the growing use of artificial intelligence, more companies are facing data-crunching challenges that resemble those seen by the HPC sector, which could make liquid cooling relevant for a larger pool of data center operators.

The design for Frontera reflects the leading edge of HPC efficiency. Frontera is Spanish for “frontier,” and the new supercomputer will help advance the frontiers of liquid cooling, with a hybrid system that will combine Dell EMC servers with x86 Intel processors and water-cooling systems from CoolIT, and a smaller system using NVIDIA GPUs (graphic processing units) immersed in a tank of liquid coolant from GRC (previously Green Revolution Cooling). Data Direct Networks will contribute the primary storage system, and Mellanox will provide the high-performance interconnect for Frontera.

Applying Immersion Benefits to GPUs

Anticipated early projects for Frontera include analyses of particle collisions from the Large Hadron Collider, global climate modeling, and improved hurricane forecasting and “multi-messenger” astronomy research using gravitational waves and electromagnetic radiation.

“Many of the frontiers of research today can be advanced only by computing, and Frontera will be an important tool to solve grand challenges that will improve our nation’s health, well-being, competitiveness and security.” said Dan Stanzione, TACC executive director.

A GRC immersion cooling container in action. (Source: GRC)

TACC has been a leader in the use of immersion cooling, which sinks servers in liquid to cool the components, and began working with Austin-based neighbor GRC in 2009. In 2017 this collaboration was expanded to immersion cooling for NVIDIA GPUs, test-driving a system created by server vendor Supermicro. Using immersion cooling with GPUs is a fairly recent phenomenon, but may attract interest as more companies adopt GPUs for AI and other parallel processing challenges.

“The cost savings that immersion cooling enables (on the hardware side) are extremely impressive,” TACC’s Stanzione said of the 2017 project. “Being early adopters of GRC’s immersion cooling system we have seen the technology mature rapidly over the years. And with the growing power and computing needs of AI and machine learning applications, especially with hotter and hotter GPUs, cooling is even more important for reliability.”

AI Data Crunching Boosts Density

New hardware for AI workloads is packing more computing power into each piece of equipment, boosting the power density – the amount of electricity used by servers and storage in a rack or cabinet – and the accompanying heat. The trend is challenging traditional practices in data center cooling, and prompting data center operators to adapt new strategies and designs.

The alternative is to bring liquids into the server chassis to cool chips and components.  Some vendors integrate water cooling into the rear-door of a rack or cabinet. This can also be done by immersing servers in tanks of coolant, or through enclosed systems featuring pipes and plates that bring cooling inside the chassis and directly to the processor.

An example of what the CoolIT direct liquid cooling system for Frontera will look like. (Source: CoolIT)

The latter approach will be used by the CPU-powered component of Frontera, which will features a CoolIT DLC system adapted for the Dell EMC servers. CoolIT recently shared an image on its social channels showing what a prototype of the Frontera system will look like.

“The new Frontera systems represents the next phase in the long-term relationship between TACC and Dell EMC, focused on applying the latest technical innovation to truly enable human potential,” said Thierry Pellegrino, vice president of Dell EMC High Performance Computing. “The substantial power and scale of this new system will help researchers from Austin and across the U.S. harness the power of technology to spawn new discoveries and advancements in science and technology for years to come.”

“Accelerating scientific discovery lies at the foundation of the TACC’s mission, and enabling technologies to advance these discoveries and innovations is a key focus for Intel,” said Patricia Damkroger, Vice President in Intel’s Data Center Group and General Manager, Extreme Computing Group. “We are proud that the close partnership we have built with TACC will continue with TACC’s selection of next-generation Intel Xeon Scalable processors as the compute engine for their flagship Frontera system.”

About the Author

Rich Miller

I write about the places where the Internet lives, telling the story of data centers and the people who build them. I founded Data Center Knowledge, the data center industry's leading news site. Now I'm exploring the future of cloud computing at Data Center Frontier.
ZincFive
Source: ZincFive
Sponsored
Tod Higinbotham, COO at ZincFive, explains why data centers should embrace immediate power solutions (IPS) as a way to enhance operational efficiency.
Oselote/Shutterstock.com
Source: Oselote/Shutterstock.com
Sponsored
Kevin Roof of nVent explains how wall-of-cool rear door heat exchanger (RDHX) solutions greatly simplify the installation of data center IT.
April 16, 2025
Accelsius
Powered by Accelsius, NeuCool is a complete liquid cooling solution, using a highly-efficient two-phase process and a dielectric refrigerant that is entirely safe for electronics. The NeuCool system supports 2200W+ per socket and up to 100kW per rack (80kW direct-to-chip cooling).
As racks climb toward 600 kW, vendors are delivering thermal innovations to match. Meanwhile, Dell’Oro’s revised forecast sees data center cooling and power distribution growing...
April 11, 2025
ID 358874377 © Mykhailo Polenok | Dreamstime.com
dreamstime_xxl_358874377
Is the recent emergence of Chinese AI startup DeepSeek, alongside supply chain issues for NVIDIA’s next-generation GB200 AI chips, prompting data center operators to reconsider...
Feb. 6, 2025

White Papers

Dcf Sesr Cover 2022 05 19 10 38 01 231x300
Sponsored
Over time, data center infrastructure has become infinitely more complex and more distributed. This special report, courtesy of Schneider Electric, explores the evolution of software...
May 23, 2022
Sign up for the Data Center Frontier Newsletter
Get the latest news and updates.