Immersion GPU System Provides AI Horsepower for Frontera

Sept. 5, 2018
The new Frontera supercomputer at the University of Texas combines several approaches to liquid cooling, including GPUs immersed in dielectric liquid coolant fluid, and x86 servers using direct water cooling to the processor.

What might the rise of artificial intelligence revolution look like in the data center? If one new ssytem is any indication, it could look like GPUs immersed in dielectric liquid coolant fluid, supporting water-cooled x86 servers.

That’s the vision put forward by the creators of Frontera, a new $60 million supercomputer to be built at the Texas Advanced Computing Center (TACC) in Austin. It is expected to be the most powerful supercomputer at any U.S. university, and continue the TACC’s history of deploying new systems ranking among the top 10 on the Top500 list of the world’s leading supercomputers.

The vast majority of data centers continue to cool IT equipment using air, while liquid cooling has been used primarily in high-performance computing (HPC). With the growing use of artificial intelligence, more companies are facing data-crunching challenges that resemble those seen by the HPC sector, which could make liquid cooling relevant for a larger pool of data center operators.

The design for Frontera reflects the leading edge of HPC efficiency. Frontera is Spanish for “frontier,” and the new supercomputer will help advance the frontiers of liquid cooling, with a hybrid system that will combine Dell EMC servers with x86 Intel processors and water-cooling systems from CoolIT, and a smaller system using NVIDIA GPUs (graphic processing units) immersed in a tank of liquid coolant from GRC (previously Green Revolution Cooling). Data Direct Networks will contribute the primary storage system, and Mellanox will provide the high-performance interconnect for Frontera.

Applying Immersion Benefits to GPUs

Anticipated early projects for Frontera include analyses of particle collisions from the Large Hadron Collider, global climate modeling, and improved hurricane forecasting and “multi-messenger” astronomy research using gravitational waves and electromagnetic radiation.

“Many of the frontiers of research today can be advanced only by computing, and Frontera will be an important tool to solve grand challenges that will improve our nation’s health, well-being, competitiveness and security.” said Dan Stanzione, TACC executive director.

A GRC immersion cooling container in action. (Source: GRC)

TACC has been a leader in the use of immersion cooling, which sinks servers in liquid to cool the components, and began working with Austin-based neighbor GRC in 2009. In 2017 this collaboration was expanded to immersion cooling for NVIDIA GPUs, test-driving a system created by server vendor Supermicro. Using immersion cooling with GPUs is a fairly recent phenomenon, but may attract interest as more companies adopt GPUs for AI and other parallel processing challenges.

“The cost savings that immersion cooling enables (on the hardware side) are extremely impressive,” TACC’s Stanzione said of the 2017 project. “Being early adopters of GRC’s immersion cooling system we have seen the technology mature rapidly over the years. And with the growing power and computing needs of AI and machine learning applications, especially with hotter and hotter GPUs, cooling is even more important for reliability.”

AI Data Crunching Boosts Density

New hardware for AI workloads is packing more computing power into each piece of equipment, boosting the power density – the amount of electricity used by servers and storage in a rack or cabinet – and the accompanying heat. The trend is challenging traditional practices in data center cooling, and prompting data center operators to adapt new strategies and designs.

The alternative is to bring liquids into the server chassis to cool chips and components.  Some vendors integrate water cooling into the rear-door of a rack or cabinet. This can also be done by immersing servers in tanks of coolant, or through enclosed systems featuring pipes and plates that bring cooling inside the chassis and directly to the processor.

An example of what the CoolIT direct liquid cooling system for Frontera will look like. (Source: CoolIT)

The latter approach will be used by the CPU-powered component of Frontera, which will features a CoolIT DLC system adapted for the Dell EMC servers. CoolIT recently shared an image on its social channels showing what a prototype of the Frontera system will look like.

“The new Frontera systems represents the next phase in the long-term relationship between TACC and Dell EMC, focused on applying the latest technical innovation to truly enable human potential,” said Thierry Pellegrino, vice president of Dell EMC High Performance Computing. “The substantial power and scale of this new system will help researchers from Austin and across the U.S. harness the power of technology to spawn new discoveries and advancements in science and technology for years to come.”

“Accelerating scientific discovery lies at the foundation of the TACC’s mission, and enabling technologies to advance these discoveries and innovations is a key focus for Intel,” said Patricia Damkroger, Vice President in Intel’s Data Center Group and General Manager, Extreme Computing Group. “We are proud that the close partnership we have built with TACC will continue with TACC’s selection of next-generation Intel Xeon Scalable processors as the compute engine for their flagship Frontera system.”

About the Author

Rich Miller

I write about the places where the Internet lives, telling the story of data centers and the people who build them. I founded Data Center Knowledge, the data center industry's leading news site. Now I'm exploring the future of cloud computing at Data Center Frontier.

Sponsored Recommendations

How Deep Does Electrical Conduit Need to Be Buried?

In industrial and commercial settings conduit burial depth can impact system performance, maintenance requirements, and overall project costs.

Understanding Fiberglass Conduit: A Comprehensive Guide

RTRC (Reinforced Thermosetting Resin Conduit) is an electrical conduit material commonly used by industrial engineers and contractors.

NECA Manual of Labor Rates Chart

See how Champion Fiberglass compares to PVC, GRC and PVC-coated steel in installation.

Electrical Conduit Cost Savings: A Must-Have Guide for Engineers & Contractors

To help identify cost savings that don’t cut corners on quality, Champion Fiberglass developed a free resource for engineers and contractors.

Anggalih Prasetya/Shutterstock.com
Source: Anggalih Prasetya/Shutterstock.com

AI in the Data Center: Building Partnerships for Success

Wesco’s Alan Farrimond explains how the right partnerships can help data centers overcome the barriers to growth and meet the demands of AI.

White Papers

Dcf Cadence No Capacity Wp Cover 2023 01 11 17 29 43

No Capacity for Change

Jan. 11, 2023
The pace of digital transformation is accelerating bringing not only opportunities, but challenges for technical professionals and digital strategists. Notably, organizations ...