OCP 2024 Spotlight: Meta Debuts 140 kW Liquid-Cooled AI Rack; Google Eyes Robotics to Muscle Hyperscaler GPUs
The Open Compute Project's blockbuster 2024 OCP Global Summit (Oct. 15-17) swarmed with over 7000 attendees thirsty to learn about the latest liquid cooling developments in wake of roughly the past 2 years' data center AI tsunami.
Fittingly enough, as one of the event's central highlights, Meta Engineering shared its soon-to-be-released Catalina rack design for high-density AI computing.
Catalina is built to support the latest NVIDIA GB200 Grace Blackwell Superchip to ensure capacity for the growing demands of modern AI infrastructure.
Meta notes that growing power demands from GPUs mean that open rack solutions need to support higher power capability. The Catalina platform's Orv3, a high-power rack (HPR) capable of supporting up to 140 kW, embodies this support.
As unveiled to the OCP technical community, Meta billed Catalina as the company's newest high-powered rack designed for AI workloads, based on the NVIDIA Blackwell platform full rack-scale solution, and with a design focused on modularity and flexibility.
The full platform is liquid-cooled and consists of a power shelf that supports a compute tray, switch tray, the Orv3 HPR, the associated Wedge 400 fabric switch, a management switch, battery backup unit, and a rack management controller.
Meta said it aims for Catalina’s modular, open design to empower others to customize the rack to meet their specific AI workloads, while leveraging both existing and emerging industry standards.
"Scaling AI at this speed requires open hardware solutions," the Meta Engineering team wrote on its blog. "Developing new architectures, network fabrics, and system designs is the most efficient and impactful when we can build it on principles of openness. By investing in open hardware, we unlock AI’s full potential and propel ongoing innovation in the field."
Hyperscaler Robots to Muscle Heavier GPU Racks
This year's OCP spotlight announcements also indicated a growing focus on robotics by the hyperscalers as their networks grow.
Google revealed a specific interest in using robotics to manage GPU racks, which are much heavier than many traditional cloud racks, creating new safety concerns for data center staff.
During his keynote at the 2024 OCP Global Summit, Google VP/Engineering Fellow Partha Ranganathan shared video footage of large robotic units moving through a Google data center, managing hard drives and equipment.
The video is emblematic of the broader OCP data center robotics and automation initiatives, whereby Google is working with Microsoft and Meta to "explore common operational pain points among hyperscalers and highlight specific use-cases for physical automation solutions in the near, mid, and long-term," Ranganathan said.
"Robotics can be very profoundly transformative in how you think about data center operations scaling much, much more, while also having safety and reliability," said Ranganathan, who cited opportunities for robots in moving data center racks and material, as well as monitoring, repair and servicing, and media management.
Here's the video of the OCP 2024 keynote by Google's Ranganathan that talks about the company's data center robotics efforts:
Matt Vincent
A B2B technology journalist and editor with more than two decades of experience, Matt Vincent is Editor in Chief of Data Center Frontier.