Selecting the Right Coolant Distribution Unit for Your AI Data Center
Selecting the right coolant distribution unit (CDU) for your data center is essential for achieving peak performance, efficiency and reliability, especially with today’s AI racks producing record-high heat loads. To make an informed decision, it is essential to evaluate critical factors related to your facility, CDU performance capabilities, CDU features including redundancy, monitoring and serviceability, and manufacturing location and practices. This article outlines how to navigate these considerations and select the best CDU for your needs.
Facility Considerations
Choosing a CDU starts with a thorough understanding of your facility's design and constraints. Consider the following:
- Facility Water Availability: The availability of facility water will drive the decision between the two main categories of row-based CDUs: liquid-to-liquid or liquid-to-air. Liquid-to-air CDUs are preferable if there is limited facility water available for ease of deployment. If facility water is already plumbed into the area where the CDUs will be installed (or is feasible to be brought in), liquid-to-liquid CDUs are preferable due to their increased capacity.
- Physical Deployment: The size and weight of the CDU must align with your building’s constraints, such as elevator capacity, floor loading limits, and available space. Proper assessment ensures smooth installation and operation, avoiding costly modifications or downtime. CDUs are designed to be installed among the server racks, within the data center hot/cold aisle, or outside the data center halls in the facility or mechanical room.
- Secondary Fluid Network (SFN) Buildout: Efficiently planning the connection between the row-based CDUs and the racks is crucial for efficient operation due to its impact on the overall system pressure drop, serviceability, and scalability of the system.
CDU Performance
Two main performance points to consider are secondary flow rate and overall heat load capability.
- Secondary Flow Rate: The CDU’s flow rate, driven by pump performance, directly affects heat removal efficiency from the chip. As Thermal Design Powers (TDPs) rise, the demand for higher flow rates increases. A typical target a CDU must meet to cool AI systems is 1.5 LPM per 1 kW of heat dissipated. Along with the higher flow rates, system pressure drops are trending upward, so ensuring adequate pump performance is crucial.
- Heat Load Capability: The performance of heat exchangers or coils determines the CDU's ability to dissipate heat from the Direct Liquid Cooling (DLC) system back to the facility. Lack of capacity and efficiency at the heat exchanger or coil level drives up the temperature of fluid going to the IT and reduces the heat load that can be handled. An important heat load metric to consider is approach temperature, which is the temperature difference between the technical fluid cooling the server components and the supply fluid provided by the facility cooling the CDU. The lower the approach temperature at a given heat load, the better the cooling efficiency of a CDU.
Redundancy, Monitoring, and Serviceability
When a DLC system goes down, there is the potential to lose all supported systems within a matter of minutes, so it is important to consider the following features of a CDU solution:
- Redundancy: Prevent downtime by ensuring your CDUs have internal redundancy within the unit’s critical components (for example pumps, filters, power supplies, sensors) and amongst multiple CDUs working together. An integrated ultracapacitor and ATS also ensure uninterrupted operations.
- Monitoring and Interfaces: Advanced monitoring systems alert operators to potential issues in real-time. Interfaces like MODBUS, SNMP, SFTP, SSH, and SMTP provide comprehensive oversight and control.
- Serviceability: Simple design and easy access to critical components ensure swift maintenance and minimal disruption, maintaining high system uptime and efficiency.Being able to service a CDU within the data center aisle (from the front or the back) without removing side panels or removing the CDU completely increases serviceability.
Manufacturing and Sourcing
Consider a CDU manufacturer’s sourcing practices as well as its location:
- Manufacturing Location: When CDUs are manufactured in Canada, lead times are significantly more stable. Additionally, check the manufacturer’s ability to collaborate between its headquarters and manufacturing facility. Proximity results in improved collaboration and communication resulting in superior product quality and reliability.
- Integrated Engineering Approach: Purchasing from a company deeply involved in engineering DLC systems ensures comprehensive design and manufacturing processes for CDUs, enhancing overall system integration and performance.
Choose the Best CDU for Your Data Center
Selecting the appropriate CDU for your AI data center involves carefully considering several factors, including facility design, performance metrics, redundancy, serviceability and manufacturing practices. By understanding these essential aspects, you can make an informed decision that optimizes cooling performance and reliability in your data center.
Ian Reynolds
Ian Reynolds is a Senior Project Engineer with CoolIT Systems, focusing on solution design and deployment of liquid cooling solutions for high-performance computing (HPC), AI data centers and enterprise environments globally.
CoolIT Systems specializes in scalable liquid cooling solutions for the world’s most demanding computing environments, partnering with global processor and server design leaders to develop the most efficient and reliable liquid cooling solutions for their leading-edge products. Enable peak AI performance and reliability today with CoolIT’s high-density liquid-to-liquid CDU, the CHx1000.