Pushing the Boundaries of Air Cooling in High Density Environments

As power densities in each data center cabinet grows, so have the challenges of keeping IT equipment cool. While liquid cooling is often talked about as the only solution for the future, others are working on pushing the capabilities of air cooling. This launches our article series on high density IT cooling.

DCF_TechnoGuardSRCover2022-06-15_16-38-10-232x300

Get the full report.

Data centers have seen an ongoing increase in average power density, especially over the past decade. Moreover, the power density per IT cabinet has seen an even greater increase as IT Equipment (ITE) power density continues to rise to levels that seemed extreme only a few years ago, but are now the new normal. We have major ITE manufacturers which offer standard “commodity” 1U servers with 800 to 1,800 watt redundant power supplies which can require more than 1000 watts of cooling.

This is being driven by processor and IT equipment manufacturers which in turn are being held back because most mainstream data centers are hard pressed or unable to support more than 10-15 kW per rack. Processor power levels are escalating from 100 to 250 watts and chip manufacturers have product roadmaps for CPUs and GPUs that are expected to exceed 500 watts per processor in the next few years.

This trend has left many wondering if air-cooled ITE has reached its power limits and if liquid cooling is really the only long term solution. Currently, some HPC IT equipment is available as liquid or air cooled versions, however the bulk of standard off-the-shelf ITE is still air cooled. And the majority of new mainstream data centers continue to be designed for air cooled ITE.

In reality, there are multiple issues that influence how high the power density in air-cooled IT equipment in standard cabinets can be pushed. This has led to renewed efforts by ITE manufacturers and data center operators to push the boundaries of air cooling.

There are multiple issues that influence how high the power density in air-cooled IT equipment in standard cabinets can be pushed. This has led to renewed efforts by ITE manufacturers and data center operators to push the boundaries of air cooling.

The basic principles of data center cooling are well known; remove the heat load generated by IT equipment and transfer it via one or more physical mediums (air, liquids, or solids) and one of more forms of thermo-mechanical systems to reject it out of the facility. However, as power densities of IT processors and other computing related components increased, this process has become more difficult to accomplish effectively and efficiently. For purposes of this report, we will discuss air and liquid cooling, with a primary focus on air-cooled IT equipment.

Power Density – Watts per What?

One of the more common issues that get discussed is what is the power density limit for air cooled ITE. Data center power density is typically expressed as average watt per SF. This is calculated based on the critical power available to the IT Load (watts or kilowatts), divided by the cooled area occupied by the ITE (whitespace). However, this does not really reflect the conditions of the cooled space occupied by the air-cooled IT equipment.

In the example below, we will examine a hypothetical 10,000 square foot whitespace area of a perimeter cooled raised floor data center. The critical load rating is 1,000 kW and 200 IT Cabinets at 5, 7.5 and kW average power.

Table 1 – Average power density by area

Air vs Liquid Cooling

The two greatest sources of heat in IT equipment are the processors (CPU, GPU, TPU, etc.) and memory. As mentioned, processor power levels are escalating from 100 to 250 watts and chip manufacturers have product roadmaps for CPUs and GPUs that are expected to exceed 500 watts per processor in the next few years (see AHSRAE Processor roadmap) and represent 50% or more of the ITE heat load. While the power level of working memory chips (DIMM, etc.) are relatively low in comparison, the total amount of memory has increased significantly and continues to rise, and can represent 20% or more of the heat load.

As mentioned, liquid cooling is being primarily used to address the higher power processors. While there are many variations of liquid cooling technologies and methodologies, they generally fall into three categories:

Liquid Cooling for Air-Cooled ITE
Liquid Cooled ITE
Immersion Liquid Cooling

Liquid Cooling for Air-Cooled ITE

The ITE is standard unmodified equipment. The process starts at the processor chip and the heat is typically transferred to an attached air-cooled heat sink. All the heat is transferred to room based cooling units, or to close coupled cooling: Inrow cooling, Rear Door Heat Exchanger (RDHX) or other types such as overhead aisle based cooling units.

Liquid Cooled ITE

The process starts with internal components; the processor chip and memory the heat is typically transferred to an attached liquid cooled heat sink (cold plate), also sometimes referred to as direct liquid cooling. Typically, a portion (50-75%) of the heat is transferred to a liquid loop, the remainder of the heat goes to room based cooling units, or to close coupled cooling: Inrow, RDHX, or other similar systems.

Immersion Liquid Cooling

The entire ITE is submerged in a dielectric fluid, which engulfs the chassis and all the components. Virtually all the heat is effectively transferred to the fluid. These are generalized summaries of three categories. Each have advantages, limitations and trade-offs, however the more granular aspects of Liquid Cooled ITE and Immersion cooling raises the complexity, (due to the varies methodologies, fluid types and operating temperatures), which makes further discussion of these factors beyond the scope of this article series.

Crossing Thermal Boundaries – Chip to Atmosphere

In order to better understand the issues, it is important to examine the path, thermal boundaries and follow how the heat is transferred from the chip to the external heat rejection systems. Starting at the processor chip, there is typically about 1 to 1.5 square inches at the top of the case that needs to transfer 100-150-200 watts of heat to an attached heat sink (integrated heat spreader) via conduction (effectively 7,000 to 14,000 watts/SF). From there, multiple fans draw intake air from the front of the IT equipment case, directing it to the fins of the heat sink though the equipment chassis, and also cooling the other IT components and then exhausted out of the rear of the ITE case. For a typical modern 1U server with multi-processors, drawing 500 watts it would require 80-100 CFM to cool it.

When the case only has a very limited area (2-4 square inches) for the air to enter and exit the openings, the result is relatively high airflow velocities and pressures in the IT cabinet.

However, the case only has a very limited area (2-4 square inches) for the air to enter and exit the openings. This results in relatively high airflow velocities and pressures in the IT cabinet. When (40) 1U servers are stacked in a typical 42U cabinet, this creates another set of challenges (see more about this in the Rack Hygiene section in the final installment of our special report article series).

Download the entire paper, “High Density IT Cooling – Breaking the Thermal Boundaries“ courtesy of TechnoGuard, to learn more. In our next article, we’ll take a deeper look at the physics of airflow.