Rethinking Resilience: Preparing Data Centers for the Next AI Wave

May 24, 2024
Tim Hysell, Co-founder and CEO of ZincFive, explains why the need for a resilient power infrastructure has never been more pressing for data center operators.

Recent high-profile outages have underscored the critical importance of data center resilience. Twitter's (now X) outage during Rihanna's Super Bowl Halftime Show and Microsoft's eight-hour outage affecting Teams, Outlook, and M365 caused widespread disruptions for millions of users. Even more alarming was the outage experienced by Australian telecommunications provider Optus, which led to transport delays, banking issues, and cut hospital phone lines for 12 hours, affecting over 10 million users (nearly 40% of the population) and 400,000 businesses.

The consequences of data center outages are further emphasized by Uptime Intelligence's 2024 data center outage report. The report reveals that 55% of operators experienced an outage in the past three years, with more than half of respondents reporting that their most recent significant outage cost over $100,000, and 16% stating that these outages cost more than $1 million.

As AI pushes data center energy consumption to unprecedented levels, the need for a resilient power infrastructure has never been more pressing. The International Energy Agency (IEA) projects that data center electricity usage will double by 2026, while training newer AI models consumes 50 times more electricity than previous generations. As various sectors further integrate AI into their operations, the need for the facilities that power these services to maintain resilience grows both in importance and difficulty.

Faced with these challenges, data center operators must take decisive action to enhance their resilience and adjust to the demanding requirements of AI. By addressing the most common causes of outages, specifically power issues and human error, including inadequate regular, comprehensive uninterruptible power supply (UPS) testing, data center operators can ensure the reliability and stability of their facilities in an increasingly complex and demanding technological landscape.

Resolving these issues requires examining their causes first. Power issues consistently emerge as the most common cause of serious outages, according to Uptime Intelligence's March 2024 report. A staggering 42% of respondents pointed to UPS failure as the leading cause of power-related outages, while 30% of incidents involved issues with the transfer switch to a generator, and 20% were attributed to generator failure itself.

Human error also plays a significant role in nearly 40% of data outages, highlighting the critical importance of proper training and adherence to established procedures. Among those who reported a human error-caused outage, 48% cited failure of staff to follow procedures, while 45% pointed to incorrect procedures.

These findings underscore the urgent need for data center operators to prioritize both the modernization and upkeep of their power infrastructure and their staff's ongoing training and education. Implementing comprehensive staff training and process reviews presents a significant opportunity to reduce human error-related outages.

To limit the risk of power-related outages, data center operators should regularly perform maintenance and rigorous testing under real-world conditions of backup power systems. They can also adopt more advanced and reliable UPS battery technologies, such as nickel-zinc. Unlike lead-acid and lithium-ion backup batteries, nickel-zinc batteries continue to discharge and carry the load even when a cell in the battery string becomes weak or depleted. This allows the battery string to continue operating and makes what would otherwise be an emergency into a simple note for replacement at the next planned maintenance cycle – no added maintenance costs or operational impact.

Nickel-zinc batteries offer several additional benefits to increase data centers’ reliability and efficiency. Unlike lithium-ion batteries, they are incapable of thermal runaway at the cell level and can operate reliably at higher temperatures – which can also lead to lower cooling costs. They also boast a greater power density than their counterparts, delivering the same amount of power in a significantly smaller footprint. This allows operators to save valuable space for income-generating equipment like servers and racks, while still providing ample backup power for intensive AI applications. Nickel-zinc batteries are also more sustainable than lead-acid and lithium batteries and can serve as a convenient drop-in replacement for lead-acid batteries, making a seamless transition to this more advanced technology.

As data center energy consumption continues to soar due to the increasing demands of AI and other power-intensive applications, ensuring a resilient power infrastructure has become more critical than ever. By embracing comprehensive staff training, rigorous testing and maintenance practices, and safe, reliable and sustainable battery technologies like nickel-zinc, data center operators can significantly bolster the resilience of their facilities and ensure they are well-equipped to handle the ever-increasing demands of the digital landscape. Taking proactive steps to address the root causes of outages and implementing innovative solutions helps facility operators ensure that customers can rely on them – no matter what the future holds.

About the Author

Tim Hysell

Tim Hysell is the co-founder and CEO of ZincFive and has over three decades of entrepreneurial success in founding, owning, and directing profitable business operations in renewable energy, banking, manufacturing, and medical devices. ZincFive is a leader in innovation and delivery of nickel-zinc batteries and power solutions. Contact ZincFive to learn how nickel-zinc chemistry can provide high power density and performance for mission critical applications.

Sponsored Recommendations

How Deep Does Electrical Conduit Need to Be Buried?

In industrial and commercial settings conduit burial depth can impact system performance, maintenance requirements, and overall project costs.

Understanding Fiberglass Conduit: A Comprehensive Guide

RTRC (Reinforced Thermosetting Resin Conduit) is an electrical conduit material commonly used by industrial engineers and contractors.

NECA Manual of Labor Rates Chart

See how Champion Fiberglass compares to PVC, GRC and PVC-coated steel in installation.

Electrical Conduit Cost Savings: A Must-Have Guide for Engineers & Contractors

To help identify cost savings that don’t cut corners on quality, Champion Fiberglass developed a free resource for engineers and contractors.

Getty Images, courtesy of Schneider Electric
Source: Getty Images, courtesy of Schneider Electric

Minimizing Cyber Risk and Doing it in a Simplified Way – the Key to Secure IT Infrastructure Success

Kevin Brown, SVP EcoStruxure Solutions, Secure Power for Schneider Electric, outlines two major areas of focus when it comes to cybersecurity.

White Papers

Get the full report

Boston Data Center Market

April 27, 2022
The Boston region is one of the most prominent data center markets in the northeast, despite a higher cost of power than is found in most major markets. DCF, in conjunction with...