Sponsored

The Top Causes of Data Center Downtime

In this week’s Voices of the Industry, Josh Moody, FORTRUST’s SVP of Sales and Marketing, explores the top four causes for data center downtime.

Voices of the Industry

April 11, 2016

5 min read

At the end of the day, the measuring stick for a data center is the number of unplanned downtime events and the track record of continuous uptime it has delivered. (Photo: FORTRUST)

In this week’s Voices of the Industry, Josh Moody, FORTRUST’s Senior Vice President of Sales and Marketing, explores the top causes for Data Center Downtime.

Josh Moody, SVP Sales FORTRUST

With so much riding on the support data centers provide, downtime is a serious – and often newsworthy – occurrence. Just last year, several service providers made headlines due to data center outages, showing that even the biggest names in technology aren’t immune to downtime.

According to InformationWeek, Google’s Cloud suffered an outage in early 2015 due to software issues. Later that year, Apple customers experienced a seven-hour outage that impacted 11 platforms, including iCloud Mail, Find My iPhone and iCloud Drive.

Early 2016 saw a power outage in a Verizon data center, which effectively downed JetBlue’s digital infrastructure. While the airline’s service level agreement (SLA) with Verizon noted guaranteed failover, JetBlue’s website and digital airport systems remained unavailable for around three hours.

These episodes only scratch the surface of the damage an unscheduled outage occurrence can cause. Here at FORTRUST, we pride ourselves on ensuring availability – in 2015, we passed the 14 year mark for continuous critical systems uptime at our Denver data center.

With the impact of data center downtime becoming more significant, it begs the question – what are the top causes of downtime in the data center? Let’s take a look:

What Causes Data Center Downtime?

FORTRUST Senior Vice President and General Manager Robert McClary noted in the eBook, “A Data Center Operations Guide for Maximum Reliability” that downtime can considerably reflect on a provider’s reputation.

“At the end of the day, the measuring stick for a data center is the number of unplanned downtime events and the track record of continuous uptime it has delivered,” McClary wrote.

Overall, McClary noted four main causes of unplanned data center downtime. These include:

1. Human error and poor infrastructure capacity management
2. Poor maintenance and lifecycle strategy
3. Substandard data center site selection and risk mitigation
4. All other causes

A 2013 study from the Ponemon Institute supports this. Overall, 83 percent of survey respondents were able to pinpoint the cause of an unplanned outage. Of these instances, 46 percent were caused by exceeding UPS capacity, 48 percent were the result of human error and 55 percent came due to UPS equipment failure.

Other causes of downtime can include utility outages, natural disasters, cyber attacks and a range of other problems that could interrupt regular data center processes.

Combating Top Causes of Downtime

Thankfully, there are factors clients can look for in providers that signal a commitment to uptime.

First and foremost, service providers must seek to combat the top cause of downtime: human error. One of the most effective ways to do this is with documented operational processes that lay out procedures across the entire facility. These can include procedures for standard operations, maintenance and contingency, as well as overall guidelines and methods. Each document procedure should define the process taking place, and include detailed information about how staff members should carry out those activities. Templates can also be helpful to ensure uniformity and consistency.

However, it’s not enough to simply create a library of procedures. Data center managers must ensure that staff members are disciplined, and are following each process as it is defined in the documentation. Procedures must be adequately disseminated among employees, and there should be multiple opportunities for review and validation of each activity.

Training is also a critical part of preventing human error. Each staff member should have an in-depth understanding of all data center processes. In this way, should one staff member be absent, another employee is effectively prepared to take over his or her role. It’s paramount that facility workers have the right skill set and continual training is one of the best ways to establish knowledgeable, capable staff.

The Importance of DCIM

An effective data center infrastructure management (DCIM) strategy is also critical in prevent unplanned outages. FORTRUST has implemented a range of procedures as part of our processes, which include capacity monitoring and management, threshold alerts and alarms, real-time power usage and predictive maintenance.

“DCIM can play a key role in infrastructure capacity management by providing real-time information across the critical systems infrastructure,” McClary wrote. “It can assist in the creation of templates used to provision capacity to the end-user without infringing on redundancies or having unbalanced loads.”

“DCIM as well as maintenance and lifecycle strategies can play a significant role in ensuring uptime.”

It’s also important that clients seek details about the provider’s maintenance and lifecycle strategies. An effective, comprehensive program here should include equipment and system inspections, predictive and preventative maintenance, as well as testing and corrective maintenance.

Finally, it’s critical that a client’s provider is not only aware of, but actively seeks to address, the top causes of unplanned outages. While it’s nearly impossible to predict the level of reliability within a provider’s data center, DCIM as well as robust maintenance and lifecycle strategies can play a significant role in ensuring uptime.

“The larger factors of high-availability and uptime are specific to people, process, operations, maintenance, lifecycle, risk mitigation strategies and the operation mindset that facilitates it,” McClary wrote.

Josh Moody is SVP of Sales and Marketing for FORTRUST. Josh joined FORTRUST in 2008. As Senior Vice President of Sales and Marketing, he is responsible for developing, managing and executing the overall sales strategy for FORTRUST. Josh brings more than 18 years of experience in the information technology field, including developing and executing strategic planning, management, customer relations, and vendor/partner relationships. Colorado Business Magazine identified Josh as one of the top 25 Most Powerful Sales Reps in Colorado in 2010. To find out more about FORTRUST’s track record for critical systems uptime and how its staff prevents unplanned outages, contact the company for a tour of its Denver data center.

About the Author

Voices of the Industry

Our Voice of the Industry feature showcases guest articles on thought leadership from sponsors of Data Center Frontier. For more information, see our Voices of the Industry description and guidelines.

Data Center Insights 2025 to Tackle Liquid Cooling, Infrastructure Optimization, and the Demands of the AI Era

Top 5 Data Center Industry Trends and Predictions for 2026

Sponsored

DigitalBridge Launches VC Unit, Outlines Full Stack Investment Strategy

Sponsored

American Tower Adds Stonepeak as Investor in its Cloud-to-Edge Data Center Strategy

Voices of the Industry

Source: Image courtesy of Colocation America

Sponsored

Recent Trends for Colocation Providers

Samantha Walters of Colocation America shares her thoughts on four trends she's seeing in the colocation space.

Sponsored

Reshaping Energy Supply for the Data Center Value Chain

Peter Huang, Global President - Data Center & Thermal Management at bp Castrol, explains why AI isn't just consuming more power, it's demanding better power systems.

The Top Causes of Data Center Downtime

What Causes Data Center Downtime?

Combating Top Causes of Downtime

The Importance of DCIM

About the Author

Voices of the Industry

Related

Data Center Insights 2025 to Tackle Liquid Cooling, Infrastructure Optimization, and the Demands of the AI Era

Top 5 Data Center Industry Trends and Predictions for 2026

DigitalBridge Launches VC Unit, Outlines Full Stack Investment Strategy

American Tower Adds Stonepeak as Investor in its Cloud-to-Edge Data Center Strategy

Voices of the Industry

Recent Trends for Colocation Providers

Reshaping Energy Supply for the Data Center Value Chain

Trending

AWS Scales AI Infrastructure Across Data Centers, Power, and Networks

SoftBank to Acquire DigitalBridge for $4 Billion, Doubling Down on AI Infrastructure

Top 5 Data Center Industry Trends and Predictions for 2026

Sponsored Picks

DigitalBridge Launches VC Unit, Outlines Full Stack Investment Strategy

American Tower Adds Stonepeak as Investor in its Cloud-to-Edge Data Center Strategy

Facing the Challenges of Today’s Modern Data Center: Know Your Site – From Hyperscale to Edge