• About Us
  • Partnership Opportunities
  • Privacy Policy

Data Center Frontier

Charting the future of data centers and cloud computing.

  • Cloud
    • Hyperscale
  • Colo
    • Site Selection
    • Interconnection
  • Energy
    • Sustainability
  • Cooling
  • Technology
    • Internet of Things
    • AI & Machine Learning
    • Edge Computing
    • Virtual Reality
    • Autonomous Cars
    • 5G Wireless
    • Satellites
  • Design
    • Servers
    • Storage
    • Network
  • Voices
  • Podcast
  • White Papers
  • Resources
    • COVID-19
    • Events
    • Newsletter
    • Companies
    • Data Center 101
  • Jobs
You are here: Home / Design / Class: New Data Center Metric Targets Probability and Risk

Class: New Data Center Metric Targets Probability and Risk

By Rich Miller - June 23, 2015

Class: New Data Center Metric Targets Probability and Risk

Steve Fairfax of MTechnology speaks at the recent 7x24 Exchange Spring conference in Orlando, where he proposed a new metric to reflect the probability of a failure. (Photo: Rich Miller)

LinkedinTwitterFacebookSubscribe
Mail

ORLANDO, Fla. – You know the Tier of your data center, and you probably know what your availability has been. But what’s the probability that your data center will experience a failure within the next 12 months? Do you know?

Steve Fairfax believes you should. At the 7×24 Exchange Spring conference earlier this month, Fairfax outlined his proposal for a new data center metric to provide a simple way to understand the risk of future failure. The Class metric would express the probability, in percentage terms, that a facility will fail in one year of operations.

“Critical facilities are for the risk averse,” said Fairfax, the President of MTechnology. “A data center is a giant insurance policy.

“Yet this risk-averse industry has no metric for risk,” he added. “Executives use metrics to help make sense of complex decisions. I think there’s a real need (for a new metric). Class lets us talk about risk and probability without using the word ‘failure.’ ”

Fairfax specializes in risk analysis, and has spent decades studying failures in complex mission-critical systems. At MTechnology, he adapted safety engineering best practices from the nuclear power industry, known as probabilistic risk assessment (PRA) , and applied them to data center risk modeling.  PRA uses computer calculations to analyze what could go wrong, how a chain of failures could combine to endanger a system, and how design decisions can minimize the impact of these scenarios.

The Class proposal seeks to take the benefits of probability analysis and express it in a simple measure of future risk.

fairfax-class-2

Why do we need a new metric? Fairfax argues that the Uptime Institute’s Tier System is primarily a measure of redundancy (the duplication of critical components), while IEEE Standard 3006.7 focuses on reliability (an indication of how long a system will operate before it fails).

Fairfax is a fan of IEEE 3006.7, but says it doesn’t translate well to the business world. “It’s very technical and detailed,” he said. “It’s written by engineers and for engineers. Class is for the rest of us. It’s a way to talk about risk. This metric should be easy to understand and use. It should be intuitive.”

Defining the probability of failure helps customers make informed decisions about the consequences of design decisions, and align their business accordingly. A company running mobile gaming apps will have a different risk profile than a financial services data center. Fairfax asserts that probability is a better tool for making decisions than redundancy or availability.

Free Resource from Data Center Frontier White Paper Library

data center migration
Migrate Your Data Center Worry-Free
In this new white paper, Flexential provides best practices for IT teams looking to optimize their data center migration by minimizing downtime and avoiding hurdles.
We always respect your privacy and we never sell or rent our list to third parties. By downloading this White Paper you are agreeing to our terms of service. You can opt out at any time.

Get this PDF emailed to you.

“Let’s start a conversation about what is the right amount of risk,” he said. “Not everyone needs the ultimate data center and the ultimate reliability.”

Not ‘One Metric to Rule Them All’

After Fairfax outlined his proposal in a 45-minute morning keynote, the 7×24 Exchange convened an afternoon panel of industry thought leaders to discuss the state of metrics in the data center industry, and how the Class metric might fit. The panelists emphasized that Class could be useful as a supplement to existing metrics, rather than a replacement. There is no “one metric to rule them all,” but rather a diverse offering of metrics that provide different views of performance.

In particular, panelists said the Class proposal was not an effort to replace the Tier System, developed by the late Ken Brill at the Uptime Institute. The tier system classifies four levels of data center reliability, defined by the equipment and design of the facility, and has become central to discussions of how to plan and design enterprise data centers.

“Ken Brill was trying to bring order to chaos,” said Fairfax. “Ken proposed a classification system that helped us to try and make sense of this. The Tier system has put data centers into four big buckets, and did the industry a great service. But it doesn’t really measure risk. I don’t think it’s a competitive thing.”

Steve Fairfax of MTechnology, Peter Gross of Bloom Energy and David Schirmacher of Digital Realty Trust discuss metrics at the 7x24 Exchange srping conference. (Photo by Rich Miller)

Steve Fairfax of MTechnology, Peter Gross of Bloom Energy and David Schirmacher of Digital Realty Trust discuss metrics at the 7×24 Exchange spring conference. (Photo by Rich Miller)

Peter Gross of Bloom Energy, a leading voice in data center design, seconded that sentiment.

“No single document has done more to improve our industry than the tier system,” said Gross. “But it’s not uncommon for a Tier II facility to be more reliable than Tier III. That’s crazy.”

“We don’t have a lot of metrics in this industry,” said Gross. “Both PUE (Power Usage Effectiveness) and the Tier System have contributed significantly to improving reliability and efficiency. But in a way, in a sophisticated industry like ours, having only two metrics is pathetic. Metrics are complicated. PUE and Tiers have succeeded because they’re simple. People have a difficult time understanding the concept of probability.”

The Problem With Availability

What about availability? Fairfax says availability is a misleading metric, because in practice it measures a data center’s ability to deliver power to customers, rather than the actual operation of customer equipment. The exalted “five nines” avilability – uptime of 99.999%, equating to just five minutes of downtime per year – is often a misnomer, he says.

“There is no such thing as a five-minute system outage,” said Fairfax. “If there was an award for abuse of statistics, this would win it.”

Gross noted that even a brief power loss can lead to hardware failures and database recovery challenges that translate into lengthy customer outages.

“There is no correlation between the availability of M&E (mechanical and electrical equipment) and the availability of a data center,” he said. “People don’t care about the availability of their electrical systems; they care about the availability of their computer systems. What’s the availability of my compute, my network, and my storage? You can have a one second power loss, and it takes 24 hours to get the business back online.”

Availability also doesn’t reflect the number of outages, as noted in a blog by Schneider Electric. “Consider two data centers that are both considered 99.999% available,” Schneider’s Susan Hartman wrote. “In one year, Data Center A loses power once, for 5 minutes. Data Center B loses power 10 times, but for only 30 seconds each time. … The data center that loses power 10 times will have a far greater MTR (Mean Time to Recover) than the one that lost power only once – and hence probably far from a 99.999% availability rating.”

“The idea is to diminish the pursuit of ‘five nines’ availability,” said Fairfax. “Availability cannot measure risk. We allow our facilities to increase risk to improve some other metric.”

8 Million Ways to Fail

Fairfax says that probability analysis provides a more complete picture of the interaction between components of the data center building, power distribution and cooling system.

Probability and Class can be calculated using a number of methods:

  • Fault trees that diagram how a failure in one component can impact other elements of the system.
  • Reliability Block Diagrams, which create a model of the system flow to assess dependencies.
  • “Monte Carlo” simulations using software tools (including MATLAB and Excel add-ins) to model all possible outcomes of failures in complex systems.

MTechnology’s work on fault tree analysis has identified up to 8 million failure modes. Most of these have very small probability of failure, but they add up, Fairfax said, adding that four or five components are involved in 85 percent of failures.

What kind of insights can be gained from this approach? Predictive modeling of design decisions can prevent data center builders from investing large amounts of money on equipment that isn’t likely to improve their uptime.

“Our analysis suggests that adding one more generator has vastly more benefit than a second utility line,” said Fairfax. “”There’s no benefit to a second utility feed. Spend that money on a second generator instead. There’s some real money to be saved.”

So what’s next for the proposal?  In his keynote, Fairfax proposed that the 7×24 Exchange can develop Class as a critical facility metric. Bob Cassiliano, Chairman and CEO of the 7×24 Exchange, invited conference attendees to provide feedback on the proposal, which will guide the group’s decision on whether to pursue establishing Class as a new metric. The 7×24 Exchange is the leading knowledge exchange among professionals who design, build, operate and maintain enterprise mission critical infrastructures.

LinkedinTwitterFacebookSubscribe
Mail

Tagged With: Data Center Metrics

Newsletters

Stay informed: Get our weekly updates!

Are you a new reader? Follow Data Center Frontier on Twitter or Facebook.

About Rich Miller

I write about the places where the Internet lives, telling the story of data centers and the people who build them. I founded Data Center Knowledge, the data center industry's leading news site. Now I'm exploring the future of cloud computing at Data Center Frontier.

  • Facebook
  • Instagram
  • LinkedIn
  • Pinterest
  • Twitter

Voices of the Industry

Maintaining Low-Latency and Sustainability at the Network Edge

Maintaining Low-Latency and Sustainability at the Network Edge Schneider Electric's Steven Carlini and Andres Vasquez discuss building data centers at the network edge while also meeting sustainability goals.

White Papers

Hybrid Edge Data Centers

Planning Now for the Future: Hybrid Edge Data Centers

More devices will be sending data to the edge and the cloud, and for this reason, data centers will need to develop a hybrid approach to their infrastructure. A white paper from Belden explains why the future is one with hybrid edge data centers.

Get this PDF emailed to you.

We always respect your privacy and we never sell or rent our list to third parties. By downloading this White Paper you are agreeing to our terms of service. You can opt out at any time.

DCF Spotlight

Data center modules on display at the recent Edge Congress conference in Austin, Texas. (Photo: Rich Miller)

Edge Computing is Poised to Remake the Data Center Landscape

Data center leaders are investing in edge computing and edge solutions and actively looking at new ways to deploy edge capacity to support evolving business and user requirements.

An aerial view of major facilities in Data Center Alley in Ashburn, Virginia. (Image: Loudoun County)

Northern Virginia Data Center Market: The Focal Point for Cloud Growth

The Northern Virginia data center market is seeing a surge in supply and an even bigger surge in demand. Data Center Frontier explores trends, stats and future expectations for the No. 1 data center market in the country.

See More Spotlight Features

Newsletters

Get the Latest News from Data Center Frontier

Job Listings

RSS Job Openings | Pkaza Critical Facilities Recruiting

  • Critical Power Energy Manager - Data Center Development - Ashburn, VA
  • Data Center Facility Operations Director - Carrollton, TX
  • Site Development Manager - Data Center - Ashburn, VA
  • Data Center Facility Operations Director - Chicago, IL
  • Electrical Engineer - Senior - Dallas, TX

See More Jobs

Data Center 101

Data Center 101: Mastering the Basics of the Data Center Industry

Data Center 101: Mastering the Basics of the Data Center Industry

Data Center Frontier, in partnership with Open Spectrum, brings our readers a series that provides an introductory guidebook to the ins and outs of the data center and colocation industry. Think power systems, cooling, solutions, data center contracts and more. The Data Center 101 Special Report series is directed to those new to the industry, or those of our readers who need to brush up on the basics.

  • Data Center Power
  • Data Center Cooling
  • Strategies for Data Center Location
  • Data Center Pricing Negotiating
  • Cloud Computing

See More Data center 101 Topics

About Us

Charting the future of data centers and cloud computing. We write about what’s next for the Internet, and the innovations that will take us there. We tell the story of the digital economy through the data center facilities that power cloud computing and the people who build them. Read more ...
  • Facebook
  • LinkedIn
  • Pinterest
  • Twitter

About Our Founder

Data Center Frontier is edited by Rich Miller, the data center industry’s most experienced journalist. For more than 20 years, Rich has profiled the key role played by data centers in the Internet revolution. Meet the DCF team.

TOPICS

  • 5G Wireless
  • Cloud
  • Colo
  • Connected Cars
  • Cooling
  • Cornerstone
  • Coronavirus
  • Design
  • Edge Computing
  • Energy
  • Executive Roundtable
  • Featured
  • Finance
  • Hyperscale
  • Interconnection
  • Internet of Things
  • Machine Learning
  • Network
  • Podcast
  • Servers
  • Site Selection
  • Social Business
  • Special Reports
  • Storage
  • Sustainability
  • Videos
  • Virtual Reality
  • Voices of the Industry
  • Webinar
  • White Paper

Copyright Data Center Frontier LLC © 2022

X - Close Ad