Equipment Longevity & Performance: Why the Bathtub Curve Is Inaccurate

Aug. 12, 2022
Chad Peters, Director of Infrastructure Solutions for Service Express, revisits the Bathtub Curve theory and explains how to track equipment reliability and performance for data-driven buying decisions.
The power of computing and data storage is only as good as product reliability.

With billions of individuals accessing data daily for personal and professional use, confidence in its accessibility and availability is paramount to everyday life. But how can OEMs and suppliers ensure their products are reliable — not just at the point of sale — but for years down the line?

We often hear the Bathtub Curve is the tried-and-true method for determining product reliability, but this common tactic often overlooks the complexity of longevity and reliability. Read on as we dispel the myth of a previously thought ironclad method and break down the importance of continued data collection throughout the product lifecycle.

Revisiting the Bathtub Curve

In our popular 2020 article The Bathtub Curve and Data Center Equipment Reliability, Jake Blough, Chief Technology Officer for Service Express, dispels several myths about new and aging hardware. The Bathtub Curve theory suggests that equipment failure rates are high when a product is released, then decrease over the next two to three years. According to the theory, the failures increase again toward the product’s End of Life (EOL) date.

Although the Bathtub Curve (pictured below) accurately reflects the failure behavior of many products, we’ve found it does not universally apply to all equipment the same.

Defining critical and non-critical failures

To better understand the context of the Bathtub Curve, it’s essential to define the common types of failures seen in data center hardware: critical and non-critical.

A critical failure occurs when something like a CPU or system board fails. Critical server failures result in the loss of access to applications or data, a significant issue that impacts overall business productivity.

A non-critical failure occurs when a component like a disk drive or power supply fails. Modern data center equipment has built-in redundancies for these components, so there is no data loss.

Contrary to what the Bathtub Curve portrays, the sample data above shows failure rates don’t drastically increase. By combining reliability and longevity data, we can better understand how equipment performs over time.

Testing the myth

At Service Express, we’ve collected over 15 years of equipment data from over half a million devices to understand equipment longevity and reliability better.

Our previous article only studied equipment longevity and how it stacks up against the Bathtub Curve. In the past two years, we’ve implemented real-time reliability studies that allow for previously unseen granularity. But what’s the difference between longevity and reliability?

Longevity reporting details a product’s expected failure rate over time. This can be useful for CapEx budget planning purposes. However, beyond equipment longevity, reliability plays a critical role. Reliability studies examine how equipment has performed over time. This practice is useful in identifying outliers within a product model number or family.

Below is an example of equipment and its reliability. This graph utilizes data sets from over 500,000 pieces of equipment under agreement for over 6,000 customers and compares performance using 55,000 average annual service calls.

In this example (pictured above), we use the following definitions when describing the quadrants.

  • Upper right – Customer model is performing better than the product family and perhaps better than similar models
  • Upper left – Customer model is worse than the overall product family but is better than similar models for other customers
  • Lower left – Model is underperforming against the product family, and the customer’s experience is worse than others with similar models
  • Lower right – Customer experience with the model is worse than others with similar models but better than the product family

The data shows that Product A is in the lower right quadrant. This position signifies that, based on equipment model failures, Product A is typically a well-performing product compared to other products within its product family. Still, the customer is having a worse experience than most.

When utilizing historic equipment reliability and performance data, customers can better understand which equipment meets performance needs or vice versa. This practice has helped customers make data-driven decisions in the data center to optimize current environments or make necessary upgrades to maintain productivity.

The impact of leveraging data-driven equipment insights

The Bathtub Curve is a useful tool when speaking about equipment performance in general. However, this method comes with many misnomers that can cause suppliers to repurchase parts or switch models before necessary.

By combining longevity studies with real-time reliability data, we can better understand which equipment will continue to perform over time and which hardware requires our attention. When those high performers (or outliers) are defined, customers can better plan or extend hardware refreshes and more efficiently develop budget and resource planning.

Chad Peters, Director of Infrastructure Solutions for Service Express, a global data center solutions provider that helps IT teams control costs, optimize infrastructure strategies and automate support.

About the Author

Voices of the Industry

Our Voice of the Industry feature showcases guest articles on thought leadership from sponsors of Data Center Frontier. For more information, see our Voices of the Industry description and guidelines.

Sponsored Recommendations

How Deep Does Electrical Conduit Need to Be Buried?

In industrial and commercial settings conduit burial depth can impact system performance, maintenance requirements, and overall project costs.

Understanding Fiberglass Conduit: A Comprehensive Guide

RTRC (Reinforced Thermosetting Resin Conduit) is an electrical conduit material commonly used by industrial engineers and contractors.

NECA Manual of Labor Rates Chart

See how Champion Fiberglass compares to PVC, GRC and PVC-coated steel in installation.

Electrical Conduit Cost Savings: A Must-Have Guide for Engineers & Contractors

To help identify cost savings that don’t cut corners on quality, Champion Fiberglass developed a free resource for engineers and contractors.

CoolIT Systems
Source: CoolIT Systems

Selecting the Right Coolant Distribution Unit for Your AI Data Center

Ian Reynolds, Senior Project Engineer with CoolIT Systems, outlines considerations for selecting the best CDU for your needs.

White Papers

IMDC_SRCover_2022-10-18_11-16-58

Beyond Greenwashing: Sustainability Meets Compliance

Oct. 19, 2022
This special report, courtesy of Iron Mountain, explores a set of metrics and mechanisms that data center operators can use track progress towards their environmental, social,...