Data Center Health Management

Nov. 22, 2016
In this week’s Voices of the Industry, Jeff Klaus General Manager of Intel Data Center Solutions, discusses an approach to data center health management .

In this week’s Voices of the Industry, Jeff Klaus General Manager of Intel® Data Center Solutions, discusses an approach to data center health management. 

JEFF KLAUS, Intel

As winter approaches, I’m sure you’ve already heard more sneezes and coughs circulating throughout the office. Just as regular checkups are important to your health and well-being, the same can be said for your data center’s health. Preventative measures are critical to avoiding outages and downtime.

Just how critical is Data Center Health Management?

Consider the Delta Airlines data center outage that occurred this past August, which grounded more than 2,000 flights over three days and cost the company $150 million. Or the data center outage that Southwest Airlines experienced, which also lasted three days and is estimated to have caused at least $177 million in lost passenger revenue.

That, my friends, is nothing to sneeze about.

In view of the financial prognosis, how could companies not afford to employ a preventative health management approach in their data centers to catch issues before they spiral out of control and are deemed untreatable?

Yes, the numbers cited above are extraordinarily high and point out that major airlines have a lot more at stake when designing and managing critical infrastructure than most other data center operators. But the risks involving outages do not discriminate. All data center facilities across every industry sector run similar risks when left unprotected by a sound health management approach. According to a study by the Ponemon Institute, the average cost of a single data center outage today is about $730,000. Of the 60-plus data center operators surveyed for the study, the costliest outage reported caused the data center operator to lose approximately $2.4 million.

To be certain, today’s data center operators are faced with significant, long-term challenges and daily uncertainties.

Among these: How can they know when a server’s components fail? Is it necessary to manually check the LEDs? How soon can a data center manager anticipate his facility’s fans to fail? Moreover, with thousands of heterogamous servers in the typical data center, there is the need for a tool to control and access these servers to maintain full availability.

Add to that the need to spend exorbitant amounts of money on hardware KVMs as well as to receive failure reports and know without question when it’s necessary to make a service call to remote data centers, and maintaining data center health can become a Sisyphean task.

Providing a remote control for your data center, Intel Virtual Gateway is a cross-platform, virtual keyboard-video-mouse used for maintaining the health of data center hardware. Given its firmware-based capability that is embedded directly into the server, Intel Virtual Gateway eliminates the need for complicated and expensive KVM infrastructure.

Health management in the data center has four main pillars: monitoring, analytics, diagnostics and remediation. Let’s take a closer look at the capabilities specific to each of these requirements (all of which are supported by Intel Virtual Gateway).

Monitoring

  • Provides root cause failures with down-to-components’ health details
  • Creates a failure device report with severity and failure details
  • Using hardware failure trending, can better predict when components will need to be replaced
  • Provides failure rate and MTTR analysis, per server model, components, etc., for the future
  • Provides server failure predication for the future

Diagnostics

  • Produces server diagnostics and troubleshooting
  • Checks BIOS settings and BIOS configuration
  • Analyzes server logs
  • Makes configuration changes or verification
  • Uses both OOB (KVM) and IB (SSH, RDP, VNC)

Remediation

  • Can remotely power servers on and off
  • Provides the ability to create groups of servers and then assigns power tasks to them
  • Can stagger turn-on to keep from overloading racks
  • Can schedule and automate and individual or group power task
  • Provides vMedia for remote OS provisioning and installation
  • Links server failures to workload and/or workflow management system for IT

Through ongoing monitoring, analytics, diagnostics and remediation, data center operators can employ a health management approach to addressing the risk of costly downtime and outages. Think of Intel Virtual Gateway as “an apple a day” for the health and well-being of your facility.

Submitted by Jeff Klaus, GM of Intel Data Center Solutions.

About the Author

Voices of the Industry

Our Voice of the Industry feature showcases guest articles on thought leadership from sponsors of Data Center Frontier. For more information, see our Voices of the Industry description and guidelines.

Sponsored Recommendations

Tackling Utility Project Challenges with Fiberglass Conduit Elbows

Explore how fiberglass conduit elbows tackle utility project challenges like high costs, complex installations, and cable damage. Discover the benefits of durable, cost-efficient...

How Deep Does Electrical Conduit Need to Be Buried?

In industrial and commercial settings conduit burial depth can impact system performance, maintenance requirements, and overall project costs.

Understanding Fiberglass Conduit: A Comprehensive Guide

RTRC (Reinforced Thermosetting Resin Conduit) is an electrical conduit material commonly used by industrial engineers and contractors.

NECA Manual of Labor Rates Chart

See how Champion Fiberglass compares to PVC, GRC and PVC-coated steel in installation.

DALL·E, courtesy of EdgeConneX
Source: DALL·E, courtesy of EdgeConneX

Breaking Barriers in Rack Density: Why Liquid Cooling is the Key to Tomorrow's Data Centers

Phillip Marangella, Chief Marketing and Product Officer for EdgeConneX, explains why the adage "everything old is new again" only tells part of the story – especially when it ...

White Papers

Mgk Dcf Wp Cover1 2023 01 09 10 34 33

Data Center Microgrids: The Case for Microgrids at Data Centers

Jan. 9, 2023
Many of the systems that businesses and the public rely on in the modern world are dependent on the internet, making data centers a critical form of infrastructure. But as the...