Data Center Maintenance: What Should Be Included?
In this week’s Voices of the Industry, Robert McClary, Chief Operating Officer, at FORTRUST discusses data center maintenance and lifestyle strategies and what should be included.
According to the most recent market studies, businesses are continuing to leverage colocation services as critical parts of the corporate infrastructure. Sandler Research predicted that the global colocation market would expand at a compound annual growth rate of more than 12 percent through 2020, driven by shrinking enterprise IT budgets along with rising needs for critical application support and accessibility.
As companies increasingly rely on colocation providers, it becomes even more important for these data centers to be properly maintained. Even a single minute of downtime can cost an organization thousands of dollars and its reputation in the industry, making it absolutely imperative that service providers do everything in their power to ensure round-the-clock uptime.
This is where robust data center maintenance and lifecycle strategies come into play, and become such a pivotal part of facility processes. But what, exactly, should customers ask colocation providers in regard to maintenance, and what advantages can these maintenance strategies bring to the table?
What makes maintenance and lifecycle strategies so important?
In the current IT landscape, downtime isn’t just costly in terms of dollars and cents – it can also do untold damage to a brand’s reputation. This is particularly true if the colocation facility is supporting client-facing resources that are imperative to customer service.
Robert McClary, FORTRUST Chief Operating Officer, pointed out that poor maintenance and lifecycle strategies are the second most likely cause of unplanned downtime, with human error and poor capacity management being the first likely cause of a data center downtime. Even the most optimally designed data centers cannot make up for a lack of proper system maintenance and upkeep.
Different types of maintenance
When it comes to maintenance strategies, there are a few different types that colocation customers should look for. Gaining details about these processes is paramount, as it will show the provider’s dedication to uptime within the facility.
McClary noted that a comprehensive strategy here should include:
- Regular and thorough inspections: Data center staff should continually inspect systems and equipment to ensure they are in proper working order. This includes daily inspections of generators, water temperature, fuel levels, plenum pressures, electrical and mechanical distribution systems operating parameters and other system parameters and configurations.
- Continuous testing: Facility employees should also test specific systems to ensure that they are operating within the correct parameters. Processes here can encompass infrared, load testing and fail-over testing.
- Predictive maintenance: This is a critical part of the data center’s strategy. Predictive maintenance leverages measurements and other data analysis to recognize any changes, trends or irregularities that could point to a potential failure. In this way, staff members can address these issues before they lead to an outage.
- Preventive maintenance: McClary explained that preventative maintenance is meant to “keep a piece of equipment or component operating at its optimum level or an action that prolongs its lifecycle.” This type of maintenance can include filter or oil changes, as well as cleaning heat exchangers and electrical systems.
- Corrective maintenance: Finally, staff members should leverage corrective maintenance processes when it comes time for a system or component to be repaired or replaced. Fixing a leak or replacing a bearing or valve would fall under corrective maintenance.
With strategies including predictive and preventative maintenance in place, the potential for system failure is considerably reduced. These processes enable facility workers to pinpoint and address issues before they cause an unplanned or even a planned outage.
“Do not be a break-fix organization that waits for failure before it takes action,” McClary recommended. “I am of the belief that it is not impossible or difficult to predict issues in equipment before failure. In fact, I believe that if you have a strong maintenance and lifecycle strategy, unpredicted failure becomes at the very least, a random event.”
Parts of the lifecycle strategy
It’s also critical to ensure that facility managers have a lifecycle strategy in place. McClary explained that this includes both a preventive and predictive maintenance program in conjunction with other best practices to boost the equipment lifecycle. Activities to look for here include strategies for:
- Replacing before failure: Many systems and components are meant to be replaced at certain intervals after their usefulness has expired. Not replacing this equipment increases the chances of failure and unplanned downtime.
- Rotation: Similarly, some components need to be rotated according to a specific schedule to ensure performance and balance.
- Replacement: Finally, customers should ensure that facility staff members have a strategy that dictates the proper times to replace equipment. This procedure ensures that critical systems aren’t interrupted in the process.
Additional best practices
Colocation customers should also ensure that their service providers are following other maintenance and lifecycle best practices.
“Providers should prioritize preventive and predictive maintenance.”
This includes being aware of and incorporating equipment manufacturer recommendations into their overall processes. In many cases, facility staff shouldn’t just be following these recommendations, but exceeding them to ensure that equipment performs at optimal levels and that its lifecycle can be prolonged.
Customers should also look to ensure that their provider prioritizes preventive and predictive maintenance over corrective maintenance.
“Understand the cost of corrective maintenance is much greater over the long term,” McClary wrote. “Ask any classic car buff, and they will tell you the same thing! Regular preventive maintenance will save you money over the long haul.”
In addition, it’s best to ensure that critical processes including maintenance and lifecycle procedures are handled in-house, and that these activities are not outsourced to a third party. Facility managers should be incredibly selective about which processes are carried out by external vendors. As a rule of thumb, less than 20 percent of these overall procedures should be outsourced.
“A skilled operations team with ownership of the maintenance and lifecycle strategies is core to a data center’s critical systems infrastructure’s ability to continuously provide high-availability service delivery and uptime over a long amount of time,” McClary noted. “Maintenance and lifecycle strategy must be a routine. Attention to detail and ownership is contagious if the tone is set and emphasized at every level in the organization!”
FORTRUST has delivered 100 percent continuous critical systems uptime for more than 15 years. To find out more about FORTRUST’s specific maintenance and lifecycle strategies, contact them for a tour of their Denver Data Center today.
Robert D. McClary is Chief Operating Officer, Robert is responsible for the overall supervision of business operations, high-profile construction and strategic technical direction at FORTRUST. Robert developed and implemented the process controls and procedures that support the continuous uptime and reliability that FORTRUST Denver has delivered since 2001. He is considered one of the leading experts on Management and Operations in the data center industry and was selected as a finalist by AFCOM for Data Center Manager of the Year. To give back to the data center community and promote youth technology education, Robert serves on the Board of Directors as President for the Rocky Mountain Chapter of the 7×24 Exchange and Chairs the Board of Directors for KidsTek.