PALO ALTO, Calif. – Behind every cloud there’s a data center. Sometimes lots of them. As the cloud begins to deliver on its promise of massive scalability for viral apps like Pokemon Go, there are cloud builders working overtime to deliver the physical infrastructure to serve this traffic.
Joe Kava and Christian Belady have been on the front lines of this effort, building the data centers that drive the growth of massive cloud empires at Google and Microsoft. As cloud computing enters a new phase of growth, the veteran cloud builders shared their insights at last week’s Infrastructure Masons Summit in Palo Alto. Their message: Keeping pace with cloud growth requires creativity, and the ability to routinely expand your growth horizon.
“Whatever size you plan to scale to, it’s too small,” said Belady, the General Manager for Cloud Infrastructure Strategy and Architecture for Microsoft. “I think it’s always about thinking 10X growth and beyond that. We’re setting the foundation of where things are going to be.”
Over the past 18 months, the accelerating shift to the cloud has redefined the scale of data centers, with several providers building new server farms in excess of 1 million square feet. Companies like Google, Microsoft, Amazon, Facebook and Oracle are deploying unprecedented amounts of data center space. Google recently bought more than 1,200 acres of land near Reno for a future server farm.
This growth is driving innovation in data center design and construction for the cloud builders, and has created an enormous opportunity for data center developers specializing in agile construction, who are bringing new capacity to market in record speed. This emphasis on speed and scale in data center construction is likely to continue for the foreseeable future, as cloud builders seek to stay ahead of demand
“Whatever you think today is going to be wrong,” said Kava, the Vice President of Data Centers at Google. “The amount of compute power is beyond what we could have imagined a few years ago.”
The Challenge of Capacity Planning
Kava and Belady shared their insights in a question-and-answer session moderated by Infrastructure Masons founder Dean Nelson, who previously headed the data center team at eBay and is now in charge of global infrastructure for Uber. The summit brought together dozens of executives responsible for building and operating data centers, who may be able to apply the lessons learned by Google and Microsoft in scaling their own infrastructure.
Infrastructure Masons was founded last year by Nelson to chart a course for the fast-growing cloud economy. The group’s members have built more than $100 billion worth of data centers, and includes leaders of the infrastructure teams at Facebook, Microsoft, eBay, Switch and Google.
Capacity planning – figuring out how much data center space to build, and when – is one of the most difficult challenges in IT. Data center construction is extremely expensive, costing between $5 million to $9 million for every megawatt of capacity. As a result, building too much data center space can become an expensive mistake.
“Our first data center was 27 megawatts and we were thinking ‘did we build too much?'” said Belady. “We didn’t know whether we’d ever fill it. Now our leases are bigger than that.”
Unexpected Stress Test for a Global Cloud
Estimating future demand is an inexact science under the best of circumstances. Companies like Google and Microsoft have developed sophisticated models to understand past demand trends. Even so, predicting future demand can be especially challenging for cloud providers.
“When you build data centers for your own products, you’re able to stay ahead of the demand,” said Kava. “You have historic data you can work with to make projections. But we’re no longer just building for our own products. With the public cloud, you’re building for everyone else’s products.”
Cloud customers often have little insight into how much capacity their applications will require, Kava said. The most vivid example of this was Pokemon Go, the augmented reality game from Niantic that became a cultural phenomenon, driving half a billion downloads. Pokemon Go’s popularity provided an unexpected stress test for the Google Cloud Platform, which hosted the app.
“When they gave us their day one estimates, they were only off by a factor of 40X,” said Kava. “They were wildly off. Most companies’ ability to do capacity planning is limited. This is what everyone’s realizing. Ten or 20 years ago, it was implausible to come up with a business that could quickly gather 100 million users. Today, app developers can do that almost overnight.”[clickToTweet tweet=”Joe Kava: 10 years ago, it was implausible to gain 100 million users. Today, apps can do that almost overnight.” quote=”Joe Kava: 10 years ago, it was implausible to gain 100 million users. Today, apps can do that almost overnight.”]
“It’s an example of why cloud is important,” said Belady. “For providers, there’s a barrier to entry because of scale. But for customers, scale gives the promise of infinite capacity.”
Should You Build or Buy?
Google and Microsoft have pursued different strategies to meet their demand for data center space. Google has built a sophisticated in-house data center construction operation, which has gone global as the company’s footprint has expanded into South America and Asia.
Microsoft also builds its own data centers, and has extensive in-house design capabilities. But as the growth Azure cloud has accelerated. Microsoft has begun leasing enormous amounts of server space from third-party developers like CyrusOne and DuPont Fabros Technology. In effect, Azure needs space faster than Microsoft can build it, so the company has increasingly relied on partners that specalize in high-speed construction.
“A lot of our builds are now leases,” said Belady. “It’s really helped us scale rapidly. We can no longer just do it on our own. You have to use the whole ecosystem. We use the data center providers as another part of our team.”[clickToTweet tweet=”Christian Belady: We use data center providers as part of our team. It’s really helped us scale rapidly.” quote=”Christian Belady: We use data center providers as part of our team. It’s really helped us scale rapidly.”]
“I would never have imagined that leases would be the size they are today,” he said .
Microsoft leased an estimated 125 megawatts of third-party data center space in 2016, accounting for the top six deals in the wholesale data center industry. These deals have prompted leading data center REITs (real estate investment trusts) to develop new strategies for delivering facilities quickly, testing the ability of the data center supply chain to keep pace.
Google decided to start building its own data centers in 2006, when the company’s requirements for server storage grew beyond what colocation companies could provide.
“When we first started building, it was impossible to find a lease for 10 megawatts, so we did our own,” said Kava. “When we were first starting out building our own data centers, we were sole-sourced to one company doing our design and construction management. That got us what we needed. At some point, you need to scale beyond the U.S. We have 11 projects on four continents now. We had to distribute resources, and we built our own team.”
Moving Faster in Construction
That places pressure on cloud providers to have extra capacity available to support viral breakouts like Pokemon Go. Capacity planning is only going to become harder with the simultaneous emergence of a host of data-intensive technologies, including virtual reality, Big Data analytics, the Internet of Things and connected cars.
“Let’s say I want to have a 10 percent buffer,” said Kava, who oversees a global data center platform that used 5.7 terawatts of electricity in 2015. “For me, 10 percent is a lot of buffer.”
The bottom line is that data centers must be built more quickly, and at larger scale. Google and Microsoft are continuously refining their data center construction strategies. Kava said Google’s data centers tend to evolve along two tracks, which each present challenges
“If you have 100 megawatts (at a data center) on day one, and you expect it to grow substantially from there, the most cost-effective way to do that is to deploy in large chunks,” said Kava, who said Google approaches these facilities by expanding in 20 megawatt phases – an amount of capacity that’s larger than many entire data centers.
As cloud providers expand into new geographic markets, they see a different profile growth in demand.
“If you’re smaller on day one and have a long ramp, large modules might remain empty for a while,” said Kava. “You have to have solutions that cover both ends of the spectrum. A new cloud region is a smaller number of megawatts. It will be a few years before you get to 10 megawatts, so I wouldn’t build in large chunks.That’s where the leasing ecosystem really makes sense.
“You have to have solutions that cover both ends of the spectrum,” he said.
Building the Clouds to Come
That’s a challenge Nelson will likely face in his new post as head of Uber Compute. The ride-sharing and logistics company has been leasing wholesale data center space over the past year to scale up its global data platform – a task that could grow to a whole new level of scale with Uber’s planned shift to autonomous vehicles.
“Uber is going to have to scale more rapidly than Joe and I ever did,” said Belady. “Dean has insight into this. He has to leverage the whole ecosystem.”
Nelson has first-hand experience with both building and buying space. At eBay, he built an innovative greenfield data center near Salt Lake City that integrated Bloom Energy Servers into its power chain. eBay is also a major tenant for Switch at its SUPERNAP campus in Las Vegas and Citadel campus in Reno.
Nelson says the emergence of new technologies will test the industry’s ability to understand demand.
“We are in a new space,” said Nelson. “I’m living it. Things are changing. There’s this wave coming that is different than what we’ve done before. ”
Masons Focus on Education
The summit was the latest in a series of meetings in which Infrastructure Masons is developing its agenda, including plans to roll out the Data Center Performance Index, a new user-friendly metric for reliability and operations. In just a year, the group has grown to 1,446 members in 46 countries.
One of the group’s priorities is making students aware of the data center industry as a career option. The Masons’ Education Committee is working to develop local chapters for universities, creating a channel to introduce students to what it’s like to work in a data center – including the opportunity to make a big difference and a good living. A key goal is to five more students the opportunity to tour a data center and see one of the facilities in action. This element may require a change in mindset, as data centers typically focus on limiting access for security and privacy reasons. One option is to provide training and resources to data center professionals, equipping them to be ambassadors for the industry.
“We need to get Masons out in the community and doing partnerships,” said Winston Saunders, an Intel veteran who is heading the Education Committee. “It’s increasing the visibility of our profession.”
There was also discussion of using scholarships to develop promising talent, as well as strategies to raise awareness of the data center industry among high school teachers, guidance counselors and students.