• About Us
  • Partnership Opportunities
  • Privacy Policy

Data Center Frontier

Charting the future of data centers and cloud computing.

  • Cloud
    • Hyperscale
  • Colo
    • Site Selection
    • Interconnection
  • Energy
    • Sustainability
  • Cooling
  • Technology
    • Internet of Things
    • AI & Machine Learning
    • Edge Computing
    • Virtual Reality
    • Autonomous Cars
    • 5G Wireless
    • Satellites
  • Design
    • Servers
    • Storage
    • Network
  • Voices
  • Podcast
  • White Papers
  • Resources
    • COVID-19
    • Events
    • Newsletter
    • Companies
    • Data Center 101
  • Jobs
You are here: Home / Cloud / How Machine Learning is Changing the Face of the Data Center

How Machine Learning is Changing the Face of the Data Center

By Rich Miller - May 24, 2016

How Machine Learning is Changing the Face of the Data Center

Google's Tensor Processing Unit, a custom ASIC designed to crunch data for machine learning workloads. (Photo: Google)

LinkedinTwitterFacebookSubscribe
Mail

Machine learning and artificial intelligence have arrived in the data center, changing the face of the hyperscale server farm as racks begin to fill with ASICs, GPUs, FPGAs and supercomputers.

These technologies provide more computing horsepower to train machine learning systems, a process that involved enormous amounts of data-crunching. The end goal is to create smarter applications, and improve the services you already use every day.

“Artificial intelligence is now powering things like your Facebook Newsfeed,” said Jay Parikh, Global Head of Engineering and Infrastructure for Facebook. “It is helping us serve better ads. It is also helping make the site safer for people that use Facebook on a daily basis.”

“Machine Learning is transforming how developers build intelligent applications that benefit customers and consumers, and we’re excited to see the possibilities come to life,” said Norm Jouppi, Distinguished Hardware Engineer at Google.

Much of the computing power to create these services will be delivered from the cloud. As a result, cloud builders are adopting hardware acceleration techniques that have been common in high performance computing (HPC) and are making their way into the hyperscale computing ecosystem.

The race to leverage machine learning is led by the industry’s marquee names, including Google, Facebook and IBM. As usual, the battlefield runs through the data center, with implications for the major cloud platforms and chipmakers like Intel and NVIDIA.

Google Unveils TPU Hardware

Neural networks are computers that emulate the learning process of the human brain to solve new challenges, a process that requires lots of computing horsepower. That’s why the leading players in the field have moved beyond traditional CPU-driven servers and are now building systems that accelerate the work. In some cases, they’re creating their own chips.

Last week Google revealed the Tensor Processing Unit (TPU), a custom ASIC tailored for TensorFlow, an open source software library for machine learning that was developed by Google. An ASIC (Application Specific Integrated Circuits) is a chip that can be customized to perform a specific task. Recent examples of ASICs include the custom chips used in bitcoin mining. Google has used its TPUs to squeeze more operations per second into the silicon.

A custom rack in a Google data center packed with Tensor Processing Unit hardware for machine learning. (Photo: Google)

A custom rack in a Google data center packed with Tensor Processing Unit hardware for machine learning. (Photo: Google)

“We’ve been running TPUs inside our data centers for more than a year, and have found them to deliver an order of magnitude better-optimized performance per watt for machine learning,” writes Norm Jouppi, Distinguished Hardware Engineer, on the Google blog. “This is roughly equivalent to fast-forwarding technology about seven years into the future (three generations of Moore’s Law).”

A board with a TPU fits into a hard disk drive slot in a data center rack. Google used its TPU infrastructure to power AlphaGo, the software program that defeated world Go champion Lee Sedol in a match. Go is a complex board game in which human players maintained an edge on computers, which long ago had overtaken the abilities of humans in games like chess or “Jeopardy.” The complexities of Go presented a challenge to artificial intelligence technology, but the extra power supplied by TPUs helped Google’s program solve more difficult computational challenges and defeat Sedol.

Free Resource from Data Center Frontier White Paper Library

Cloud-Based Gaming Company Case Study
A new white paper from Aligned presents a case study of their multi-year colocation partnership with a global cloud-based gaming company. The report outlines the challenges presented by the client, the solutions provided by Aligned, and three of the key business results achieved by the partnership.
We always respect your privacy and we never sell or rent our list to third parties. By downloading this White Paper you are agreeing to our terms of service. You can opt out at any time.

Get this PDF emailed to you.

“Our goal is to lead the industry on machine learning and make that innovation available to our customers,” writes Jouppi. “Building TPUs into our infrastructure stack will allow us to bring the power of Google to developers across software like TensorFlow and Cloud Machine Learning with advanced acceleration capabilities.”

Big Sur GPUs power Facebook’s AI Infrastructure

Facebook’s AI Lab is using GPUs to bring more horsepower to bear on data-crunching for its artificial intelligence (AI) and machine learning platform.

“We’ve been investing a lot in our artificial intelligence technology,” said Parikh.

Facebook's Big Sur is Open Rack-compatible hardware designed for AI computing at a large scale.

Facebook’s Big Sur is Open Rack-compatible hardware designed for AI computing at a large scale. (Image: Facebook)

The Big Sur system leverages NVIDIA’s Tesla Accelerated Computing Platform, with eight high-performance GPUs of up to 300 watts each, with the flexibility to configure between multiple PCI-e connections. Facebook has optimized these new servers for thermal and power efficiency, allowing them to operate them in the company’s data centers alongside standard CPU-powered servers.

The gains in performance and latency provided by Big Sur help Facebook process more data, dramatically shortening the time needed to train its neural networks.

“It is a significant improvement in performance,” said Parikh. “We’ve deployed thousands of these machines in a matter of months. It gives us the ability to drive this technology into more product use cases within the company.”

Intel Focuses on FPGAs

On the hardware front, NVIDIA is perhaps the leading beneficiary of the new focus on machine learning, which has boosted sales of its GPU technology to hyperscale players. But it’s not the only chipmaker targeting the machine learning market.

Intel recently began sampling a new module that combines its traditional CPUs with field programmable gate arrays (FPGAs), semiconductors that can be reprogrammed to perform specialized computing tasks. FPGAs are similar to ASICs in that they allow users to tailor compute power to specific workloads or applications, but FPGAs can be reprogrammed to new tasks.

Intel sees FPGAs as the key to designing a new generation of products to address emerging customer workloads in the data center sector. In 2015 it paid $16 billion to acquire Altera,  a leading player in FPGAs and other programmable logic devices (PLDs) to automate industrial infrastructure.[clickToTweet tweet=”Intel sees FPGAs as the key to designing a new generation of products to address emerging customer workloads.” quote=”Intel sees FPGAs as the key to designing a new generation of products to address emerging customer workloads.”]

“We think FPGAs are very strategic,” said Raejeanne Skillern, GM of the Cloud Service Provider Business at Intel. “We’re doing a lot of development with OEMs and customers, and continuing to implement (FPGAs) into our roadmap.”

A particular focus for Intel is the “Super 7” group of cloud service providers that are driving hyperscale infrastructure innovation, which includes Amazon, Facebook, Google and Microsoft, along with Chinese hyperscale companies Alibaba, Baidu and Tencent. Intel projects that by 2020, more than 30 percent of cloud service provider nodes will be accelerated by FPGAs.

Beyond Jeopardy: IBM Takes Watson to Market

IBM is pursuing a different path with its push into artificial intelligence, which Big Blue refers to as “cognitive computing.” IBM is targeting enterprise users, and leading with Watson

The IBM Watson supercomputer became the poster child for artificial intelligence in 2011, defeating two human champions in a game of Jeopardy.

As the race to bring AI and machine learning to the mass market accelerates, IBM is seeking to keep Watson relevant with commercial offerings that show how AI can be used in the enterprise and public sector.

Watson consists of a collection of algorithms and software running on IBM’s Power 750 line of servers, and learns from data instead of being explicitly programmed to carry out instructions. IBM says Watson is the ideal tool to help companies make sense of Big Data.

“We are seeing massive, massive growth in the amount of data, and most of that is unstructured,” said Steven Abrams, Distinguished Engineer at IBM’s Thomas Watson Research Center. “Until now, it’s been hard to get our arms around that data and what we can do with it.”

At the recent DataCenterDynamics Enterprise event in New York, Abrams outlined how customers can use Watson to build applications. IBM is used to “large, transformative engagements” with enterprise clients, but is offering a subscription model in which clients can use Watson via cloud APIS (application programming interfaces).

At DataCenterDynamics Enterprise, a group of startups and analytics firms described how they were using IBM Watson's "cognitive computing" capabilities to create applications. (Photo: Rich Miller)

At DataCenterDynamics Enterprise, a group of startups and analytics firms described how they were using IBM Watson’s “cognitive computing” capabilities to create applications. (Photo: Rich Miller)

“We have to make Watson available to the type of company that may not normally be able to do business with IBM, and give them access to Watson technology,” said Abrams.
“We’ve gotten to the point where the technology is much closer to the self-service model. We’re really focusing on developers. We’re focused on helping people go from 0 to 60 in much less time.”

On the DCD panel, several customers discussed how they are using Watson to build apps. Some examples:

  • PurpleForge trains Watson to provide quick answers to engineering and support questions. Watson absorbs details from textbooks and manuals and then builds that knowledge into a database that can be queried with natural language questions. “It lets you do things you never thought were possible,” said Brian Hurley, President and CEO of PurpleForge.
  • Data security startup SparkCognition uses Watson to collate vulnerability databases and lists, and use the data to provide real-time remediation advice to security professionals. Director of Business development Stuart Gillen says using Watson accelerates the time-to-solution of potential threat.
  • Equals 3 Media allows digital marketers to use big data analytics refine ad serving, matching to users and their interests. Equals 3 uses Watson “personality insights” service, which scans social media networks, mining them for signals that can help advertisers target their messaging. For example, the Watson personality profiles might market a car different ways to different users – touting performance and speed to an adventure-oriented single person, and safety features to a suburban mom.

The personality profiling is one aspect of machine learning that may test users’ comfort levels, one panelist noted, raising the specter of a “giant Watson in the sky that knows everything.”

Cloud Delivery Model

Whether it’s Watson or competing services, it’s clear that the cloud will be the primary delivery method for consumer-facing services that tap machine learning. Google, Microsoft and Amazon Web Services are all now offering fully managed cloud services that offer the ability to analyze data and build applications or services.

As a result, the hardware required to support machine learning will live primarily in hyperscale data centers, which are already highly customized for extreme efficiency and high density workloads. These services are relatively new, so it’s not yet clear whether cloud economics will favor keeping this services in third-party clouds, or it may make sense for end users to eventually shift these workloads to company-operated data centers. For cloud services, it typically requires significant scale before the economics shift in favor of a company-operated facility.

But for the data center community, the benefits of machine learning aren’t measured only by the volume of hardware. Google is using machine learning and artificial intelligence to wring even more efficiency out of its mighty data centers. Joe Kava, Vice President for Data Center Operations at Google, said the use of neural networks will allow Google to reach new frontiers in efficiency in its server farms, moving beyond what its engineers can see and analyze.

“Our data centers are very large and complex,” said Kava. “The sheer number of interactions and operating parameters makes it really impossible for us mere mortals to understand how to optimize a data center in real time. However, it really is pretty trivial for computers to crunch through all those scenarios and find the optimal settings.

“Over the past couple of years, we’ve developed these algorithms and we’ve trained them with billions of data points from all of our data centers all over the world,” said Kava. “Now we use this machine learning to help our teams visualize the data, so the operations teams can know how to set up the electrical and mechanical plants for the optimal settings on any given day.”

In early usage, the neural network has been able to predict Google’s Power Usage Effectiveness with 99.6 percent accuracy. Its recommendations have led to efficiency gains that appear small, but can lead to major cost savings when applied across a data center housing tens of thousands of servers.

LinkedinTwitterFacebookSubscribe
Mail

Tagged With: Artificial Intelligence, Big Data, Facebook, Google, IBM

Newsletters

Stay informed: Get our weekly updates!

Are you a new reader? Follow Data Center Frontier on Twitter or Facebook.

About Rich Miller

I write about the places where the Internet lives, telling the story of data centers and the people who build them. I founded Data Center Knowledge, the data center industry's leading news site. Now I'm exploring the future of cloud computing at Data Center Frontier.

Comments

  1. dinkleberg@yahoo.com'Gary Dinkleberg says

    July 6, 2016 at 12:08 pm

    Great read! Thanks Rich!

  • Facebook
  • Instagram
  • LinkedIn
  • Pinterest
  • Twitter

Voices of the Industry

Overcoming Supply Chain Roadblocks: How to Avoid Disruptions in Your Data Center

Overcoming Supply Chain Roadblocks: How to Avoid Disruptions in Your Data Center The data center industry continues to experience significant global supply chain problems. Brett Williams of Service Express, explores the importance of leveraging the secondary hardware market to overcome supply chain roadblocks.

White Papers

Hybrid Edge Data Centers

Planning Now for the Future: Hybrid Edge Data Centers

More devices will be sending data to the edge and the cloud, and for this reason, data centers will need to develop a hybrid approach to their infrastructure. A white paper from Belden explains why the future is one with hybrid edge data centers.

Get this PDF emailed to you.

We always respect your privacy and we never sell or rent our list to third parties. By downloading this White Paper you are agreeing to our terms of service. You can opt out at any time.

DCF Spotlight

Data center modules on display at the recent Edge Congress conference in Austin, Texas. (Photo: Rich Miller)

Edge Computing is Poised to Remake the Data Center Landscape

Data center leaders are investing in edge computing and edge solutions and actively looking at new ways to deploy edge capacity to support evolving business and user requirements.

An aerial view of major facilities in Data Center Alley in Ashburn, Virginia. (Image: Loudoun County)

Northern Virginia Data Center Market: The Focal Point for Cloud Growth

The Northern Virginia data center market is seeing a surge in supply and an even bigger surge in demand. Data Center Frontier explores trends, stats and future expectations for the No. 1 data center market in the country.

See More Spotlight Features

Newsletters

Get the Latest News from Data Center Frontier

Job Listings

RSS Job Openings | Pkaza Critical Facilities Recruiting

  • Critical Power Energy Manager - Data Center Development - Ashburn, VA
  • Site Development Manager - Data Center - Ashburn, VA
  • Data Center Facility Operations Director - Chicago, IL
  • Electrical Engineer - Senior - Dallas, TX
  • Mechanical Commissioning Engineer - Calgary, Alberta

See More Jobs

Data Center 101

Data Center 101: Mastering the Basics of the Data Center Industry

Data Center 101: Mastering the Basics of the Data Center Industry

Data Center Frontier, in partnership with Open Spectrum, brings our readers a series that provides an introductory guidebook to the ins and outs of the data center and colocation industry. Think power systems, cooling, solutions, data center contracts and more. The Data Center 101 Special Report series is directed to those new to the industry, or those of our readers who need to brush up on the basics.

  • Data Center Power
  • Data Center Cooling
  • Strategies for Data Center Location
  • Data Center Pricing Negotiating
  • Cloud Computing

See More Data center 101 Topics

About Us

Charting the future of data centers and cloud computing. We write about what’s next for the Internet, and the innovations that will take us there. We tell the story of the digital economy through the data center facilities that power cloud computing and the people who build them. Read more ...
  • Facebook
  • LinkedIn
  • Pinterest
  • Twitter

About Our Founder

Data Center Frontier is edited by Rich Miller, the data center industry’s most experienced journalist. For more than 20 years, Rich has profiled the key role played by data centers in the Internet revolution. Meet the DCF team.

TOPICS

  • 5G Wireless
  • Cloud
  • Colo
  • Connected Cars
  • Cooling
  • Cornerstone
  • Coronavirus
  • Design
  • Edge Computing
  • Energy
  • Executive Roundtable
  • Featured
  • Finance
  • Hyperscale
  • Interconnection
  • Internet of Things
  • Machine Learning
  • Network
  • Podcast
  • Servers
  • Site Selection
  • Social Business
  • Special Reports
  • Storage
  • Sustainability
  • Videos
  • Virtual Reality
  • Voices of the Industry
  • Webinar
  • White Paper

Copyright Data Center Frontier LLC © 2022