AI Hardware is Disrupting the Data Center. Here’s Who to Watch

As powerful new chips target AI workloads, here’s a look at the players to watch, including specialized hardware from startups, incumbents and hyperscaler operators.

Rich Miller

Oct. 5, 2020

4 min read

Groq’s Tensor Streaming Processor shown on this PCIe board which is currently being tested by customers (Image: PRNewsfoto/Groq)

The market for data center hardware is undergoing major disruption. NVIDIA’s plan to acquire ARM is the latest in a series of shifting tides, with new processors and servers providing many options for users yearning for more power to tame artificial intelligence (AI) workloads. That includes new offerings from resurgent players, as well as a herd of startup offering specialized chips for AI computing.

The rise of specialized computing has the potential to bring change in the data center industry, which must adapt to new form factors and higher rack densities. But for all the excitement in the chip and server sector, the racks and rows of most data halls continue to be populated by Intel chips, particularly in the enterprise sector. Is this about to change? And if so, when?

Those were the big questions on the table at the AI Hardware Summit, a virtual event from Kisaco Research that offered updates from and insights from leading lights in high-performance hardware.

The bottom line: There will be powerful new chips arriving from several startups in coming months. It’s the arrival of what analyst Karl Freund of Moor Insights calls a “Cambrian Explosion” of new chips optimized for AI data-crunching.

“We’re going to see a lot of new competitors later this year,” said Freund. “I know I said that last year. Now we’re starting to see chips come out, and it should be pretty exciting. These chips take longer to develop than anyone wants. But there’s no way faster chips can come close to keeping up with the growth in models.”

The Rise of Domain-Specific Architectures

The development of AI algorithms is accelerating, said Freund, with new models incorporating billions of data points to make recommendations and decisions. As models incorporate more data, they also require more computing horsepower, which in turn is driving an AI hardware arms race. The competition to leverage AI is led by the industry’s marquee names – including Google, Facebook, Amazon and Microsoft – who are seeking to add intelligence to a wide range of services and applications.

The future of AI hardware will be specialized, according to David Patterson, a Distinguished Engineer at Google.

“Moore’s law is slowing down,” said Patterson, one of the keynote speakers at the summit. “There’s people that want to argue about it, but it’s a fact that you can measure. As a result, general-purpose microprocessor performance is improving slowly.

“The domain specific architecture is the future of computing,” said Patterson, a pioneer in reduced instruction set computing (RISC) and vice-chair of the RISC-V initiative. “You can tailor your work to that domain and ignore other domains.”

That realization led to big changes in how Google operates its massive global computing platform. In 2015 Patterson led Google’s development of Tensor Processing Units (TPUs), a specialized chip architecture that dramatically boosted Google’s processing power.

The TPU is a custom ASIC tailored for TensorFlow, an open source software library for machine learning that was developed by Google. An ASIC (Application Specific Integrated Circuit) is a chip that can be customized to perform a specific task. Adopting a domain-specific approach allowed Google to drop general purpose features to save on space and energy in the processor.

A four-rack “pod” of Google Tensor Processing Units (TPUs) and supporting hardware inside a Google data center. (Photo: Google)

Most importantly, it allowed Google to deploy massive processing power in a smaller footprint, shifting to liquid cooling to boost its rack density. Without the TPU architecture, Google would have needed to build an additional 20 to 40 data centers to gain equivalent compute horsepower.

Google’s embrace of specialized hardware reset the competitive landscape, highlighting an opportunity for a huge market beyond the x86 architecture.

“I think of TPU one as the Helen of Troy of chips,” said Patterson. “It launched 1,000 chips.”

Who Are the Hot New Players?

Google’s in-house technology sets a high bar for other major tech players seeking an edge using AI to build new services and improve existing ones.

Among existing chipmakers, recent beneficiaries include NVIDIA (which holds is GPU Technology Conference this week) and AMD, as well as ARM specialist Ampere. Intel also expresses confidence in its opportunities in AI, particularly as more compute is applied to inference (decision-making) in addition to training algorithms. Last December Intel acquired AI startup Habana Labs.

Meanwhile, a host of startups are developing AI hardware. Freund said the players to watch as their technology enters production include Cerebras Systems, SambaNov a, Groq, Graphcore and Tenstorrent and Habana Labs.

Freund also sees big impacts from other hyperscale operators developing inference chips. He cited Alibaba’s Hanguang 800 inference chip, which can process 80,000 images per second, as well as the AWS Inferentia initiative.

The AI Hardware Summit continues on Oct. 6-7.

About the Author

Rich Miller

I write about the places where the Internet lives, telling the story of data centers and the people who build them. I founded Data Center Knowledge, the data center industry's leading news site. Now I'm exploring the future of cloud computing at Data Center Frontier.

DoD Taps 8 Nuclear SMR Vendors in Push to Deploy On-Site Microreactors: Data Center Energy Implications

Vertiv Launches OneCore Modular Data Center Platform for AI and HPC

Sponsored

NECA Manual of Labor Rates Chart

Sponsored

Electrical Conduit Cost Savings: A Must-Have Guide for Engineers & Contractors

Voices of the Industry

Source: Shutterstock, courtesy of BluePrint Supply Chain

Sponsored

When Gigawatt Construction Outpaces the Supply Chain

Jarrett Atkinson of BluePrint Supply Chain explains why construction execution systems must evolve in the gigawatt era.

Sponsored

6 Ways to Regain Control of Cloud Costs

Mastering cloud expenditure is vital for businesses of all sizes. Matt Powers of Wesco outlines six strategies to help you take control of your cloud spending.

AI Hardware is Disrupting the Data Center. Here’s Who to Watch

The Rise of Domain-Specific Architectures

Who Are the Hot New Players?

About the Author

Rich Miller

Related

DoD Taps 8 Nuclear SMR Vendors in Push to Deploy On-Site Microreactors: Data Center Energy Implications

Vertiv Launches OneCore Modular Data Center Platform for AI and HPC

NECA Manual of Labor Rates Chart

Electrical Conduit Cost Savings: A Must-Have Guide for Engineers & Contractors

Voices of the Industry

When Gigawatt Construction Outpaces the Supply Chain

6 Ways to Regain Control of Cloud Costs

Trending

Utah’s 4 GW AI Campus Tests the Limits of Speed-to-Power

Vertiv’s AI Infrastructure Surge: Record Orders, Liquid Cooling Expansion, and Grid-Scale Power Reflect Data Center Growth

AI’s New Land Grab: Meta’s Indiana Megaproject and the Rise of Europe’s Neocloud Challengers

Sponsored Picks

Improving speed to market for data center operators

Navigating Liquid Cooling Architectures for Data Centers with AI Workloads

Small Modular Nuclear Reactors Suitability for Data Centers