NVIDIA Moves to Accelerate Growth of GPU-Powered AI Clouds

NVIDIA is looking to accelerate sales of its AI hardware to hyperscale computing providers through tighter partnerships with original design manufacturers (ODMs), who have become key players in the cloud hardware ecosystem.

Rich Miller

May 30, 2017

5 min read

A prototype of the HGX-1 machine learning system on display at the Microsoft booth at the Open Compute Summit. The HGX-1 is a collaboration between Microsoft, NVIDIA and Ingrasys/Foxconn and optimized for cloud AI services. This version uses eight Tesla P100 for PCIe GPUs. (Photo: Rich Miller)

NVIDIA is looking to accelerate sales of its artificial intelligence (AI) hardware to hyperscale computing providers through tighter partnerships with original design manufacturers (ODMs), who have become key players in the cloud technology ecosystem. The company said it will work more closely with partners like Foxconn, Quanta/QCT, Inventec, and Wistron in building custom machine learning hardware using NVIDIA’s GPU (graphics processing unit) technology.

ODMs create new products and specifications based on the product designs provided by the client. NVIDIA engineers will now work closely with ODMs to help minimize the amount of time from design win to production deployments.

The NVIDIA HGX Partner Program will provide each ODM with early access to NVIDIA’s HGX reference architecture, GPU computing technologies and design guidelines. HGX is the data center design used to power the Microsoft Project Olympus initiative, Facebook’s Big Basin server and NVIDIA DGX-1 AI supercomputers.

Surging Demand for AI Hardware

“Accelerated computing is evolving rapidly — in just one year we tripled the deep learning performance in our Tesla GPUs — and this is having a significant impact on the way systems are designed,” said Ian Buck, general manager of Accelerated Computing at NVIDIA. “Through our HGX partner program, device makers can ensure they’re offering the latest AI technologies to the growing community of cloud computing providers.”

The race to leverage AI is led by the industry’s marquee names – including Google, Facebook, Amazon and Microsoft – who are seeking to add intelligence to a wide range of services and applications. These companies have developed in-house designs for many elements of their infrastructure, turning to ODMs like Quanta/QCT and Wistron to turn their design ideas into hardware for their data centers.

The HGX design is configurable, with the ability to combine GPUs and CPUs in a number of ways to customize hardware for high performance computing (HPC), deep learning training or deep learning inferencing. Using HGX as a starter recipe, ODM partners can work with NVIDIA to more quickly design and bring to market a wide range of qualified systems for hyperscale data centers.

NVIDIA is already working with some of major ODMs. It partnered with Foxconn’s Ingrasys Technology unit in working with Microsoft to develop the HGX-1 AI accelerator for GPU-powered deep learning in the cloud. But ODMs said the new partnership structure will enable them to be more agile in creating new GPU products.

“Through this new partner program with NVIDIA, we will be able to more quickly serve the growing demands of our customers, many of whom manage some of the largest data centers in the world,” said Taiyu Chou, general manager of Foxconn/Hon Hai Precision Ind Co., Ltd., and president of Ingrasys Technology. “Early access to NVIDIA GPU technologies and design guidelines will help us more rapidly introduce innovative products for our customers’ growing AI computing needs.”

“As a long-time collaborator with NVIDIA, we look forward to deepening our relationship so that we can meet the increasing computing needs of our hyperscale data center customers,” said Donald Hwang, chief technology officer and president of the Enterprise Business Group at Wistron. “Our customers are hungry for more GPU computing power to handle a variety of AI workloads, and through this new partnership we will be able to deliver new solutions faster.”

“Tapping into NVIDIA’s AI computing expertise will allow us to immediately bring to market game-changing solutions to meet the ever-growing demands in the AI era.” said Mike Yang, SVP of Quanta Computer Inc. and President of QCT.

Arms Race in Hyperscale AI Hardware

NVIDIA’s graphics processing technology has been one of the biggest beneficiaries of the rise of specialized computing, gaining traction with workloads in supercomputing, AI and connected cars. NVIDIA has been investing heavily in innovation in AI, which it sees as a pervasive technology trend that will bring its GPU technology into every area of the economy and society.

In artificial intelligence (AI), computers are assembled into neural networks that emulate the learning process of the human brain to solve new challenges. It’s a process that requires lots of computing horsepower, which is why the leading players in the field have moved beyond traditional CPU-driven servers.[clickToTweet tweet=”Cloud builders work with ODMs like Quanta and Foxconn to turn design ideas into hardware for their data centers.” quote=”Cloud builders work with ODMs like Quanta and Foxconn to turn design ideas into hardware for their data centers.”]

The NVIDIA DGX-1 is packed with 8 Tesla P100 GPUs, providing 170 teraflops of computing power in a 3U form factor. (Photo: Rich Miller)

NVIDIA says the world’s 10 largest hyperscale businesses are all using its GPU accelerators in their data centers. It is introducing the partnership program at a key moment in the AI hardware arms race, as these huge customers survey an evolving landscape of new hardware. NVIDIA has just introduced new Volta-based GPUs offering three times the performance of its prior architecture, and is hoping ODMs can feed the market demand with new products based on the latest NVIDIA technology available.

The standard HGX design architecture includes eight NVIDIA Tesla GPUs in the SXM2 form factor and connected in a cube mesh using NVIDIA NVLink high-speed interconnects and optimized PCIe topologies. With a modular design, HGX enclosures are suited for deployment in existing data center racks across the globe, using hyperscale CPU nodes as needed. Both NVIDIA Tesla P100 and V100 GPU accelerators are compatible with HGX. This allows for immediate upgrades of all HGX-based products once V100 GPUs become available later this year.

About the Author

Rich Miller

I write about the places where the Internet lives, telling the story of data centers and the people who build them. I founded Data Center Knowledge, the data center industry's leading news site. Now I'm exploring the future of cloud computing at Data Center Frontier.

DoD Taps 8 Nuclear SMR Vendors in Push to Deploy On-Site Microreactors: Data Center Energy Implications

Vertiv Launches OneCore Modular Data Center Platform for AI and HPC

Sponsored

NECA Manual of Labor Rates Chart

Sponsored

Electrical Conduit Cost Savings: A Must-Have Guide for Engineers & Contractors

Voices of the Industry

Image courtesy of Integrated Environmental Solutions

Sponsored

Designing Data Centers for Real-World Performance - Optimizing for AI Workloads All Year Long

Mark Knipfer of Integrated Environmental Solutions (IES), explains why data center cooling strategies should be designed for reality, not extremes.

Sponsored

The Infrastructure Reality Behind the AI Revolution

Experts from CommScope share insights on trends, technologies, and key practices shaping next generation data centers.

NVIDIA Moves to Accelerate Growth of GPU-Powered AI Clouds

Surging Demand for AI Hardware

Arms Race in Hyperscale AI Hardware

About the Author

Rich Miller

Related

DoD Taps 8 Nuclear SMR Vendors in Push to Deploy On-Site Microreactors: Data Center Energy Implications

Vertiv Launches OneCore Modular Data Center Platform for AI and HPC

NECA Manual of Labor Rates Chart

Electrical Conduit Cost Savings: A Must-Have Guide for Engineers & Contractors

Voices of the Industry

Designing Data Centers for Real-World Performance - Optimizing for AI Workloads All Year Long

The Infrastructure Reality Behind the AI Revolution

Trending

Project Stalled: Grid Bottlenecks Threaten the Fifth Industrial Revolution

Community Opposition Emerges as New Gatekeeper for AI Data Center Expansion

The Gigawatt Bottleneck: Power Constraints Define AI Data Center Growth

Sponsored Picks

Liquid Cooling for AI Data Centers: 3 Risks and How a Trusted Partner Ensures Success

5 Principles for 800 VDC in AI Data Centers: Rack-level Architectures as the Immediate Enabler

How 6 AI Attributes Change Data Center Design