Intel Engineers Multi-Pronged Approach to Gaining Data Center AI Market Share

It was bad enough for Intel when AMD started to make huge inroads into their server CPU market with their latest generations of EPYC CPUs -- but as the market shifted towards massive investment in AI, Intel was hard-pressed to offer any competition for Nvidia.

But five years after acquiring Israeli start-up Habana Labs, and its Gaudi AI chip family, Intel thinks they have a shot at competing in the AI chip marketplace with the Intel Gaudi 3 AI accelerator.

Inference Matters

While it is true that the AI marketplace is all about Nvidia, then everybody else, with Nvidia’s very aggressive release schedule for advanced AI products it is important to remember that the vast majority of AI-related tasks don’t require the absolute current pinnacle of performance.

While training large language models is the jewel in the AI crown, the majority of enterprise AI use is applying what is learned by those models as inference, most often seen by consumers as recommendation models when visiting web sites.

And the industry is realizing this, as is clearly identified by Cisco’s partnering with Nvidia to deploy capable Tensor core-based systems within their blade servers and Ethernet networks.

Competing in the Market

In its Gaudi announcement, Intel claims that the Gaudi 3, crafted using TSMC’s advanced 5nm technology, delivers 50% better inference and 40% better power efficiency than the previous top-of-the-line Nvidia H100 AI GPU.

What this means is that the large market of people for whom being absolutely cutting-edge is unnecessary will have serious competition.

And while pricing has not been announced, Intel's claim is the Gaudi 3 will be a fraction of the cost of the comparable Nvidia AI accelerators.

Gaudi 3 vs. Nvidia

Intel likes to point out that the previous generation Gaudi 2 is the only benchmarked alternative to the Nvidia H100 for generative AI performance, with results that are linked here.

And the performance claims for Gaudi 3 indicate a significant increase in performance, with 4X AI compute in BF16; a 1.5x increase in memory bandwidth; and a doubling of networking bandwidth driving the performance gains.

Additional details on the Gaudi 3 accelerator can be found here.

Performance comparisons against the shipping Nvidia H100 using common reference models make the following claims:

50% faster time-to-train on average a cross Llama2 models with 7B and 13B parameters, and GPT-3 175B parameter model.
Inference throughput is projected to outperform the H100 by 50% on average1 and 40% for inference power-efficiency averaged2 across Llama 7B and 70B parameters, and Falcon 180B parameter models.

CUDA vs. Open Standards and Ethernet Implications

Perhaps more importantly, with the new release Intel is taking on Nvidia’s developer dominance with CUDA by focusing on open standard, community-based software, reducing reliance on CUDA-specific development skills.

It is also making use of Ethernet-standard networking, allowing systems to scale from a single node to mega-clusters with thousands of nodes.

This means supporting inference and training models at a scale that can adjust to customer demands.

Will “Open” Really Make the Difference?

To help drive the concept of open platform AI development, Intel announced that they had joined with Anyscale, Articul8, DataStax, Domino, Hugging Face, KX Systems, MariaDB, MinIO, Qdrant, Red Hat, Redis, SAP, VMware, Yellowbrick and Zilliz, with the intention to create this open AI platform.

To get this effort going, Intel announced that they “will release reference implementations for GenAI pipelines on secure Intel Xeon and Gaudi-based solutions, publish a technical conceptual framework, and continue to add infrastructure capacity in the Intel Tiber Developer Cloud for ecosystem development and validation of RAG and future pipelines.”

The company is also working through the Ultra Ethernet Consortium to develop optimized Ethernet networking for an AI fabric.

Everyone Is Becoming An AI Company

Across many technical disciplines, according to Intel CEO Pat Gelsinger, every player is becoming an AI company.

Be it manufacturing components for AI-enabled infrastructures, actual AI accelerators, or using AI to deliver value to customers, everyone seems to be involved. And Gelsinger points out that “Intel is bringing AI everywhere across the enterprise, from the PC to the data center to the edge."

He continues, "Our latest Gaudi, Xeon and Core Ultra platforms are delivering a cohesive set of flexible solutions tailored to meet the changing needs of our customers and partners and capitalize on the immense opportunities ahead.”

Select hardware partners, including Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermicro, are sampling the Gaudi 3 this quarter, with the air-cooled version of Gaudi 3 expected to ship to customers in Q3 24, and the water-cooled version in Q4 24.

Keep pace with the fast-moving world of data centers and cloud computing by connecting with Data Center Frontier on LinkedIn, following us on X/Twitter and Facebook, and signing up for our weekly newsletters using the form below.

Intel Engineers Multi-Pronged Approach to Gaining Data Center AI Market Share

Inference Matters

Competing in the Market

Gaudi 3 vs. Nvidia

CUDA vs. Open Standards and Ethernet Implications

Will “Open” Really Make the Difference?

Everyone Is Becoming An AI Company

About the Author

David Chernicoff

Voices of the Industry

Empowering Data Centers with 24x7 Decarbonized Power: The Role of Solid Oxide Fuel Cells