AI / Machine Learning

AMD Outlines its AI Roadmap, Including New GPUs

June 14, 2023

AMD is positioning itself as a provider of a full range of AI hardware, with everything from optimizations for its EPYC CPUs to dedicated data center GPUs and everything in between.

David Chernicoff

The AMD Infinity Architecture Platform, which features 8 AMD Instinct MI300X GPUs.

SAN FRANCISCO - From next generation server CPUs to AI accelerators and data center GPUs aimed squarely at Nvidia, there was something to catch any technologist's eye at the AMD Data Center and AI Technology Premier this week.

That included the introduction of new versions of AMD's Epyc Genoa CPUs, along with a new GPU and accelerator optimized for artificial intelligence (AI) Workloads.

AMD is positioning itself as a provider of a full range of AI hardware, with everything from optimizations for its EPYC CPUs to dedicated data center GPUs and everything in between.

AMD CEO Dr. Lisa Su pointed to data they have that indicates that the market in 2023 for AI acceleration could hit $30 billion, growing at a 50% CAG over the next for years to a $150 billion market. She made AMD’s position very clear “We are incredibly excited to see AI everywhere in our portfolio.” And that they were “really focused on making it easy for our customers and partners to deploy.”

AMD's The earlier announcements on the Ryzen 7040 mobile processors would run AI inference under Windows raised a few eyebrows, but the full range of AI announcements now includes direct competition for the Nvidia A100 and H100 GPUs in the form of the Instinct line.

Two products were discussed at the event that demonstrate AMD’s commitment the generative AI market:

The MI300A, an APU accelerator for AI and HPC workloads, that comes with 128 GB of HBM3 memory and 24 Zen 4 CPU cores, and is currently sampling to customers. An APU is an accelerated processing unit, and houses a CPU and GPU on a central die, allowing for efficient resource sharing between the two components.
The MI300X is a GPU designed to handle large language models (LLMs), the technology behind much of the generative AI advances. The MI300X comes with up to 192 GB of HBM3 memory and is available in multiple packages, from a single GPU to an 8 GPU package known as the Infinity Architecture Platform, mirroring the packaging of some of the Nvidia H100 product releases, and which will presumably be aimed at the same markets and customers.

The CDNA 3 architecture MI300X GPU has 153 billion transistors, 896 GBps Infinity Fabric bandwidths, and 5.2 TB memory bandwidth. AMD claims that this will be the fastest GPU for generative AI.

As the top of the line MI300X shown was still in preview and not a formal product announcement, AMD was unable to discuss power and cooling requirements for the data products.

AMD was bullish on its efforts to educate developers on their ROCm software stack, and feels that they have a good mix of efforts aimed at developers to bring them into the AMD AI world. They are also working closely with the Hugging Face AI community on supporting their over 15000 models that are also supported by other AI players.

Less exciting but important to their respective markets are the AI-enabled Alveo Media Accelerators and the embedded Versal chips for sensor and edge markets. These technologies, along with their siblings, will enable AI processing at almost all points of the process workflow. Making good on AMD’s vision of an AI enabled infrastructure.

New CPUs and SmartNICs

The festivities started with the introduction of two new versions of AMD's top-tier server CPU, the 4th-generation AMD EPYC Genoa processor. Originally introduced in November 2022, the two new versions are the Bergamo and Genoa-X. The first should be of significant interest to large scale data centers. Unlike the original version of the Genoa, the Bergamo, which is optimized for cloud native operations, uses Zen 4c CPU cores that are 35% smaller than the original Zen 4 cores, allowing the processor to support 128 cores compared to the Genoa’s 96.

Unlike the Genoa cores, which are optimized for maximum performance per core, the new Bergamo is optimized for maximum performance per watt, allowing for a more efficient processor that supports a larger number of virtual machines providing the greatest vCPU density and best energy efficiency of this generation of AMD server CPUs.

The second new version of the EPYC Genoa is the Genoa-X which comes with the latest in AMD 3D V-Cache technology, high-performance Zen 4 cores, and up to 1.3 GB of L3 cache, making them especially suited for HPC applications. Depending on the application AMD is claiming close to 3 times the performance of the 60-core fourth generation Intel Xeon 8490H CPU.

All versions of the 4th-generation EPYC, the Genoa, Genoa-X, and Bergamo, are currently available. A fourth version, code name Sienna, which is planned to be optimized for telco and edge computing, will be available sometime in the second half of this year, however no details on the processor were made available.

Another piece of AMDs plan to accelerate the data center is the introduction of a new generation of silicon for DPUs and Smart Nics. The AMD Pensando Smart Nic is in use by hyperscalers such as Microsoft Azure, and the silicon in devices such as top of rack switches from Aruba. Like other DPU technologies, these devices allow workloads to be offloaded from CPUs as well as their expected support of huge numbers of connections.

AMD is clearly coming hard at Nvidia’s 80% market share for AI and we are looking forward to the actual release of their next generation AI hardware to see how the market responds.

About the Author

David Chernicoff

David Chernicoff is an experienced technologist and editorial content creator with the ability to see the connections between technology and business while figuring out how to get the most from both and to explain the needs of business to IT and IT to business.