Key Technology Suppliers Race to Adapt AI Infrastructure for Data Centers at Global Scale
As the growth in AI technologies has accelerated, we’re seeing tremendous changes in the configuration of the data center infrastructure. Organizations are scaling their capacities to accommodate the burgeoning computations demanded by AI-based apps and services.
With the advent of sophisticated machine learning models, data centers have to make sure that their infrastructure is capable of handling HPC, compute-intensive AI tasks and the storage and processing demands that are intrinsic to these technologies.
As such, recent announcements from Singtel, Hitachi, Intel, Oracle and many others are focused on how these companies are upgrading their infrastructure by investing in future-proofing their data centers with the advanced resources and architectures needed to enable AI development and deployment.
September saw significant announcements from major players in the industry, both in the US and worldwide.
Skyrocketing Demand for AI Acclerators and GPUs
According to a Dell’Oro Group report published in early September 2024, it was the demand for upgraded AI infrastructures that drove spending on accelerators and GPUs to $54 billion in the second quarter of the year.
According to Baron Fung, Senior Research Director at Dell'Oro Group:
"Server and storage system component revenues surged 127 percent year-over-year in 2Q 2024, reaching a new peak. This growth was propelled by accelerators, such as GPUs and custom accelerators, HBMs, and Ethernet adapters for generative AI applications."
Fung said that additionally, memory and storage drive components for non-AI workloads contributed to this growth, as prices rose from last year's low.
Reflecting on corollary increases in spending, he also noted how in anticipation of increased server and storage demand later this year, OEMs and cloud service providers replenished inventories.
The Role of Sustainability in AI-Ready Data Centers
Perhaps the biggest risk deriving from scale for AI infrastructure expansion is centered on sustainability. The energy demand created by AI workloads, particularly the intensive training of large-scale models, has put increased pressure on companies to reconcile performance with green IT.
For instance, as we will see below, Singtel and Hitachi have partnered to develop more energy-efficient data centers that capitalize on green energy and include smart cooling technologies for this purpose. Meanwhile, Intel, Oracle and Cerebras are similarly working to optimize hardware to design systems that are performant yet efficient in energy use.
Intel’s newest AI accelerator, called Gaudi 3, had a design goal of improving performance per watt, while the liquid-cooled GPUs in Oracle’s superclusters help reduce the thermal footprint of AI workloads. Such developments are part of a larger infrastructural shift in the data center industry where efficiency and low-carbon design are primary concerns.
AI will continue to drive the trend toward making data centers more sustainable – not just by using more environmentally sound power sources and cooling systems, but by also ensuring that the same level of performance could be achieved using as few resources as possible.
For this reason, the emergence of new AI technologies should only serve to make an even greener, cleaner and more energy-efficient infrastructure ever more important.
Oracle Cloud Infrastructure: Driving AI Supercomputing
The development and maintenance of AI infrastructure by Oracle has led to the construction of the company’s zettascale clusters of cloud computing. Powered by the NVIDIA Blackwell AI accelerator platform, Oracle’s Oracle Cloud Infrastructure (OCI) is now one of the world’s largest AI supercomputers in the cloud.
With up to 131,072 NVIDIA Blackwell GPUs and a peak performance of up to 2.4 zettaflops, Oracle’s distributed cloud architecture supports some of the most intense cloud-based AI workloads, and is now available via a flexible distributed cloud model that allows customers to choose where to deploy their cloud and AI services.
Oracle’s OCI Superclusters provide ultra-low latency networking and HPC storage to support high-throughput AI applications. These resources enable customers to access petascale and exascale computing power, which would otherwise be accessible only to big supercomputing centers.
For instance, WideLabs is the first to harness Oracle’s AI infrastructure, using OCI to train one of the largest Portuguese language models. Zoom is leveraging OCI’s sovereignty capabilities to power its AI personal assistant, Zoom AI Companion. These use cases indicate that OCI can support scalable, secure and regionally compliant AI application deployments, which are essential for any enterprise working with sensitive data.
Intel's Xeon 6 and Gaudi 3: Powering AI Infrastructure
Intel has also entered the AI infrastructure space with its Xeon 6 processors and its Gaudi 3 AI accelerators, which are purpose-built to offer the best performance per watt and total cost of ownership for AI deployment. The high demand for scalable AI infrastructure with reduced cost, especially for hyperscale cloud environments, has prompted Intel to develop these new processors and accelerators.
The Xeon 6 processor increases core count and AI acceleration and doubles memory bandwidth over the previous Xeon. This chip focuses on general purpose workloads in edge, cloud and data center environments, whereas Intel’s Gaudi 3 AI accelerator is designed for deep learning workloads.
Gaudi 3 has 64 Tensor processing cores and eight matrix multiplication engines for fast neural network inference. The Gaudi 3 accelerator has focused features for training and inference, as well as high memory bandwidth and popular framework compatibility with PyTorch.
Meanwhile, Intel’s open ecosystem strategy empowers OEMs such as Dell Technologies and Supermicro to optimize systems for AI workloads.
Intel has also grown its Tiber AI Cloud portfolio to support hyperscalers and cloud operators as they address new challenges of AI delivery, such as access and scalability, as well as cost and complexity.
These include giving customers access to the Intel Tiber Developer Cloud for evaluating AI technologies and providing them with Intel Gaudi 3 clusters for scaling to large AI workloads.
Similarly, Intel’s co-engineering efforts have helped companies take generative AI systems from prototypes to production-ready systems, enabling users to overcome issues around real-time monitoring, error handling, and scaling.
By leveraging Intel’s partnerships, enterprises can actualize production-ready RAG solutions that enable the adoption of applications and AI at scale.
Cerebras Systems: Revolutionizing AI Inference
Cerebras Systems offers a 16-bit inference service that it claims is far more powerful and energy-efficient than inference done with GPUs.
AI inference (where you use trained models on new data) is one of the fastest-growing sub-fields of AI computation, and this kind of research represents the kind of innovation that can be as revolutionary as broadband internet is to our access to information.
The CS-3 system, Cerebras’ only product, packs an unprecedented AI processor called the Wafer Scale Engine-3 (WSE-3). The system delivers industry-leading per-user performance and throughput for large, complex, real-time AI applications that developers need to build but struggle on smaller configurations.
Cerebras’ inference service is also available in three flavors – Free, Developer and Enterprise – to suit different use-cases from developers to start-ups to large enterprises.
The Cerebras proposition to start-ups and enterprises alike is that AI inference can and should be both rapid and inexpensive, something that has never been possible before. The company has released an API that can call its inference service, making it possible for any AI model to be easily migrated and start seeing performance benefits.
AI Infrastructure and Data Center Growth in the Asia-Pacific Region
The Asia-Pacific region is poised for growth in its data center market as the ravenous appetite for AI and cloud services heats up demand for computing capacity. Japan, for example, is being eyed as an area of growth for data center development, with a projected compound annual growth rate through 2028 of 9.8 per cent according to the Japan Finance Center’s annual data center survey.
The data center in Singtel’s Banyan Park II was a response to the nation’s need for a future-ready AI infrastructure to meet demand. This included custom-engineered components that could withstand Japan’s earthquakes. The facility highlights the partnership between Singtel and Hitachi which is in line with Singtel’s strategy to expand data center capacity across the region.
By combining GPU Cloud and AI orchestration platforms such as Paragon with sustainable data center infrastructure, the Asia-Pacific plays an important role in an AI future. Singtel and Hitachi’s collaboration in this space brings together the technological and environmental barriers to AI-driven data center expansion to drive digital economic growth in the region.
Strategic Partnership in AI-Ready Data Centers
Singtel and Hitachi have signed a Memorandum of Understanding (MOU) to support AI adoption in the Asia-Pacific with a specific emphasis on Japan, where the partners will co-innovate on next-generation data centers and GPU Cloud solutions.
The collaboration could also be expanded across the wider Asia-Pacific region. Under the MOU, Singtel’s Digital InfraCo business unit will work with Japanese conglomerate Hitachi to identify and define a robust framework for a digital transformation initiative with AI at its core.
The project will focus on the development of next-generation data centers with Hitachi’s green power systems, cooling technologies, storage infrastructure and other end-to-end data center solutions.
The partners will combine Singtel’s data center and connectivity expertise, and Hitachi’s capabilities in integrating end-to-end data center solutions, to accelerate the adoption of AI.
Built on Singtel’s Paragon, an orchestration solution that brings together 5G, edge computing and cloud, and Hitachi’s AI applications for its manufacturing and enterprise customers, the collaboration seeks to overcome some of the barriers to AI implementation, including considerations around scalability, efficiency and sustainability.
The initiative is part of a larger effort to equip data centers to handle the resource-intensive demands of AI and other cloud computing services.
Among the fastest-growing data center markets in Asia-Pacific is Japan, where Singtel’s Digital InfraCo launched a division called Nxera to build green, hyper-connected data centers that can accommodate AI applications. The new unit will roll out a GPU-as-a-Service (GPUaaS) later this year to facilitate enterprise adoption of AI.
Hitachi’s push for planetary stewardship has also shaped its thinking around the partnership. The energy demands of generative AI are growing fast, so fast that new forms of clean power will be required to keep the lights on.
Through its joint venture with Singtel, Hitachi hopes to marry its green energy and smart data management expertise with Singtel’s data center operations, in order to establish intelligent, energy-efficient data centers that rise to the twin challenges of AI: the need for more power, and the need for more efficient computing.
And as we talked about last month, in our coverage of the Blackstone acquisition of APAC data center leader AirTrunk, data center operators are looking to that region as a major growth target in the near future.
Meeting the Challenges of AI Deployment
Bringing AI to scale, however, means answering some fundamental questions around infrastructure readiness, performance optimization, energy efficiency, and regulatory compliance.
As is clear just from the announcements in the last month, companies worldwide are investing in next-gen data center solutions and high-performance hardware, and co-engineering with leading AI companies to make sure AI systems are optimized for enterprise usability. Be it partnerships between Singtel and Hitachi or Nvidia and Oracle, or new technologies, such as Gaudi or Cerebras, businesses are setting out to smooth the process of enterprise deployment.
This process refinement allows for the rapid acquisition of AI-ready data center infrastructure and streamlined access to powerful computational resources, whether for internal data center operations, or for making use of the cloud services being offered that hand off the operational tasks of running HPC and AI computing to companies with that specific expertise.
Supermicro’s VP of Technology Enablement Ray Pang is here joined by Executive Vice President and General Manager of Intel’s Data Center and AI Group, Justin Hotard, to discuss how Supermicro X14 systems and Intel Xeon 6 processors are optimized to accelerate cloud-native and scale-out workloads with maximum performance and efficiency.