Data center firms keen to diversify AI GPU sources

Monica Chen, Hsinchu; Eifeh Strom, DIGITIMES Asia 0


Data center manufacturers are keen on diversifying their AI GPU suppliers. Advanced Micro Devices (AMD) intends to reap the most benefits with its Instinct MI300 series, offering an alternative to Nvidia's H100 GPU, according to industry sources.

AMD is set to hold the AMD Data Center and AI Technology Premiere on June 13, at which CEO Lisa Su will detail the company's AI platform deployment and outlook.

The most anticipated product is AMD's Instinct MI300 data center accelerated processing unit (APU), which can take on Nvidia's H100 GPU. Industry players expect Su will announce a fourth-quarter mass production timetable.

The MI300 uses TSMC's 5nm process and is designed for high-energy-efficient AI training performance and high-performance computing (HPC) workloads. It uses a 3D chiplet design, combining CDNA 3 GPU, Zen 4 CPU, and HBM chiplet.

As a competitor to the H100, AMD is looking to challenge Nvidia from dominating the AI GPU market in the same way Intel dominated the server platform market.

With the help of TSMC, AMD has significantly improved the performance, yield, and time-to-market of its server platform since 2019. Amid Intel's manufacturing delays and shortages, supply chains have turned to AMD. Market acceptance of AMD's server platform continues to grow, with its market share going from zero to nearly 20%.

Intel has announced it is working on its AI offerings. However, supply chain players believe it will be difficult for Intel to challenge AMD and Nvidia.

Nvidia's A100 and H100 GPUs have become the focus of the global AI GPU market, but industry observers say the company will face challenges on its road to dominance.

According to supply chain players, Nvidia leads the AI GPU field in terms of software, hardware, ecosystem strength, price, specifications, and supply. However, upstream and downstream suppliers are weary of seeing one company dominate the market, which would leave little room for price or supply negotiations.

Many major companies are already working with AMD, which pointed out that its EPYC server processors and Instinct MI300 APUs are already being used in supercomputers. AMD also offers competitive pricing.

Sources noted that all of Nvidia's A100 and H100 GPUs and AMD's MI300 APUs are produced by TSMC. The supply of Nvidia's A100 and H100 GPUs will remain short through the end of this year, mainly due to insufficient CoWoS capacity. TSMC is accelerating its capacity expansion.

TSMC estimates its capacity will double in 2024, becoming the biggest beneficiary of the competition between Nvidia and AMD. The supply chain is also expected to usher in another wave of growth.

To comply with US restrictions on chip supply to China, Nvidia and AMD offer downgraded versions of their AI GPUs for Chinese customers, such as Baidu, Tencent, and Alibaba.