CONNECT WITH US

Exclusive: AMD, UALink, UEC leaders on AI infrastructure ahead of OCP Summit

Levi Li, DIGITIMES Asia, Taipei 0

Credit: AFP

As generative AI pushes data center infrastructure to new limits, pressure is mounting on thermal management, interconnect design, and ecosystem interoperability. Ahead of the 2025 OCP APAC Summit, executives from AMD, the Ultra Ethernet Consortium (UEC), and the UALink Consortium spoke with DIGITIMES Asia about the engineering realities and strategic trade-offs shaping the future of AI infrastructure.

Thermal design meets rack-scale complexity

AMD Corporate VP Robert Hormuth addressed the mounting physical and thermal constraints facing dense AI clusters. "AMD is acutely aware of the growing thermal and physical design challenges," he said, pointing to a holistic, data center–wide strategy that integrates liquid cooling, airflow refinement, and rack-scale optimization.

AMD is validating liquid cooling solutions for its CPUs, GPUs, and high-speed interconnects, particularly within the Instinct product line, while partnering with OEMs and hyperscalers to ensure integration with direct-to-chip systems. The company is also optimizing chassis and rack airflow to preserve performance at high thermal loads.

"AMD treats cooling and physical design as full data center challenges, not just chip- or board-level problems," Hormuth said, adding that the company offers reference architectures and deployment blueprints to guide partners navigating high-density AI thermal constraints.

Credit: Micron YouTube

AMD CVP, Robert Hormuth. Credit: Micron YouTube

UALink vs. NVLink: competing visions for AI scale-up

Kurtis Bowman, Chair of the UALink Consortium, positioned UALink as a fundamental rethinking of AI interconnect standards. "UALink was designed from the ground up as a truly open standard," he said, with broad industry participation and vendor-agnostic architecture at its core.

According to Bowman, Nvidia's NVLink Fusion acknowledges the market shift toward faster interconnects, but "remains a proprietary solution with limited openness in practice." He noted its closed governance and continued tie-ins to Nvidia's hardware stack as limiting factors for system designers.

By enabling mix-and-match interoperability, UALink supports flexible AI system design and avoids vendor lock-in. "This openness fosters faster innovation, greater flexibility in system design, and typically results in more competitive pricing," Bowman said. For organizations prioritizing multi-vendor choice and ecosystem sustainability, UALink offers long-term strategic benefits.

UALink Consortium Chair, Kurtis Bowman. Credit: Dutch IT Channel

UALink Consortium Chair, Kurtis Bowman. Credit: Dutch IT Channel

Scale-up meets scale-out: UALink and UEC converge

Dr. J Metz, Chair of the UEC, described Ultra Ethernet as the "high-capacity highway system" for scaling out across distributed AI clusters. While built on Ethernet, the standard incorporates enhancements in congestion control and packet routing — key for distributed training and model serving at hyperscale.

UALink handles scale-up within individual AI pods, creating low-latency, memory-semantic links across accelerators. "UALink focuses on scale-up within an AI computing 'pod,'" Bowman said, enabling pods to function like a single massive GPU for demanding LLM training and inference workloads.

Combined, UALink and UEC form "a comprehensive, flexible and open networking foundation" for next-generation AI infrastructure, according to Bowman.

Credit: J Metz LinkedIn.

UEC Chair, J Metz. Credit: J Metz LinkedIn

Scaling challenges remain, but openness leads the way

While UALink 1.0 supports up to 1,024 accelerators per pod, Bowman clarified that system design remains outside the specification's scope, but is "a critical element in the success of the ecosystem," reflecting the complexity ahead for integrators.

In a fragmented market shaped by geopolitical risk and vendor consolidation, governance models — like those behind UALink — may prove just as consequential as raw throughput. With vendor-neutral frameworks gaining traction, the future of AI infrastructure may hinge as much on openness as on performance.

Meet the contributors

│J Metz

Chair of the Ultra Ethernet Consortium (UEC) Steering Committee and Technical Director of Systems Design at AMD, J Metz has held engineering roles at Apple, QLogic, Cisco, and Rockport Networks. A veteran of standards development, he has served on the boards of NVM Express, the Fibre Channel Industry Association, and the Storage Networking Industry Association.

│Robert Hormuth

As Corporate Vice President of Architecture and Strategy at AMD's Data Center Solutions Group, Robert Hormuth leads system-level planning across AMD's server, GPU, and embedded product lines. He focuses on aligning emerging workloads with long-term architectural strategy through close collaboration with customers and engineering teams.

│Kurtis Bowman

Kurtis Bowman is Director of Architecture and Strategy at AMD and Chair of the UALink Consortium. With over 30 years in system architecture, he has held leadership roles at Compaq, Dell, and Panasas, and co-founded major initiatives like CXL, Gen-Z, and UALink. He currently co-chairs the CXL Marketing Workgroup and drives industry-wide collaboration on open computing standards.

Article edited by Jerry Chen