Shenzhen has brought online what project materials describe as China's first 14,000P, 10,000-card AI computing cluster built around a fully domestic technology stack, marking a new stage in the country's push to reduce reliance on foreign hardware and software in large-scale model training.
Activated on March 26, 2026, the project combines an 11,000P second phase with an earlier 3,000P deployment, bringing total capacity to 14,000P. More significant than its scale is its positioning as China's first end-to-end domestic AI infrastructure stack, spanning chips, servers, networking, storage, software, and scheduling.
This matters because competition in AI infrastructure has shifted beyond individual chip performance. At the 10,000-card scale, the challenge is no longer compute alone, but whether systems can be fed, cooled, connected, and orchestrated reliably over sustained training cycles. Shenzhen's project, therefore, tests whether China can build a usable, industrial-scale AI training base — rather than a nominally large but operationally fragmented cluster.
Why China's AI cluster matters
China's AI infrastructure push has long faced a structural gap: even where domestic chips were deployed, key layers — such as interconnects, software environments, and supporting components — often remained reliant on foreign technologies. Shenzhen's cluster is positioned as a break from that dependency.
The system is built around Huawei's Ascend 910C accelerators and an "Ascend + CAAN" ecosystem. It is designed as a fully self-controlled computing base for large model training and inference, supported by localized data, integrated operations, and centralized scheduling. Officials also position it within a broader national computing network strategy rather than as a stand-alone municipal project.
Early demand supports that positioning. Phase one was fully allocated, while nearly 50 companies, universities, and research institutes signed agreements for second-phase capacity. With an overall utilization rate of 92%, the constraint is shifting from demand to whether reliable compute can be delivered at scale.
Beyond scale: coordination defines performance
The cluster's significance lies less in card count than in how those resources are organized.
The system deploys roughly 14,000 Ascend 910C accelerators in a supernode architecture, rather than a conventional stack of smaller AI servers. In traditional designs, scaling card count increases inter-node communication overhead, raising latency, fragmenting utilization, and reducing training efficiency.
The architecture groups compute into dense supernodes linked by a domestic high-speed interconnect and coordinated through a unified scheduling layer. By localizing communication within nodes and reducing cross-cluster traffic, the design aims to improve efficiency at scale while addressing three core constraints: communication bottlenecks, engineering complexity, and fault management.
This systems-level approach is reflected in reported performance metrics. The initial 3,000P deployment recorded an average daily failure rate of 0.3‰, while training linearity on the Pangu-718B model reached 93.12%. A reported PUE of 1.08 also highlights a parallel focus on energy efficiency.
How the domestic stack is put together
While some technical details read more like engineering briefings than independently verified disclosures, the broader picture is clear: Shenzhen is positioning the project as a rare full-stack domestic AI infrastructure system spanning all major layers — not just accelerators.
At the core is Huawei's Ascend 910C accelerator. Surrounding it is a domestically built stack: high-density servers designed for supernode deployment, a proprietary high-speed interconnect marketed as Ascend Fabric (often referred to as Xinghe AI Fabric in its latest iterations), distributed storage combining local NVMe and parallel systems, and a software layer built on Ascend Intelligent Computing Platform with a distributed scheduling engine for resource orchestration and fault isolation.
Crucially, the project is not built around a single supplier. It draws on a broader domestic ecosystem spanning CPUs, memory, power systems, and infrastructure vendors, reinforcing China's push to localize not just chips, but the entire AI supply chain.
China-made by layers | |||
Layer | Main brand/supplier | Product/system | Role in the cluster |
AI accelerator | Huawei | Ascend 910C | Core training accelerator for the 10,000-card cluster |
CPU | Phytium (FeiTeng) / Hygon | FT-3000 / Hygon 7390 | General-purpose compute and system control |
Server/motherboard | Huawei and domestic ODMs | Ascend server boards | High-density server integration for supernode deployment |
Network | Huawei and domestic vendors | Ascend Fabric, 400G/800G RDMA switches | High-speed interconnect for large-scale distributed training |
Storage media | YMTC / CXMT | NVMe SSD / memory components | Local cache and storage support for model training |
Power systems | Huawei / Hangzhou Zhongheng Electric | Integrated power modules | Power delivery for dense AI infrastructure |
Cooling/cabinet | Shenzhen-based local suppliers | Custom liquid-cooled racks | Thermal management and high-density enclosure design |
Software platform | Huawei | Ascend Intelligent Computing Platform | Cluster management, resource control and monitoring |
Scheduling layer | Proprietary domestic software | Distributed scheduling engine | Job scheduling, topology-aware allocation and fault handling |
Source:.Nstipsp.com, compiled by DIGITIMES, April 2026
Performance meets economics
Large AI clusters often fail to convert scale into usable compute. As systems grow, communication overhead increases, faults accumulate, and power costs escalate. Shenzhen's project is designed to address these constraints directly.
Project metrics focus on three areas: linearity, reliability, and energy efficiency. The reported training linearity of 93.12% for the Pangu-718B model suggests performance scales with cluster size, while a 0.3‰ daily failure rate addresses concerns over long-duration training stability. A reported PUE of 1.08 points to aggressive optimization in cooling and power systems, including full liquid cooling and integrated energy management.
These metrics underpin the project's commercial positioning. Beyond sovereignty, Shenzhen is pitching the cluster as cost-efficient, reliable AI infrastructure capable of attracting model developers, robotics firms, research institutions, and enterprise users.
A test for China's AI infrastructure
The implications extend beyond Shenzhen. China's AI race is no longer defined by chip design alone, but by whether it can build a complete training environment — spanning compute, networking, storage, orchestration, and operations — that developers will adopt at scale.
If validated in real-world deployment, the cluster could serve as a reference point. It would signal a shift from component substitution to systems-level engineering across China's AI stack, with implications for large models, AI for Science, robotics, and autonomous driving — all of which depend on sustained, high-availability compute.
The next phase will test execution. Operators plan to expand capacity, integrate resources into a unified platform, and support both training and inference workloads. If Shenzhen can sustain utilization, improve software compatibility, and scale without compromising reliability, the project could signal a broader shift: turning AI compute from a constraint into a controllable industrial capability.
Article edited by Jerry Chen




