NEWS  /  Analysis

China Eyes Breakthroughs in AI Chip Power as Global Arms Race Accelerates

By  xinyue  Jul 17, 2025, 12:01 a.m. ET

For China, the next chapter in AI competitiveness won’t be won purely through scale, but through a strategic shift in chip architecture, cross-disciplinary co-optimization, and ecosystem-wide innovation.

AsianFin -- As a new wave of artificial intelligence sweeps across industries, the global race for computing power is entering a decisive phase—one that could redefine technology leadership for decades.

With Nvidia’s market capitalization surpassing $4 trillion in early July—eclipsing Apple and Microsoft—and CEO Jensen Huang overtaking Warren Buffett in personal wealth, the AI boom is reshaping capital markets and geopolitical tech priorities.

While Nvidia’s success epitomizes the dominance of GPU-based AI compute architecture, China is scrambling to close a widening gap. Amid intensifying restrictions on chip supply and advanced manufacturing, China’s AI sector is battling both supply-side constraints and a demand surge driven by large model developers like DeepSeek.

According to industry forecasts, China’s AI chip market is projected to exceed $180 billion by 2030, while the broader AI-related economy could top $1.4 trillion. But with domestic production still trailing and foreign dependency high, Beijing is aggressively pursuing indigenous innovation—not only in silicon but also in architecture and design methodologies.

At the China Integrated Circuit Design Innovation Conference (ICDIA) in early July, Tsinghua University professor and Qingwei Intelligence co-founder Yin Shouyi presented a stark assessment: chip innovation must move beyond Moore’s Law. As transistor miniaturization nears physical limits, the next frontier lies in System-Technology Co-Optimization (STCO)—a methodology focused on optimizing performance, power, area, and cost (PPAC) across the entire design-manufacture stack.

STCO integrates chip architecture, manufacturing processes, and packaging into a deeply collaborative design cycle. This strategy could help Chinese chipmakers bypass bottlenecks imposed by global foundry restrictions and build AI-specific chips tailored for massive-scale, spatiotemporal workloads.

“China needs to move fast and think holistically,” said Yin, outlining four key focus areas: architecture exploration, component design, rapid simulation, and process co-optimization. “It’s not just about speed—it’s about evolving the entire chip design ecosystem to meet the demands of next-gen AI.”

AI compute demand in China is expected to grow exponentially. According to Frost & Sullivan, AI acceleration chip revenue will rise from $19.6 billion in 2024 to $183.4 billion by 2029, representing a compound annual growth rate (CAGR) of 53.7%. Over the same period, China’s total compute capacity is projected to increase nearly sixfold, from 617 EFLOPS to over 3,440 EFLOPS.

However, the country remains heavily reliant on GPUs—primarily from Nvidia—which currently account for nearly 70% of all AI chips in use. But alternatives are emerging.

One such alternative is the Reconfigurable Processing Unit (RPU), a chip architecture based on distributed dataflow computing. Unlike traditional, instruction-driven GPUs, RPUs dynamically allocate compute resources, enabling higher throughput and energy efficiency tailored for AI inference and training.

Companies like SambaNova and Groq are leading the global charge. Groq claims its chips offer 10× the inference speed of Nvidia’s H100 at one-tenth the cost, while Tesla has adopted similar distributed architecture in its Dojo supercomputer. Google’s newly launched TPU v7 “Ironwood” boasts a staggering 3,600× performance gain, further intensifying the competition.

At the center of China’s RPU development is Qingwei Intelligence, a spinout from Tsinghua’s Reconfigurable Computing Lab. The firm has launched the TX8 series and the latest TX81 RPU module, capable of delivering 512 TFLOPS (FP16). Its REX1032 server, designed for trillion-parameter models, reaches 4 PFLOPS per node and supports direct multi-card interconnects, eliminating costly switch hardware.

Qingwei’s chips are now deployed at intelligent computing centers across several Chinese provinces, serving leading domestic models like DeepSeek R1 and V3. The goal: build a self-reliant infrastructure that rivals U.S. giants without leaning on foreign supply chains.

Whether it’s GPUs, RPUs, or TPUs, one reality is clear: AI’s future depends on scalable, efficient compute infrastructure. As Nvidia CEO Jensen Huang has stated, AI is becoming as foundational as electricity or the internet.

With next-generation workloads—like Agentic AI and Physical AI—demanding unprecedented levels of compute, data centers are quickly evolving into AI factories, the core computational units of the digital future.

For China, catching up in the AI race will require more than scale—it will demand strategic shifts in chip architecture, cross-disciplinary innovation, and full-stack ecosystem development.

Please sign in and then enter your comment