(Image source: SCMP)
AI chips have emerged as a decisive factor in the escalating technological rivalry between China and the United States.
Over the past two weeks alone, Nvidia—now valued at nearly $4.5 trillion—announced plans to invest up to $100 billion in OpenAI over the next decade. As part of the partnership, OpenAI will purchase and deploy between 4 million and 5 million Nvidia GPU chips. Meanwhile, on October 7, AMD revealed a four-year, multibillion-dollar chip supply agreement with OpenAI, under which OpenAI is expected to acquire up to a 10% equity stake in AMD. Oracle has also entered into a trillion-dollar strategic partnership with OpenAI.
Following the AMD-OpenAI announcement, AMD shares surged—recording their biggest single-day gain in nearly ten years. The deal positions AMD, long regarded as the “runner-up” in the data-center AI chip market, to mount its first direct challenge to Nvidia’s dominance, while enabling OpenAI to establish what some analysts have called a trillion-dollar “circular transaction.”
In early October, DeepSeek launched its latest AI model, DeepSeek-V3.2-Exp. Shortly afterward, domestic chipmakers including Cambricon and Huawei Ascend announced full compatibility with the model. Huawei also revealed a mass production plan for its Ascend 910 series chips, with the Ascend 950PR—featuring Huawei’s self-developed HBM memory—scheduled for release in the first quarter of 2026, and the Ascend 970 planned for the fourth quarter of 2028.
Cambricon has seen an impressive surge in market value, with its stock rising 124% between July and September. Its market capitalization now stands at 521 billion yuan, at one point surpassing Japan’s largest chip equipment maker, Tokyo Electron, making Cambricon one of the highest-valued semiconductor design firms in China’s A-share market.
Nvidia CEO Jensen Huang recently commented that China is “only a few nanoseconds behind the US” in chip technology and highlighted the country’s strong potential in research, development, and manufacturing. He also urged the US government to allow American tech companies to compete in markets such as China to “enhance America’s influence.”
Despite U.S. export restrictions, these measures have inadvertently accelerated domestic AI chip innovation. The H20 chip, for example, received only a lukewarm response in China, even as Nvidia products accounted for over a third of AI chip sales in the country in 2024. Huang and his team continue to be deeply involved in the ongoing US-China AI chip competition.
The U.S. continues to tighten export controls on AI chips to China. The newly enacted “GAIN AI” Act requires that Nvidia prioritize supplying AI chips to American companies before exporting advanced chips abroad, potentially excluding China from a share of its $50 billion AI computing market.
Meanwhile, competition is heating up globally. Companies like AMD, Google, Microsoft, and Broadcom are racing to develop more cost-effective AI computing chips. At the same time, domestic players such as Huawei, Cambricon, and Moore Threads are increasingly securing deployment orders. Major Chinese internet companies—Alibaba, Tencent, Baidu, and ByteDance—are also stepping up investment in chip R&D and design, aiming to strengthen self-sufficiency and control over the AI supply chain.
According to Epoch AI, OpenAI spent $7 billion on computing power over the past year, with $5 billion allocated to AI model training. Morgan Stanley projects that global investment in AI infrastructure could reach $3 trillion (around 21 trillion yuan) over the next three years. Deloitte predicts that global semiconductor sales will reach a record $697 billion in 2025, potentially surpassing $1 trillion by 2030 as AI, 5G, and other technologies continue to expand.
Morningstar analyst Brian Colello cautioned, “If an AI bubble forms and eventually bursts, Nvidia’s large investment in OpenAI could be an early warning sign.” When asked about Chinese chipmakers’ progress, a Nvidia spokesperson simply stated, “There is no doubt that competition has arrived.”
Wang Bo, CEO of Tsing Micro Intelligence—a Tsinghua University-affiliated AI chip company—pointed out that reconfigurable AI chips provide a development path for China that doesn’t rely on Nvidia GPUs. He emphasized that to gain market share, domestic AI chip products need to offer at least five times the cost-performance ratio of competitors. “In this industry, dominant players like Nvidia or Intel set the standard,” he said. “Following their path blindly will only lead to being crushed.”
The DeepSeek Boom Accelerates Domestic AI Chip Advancement
Since October 2022, the United States has implemented multiple rounds of export controls targeting China’s semiconductor industry, aiming to prevent the country from manufacturing advanced AI chips or using American chips to train state-of-the-art AI models.
In December 2024, during the final year of the Biden administration, the U.S. tightened restrictions further, limiting exports of HBM (high-bandwidth memory) essential for advanced AI chips and lowering the threshold for computing power density—directly challenging China’s ability to develop large-scale AI models. In response, Chinese internet and cloud companies that previously relied heavily on Nvidia chips began exploring the deployment of domestic AI chips.
The DeepSeek boom in 2025 has further accelerated this shift. When DeepSeek released version V3.1 in August, the announcement noted that “UE8M0 FP8 is designed for the next generation of soon-to-be-released domestic chips,” drawing widespread market attention and even contributing to a drop in Nvidia’s stock price.
DeepSeek’s training costs are currently far lower than those of leading U.S. AI models. A paper published in Nature on September 18, with Liang Wenfeng as the corresponding author, revealed that training the DeepSeek-R1 model cost just $294,000. Even factoring in the roughly $6 million foundational model cost, this remains far below the expenditures required for comparable models at OpenAI or Google.
In July, Nvidia CEO Jensen Huang described DeepSeek-R1 as a revolutionary, open-source inference model. He emphasized its flexibility, noting that Chinese AI models can be efficiently adapted to diverse applications, allowing companies to build products or even entire businesses on these platforms.
Four years ago, Nvidia held 95% of the AI chip market in China; today, its share has dropped to just 50%. Huang warned, “If we don’t actively compete in China and instead allow domestic platforms and ecosystems to flourish independently, Chinese AI technology and leadership will inevitably spread globally as the technology itself proliferates.”
Nvidia CEO Jensen Huang
As of late 2024, Nvidia controlled over 90% of the global AI accelerator chip market. Its data center revenue surged to $41.1 billion in Q2 2025, marking a 56% year-on-year increase and solidifying its position as the company’s largest revenue driver.
In August 2025, Jensen Huang, Nvidia’s CEO, highlighted the strong demand for its Blackwell Ultra architecture chips, noting their central role in the growing AI race. He stated that the annual capital expenditure related to data center infrastructure is projected to hit $600 billion, with Nvidia aiming to capture a $3–4 trillion opportunity in AI infrastructure development through its Blackwell and Rubin architectures. Huang emphasized that as AI continues to develop, it will be a key driver of global GDP growth.
However, China's AI chip sector is beginning to challenge Nvidia's dominance. As U.S. restrictions on chip exports to China intensify, domestic companies like Alibaba, Tencent, and ByteDance’s Volcano Engine are stockpiling Nvidia GPUs while also exploring homegrown alternatives.
Nvidia’s fiscal struggles have worsened, with the company facing a $4.5 billion expense due to unsold H20 chips in Q1 FY2026, following U.S. export restrictions. Sales plummeted by $4 billion in Q2 2026, reflecting the growing shift toward domestic solutions.
Meanwhile, China’s AI chip market is grappling with a supply shortage, but domestic chipmakers are rising to the challenge. Companies like Alibaba, Cambricon, Iluvatar CoreX, Moore Threads, and Biren Technology are positioning themselves as core players in the AI chip race.
A recent CCTV report highlighted the China Unicom Sanjiangyuan Green Power Intelligent Computing Center project, where Alibaba’s Pingtouge introduced a new PPU chip designed for AI data centers. The chip outperforms Nvidia’s A800 and matches the H20 in key specs, while consuming less energy.
Cambricon, a Chinese AI chip company, has become a key player, with ByteDance as its largest client. The company has pre-sold over 200,000 chips. Both Alibaba and Baidu have also begun mass production of their self-developed chips, while Tencent is gradually deploying previously stockpiled chips and purchasing products from Enflame.
Huawei, too, has joined the fray. In September 2025, the company unveiled its most powerful AI chip to date, with plans to release the Ascend 950PR in Q1 2026, Ascend 950DT in Q4 2026, Ascend 960 in Q4 2027, and Ascend 970 by Q4 2028—directly challenging Nvidia’s dominance.
According to estimates from semiconductor insiders, Huawei’s Ascend AI chip shipments will reach 300,000–400,000 units in 2024, and the number is expected to grow to nearly 1 million units by 2025. Meanwhile, Cambricon’s shipments could reach 80,000 units, and may double again in 2026.
Despite these advances, Huawei faces challenges due to U.S. sanctions. As Vice Chairman Xu Zhijun noted, the company cannot rely on TSMC for manufacturing, resulting in a performance gap between its chips and Nvidia’s. However, Huawei has made significant strides in supernode interconnection technology, allowing it to deliver some of the world’s most powerful computing performance.
Huawei founder Ren Zhengfei also commented that while Ascend chips are “one generation behind” their American counterparts, technologies like stacking and clustering allow them to deliver cutting-edge performance.
Zhang Jianzhong, CEO of Moore Threads, pointed out that the current challenges in GPU chip manufacturing stem from three key areas: international bans on high-end chips, restrictions on HBM memory, and limitations on advanced process nodes. Given the explosive demand for generative AI and AI agents, Zhang estimated that over the next five years, demand for AI computing power will increase by a factor of 100. Yet, China is still facing a shortfall of 3 million GPU cards in production capacity. In the short to medium term, the domestic market is likely to experience a shortage of intelligent computing resources.
On September 26, Moore Threads became the fastest AI chip company to be approved for listing on China’s STAR Market. The company plans to raise 8 billion RMB in its IPO, which will be invested in the R&D of next-gen AI chips, graphics chips, and AI SoCs.
In the first half of 2025, Moore Threads’ revenue reached 702 million RMB—surpassing the total revenue of the previous three years combined—with a compound annual growth rate of over 208%. The company’s gross margin improved significantly, climbing from -70.08% in 2022 to 70.71% in 2024. As of June 2025, Moore Threads had secured over 2 billion RMB in client orders, with expectations of turning profitable as early as 2027.
With AI adoption accelerating, the demand for computing power is growing exponentially. By June 2025, the average daily token consumption for AI tasks in China had reached 30 trillion, up 300-fold from just a year and a half ago.
According to Jensen Huang, AI is a fast-evolving, high-tech industry, and the U.S. risks losing its global AI advantage if it doesn’t engage in free trade with China.
The latest data from IDC reveals that China’s AI-accelerated server market reached $16 billion in the first half of 2025—more than double that of the same period in 2024. By 2029, the market is expected to exceed $140 billion. The demand for non-GPU cards like NPUs and CPUs is growing rapidly, now accounting for 30% of the market share, while domestic AI chips are increasing their market penetration, reaching approximately 35% of the total market.
The Computing Bottleneck: The Urgent Need for New Architectures, Storage, and Communication in Server Chips
For data center servers, the three most essential elements are computing power (chip performance), communication (supernodes, NVLink), and storage (HBM, DDR, etc.).
As the semiconductor industry enters the post-Moore’s Law era, improving AI chip performance is no longer as simple as shrinking transistor sizes. To innovate and advance AI chips, we must enhance PPA—performance, power consumption, and area—through integrated solutions that combine architectural design, process choices, and software optimization. The goal is to optimize computing and data transfer efficiency through hardware and software synergy, ultimately achieving a breakthrough that balances all three aspects of PPA while meeting the rising demands of AI computing.
However, with Moore’s Law slowing down, moving to more advanced nodes like 7nm, 4nm, or 3nm has not yielded the anticipated leaps in AI chip performance. In fact, the costs of such chips have skyrocketed.
Handel Jones, CEO of International Business Strategies (IBS), noted that the cost to design a 28nm chip is $40 million, but for a 7nm chip, the price jumps to $217 million, and it climbs to $590 million for a 3nm chip. Public sources indicate that the total cost for designing and developing a 3nm chip may approach $1 billion, mainly due to wafer foundry costs, R&D investments, equipment procurement (such as EUV lithography machines), and yield challenges.
For instance, Qualcomm’s latest Snapdragon 8s, based on a 4nm process, offers only a 31% improvement in general CPU performance compared to previous generations. Similarly, Intel’s latest Core Ultra7 165H, built with chiplet technology, only delivers an 8% increase in performance per watt over its predecessor, while TSMC’s N2 process offers just a 10%-15% performance boost over its previous version.
Clearly, cutting-edge processes are unlikely to provide the major performance gains or cost-effectiveness that AI chips demand. At the 2025 GTC conference, Jensen Huang, Nvidia’s CEO, focused on the growing need for large model tokens, indicating that the B200 architecture is where the real competition lies, rather than on raw chip performance.
A semiconductor industry insider also suggested that China should avoid pursuing ultra-advanced process nodes beyond 12nm, as further improvements yield diminishing returns. The country’s domestic chip manufacturing capabilities remain constrained, so addressing the chip computing power challenge has become a critical issue.
Thus, the bottleneck in computing has arrived. Server AI chips urgently need new architectures, storage solutions, and communication network approaches to enhance their capabilities.
Professor Wei Shaojun from Tsinghua University and Chairman of the Integrated Circuit Design Branch of the China Semiconductor Industry Association has stated that due to external restrictions on China’s ability to develop advanced chips, the domestic chip industry must focus on design innovations that don’t rely on cutting-edge process nodes. This includes new chip architectures and microsystem integration. He warned that sticking to existing architectures would result in China remaining a follower in the global AI race. To reduce reliance on Nvidia’s technology, Asian countries, including China, should avoid imitating U.S. chip architectures for AI and explore alternative strategies for algorithm and infrastructure development.
Yin Shouyi, Dean of Tsinghua University’s School of Integrated Circuits, added that breakthroughs in computing architecture are critical to enhancing the performance of AI chips. Reconfigurable computing architectures—where hardware and software collaborate dynamically—could lead to performance approaching that of application-specific integrated circuits (ASICs) while also addressing challenges like the “memory wall” that limits domestic chip performance.
“Innovative architectures can break through traditional design barriers and address computing power challenges,” Yin noted. “But to succeed, we need strong ecosystem support. FlagOS, from the Beijing Academy of Artificial Intelligence, serves as the backbone for China’s architecture innovation, facilitating collaboration between software and hardware to overcome the bottlenecks facing China’s computing power.”
In addition to new architectures, improving storage and communication networks is crucial.
For storage, the demand for AI chips such as HBM and DDR is growing exponentially. A single GPU node can require hundreds of gigabytes, or even terabytes, of storage. Data from Micron shows that AI servers require eight times more DRAM than standard servers and three times more NAND flash. Some AI servers may need up to 2TB of storage, far exceeding the requirements of traditional servers.
This surge in demand has driven up the proportion of storage chip costs in AI infrastructure. Recently, OpenAI’s “Stargate” project struck a deal with Samsung and SK Hynix to purchase 900,000 DRAM wafers per month, representing nearly 40% of global DRAM production.
Currently, a single HBM chip costs over $5,000, 20 times the price of DDR5 memory, with margins reaching 50%-60%, far surpassing traditional DRAM, which typically has a margin of around 30%.
As the AI boom continues, the shift from CPU-centric to GPU/NPU-centric architectures is increasing demand for storage chips. High-bandwidth memory (HBM) has become widely adopted, with its share of the DRAM market nearing 30%. The anticipated launch of HBM4 in 2026 will further drive demand for specialized memory solutions.
In terms of communication, Nvidia’s NVLink, InfiniBand, and Ethernet technologies are key to connecting GPUs within servers or across server racks. However, Huawei has introduced its Ascend CloudMatrix 384 supernode, a network that connects 12 compute cabinets and four bus cabinets using high-speed interconnects. This setup offers 1.7 times the performance of Nvidia’s NVL72, with network bandwidth of 269 TB/s—107% more than Nvidia’s NVL72—and memory bandwidth of 1,229 TB/s, 113% higher than Nvidia’s NVL72.
Looking ahead, Huawei plans to expand this setup into the Atlas 900 SuperCluster, a supernode cluster featuring tens of thousands of cards, supporting even larger-scale AI models.
Other AI chip companies are also exploring innovative communication technologies, including Co-Packaged Optics (CPO), chiplets, optical communication networks, and DPUs, aiming to accelerate AI performance through faster interconnects.
Lin Yonghua, Deputy Director and Chief Engineer of the Beijing Academy of Artificial Intelligence, emphasized the need for sustained investment in innovation to drive lower energy consumption, improved cost-performance ratios, and new computing architectures. This approach will unlock the commercial value of AI hardware. The Beijing Academy of Artificial Intelligence has recently launched the “Zhongzhi FlagOS v1.5” system in collaboration with global ecosystem partners, with companies like Qingwei Intelligence, Cambricon, Moore Threads, Kunlunxin, Huawei Ascend, and Hygon forming the core domestic FlagOS Excellence Adaptation Units.
Despite these advancements, China’s AI chip ecosystem remains incomplete, with serious production capacity shortages. According to the General Administration of Customs, China’s total integrated circuit imports in 2024 are expected to reach 549.2 billion units, a 14.6% increase from the previous year. The total value of imported chips will hit $385 billion, accounting for 62% of global chip output, surpassing China’s crude oil imports.
Du Yunlong, an AI infrastructure analyst at IDC China, noted that the AI accelerated server market in China is growing, but there are still gaps in high-end computing performance and ecosystem development.
Future competition will shift from single-chip performance to system energy efficiency, ecosystem collaboration, and green computing cost control. The focus must be on avoiding redundant construction while enhancing global competitiveness through technological partnerships and standards optimization.