NEWS  /  Analysis

Nvidia Unveils ‘Vera Rubin’ Platform as Next Generation of AI Computing at CES 2026

By  xinyue  Jan 05, 2026, 9:49 p.m. ET

Compared to the current Blackwell architecture, Rubin offers 3.5 times faster training speed and reduces inference costs by a factor of 10.

Image source: NVIDIA livestream screenshot

Image source: NVIDIA livestream screenshot

Nvidia on Monday introduced its next-generation artificial intelligence computing platform, called Vera Rubin, at the Consumer Electronics Show (CES) 2026, marking what the company describes as a major step forward in performance, efficiency and scalability for training and running large AI models.

The new platform is designed to support emerging AI workloads such as autonomous “agentic” systems, advanced reasoning models and mixture-of-experts (MoE) architectures, which rely on routing user queries dynamically across multiple specialized models.

At the heart of the platform is the Vera Rubin superchip, which integrates one Vera central processing unit (CPU) with two Rubin graphics processing units (GPUs) into a single package. It is one of six interconnected chips that together form Nvidia’s broader Rubin architecture.

“Rubin arrives at a moment when the demand for AI computing for both training and inference is accelerating at an unprecedented pace,” Nvidia chief executive Jensen Huang said during the company’s keynote presentation. “With deep co-design across compute, networking and storage, this platform is built for the next frontier of AI.”

A Platform Built for Large-Scale AI

Beyond the main processor, the Rubin platform includes a suite of new networking and infrastructure components: the NVLink 6 Switch for high-speed GPU interconnects, the ConnectX-9 SuperNIC for network acceleration, the BlueField-4 data processing unit (DPU) for offloading infrastructure workloads, and the Spectrum-6 Ethernet Switch for large-scale data center networking.

These components can be assembled into Nvidia’s new NVL72 server system, which integrates 72 GPUs into a single rack-scale unit. Multiple NVL72 systems can then be combined into larger clusters known as DGX SuperPODs, which are used by hyperscale cloud providers and AI developers to train frontier models.

Customers for these systems include major cloud and technology firms such as Microsoft, Google, Amazon and Meta, all of which are investing heavily in AI infrastructure.

Nvidia also introduced a new storage architecture called Inference Context Memory Storage, designed to manage the massive volumes of data generated by trillion-parameter and multi-step reasoning models and to allow that data to be shared efficiently across large AI systems.

Efficiency Gains Over Previous Systems

Nvidia said the Rubin platform delivers significant efficiency improvements over its previous Grace Blackwell generation.

According to the company, Rubin can reduce the number of GPUs required to train certain mixture-of-experts models by up to four times, allowing companies either to cut costs or redeploy hardware to other workloads. Nvidia also claims that the platform can reduce the cost of AI inference — the process of generating outputs from trained models — by up to ten times per token.

Inference costs have become a growing concern for AI developers, as large language and multimodal models consume vast amounts of computing power and electricity when processing text, images and video. Lower token costs could significantly improve the total cost of ownership for enterprises deploying AI at scale.

Nvidia said the platform is already being sampled by partners and is now in full production.

Market Position and Competition

Nvidia’s dominance in AI chips has propelled it to the top ranks of global technology companies by market capitalization, although its valuation has fluctuated in recent months amid investor concerns about the pace and sustainability of AI spending.

The company also faces rising competition. Advanced Micro Devices (AMD) is developing its own rack-scale AI systems, while major cloud providers such as Google and Amazon are expanding the use of in-house chips for some workloads, including those supporting AI start-up Anthropic.

Google is also in talks with other technology firms about broader adoption of its custom processors in third-party data centers, according to people familiar with the matter.

Even so, analysts say Nvidia retains a substantial lead in AI hardware, software integration and developer ecosystem. Its strategy of delivering a new generation of AI platforms on an annual cadence could make it difficult for rivals to close the gap in the near term.

“With Rubin, Nvidia is not just selling faster chips — it is selling a tightly integrated AI computing stack,” said one industry analyst. “That makes it much harder for competitors to match the full system performance and ecosystem support that Nvidia now offers.”

As AI applications move beyond experimentation into large-scale deployment across industries, Nvidia is betting that demand for powerful, efficient and flexible AI infrastructure will continue to rise — and that Rubin will become the backbone of that next phase.

Please sign in and then enter your comment