NEWS  /  Brief News

DeepSeek Unveils New "MODEL1" Architecture, Boosting AI Inference Capabilities

Jan 21, 2026, 1:37 a.m. ET

DeepSeek has quietly revealed a new AI model architecture, MODEL1, which could offer a more efficient alternative to its current models, according to new findings from the company’s updated GitHub repository. The disclosure arrives just as DeepSeek celebrates the first anniversary of its R1 model.

The update, posted on Wednesday, featured FlashMLA—DeepSeek's proprietary optimization tool designed to accelerate large-scale model inference. The update included over 100 code files, which AI analysis showed referenced MODEL1 31 times, marking the first public mention of the architecture.

FlashMLA is based on MLA (multi-layer attention), a technology that minimizes memory usage and maximizes GPU performance in DeepSeek's models. This is particularly crucial as companies push for more efficient AI models that can handle complex tasks without overburdening hardware.

MODEL1 is one of two key architectures supported by FlashMLA, alongside DeepSeek-V3.2. According to industry experts, MODEL1 is positioned as a low-memory inference model, ideal for edge devices or cost-sensitive applications.

Additionally, speculation points to MODEL1 being optimized for long-sequence tasks, such as document analysis or code interpretation, with a focus on sequences of 16,000 tokens or more.

Please sign in and then enter your comment