Xiaohongshu Open-Sources First Large Language Model, Trained Without Synthetic Data

AsianFin — Chinese social commerce platform Xiaohongshu has open-sourced its first large language model, dots.llm1, marking the company’s entry into the frontier of AI development.

According to Xiaohongshu, dots.llm1 is a mixture-of-experts (MoE) model with 142 billion parameters, but only 14 billion are activated during inference, striking a balance between high performance and reduced training and inference costs.

A variant of the model, dots.llm1.ins, was trained on 11.2 trillion tokens of non-synthetic data, avoiding the use of artificially generated datasets—a rare practice among foundation model developers. In benchmark tests, the model delivers performance close to Alibaba’s Qwen3-32B across tasks in both Chinese and English, as well as in math and alignment capabilities.

The move signals Xiaohongshu’s ambition to move beyond content and commerce into core AI R&D, further intensifying competition in China’s open-source LLM landscape.

NEWS / Brief News

Xiaohongshu Open-Sources First Large Language Model, Trained Without Synthetic Data

AsianFin Newsletters