NEWS  /  Analysis

Language is the Core Axis; Multimodality is Not the Main Battlefield, Says CEO of Baichuan Intelligence

By  xinyue  Jan 14, 2026, 2:19 a.m. ET

Baichuan Intelligence has officially open-sourced its next-generation medical large language model, Baichuan-M3. The model features significant breakthroughs in advanced reasoning, reduced hallucinations, and end-to-end consultation capabilities.

Baichuan Intelligence Founder and CEO Wang Xiaochuan, image source: Baichuan Intelligence

Wang Xiaochuan, founder and CEO of Baichuan Intelligence

Language is the core axis; multimodality is not the main battlefield, said Wang Xiaochuan, founder and CEO of Baichuan Intelligence, in a media briefing following the open-source release of the company’s next-generation medical large language model, Baichuan-M3.

The model, unveiled on Tuesday, is designed for end-to-end medical consultation and reasoning, and Baichuan Intelligence said it has set a new benchmark for medical AI. In the globally recognized HealthBench evaluation, M3 scored 65.1 overall and 44.4 on the HealthBench Hard benchmark, which tests complex decision-making abilities. These results surpass OpenAI’s GPT-5.2, marking the first time a model in the medical field has comprehensively overtaken the latest version of OpenAI’s flagship system.

Baichuan-M3 is also distinguished by a remarkably low hallucination rate of 3.5%, achieved through reinforcement learning that integrates factual consistency as a core training objective. “We trained the model to know what it knows and to acknowledge what it doesn’t,” Wang said. “This is baked into the model’s decision-making, ensuring accuracy and reliability in serious medical contexts.”

Image source: Baichuan Intelligence

Image source: Baichuan Intelligence

 

Unlike conventional large models that respond to prompts based on supplied context, Baichuan-M3 actively engages patients like a human physician. It can probe for incomplete information, extract key medical histories and risk factors, and conduct nuanced reasoning to arrive at well-supported conclusions. Evaluations indicate that M3’s consultation capabilities surpass the average level of human doctors.

Baichuan developed a proprietary SCAN-bench evaluation framework, inspired by the Objective Structured Clinical Examination (OSCE) used in medical education. SCAN-bench measures a model’s ability to gather patient history, recommend auxiliary examinations, and provide accurate diagnoses in a dynamic, multi-round simulation of real-world consultations. Wang described SCAN as a “gold standard” approach to systematically assess end-to-end clinical reasoning.

Wang emphasized that Baichuan’s focus is on intelligence and reasoning, rather than sheer data accumulation. “Many medical models rely heavily on hospital data but don’t know what they are doing,” he said. “We focus on algorithms and evaluation systems to define a model’s capabilities.”

He contrasted this with common assumptions in the AI industry that multimodal models—capable of integrating images, text, and other data—represent the primary frontier of AI development. “Intelligence is about symbols,” Wang said. “Language, mathematics, and programming are formal symbolic systems. Multimodality is just a branch of the tree. Reasoning and decision-making should remain central, and language is the axis that drives it.”

While Baichuan plans to develop smaller multimodal models for tasks like medical imaging, Wang stressed that the core of medical AI remains symbolic reasoning through language. “Image models can interpret scans, but reasoning about decisions, treatment options, and patient outcomes is fundamentally a language-based task,” he said.

A key breakthrough of Baichuan-M3 is its “Serious Inquiry Paradigm” and adherence to the SCAN Principle—Safety Stratification, Clarity Matters, Association & Inquiry, and Normative Protocol. These frameworks allow the model to simulate human-like clinical reasoning, systematically eliciting missing information and guiding patients through structured medical questioning.

“The biggest challenge for medical AI is incomplete patient descriptions,” Wang said. “The model must proactively ask questions to gather enough information to support decision-making. Simply instructing a model to ‘act as an experienced doctor’ elicits performative behavior, not intrinsic reasoning. M3 solves that problem.”

This capability is deployed in Baichuan’s application, Baixiaoying, which is accessible to both healthcare professionals and patients. Doctors can simulate consultation scenarios to train or validate diagnostic reasoning, while patients receive explanations of diagnoses, treatment plans, and prognoses in an understandable, actionable form.

Low Hallucination Rates and Factual Consistency

In medical AI, hallucinations—false or misleading outputs—can be dangerous. Baichuan-M3 addresses this through reinforcement learning that emphasizes medical factual consistency. “In serious medical contexts, hallucinations are not a minor issue; they can compromise patient safety,” Wang explained.

By embedding the principle of “admitting what you know and acknowledging what you don’t” into training, M3 achieves stable, reliable outputs without relying on external retrieval systems. This approach represents a paradigm shift in medical AI, producing a hallucination rate of just 3.5% compared with 8–10% for competing models like GPT-5.2, according to Baichuan.

Wang argued that the most significant potential for medical AI lies outside hospitals. “Hospitals are for surgeries, procedures, and acute care,” he said. “The future of medical AI is to empower patients directly, through decision support, home-based health monitoring, and health companionship.”

He criticized approaches that overemphasize hospital datasets and internal hospital applications. “Intelligence is the key issue, not the data,” he said. “Even without massive hospital data, models can evolve through rigorous evaluation, reinforcement learning, and algorithmic innovation.”

Baichuan’s commercialization strategy targets patients directly rather than doctors. M3 is designed to bridge the knowledge gap, translate medical language into patient-understandable information, and facilitate informed decision-making. Wang emphasized that regulatory constraints prevent AI from issuing prescriptions, but within those boundaries, the model can provide substantial value.

“For serious scenarios, patients should make the final decisions,” Wang said. “Doctors provide guidance, but AI helps patients understand their options, analyze trade-offs, and choose the approach that aligns with their needs.”

This model of patient empowerment also opens avenues for partnerships with pharmaceutical companies and medical device manufacturers, who may subsidize AI-driven services for patient populations.

Image source: Baichuan Intelligence

Image source: Baichuan Intelligence

 

Focus Areas: Pediatrics, Chronic Disease, and Oncology

Baichuan’s initial deployment focuses on pediatrics, chronic diseases, and oncology, covering both younger and older patient populations. The company is also exploring applications in sleep health and drug efficacy enhancement, using AI as a companion to improve patient outcomes.

“AI can raise the effective rate of an existing treatment without requiring new drugs or long clinical trials,” Wang said. “For example, algorithms can improve drug effectiveness from 70% to 75%—essentially a new drug, achieved digitally.”

Baichuan plans to expand M3 internationally. Wang highlighted disparities in rural healthcare in China, where only 26% of village doctors have completed standardized training, as an area where AI could have a transformative impact.

He also emphasized robust privacy and security protocols, assuring that patient data is used solely for personal services and is protected with industry-standard encryption. “If we ever leaked data, the company would collapse,” Wang said.

Baichuan plans to release smaller multimodal models for imaging tasks, while continuing to prioritize symbolic reasoning as the core axis. Wang reiterated that intelligence, not data volume or multimodality, drives meaningful improvements in medical AI.

“Our model roadmap moves from technical reasoning to meeting clinical needs,” he said. “We focus on medical reasoning, evidence extraction, information gathering, and end-to-end consultation. After M3, all four areas have been addressed.”

Wang stressed that Baichuan-M3 bridges gaps that many doctors cannot, including explaining complex diagnoses in ways patients can understand. “Effective communication between patients and doctors requires articulation skills,” he said. “AI models excel at analyzing information and translating it into clear guidance for patients, enabling informed decision-making.”

Baichuan’s product positioning is unique globally: doctors receive evidence-based information, while patients gain guidance that empowers them to actively participate in their healthcare. Wang described this as a pluralistic, actionable, and decision-enabling approach, in contrast to existing systems that primarily serve professionals.

Experts note that Baichuan-M3 represents a potential shift in how medical AI is deployed, moving from hospital-centric applications to consumer-focused healthcare. By integrating advanced reasoning, low hallucination rates, and patient-centered consultation, the model could expand access to high-quality medical guidance and improve decision-making outside traditional clinical settings.

With Baichuan-M3, Wang is emphasizing that AI in healthcare should enhance patient agency, not merely supplement hospital workflows. By prioritizing language-based reasoning, factual consistency, and end-to-end consultation, Baichuan is positioning its technology to become a cornerstone of out-of-hospital medical services, potentially reshaping the way patients engage with healthcare and empowering informed decision-making on a broad scale.

“Life itself is fascinating,” Wang said. “Our mission is to harness AI to understand it better, helping patients make informed decisions and ultimately live healthier lives.”

As global competition in medical AI intensifies, Baichuan-M3’s release signals a new stage in the evolution of patient-centered, intelligent healthcare, where symbolic reasoning remains the decisive frontier, and multimodality is a supporting branch rather than the main battlefield.

Please sign in and then enter your comment