NEWS  /  Analysis

AI Challenges Medical Orthodoxy as Patients and Doctors Turn to Algorithms First

By  xinyue  Jul 25, 2025, 2:09 a.m. ET

Large language models are making their way into the front lines of clinical medicine—the era of doctors and AI “co-diagnosing” has already arrived.

Image generated by AI

Image generated by AI

AsianFin -- Artificial intelligence is rapidly encroaching on one of the last bastions of human authority: the doctor’s office.

Increasingly, patients aren’t just showing up for appointments with questions—they’re arriving with diagnoses, suggested treatments, and guideline citations, thanks to conversations they’ve already had with ChatGPT, DeepSeek, or other medical large language models. For physicians, it signals the end of an era in which their expertise went largely unchallenged.

“No matter whether doctors like it or not, we have to face this reality,” said Li Houmin, director of dermatology at Peking University People’s Hospital. “Patients now routinely consult AI before they ever come to see us.”

From major Chinese cities to patients abroad seeking treatment back home, the shift is clear: AI-powered tools are becoming the first point of contact for millions. Armed with model-generated insights, patients are pressing their doctors with deeper, more informed questions—sometimes referencing medical standards from multiple countries.

The acceleration of specialized medical models is giving this trend fresh momentum. In May, OpenAI unveiled HealthBench, a new benchmark developed with input from 262 physicians in 60 countries. The dataset contains 5,000 real-world health conversations, scored using custom physician-created metrics. On five of seven scoring dimensions, GPT-4.1 outperformed the average doctor.

Microsoft followed in July with MAI-DxO, a diagnostic model tested on 304 complex cases from the New England Journal of Medicine. It posted an accuracy rate of 85.5%—roughly 20 percentage points above that of practicing physicians—while also cutting costs and saving time.

China is making its own push into the field. On July 23, Quark announced that its Quark Health large model had passed China’s chief physician written assessment across 12 core medical specialties. That “chief physician-level AI” has since been integrated into Quark’s AI-powered search platform, offering patients sophisticated analysis through deep health search queries.

Built atop Alibaba’s Qwen foundation model, the Quark system has been tailored for vertical medical scenarios. Xu Jian, head of algorithms at Quark Health, said a core breakthrough is the platform’s “slow thinking” mechanism—chain-of-thought reasoning paired with multistage clinical deduction modeling. The system’s backend categorizes data as either “verifiable” (e.g., diagnostic outcomes) or “non-verifiable” (e.g., lifestyle advice), applying dual reward structures to separately assess logical reasoning and final accuracy.

Quark says its model achieves a diagnostic accuracy rate of 90.78% for common outpatient cases, matching the precision of clinical medical records written by physicians.

Even as AI rises, its role remains controversial. “AI models or guidelines offer generalized answers,” Li Houmin said. “Individual diagnosis still requires professional medical judgment. And a layperson can’t always interpret what AI says in the way a trained physician can.”

Still, she sees AI’s expanding role as complementary, not competitive. “For example, patients can understand their skin issues may stem from poor sleep, anxiety, or other lifestyle factors. That’s helpful context—but professional input remains essential.”

Doctors, too, are increasingly turning to AI for help. Quark reports that over 2 million medical students—more than half the total in China—now use its health platform monthly. Peak usage aligns with exam periods and weekday study cycles, as students look up terminology, textbook knowledge, and clinical case studies. The company plans to extend services to junior doctors, enabling AI support for clinical decisions, treatment protocols, and research assistance.

Mental health is also emerging as a frontier for AI. Wang Huiling, a professor and director at Wuhan University’s Mental Health Center, notes that depression often hides behind socially acceptable masks. One patient of hers, suffering from severe depression, trained himself to smile convincingly just to meet societal expectations, even fooling experienced clinicians.

“In that case, AI could have made a difference,” Wang said. “By analyzing micro-expressions, pupil movement, or vocal tone, it might have picked up on signals humans missed.”

Companion-style AI tools—some already deployed in clinical psychotherapy—offer early warning signals and alleviate burdens on scarce psychiatric resources. Wang’s colleagues are now mining student interactions with an “AI Treehole” app to identify users at risk of self-harm.

Still, she cautions that the quality of AI matters. “Some platforms may carry harmful or negative emotional reinforcement. It’s critical to ensure that AI therapy is built on responsible databases and reliable algorithms.”

As AI creeps further into younger demographics—via educational tools, AI toys, or digital therapy—long-term cognitive effects remain uncertain. “This is an urgent research topic,” Wang said, comparing the dilemma to early childhood writing instruction. “We used to think early writing was beneficial, but it turns out premature development can hinder brain maturation. AI exposure might be similar—we just don’t know yet.”

What’s clear is that the medical profession, long resistant to outside input, is now being reshaped from both ends: by patients arriving with AI advice and by doctors leaning on AI tools themselves.

For now, medicine remains a collaborative space between humans and machines. But with diagnostic models outperforming seasoned professionals and new tools promising to close emotional gaps, the balance is shifting quickly—and permanently.

Please sign in and then enter your comment