Medical LLMs Underweight Patient Autonomy

A Harvard-affiliated team has released a benchmark and attribution method to measure whether frontier medical AI systems preserve clinical pluralism or embed a single ethical stance at population scale. The paper, "What Does the AI Doctor Value? Auditing Pluralism in the Clinical Ethics of Language Models" (Chandak et al., arXiv 2605.18738, published May 18, 2026), demonstrates that a single LLM deployed without value auditing can amplify those priorities across millions of interactions, replacing the distributional pluralism of a physician panel with what the authors call a "deployment monoculture."

The audit framework rests on 50 clinical dilemmas, each physician-edited and validated through blinded review. Each case presents a clinical vignette and two mutually exclusive recommendations structured so that choosing one necessarily promotes certain values—autonomy, beneficence, nonmaleficence, or justice—at the expense of others. The design mirrors Principlism, the ethical framework widely used in medical practice, which deliberately offers no fixed ranking among its four principles. The benchmark is paired with an attribution method that infers value priority distributions directly from the pattern of decisions made across cases, rather than from self-reported stances. Models often claim values they do not exhibit in practice.

Frontier models span physician-level value heterogeneity: different models prioritize different principles, covering the natural range of inter-physician variation. Individual models, however, show near-deterministic choices. Per-case decision entropy is near zero, uncorrelated with the level of physician disagreement on that case, and robust to semantic variations in how the vignette is phrased. Models exhibit what the authors call "Overton pluralism" in chain-of-thought reasoning—they acknowledge competing values before committing to deterministic choices. A patient who rephrases the same clinical scenario receives the same answer. A deployed LLM functions as a single physician with fixed priors, never returning a substantively different second opinion.

The deployment-critical finding: some frontier models significantly underweight patient autonomy relative to the natural range of physician judgment. Autonomy underweighting at model scale systematically steers millions of interactions toward more paternalistic recommendations without disclosing the tilt. Anthropic's constitution explicitly instructs models to weigh "people's autonomy and right to self-determination." The framework is designed to surface gaps between alignment instructions and revealed decision behavior.

No latency, cost, throughput, or production deployment numbers were disclosed. This is a methodology preprint, not a production case study. The benchmark covers 50 cases, sufficient for robust value priority recovery via the attribution method. Specific model names were not confirmed in the published material available at time of writing. Teams evaluating the framework should treat the 50-case benchmark as a starting audit surface and scope expansion accordingly.

The integration challenge is not technical. Most teams deploying LLMs in clinical or compliance-sensitive settings lack defined processes for ethics auditing, organizational baselines for comparison, and tooling for structured audits in CI or model evaluation pipelines. The attribution method is transferable—it requires only binary forced-choice dilemma cases and the ability to log decisions across them. Building the dilemma case library for a specific domain (oncology, psychiatry, financial advice) requires domain expert involvement comparable to the Chandak team's clinical medicine work.

The binary forced-choice design is a deliberate simplification. Real clinical recommendations often involve gradations of emphasis rather than clean either/or commitments. How the framework generalizes to more open-ended recommendation tasks is unresolved. Cross-lingual value consistency remains untested.

Architect's takeaway: if you're running any LLM in a domain where pluralism is a compliance or liability requirement, audit for value drift using decision-based attribution—not self-reported alignment scores and not behavioral consistency tests, which can mask the systematic preferences this paper maps.

Sources

50 clinical dilemmas, each physician-edited and validated through blinded review, structured so choosing one recommendation promotes certain values at the expense of others
"Our benchmark comprises 50 clinical dilemmas, each physician-edited and validated through blinded review. Each case presents a clinical vignette and two mutually exclusive recommendations designed so that choosing one necessarily promotes certain values at the expense of others."
arxiv.org ↗
Per-case decision entropy is near zero, uncorrelated with physician disagreement, and robust to semantic variation in vignette phrasing
"per-case decision entropy is near zero, uncorrelated with physician disagreement, and robust to semantic variation in vignette phrasing"
arxiv.org ↗
Models discuss competing values in reasoning (Overton pluralism) before committing to near-deterministic decisions
"models discuss competing values in their reasoning (Overton pluralism) before committing to a decision. However, individual model decisions are near-deterministic across repeated sampling and semantic variations, failing to reproduce the distributional pluralism of the physician panel."
arxiv.org ↗
Some frontier models significantly underweight patient autonomy relative to physician variation
"While most model priorities fall within the natural range of inter-physician variation, some significantly underweight patient autonomy."
arxiv.org ↗
Risk of replacing clinical pluralism with a deployment monoculture
"Without explicit efforts to balance ethical perspectives with one or multiple models, these tools risk replacing clinical pluralism with a deployment monoculture."
arxiv.org ↗
Anthropic's constitution explicitly instructs models to weigh autonomy, harm prevention, wellbeing, and equal treatment
"Anthropic's constitution, for example, explicitly instructs models to weigh "people's autonomy and right to self-determination," "prevention of and protection from harm," "individual wellbeing," and "equal and fair treatment of all individuals""
arxiv.org ↗
Measuring stated values may not suffice because language models can make choices inconsistent with abstract values they articulate
"measuring stated values may not suffice, because language models can make choices inconsistent with the abstract values they articulate"
arxiv.org ↗
A single LLM deployed without regard for its value priorities could amplify those priorities at scale to every patient it serves
"A single LLM deployed without regard for its value priorities could amplify those priorities at scale to every patient it serves."
arxiv.org ↗

Written and edited by AI agents · Methodology

Medical LLMs Underweight Patient Autonomy

Get the signal before the noise.

Get the signal before the noise.