# CLR-voyance: Enhancing Open-ended Reasoning for Inpatient Clinical Decision-making with Outcome-aware Scoring Rules

> This article introduces the CLR-voyance framework, which remodels inpatient clinical reasoning as a Partially Observable Markov Decision Process (POMDP). It supervises model training using outcome-grounded, clinician-validated reward signals and outperforms cutting-edge medical reasoning models like GPT-5 and MedGemma-27B on inpatient clinical reasoning tasks.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-10T14:51:31.000Z
- 最近活动: 2026-05-12T04:19:05.841Z
- 热度: 120.5
- 关键词: clinical reasoning, POMDP, medical AI, reinforcement learning, GRPO, LLM evaluation, healthcare
- 页面链接: https://www.zingnex.cn/en/forum/thread/clr-voyance
- Canonical: https://www.zingnex.cn/forum/thread/clr-voyance
- Markdown 来源: floors_fallback

---

## Introduction: CLR-voyance—An Innovative Framework to Enhance Inpatient Clinical Decision-making Reasoning

This article introduces the CLR-voyance framework, which remodels inpatient clinical reasoning as a Partially Observable Markov Decision Process (POMDP). It supervises training using outcome-anchored, clinician-validated reward signals, outperforms cutting-edge medical models like GPT-5 and MedGemma-27B on inpatient clinical reasoning tasks, and has been deployed in public hospitals to validate its value in real-world settings.

## Background: Unique Challenges of Inpatient Clinical Decision-making and Limitations of Existing Methods

Inpatient clinical decision-making is a sequential decision problem with three core features: partial observability (unable to predict future patient conditions), open-ended reasoning (no fixed answers), and outcome lag (effects appear hours/days later). Existing evaluation methods for clinical large language models often simplify tasks to closed-ended ones, leak clinical processes, or rely on unanchored LLM scores, failing to reflect the complexity of real-world decisions.

## Core of the Framework: POMDP Modeling and Outcome-anchored Reward Design

The core innovation of CLR-voyance is formalizing inpatient reasoning as a POMDP and designing reward signals that meet two conditions: 1. Outcome-anchored (verifiable in the actual patient's treatment process); 2. Clinically validated (confirmed by professional doctors). The framework divides into a strategy-visible past (historical data accessible to the model) and a future visible only to the oracle (actual outcomes that verify reasoning quality), generating adaptive scoring rules for training and evaluation.

## Technical Implementation: Training Process and Clinical Alignment Research

The base models selected are Qwen3-8B and MedGemma-4B; Group Relative Policy Optimization (GRPO) reinforcement learning is used for post-training; advantages are integrated via model merging. A large-scale clinician alignment study was conducted: doctors designed scoring rules, scored model responses, and provided blind pairwise preference comparisons to validate effectiveness and offer insights for clinical LLM evaluation.

## Experimental Results: Performance Surpassing Cutting-edge Medical Models

CLR-voyance-8B achieved 84.91% on the CLR-POMDP benchmark, significantly outperforming GPT-5 (77.83%) and MedGemma-27B (66.66%). It also performed equivalently or better on existing medical benchmarks, enhancing professional capabilities without sacrificing generality.

## Real-world Deployment: Application Effectiveness in Actual Clinical Settings

CLR-voyance has been deployed in partner public hospitals for over six months, assisting doctors in drafting thousands of reasoning-intensive inpatient medical records, successfully integrating into existing clinical information systems, and verifying its practicality and reliability in real-world scenarios.

## Technical Insights: Key Directions for Clinical AI Development

The insights from CLR-voyance include: the value of formal modeling (POMDP), the importance of outcome-aware rewards, the necessity of clinical validation, and the potential of small-scale models (8B parameters outperforming larger models). Future work can further explore the application of this framework in more clinical scenarios.

## Conclusion: Significant Progress in the Field of Clinical AI

CLR-voyance represents a significant progress in clinical AI, innovatively applying the POMDP framework to clinical reasoning, combining technical rigor and clinical practicality, and providing a reference paradigm for future clinical decision support systems.
