Zing Forum

Reading

JTS Framework: Bridging the Detection-to-Abstention Gap in Reasoning Models Under Insufficient Information

Large reasoning models often detect the incompleteness of a problem when faced with insufficient information, yet they still continue reasoning and provide unsupported answers. The Judge-Then-Solve (JTS) framework proposed in this paper uses trajectory-level reasoning control to train models to make an answerability commitment before generating solutions, effectively improving the reliability of abstention.

推理模型信息不足弃权机制检测-弃权鸿沟强化学习医疗AI推理控制Judge-Then-Solve
Published 2026-05-28 10:19Recent activity 2026-05-28 10:23Estimated read 8 min
JTS Framework: Bridging the Detection-to-Abstention Gap in Reasoning Models Under Insufficient Information
1

Section 01

[Introduction] JTS Framework: Bridging the Detection-to-Abstention Gap in Reasoning Models

Original Author/Maintainer: arXiv authors Source Platform: arXiv Original Title: Bridging the Detection-to-Abstention Gap in Reasoning Models under Insufficient Information Original Link: http://arxiv.org/abs/2605.28070v1 Release Time: 2026-05-28

Large reasoning models face the problem of "detecting but not acting" when information is insufficient—they can identify missing information but still forcefully reason and give unsupported answers, a phenomenon called the Detection-to-Abstention Gap. The Judge-Then-Solve (JTS) framework proposed in this paper uses trajectory-level reasoning control to train models to judge answerability before generating solutions, effectively improving the reliability of abstention and supporting the safe deployment of high-risk scenarios (such as medical AI).

2

Section 02

Research Background: The Detection-to-Abstention Gap Problem in Reasoning Models

Large reasoning models excel at handling complex problems, but when faced with queries with insufficient information, they have a hidden flaw of "detecting missing information but not abstaining". The research team formalized this phenomenon as the Detection-to-Abstention Gap, which is particularly dangerous in high-risk fields like medical AI: for example, a diagnostic AI that knows the medical records are insufficient but still gives a diagnosis could lead to catastrophic consequences.

3

Section 03

Analysis of Limitations of Existing Methods

Traditional methods treat abstention as an answer style (outputting "I don't know", etc.) and have three major problems:

  1. Passive response: Only choose to abstain at the final stage, unable to actively control the reasoning process;
  2. Reasoning waste: Even if aware of insufficient information, still complete reasoning, wasting computing resources;
  3. Risk accumulation: Make assumptions based on missing premises when continuing reasoning, amplifying the risk of errors.
4

Section 04

JTS Framework: Core Mechanism of Judge-Then-Solve

JTS is a trajectory-level reasoning control framework with the core principle of "Judge-Then-Solve": Judge Phase: Before generating a solution, the model must explicitly judge whether the problem has sufficient information to answer; if not, it immediately terminates reasoning; Solve Phase: Only after passing the judgment does it generate a solution.

Training strategies include:

  • Supervised warm-up: Use supervised learning to familiarize the model with answerability judgment;
  • Missing premise reinforcement learning: Train the model to actively abstain using consistency rewards (consistency between judgment and action) and length shaping rewards (terminate unanswerable reasoning as early as possible).
5

Section 05

Experimental Results: Dual Improvement in Abstention Reliability and Efficiency

Experiments on dense and MoE models show:

  1. Significant improvement in abstention reliability: The Abstention@Detection (A@D) metric is nearly saturated, meaning the model can take abstention actions based on detection results;
  2. Optimized reasoning efficiency: Early termination of unanswerable trajectories reduces unnecessary computation;
  3. Improved reasoning behavior: Reduces unproductive reflection on difficult but answerable questions, making reasoning more direct and efficient.
6

Section 06

Technical Significance and Potential Application Scenarios

Technical Significance:

  • Improved safety: Models can explicitly abstain in high-risk scenarios, reducing the risk of wrong decisions;
  • Saved computing resources: Early termination of invalid reasoning makes it suitable for large-scale deployment;
  • Enhanced interpretability: The explicit judgment mechanism makes the decision process more transparent.

Potential Application Scenarios:

  • Medical diagnosis assistance: Prompt to supplement medical record information instead of giving uncertain diagnoses;
  • Legal consultation: Guide users to supplement background information;
  • Scientific research assistance: Identify missing data and suggest supplementary experiments;
  • Financial risk control: Reject risk assessments with insufficient information.
7

Section 07

Limitations and Future Research Directions

Limitations of JTS and future research directions:

  1. Improve judgment accuracy: Avoid misjudging answerable questions as unanswerable;
  2. Multilingual expansion: Verify effectiveness in non-English scenarios;
  3. Integration with other safety mechanisms: Explore synergy with Constitutional AI and RLHF;
  4. Dynamic threshold adjustment: Dynamically adjust the answerability judgment threshold according to the scenario.
8

Section 08

Conclusion: Core Contributions of the JTS Framework

The JTS framework effectively bridges the detection-to-abstention gap in reasoning models by redefining abstention as a control decision rather than an answer style. Experiments prove that it significantly improves reliable abstention ability, optimizes reasoning efficiency, and improves reasoning behavior, providing technical support for the safe deployment of high-risk scenarios and pointing the way for building more reliable and controllable AI systems.