# Reasoning Trace Collapse: How Fine-tuning Quietly Undermines Explicit Reasoning Models

> This paper reveals the phenomenon of Reasoning Trace Collapse in explicit reasoning models during downstream fine-tuning—models can still produce correct answers but lose structured intermediate reasoning processes. It proposes a structural evaluation framework and a loss masking strategy to detect and mitigate this issue.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-20T12:58:01.000Z
- 最近活动: 2026-05-21T03:56:07.287Z
- 热度: 132.0
- 关键词: 显式推理, 模型微调, 链式思考, 可解释性, 评估框架, AI安全
- 页面链接: https://www.zingnex.cn/en/forum/thread/llm-arxiv-2605-21127v1
- Canonical: https://www.zingnex.cn/forum/thread/llm-arxiv-2605-21127v1
- Markdown 来源: floors_fallback

---

## [Introduction] Reasoning Trace Collapse: A Hidden Crisis in Fine-tuning Explicit Reasoning Models

This paper reveals the **Reasoning Trace Collapse** phenomenon in explicit reasoning models (e.g., DeepSeek-R1, OpenAI o1) during downstream fine-tuning—models can still maintain correct answers but lose structured intermediate reasoning processes. This phenomenon is highly covert and undermines the model's interpretability and reliability. The study proposes a **structural evaluation framework** to detect the problem and uses a **loss masking strategy** to mitigate the collapse, providing key guidance for the fine-tuning and application of explicit reasoning models.

## Background: The Rise of Explicit Reasoning Models and Fine-tuning Challenges

In recent years, explicit reasoning models have excelled in complex tasks by generating detailed intermediate reasoning processes (e.g., chain-of-thought), bringing three major advantages: interpretability, reliability, and the ability to handle complex tasks. However, during downstream fine-tuning, task data often only contains instruction-response pairs and **lacks intermediate reasoning traces**, which becomes a key challenge for model applications.

## Phenomenon: Definition and Harms of Reasoning Trace Collapse

The study discovered the **Reasoning Trace Collapse** phenomenon: after fine-tuning an explicit reasoning model on data without reasoning traces, although it can still output correct answers, it loses structurally valid explicit reasoning traces and degenerates from explicit reasoning to implicit reasoning. Its harms include: the correctness of answers masks the problem, loss of interpretability, decreased reliability, and difficulty in locating and correcting errors.

## Method: Structural Evaluation Framework—An Evaluation System Separating Answers and Reasoning

To quantitatively study the collapse phenomenon, the team developed a **structural evaluation framework** that assesses the state of reasoning traces from four dimensions: valid reasoning (exists and logically coherent), empty reasoning (invalid content), missing reasoning (directly outputting answers), and truncated reasoning (stopping midway). The framework also introduces **reasoning-conditional performance**, which calculates task performance only when reasoning is valid, revealing the model's true explicit reasoning ability.

## Experimental Evidence: Collapse Speed and Evaluation Bias

Experiments were conducted on four open-source reasoning models, and the findings are: 1. Standard Fine-tuning (SFT) can reduce the proportion of valid reasoning in a very short time; 2. Answer-only metrics seriously mask the problem—conditional performance remains high, but the valid reasoning rate drops sharply, leading researchers to mistakenly judge fine-tuning as successful, while the core ability is actually impaired.

## Mitigation Strategy: Loss Masking—A Protection Method Without Additional Reasoning Traces

A **loss masking strategy** is proposed to mitigate the collapse: when calculating training loss, process the reasoning trace part (full masking: no loss calculation; partial masking: reduce weight). This method does not require teacher-generated reasoning traces; only modifying the loss calculation can significantly reduce the collapse while maintaining task performance and explicit reasoning ability.

## Practical Recommendations and Research Insights

**Practical Recommendations**: 1. Evaluations should include reasoning reliability metrics (proportion of valid reasoning, conditional performance, etc.); 2. When fine-tuning on data without reasoning traces, use loss masking and monitor quality; 3. Consider synthetic reasoning traces (generated by teacher models, manual annotation, etc.); 4. Continuously monitor reasoning behavior in production environments.

**Research Insights**: Performance does not equal ability; a single metric easily masks behavioral changes; fine-tuning needs to be cautious—standard SFT may lead to ability degradation. Protecting explicit reasoning ability is key to building trustworthy AI.
