# Thinking as Compression: A New Paradigm of Reasoning Models as Context Compressors

> This article introduces a new paradigm called "Thinking as Compression" (TaC), which uses the reasoning model's own thinking process to compress long contexts without requiring a dedicated compression module. At 4x and 8x compression ratios, it outperforms the strongest baseline by 17.4% and 23.4% in F1 score respectively.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-27T16:36:01.000Z
- 最近活动: 2026-05-28T03:47:37.235Z
- 热度: 139.8
- 关键词: 上下文压缩, 推理模型, 长上下文, 大语言模型, TaC, 信息压缩, 思维痕迹, 强化学习
- 页面链接: https://www.zingnex.cn/en/forum/thread/llm-arxiv-2605-28713v1
- Canonical: https://www.zingnex.cn/forum/thread/llm-arxiv-2605-28713v1
- Markdown 来源: floors_fallback

---

## Introduction: Thinking as Compression (TaC) — A New Context Compression Paradigm for Reasoning Models

This article proposes the **Thinking as Compression (TaC)** new paradigm, whose core is to use the reasoning model's own thinking process to compress long contexts without a dedicated compression module. At 4x/8x compression ratios, TaC-C (enhanced version) outperforms the strongest baseline by 17.4% and 23.4% in F1 score respectively, providing an efficient and low-cost solution for long context processing.

## Background: Bottlenecks in Long Context Processing and Limitations of Traditional Compression

As the context window of LLMs expands to millions of tokens, the inference cost (computational complexity, latency, memory usage) grows quadratically. Traditional compression methods rely on dedicated modules or specific training, which have problems such as high additional architecture/training costs and difficulty balancing compression ratio and information retention.

## Core Insight: Reasoning as Compression and the Basic TaC Paradigm

**Core Viewpoint**: The thinking process of a reasoning model is essentially information compression (extracting key elements, establishing logical connections).

**Basic TaC Process**: 
1. Input long context into the reasoning model
2. Prompt to generate thinking traces (e.g., step-by-step analysis)
3. Use thinking traces as compressed context
4. Inference for downstream tasks

Without additional modules or training, it already outperforms most dedicated compression methods.

## Enhanced Paradigm: Constrained Optimization Framework of TaC-C

The original TaC has problems such as difficulty in budget control and the model's tendency to generate superficial thinking, so **TaC-C (Constrained)** is proposed:
- Design a reward function to encourage compact and information-rich thinking
- Optimize the thinking generation strategy through reinforcement learning
- Achieve a balance between controllable compression ratio and high information density.

## Experimental Validation: Significant Performance Advantages of TaC/C

Evaluation results on 4 long context QA benchmarks:

| Compression Ratio | F1 Improvement | EM Improvement |
|--------|--------|--------|
|4x|+17.4%|+15.7%|
|8x|+23.4%|+21.7%|

**Key Findings**: 
1. TaC achieves competitiveness without dedicated training
2. The training cost of TaC-C is much lower than end-to-end compression models
3. Thinking traces are structured, preserving key logical relationships

The higher the compression ratio, the more obvious the advantage.

## Technical Significance and Application Prospects

**Theoretical Implications**: The understanding process itself is effective information encoding, revealing the deep characteristics of intelligent systems.

**Practical Value**: 
- Plug-and-play: No need to modify the model architecture or additional training
- Cost-effectiveness: Avoid additional computation from dedicated modules
- Interpretability: Thinking traces are easier to verify than black-box vectors

**Potential Expansions**: Explore the impact of different reasoning strategies, task-adapted prompts, and joint optimization of compression and reasoning.

## Limitations and Future Research Directions

**Limitations**: 
1. Dependent on the quality of the base reasoning model; weak models tend to generate low-quality compression
2. Some tasks (e.g., precise numerical calculation) are difficult to compress through natural language thinking
3. Generating thinking traces has additional computational overhead; need to balance benefits and costs

**Conclusion**: TaC re-examines the internal connections of LLM capabilities, inspiring future model designs to make more use of inherent capabilities rather than stacking dedicated modules.