Zing Forum

Reading

DRIFT: A Dual-Model Framework for Long-Context Reasoning Based on Implicit Fact Tokens

DRIFT decouples reading and reasoning, preventing the reasoning model from directly processing raw long-context inputs. Instead, it provides a knowledge representation specifically designed for reasoning, achieving excellent performance and significant context compression across multiple long-context benchmarks.

长上下文推理上下文压缩双模型框架隐式事实令牌高效推理大语言模型阅读-推理解耦
Published 2026-04-20 20:04Recent activity 2026-04-20 20:21Estimated read 7 min
DRIFT: A Dual-Model Framework for Long-Context Reasoning Based on Implicit Fact Tokens
1

Section 01

Core Introduction to the DRIFT Framework

This article introduces DRIFT (Dual-Model Framework for Long-Context Reasoning Based on Implicit Fact Tokens), whose core is to decouple reading and reasoning. It provides a compact knowledge representation for the reasoning model via implicit fact tokens, achieving excellent performance and significant context compression in long-context benchmarks. Keywords: long-context reasoning, context compression, dual-model framework, implicit fact tokens, etc.

2

Section 02

Challenges of Long-Context Reasoning and Traditional Solutions

Challenges of Long-Context Reasoning

Large language models face issues like quadratic growth in computational complexity/memory requirements when processing long contexts, and difficulty in locating key information leading to reduced reasoning quality. Traditional solutions:

  • Retrieval-Augmented Generation (RAG):Prone to losing global context
  • Context Compression: May lose important details
  • Chunk Processing: Breaks context coherence
3

Section 03

Core Methods and Dual-Model Architecture of DRIFT

Core Idea and Architecture of DRIFT

DRIFT proposes a reading-reasoning decoupling paradigm, preventing the reasoning model from directly processing raw long contexts and providing a knowledge representation specifically designed for reasoning.

Dual-Model Architecture

  1. Reading Model: Processes raw long contexts, extracts key facts and encodes them into implicit fact tokens (compact and retains core information).
  2. Reasoning Model: Only processes compressed implicit fact tokens, focuses on logical reasoning without searching through lengthy contexts.

Implicit Fact Token Design

Not a simple summary, but a knowledge representation optimized for downstream reasoning: captures key facts and relationships, removes redundancy, and maintains logical structure.

Reading-Reasoning Collaboration

The two models collaborate via implicit fact tokens: the reading model understands semantics, the reasoning model infers based on compressed representations, with optimized division of labor.

4

Section 04

Technical Advantages of DRIFT

Technical Advantages of DRIFT

Efficiency Improvement

Reduces the number of tokens processed by the reasoning model via context compression, lowering computational complexity and memory usage, accelerating reasoning speed, and reducing costs.

Performance Advantages

Outperforms full-context reasoning and existing compression methods in multiple long-context benchmarks, proving that implicit fact tokens effectively retain key information for reasoning.

Interpretability

Separates the responsibilities of the reading and reasoning models; users can inspect implicit fact tokens to understand the basis for reasoning.

5

Section 05

Application Scenarios of DRIFT

Applicable Scenarios of DRIFT

  1. Document Q&A: Handles Q&A tasks for long documents like legal contracts and research papers.
  2. Multi-turn Dialogue: Efficiently uses context in scenarios with large amounts of dialogue history.
  3. Code Understanding: Analyzes large codebases to support code generation and defect detection.
  4. Knowledge Base Query: Retrieves and infers relevant information from large-scale knowledge bases.
6

Section 06

Project Progress and Resources of DRIFT

Project Progress and Resources

Phased Release Strategy

  • Phase 1: Core model architecture, reasoning scripts, processed training datasets, and data synthesis pipeline.
  • Phase 2: Pre-trained model weights of different scales.
  • Phase 3: Complete training scripts, distributed configurations, hyperparameters.

Released Resources

7

Section 07

Summary and Academic Contributions of DRIFT

Summary and Academic Contributions

DRIFT addresses the trade-off between efficiency and effectiveness in long-context reasoning through its dual-model architecture and implicit fact token mechanism, performing excellently in benchmarks and providing a reference for long-text processing model design. Academically: The paper has been published on arXiv (arXiv:2602.10021), completed by researchers from institutions like Fudan University and Shanghai Artificial Intelligence Laboratory, providing new ideas for long-context reasoning.