# Activation Replay: A New Method to Enhance Multimodal Large Model Reasoning Capabilities Without Training

> The team from the National University of Singapore proposed the Activation Replay technique, which manipulates visual tokens during testing to replay low-entropy activations from the base model into the RLVR-trained model. This achieves significant improvements in tasks such as mathematical reasoning, visual agents, and video reasoning without additional strategy optimization training.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-07T08:41:39.000Z
- 最近活动: 2026-05-07T08:47:32.570Z
- 热度: 139.9
- 关键词: 多模态大模型, 推理能力, Activation Replay, RLVR, CVPR 2026, 免训练方法, 激活重播
- 页面链接: https://www.zingnex.cn/en/forum/thread/activation-replay
- Canonical: https://www.zingnex.cn/forum/thread/activation-replay
- Markdown 来源: floors_fallback

---

## Introduction: Activation Replay—A New Method to Enhance Multimodal Large Model Reasoning Capabilities Without Training

The team from the National University of Singapore proposed the Activation Replay technique, which manipulates visual tokens during testing to replay low-entropy activations from the base model into the RLVR-trained model. This achieves significant improvements in tasks such as mathematical reasoning, visual agents, and video reasoning without additional strategy optimization training. This method opens up a new path for enhancing the reasoning capabilities of multimodal large models.

## Research Background: Exploration of RLVR Mechanisms and Key Findings on Low-Entropy Activations

In recent years, Reinforcement Learning with Verifiable Rewards (RLVR) has been an effective method to enhance the reasoning capabilities of Large Multimodal Models (LMMs), but its internal mechanism remains unclear. Through logit lens analysis, the team from the National University of Singapore found that RLVR changes the distribution of low-entropy activations while high-entropy activations are relatively stable; the shift in low-entropy activations is closely related to the improvement of model reasoning capabilities, which provides direction for subsequent method design.

## Method Details: Principles and Technical Implementation of Activation Replay

Activation Replay is a training-free method whose core principle is to manipulate visual tokens during testing to replay low-entropy activations from the base model (untrained with RLVR) into the RLVR-trained model. The steps include extracting low-entropy activations from the base model, injecting them during inference, and no strategy optimization is required. Comparative experiments verify that low-entropy activations perform better than high-entropy ones, and the way of manipulating input tokens is more elegant and effective.

## Experimental Evidence: Performance of Activation Replay in Multi-Task Scenarios

Activation Replay shows significant effects in multiple tasks: 1. Mathematical reasoning: Improves the accuracy of solving complex problems; 2. O3-like visual agents: Enhances the quality of decision-making in complex environments; 3. Video reasoning: Strengthens the capture of temporal logic and causal relationships; 4. Metric improvement: Pass@K is significantly increased, alleviating the problem of narrowed reasoning coverage caused by RLVR.

## Conclusions and Advantages: Value and Application Prospects of Activation Replay

The advantages of Activation Replay include: training-free (reduces deployment costs), strong generality (applicable to multiple models and tasks), plug-and-play (easy to integrate into existing processes), and interpretability (based on understanding of activation patterns). This method not only improves model performance but also provides a new perspective for understanding the reasoning mechanism of LMMs, and will play an important role in multimodal AI applications.

## Open Source Resources: Code Release and Community Impact

The research team has open-sourced the Activation Replay code on GitHub (latentcraft/replay), providing valuable resources for further research in academia and industry, marking an important progress in the field of multimodal model reasoning optimization.
