# Adam's Law: Text Frequency Law Reveals LLMs Prefer "Common Expressions", Rewriting Inputs Can Improve Performance

> The study proposes the Text Frequency Law (TFL), finding that LLMs are more sensitive to high-frequency text expressions. A three-step framework of input rewriting, frequency distillation, and curriculum training was validated effective in tasks like mathematical reasoning and translation.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-02T15:39:25.000Z
- 最近活动: 2026-04-03T01:23:23.163Z
- 热度: 152.3
- 关键词: 文本频率, Adam's Law, TFL, LLM优化, 提示工程, 课程学习, 输入改写, 频率蒸馏, TFPD
- 页面链接: https://www.zingnex.cn/en/forum/thread/adam-s-law-llm
- Canonical: https://www.zingnex.cn/forum/thread/adam-s-law-llm
- Markdown 来源: floors_fallback

---

## Introduction: Key Points of Adam's Law and Text Frequency Law (TFL)

The study proposes the **Text Frequency Law (TFL)**, revealing that LLMs are more sensitive to high-frequency text expressions. It constructs a three-step optimization framework of **input rewriting, frequency distillation, and curriculum training**, which was validated effective in tasks such as mathematical reasoning, machine translation, commonsense reasoning, and agent tool calling. This finding provides a new direction for LLM optimization.

## Background: The Neglected Factor of Text Frequency

LLM research often focuses on architecture, data scale, etc., but **text frequency** has long been neglected. Psychology confirms that humans read high-frequency words faster (familiarity effect). The Adam's Law study aims to explore whether LLMs have a similar pattern, proposing TFL and constructing an optimization framework.

## Core Findings: TFL's Assertions and Frequency Estimation

Core of TFL: LLMs' prompts and fine-tuning should prioritize high-frequency expressions (since high-frequency expressions are encountered more in pre-training, the model understands them more fully). Due to closed-source training data, the team used **online resources** to estimate text frequency and solve statistical challenges.

## Three-Step Optimization Framework: From Theory to Practice

Three-step framework based on TFL:
1. **Input Rewriting**: Convert input into semantically equivalent high-frequency expressions (no model modification needed);
2. **Frequency Distillation (TFD)**: Use LLM continuation to generate corpus for calibrating frequency estimation;
3. **Curriculum Training for Frequency Tuning (CTFT)**: Fine-tune from low to high frequency (drawing on curriculum learning).

## Experimental Validation: Multi-Task Test Results

Constructed the **TFPD dataset** and validated on four tasks:
- Mathematical reasoning: High-frequency rewriting improves problem understanding;
- Translation: High-frequency expressions in the target language make results more authentic;
- Commonsense reasoning: Reduces ambiguity and improves accuracy;
- Tool calling: Clear instructions enhance reliability. All tasks achieved significant improvements.

## Technical Details: Key Implementation Considerations

Implementation considerations:
- Frequency estimation granularity: Sentence level (balancing semantics and sparsity);
- Rewriting quality: Maintain semantic equivalence (using similarity models or manual review);
- Online resources: Choose corpora close to the distribution of training data (e.g., subsets of Common Crawl);
- Curriculum strategy: Reasonably set frequency thresholds and training phases.

## Application Insights: Practical Suggestions for LLM Optimization

Research insights:
- Prompt engineering: Add the dimension of expression frequency, use common expressions to improve performance;
- Data preprocessing: Filter/rearrange fine-tuning data by frequency (curriculum-style organization);
- Input optimization automation: Use a rewriter as a general preprocessing module (applicable to scenarios like chatbots).

## Limitations and Future Directions

Limitations and future directions:
- Trade-off between frequency and quality: High-frequency expressions may lose professionalism;
- Cross-language applicability: Need to verify effectiveness in non-English languages;
- Impact of model scale: Whether ultra-large models are still affected by frequency effects;
- Technical combination: Synergistic effects with CoT, few-shot learning, etc.
