Zing Forum

Reading

Adam's Law: Text Frequency Law Reveals LLMs Prefer "Common Expressions", Rewriting Inputs Can Improve Performance

The study proposes the Text Frequency Law (TFL), finding that LLMs are more sensitive to high-frequency text expressions. A three-step framework of input rewriting, frequency distillation, and curriculum training was validated effective in tasks like mathematical reasoning and translation.

文本频率Adam's LawTFLLLM优化提示工程课程学习输入改写频率蒸馏TFPD
Published 2026-04-02 23:39Recent activity 2026-04-03 09:23Estimated read 5 min
Adam's Law: Text Frequency Law Reveals LLMs Prefer "Common Expressions", Rewriting Inputs Can Improve Performance
1

Section 01

Introduction: Key Points of Adam's Law and Text Frequency Law (TFL)

The study proposes the Text Frequency Law (TFL), revealing that LLMs are more sensitive to high-frequency text expressions. It constructs a three-step optimization framework of input rewriting, frequency distillation, and curriculum training, which was validated effective in tasks such as mathematical reasoning, machine translation, commonsense reasoning, and agent tool calling. This finding provides a new direction for LLM optimization.

2

Section 02

Background: The Neglected Factor of Text Frequency

LLM research often focuses on architecture, data scale, etc., but text frequency has long been neglected. Psychology confirms that humans read high-frequency words faster (familiarity effect). The Adam's Law study aims to explore whether LLMs have a similar pattern, proposing TFL and constructing an optimization framework.

3

Section 03

Core Findings: TFL's Assertions and Frequency Estimation

Core of TFL: LLMs' prompts and fine-tuning should prioritize high-frequency expressions (since high-frequency expressions are encountered more in pre-training, the model understands them more fully). Due to closed-source training data, the team used online resources to estimate text frequency and solve statistical challenges.

4

Section 04

Three-Step Optimization Framework: From Theory to Practice

Three-step framework based on TFL:

  1. Input Rewriting: Convert input into semantically equivalent high-frequency expressions (no model modification needed);
  2. Frequency Distillation (TFD): Use LLM continuation to generate corpus for calibrating frequency estimation;
  3. Curriculum Training for Frequency Tuning (CTFT): Fine-tune from low to high frequency (drawing on curriculum learning).
5

Section 05

Experimental Validation: Multi-Task Test Results

Constructed the TFPD dataset and validated on four tasks:

  • Mathematical reasoning: High-frequency rewriting improves problem understanding;
  • Translation: High-frequency expressions in the target language make results more authentic;
  • Commonsense reasoning: Reduces ambiguity and improves accuracy;
  • Tool calling: Clear instructions enhance reliability. All tasks achieved significant improvements.
6

Section 06

Technical Details: Key Implementation Considerations

Implementation considerations:

  • Frequency estimation granularity: Sentence level (balancing semantics and sparsity);
  • Rewriting quality: Maintain semantic equivalence (using similarity models or manual review);
  • Online resources: Choose corpora close to the distribution of training data (e.g., subsets of Common Crawl);
  • Curriculum strategy: Reasonably set frequency thresholds and training phases.
7

Section 07

Application Insights: Practical Suggestions for LLM Optimization

Research insights:

  • Prompt engineering: Add the dimension of expression frequency, use common expressions to improve performance;
  • Data preprocessing: Filter/rearrange fine-tuning data by frequency (curriculum-style organization);
  • Input optimization automation: Use a rewriter as a general preprocessing module (applicable to scenarios like chatbots).
8

Section 08

Limitations and Future Directions

Limitations and future directions:

  • Trade-off between frequency and quality: High-frequency expressions may lose professionalism;
  • Cross-language applicability: Need to verify effectiveness in non-English languages;
  • Impact of model scale: Whether ultra-large models are still affected by frequency effects;
  • Technical combination: Synergistic effects with CoT, few-shot learning, etc.