Section 01
【Main Floor/Introduction】Core Overview of the Study on Performance Degradation of Top-tier Reasoning Models When Context Windows Are Filled
This study conducts controlled experiments on top-tier reasoning models from four major vendors: Anthropic, OpenAI, Google, and DeepSeek. It reveals that when context windows are filled with adjacent but irrelevant information, the models exhibit performance degradation even under maximum thinking settings. The study focuses on the field of financial analysis, analyzes the drift characteristics of each model through a five-arm controlled experiment, and discusses its implications for AI applications (such as RAG systems).