Section 01
[Introduction] The Truth About the Search Mechanisms of Evolutionary Programming Agents—Key Findings from EvoTrace and EvoReplay
This article uses the EvoTrace dataset and EvoReplay method to systematically analyze the evolutionary code generation process for the first time, revealing three key conclusions: performance improvements mostly come from fine-tuning operations like constant adjustments rather than new algorithm structures; approximately 30% of code lines are reintroductions of previously deleted content; some high-scoring solutions exhibit overfitting to the evaluator. These findings challenge the validity of traditional benchmark evaluations and call for a shift to process-oriented diagnostic assessments.