Section 01
[Introduction] Key Findings of Cross-Format Transfer Study on Error Awareness Detection in Large Language Models
This article focuses on the key question: "Can large language models recognize their own errors?" Researchers developed a low-cost error awareness detector based on probability distributions, but cross-format transfer tests revealed that the detector does not truly understand errors—instead, it overfits to surface features of the dataset. This finding has important implications for the reliability assessment of LLMs and AI safety.