Section 01
Introduction to the APORIA Framework: Focus on Rigorous Evaluation of LLM Metacognitive Capabilities
APORIA is a rigorous evaluation framework for the metacognitive capabilities of large language models (LLMs). Its core lies in using a dynamic five-round interaction protocol to isolate interfering factors and assess metacognitive abilities such as self-reflection and confidence calibration. This framework fills the gap in the current neglect of the metacognitive dimension in LLM evaluations and is of great significance for improving model reliability and safety.