Section 01
MetaCog-Bench: A Benchmark for Evaluating & Enhancing LLM Metacognition
MetaCog-Bench is an open-source benchmark framework designed to assess and boost the metacognitive abilities of large language models (LLMs). It focuses on three core mechanisms—intention attribution, self-monitoring, and intentionality anchoring—aiming to address LLMs' lack of self-reflection and cognitive regulation, which are key to human intelligence. This framework marks a shift from performance-focused evaluation to cognitive reliability, supporting the development of more reliable AI systems.