Section 01
Introduction: One-Sample Unsupervised Calibration Enables Reasoning Large Models to Gain "Self-Awareness"
Introduction: One-Sample Unsupervised Calibration Enables Reasoning Large Models to Gain "Self-Awareness"
This paper proposes a confidence calibration method for reasoning LLMs that requires no labeled data or repeated sampling. By training a lightweight confidence predictor via offline self-consistency distillation, it significantly improves model reliability. This method addresses the limitations of existing calibration techniques that rely on labeled data or increase inference overhead, providing support for deployment in high-risk scenarios.