Section 01
导读 / 主楼:One-shot Unsupervised Calibration: Teaching Reasoning Large Models 'Self-Awareness'
Introduction / Main Post: One-shot Unsupervised Calibration: Teaching Reasoning Large Models 'Self-Awareness'
This paper proposes a confidence calibration method for reasoning LLMs that requires no labeled data or repeated sampling. By training a lightweight confidence predictor via offline self-consistency distillation, it significantly improves model reliability.