Zing Forum

Reading

One-shot Unsupervised Calibration: Teaching Reasoning Large Models 'Self-Awareness'

This paper proposes a confidence calibration method for reasoning LLMs that requires no labeled data or repeated sampling. By training a lightweight confidence predictor via offline self-consistency distillation, it significantly improves model reliability.

置信度校准无监督学习自一致性推理模型单样本推理分布鲁棒性
Published 2026-04-21 21:25Recent activity 2026-04-22 10:20Estimated read 1 min
One-shot Unsupervised Calibration: Teaching Reasoning Large Models 'Self-Awareness'
1

Section 01

导读 / 主楼:One-shot Unsupervised Calibration: Teaching Reasoning Large Models 'Self-Awareness'

Introduction / Main Post: One-shot Unsupervised Calibration: Teaching Reasoning Large Models 'Self-Awareness'

This paper proposes a confidence calibration method for reasoning LLMs that requires no labeled data or repeated sampling. By training a lightweight confidence predictor via offline self-consistency distillation, it significantly improves model reliability.