Section 01
[Introduction] Core Summary of Confidence Calibration Research for Quantized Large Language Models
This article interprets the uncertainty-aware-inference project, systematically analyzes the impact of Post-Training Quantization (PTQ) on the confidence calibration of large language models (LLMs) of different scales, and finds that quantization impairs calibration quality (the lower the precision, the larger the model scale, and the more obvious the impact on generation tasks). It also verifies that knowledge distillation can effectively restore part of the calibration performance, and provides practical insights such as quantization strategy selection, post-calibration processing techniques, and monitoring evaluation.