Confidence Estimation and Calibration
The system collects the output distribution of the quantized model on the validation set, analyzes the relationship between predicted confidence and actual accuracy, and constructs a calibration function to convert raw confidence into reliable estimates. It also considers the impact of quantization bit-width, using corresponding calibration parameters for different bit-widths.
Dynamic Stopping Strategy
The adaptive stopping module evaluates the confidence of the current output at each step of inference, terminating when the confidence exceeds a preset threshold or the maximum number of steps is reached. The threshold can be adjusted based on scenarios: conservative thresholds for high-reliability tasks, and relaxed standards for scenarios with high real-time requirements.