Reading

Research on Reasoning Consistency in Knowledge Distillation: Are Compressed Models 'Thinking' Correctly?

This article deeply analyzes an empirical study on reasoning consistency in knowledge distillation. Through GradCAM saliency maps, CKA representation alignment, and calibration analysis, it reveals key findings about the decoupling of accuracy and reasoning consistency during model compression.

知识蒸馏推理一致性模型压缩GradCAMCKA模型校准捷径学习边缘部署可信度温度参数

Published 2026-05-01 21:47Recent activity 2026-05-01 22:22Estimated read 7 min

Research on Reasoning Consistency in Knowledge Distillation: Are Compressed Models 'Thinking' Correctly?

Section 01

Introduction: Reasoning Consistency in Knowledge Distillation—Are Compressed Models 'Thinking' Correctly?

This article focuses on a core overlooked issue in knowledge distillation technology: when the compressed student model and the teacher model give the same answer, do they rely on the same reasoning logic? Through three dimensions—GradCAM saliency map comparison, CKA representation alignment, and calibration analysis—it reveals the key phenomenon of decoupling between accuracy and reasoning consistency, providing a new perspective for model evaluation in edge AI deployment.

Section 02

Background: Evaluation Blind Spots in Knowledge Distillation and Reasoning Consistency Issues

Knowledge distillation compresses large models into small ones to adapt to edge deployment, but traditional evaluation only focuses on accuracy. A fundamental issue is overlooked: student models may achieve high accuracy through 'shortcut learning' (e.g., relying on background textures instead of object shapes). This 'correct but wrong' reasoning poses unpredictable risks in real-world scenarios.

Section 03

Research Methods: Multi-dimensional Consistency Measurement Framework

The study designs an integrated evaluation system:

GradCAM Saliency Map Comparison: Calculate Spearman rank correlation (heatmap distribution similarity) and Top-20% pixel IoU (focus area overlap), analyzing only on 9196 images where both teacher and student models predicted correctly;
CKA Representation Alignment: Compare the similarity of representations in intermediate layers (layer1 to pre_fc) of teacher and student models;
Calibration Analysis: Use Expected Calibration Error (ECE) to measure the consistency between model confidence and accuracy.

Section 04

Key Findings: Decoupling Patterns Between Accuracy and Reasoning Consistency

Distillation improves accuracy but impairs calibration: Student accuracy increases from 91.71% to 92.93%, but ECE worsens from 0.0325 to 0.0442;
Reasoning consistency is significantly below perfect level: When both teacher and student are correct, the average Spearman ρ=0.6976 and IoU=0.4426, with extreme cases where ρ reaches -0.5679;
Typical shortcut learning categories: Car and ship categories have the highest accuracy (96.3%/95.2%) but the lowest consistency (ρ=0.538/0.644);
Compression has a greater impact on consistency: From medium model (24x compression) to tiny model (248x compression), accuracy drops by 8.1 percentage points, while consistency drops by 18.7 percentage points;

Temperature parameter is a regulator:

Temperature	Accuracy	ECE	Spearman ρ	IoU
T=2	92.40%	0.0429	0.672	0.422
T=4	92.93%	0.0442	0.698	0.443
T=8	92.93%	0.0454	0701	0.445
T=2 gives the best calibration, while T=8 gives the best reasoning consistency.

Section 05

Experimental Design: Teacher and Student Model Configuration Details

Teacher Model: ResNet-50 fine-tuned on CIFAR-10 to 97.31% accuracy, with conv1 replaced by a 3×3 kernel and no maxpool to maintain 32×32 resolution;
Student Models:
- Tiny model:95k parameters,248x compression (3 convolutional blocks, final block with128 channels);
- Small model:242k parameters,97x compression (4 convolutional blocks, final block with128 channels);
- Medium model:982k parameters,24x compression (5 convolutional blocks, last two blocks with256 channels).

Section 06

Engineering Implications: Multi-dimensional Trade-offs and Deployment Strategies

Evaluation Paradigm Shift: Key scenarios need to evaluate both accuracy and reasoning consistency simultaneously;
Temperature Parameter Selection: Low temperature (e.g., T=2) prioritizes calibration, while high temperature (e.g., T=8) prioritizes reasoning consistency;
Shortcut Learning Detection: Monitor categories with high accuracy but low consistency;
Compression Trade-off: Over-compression leads to unpredictable reasoning inconsistencies, so balance size, accuracy, and consistency.

Section 07

Conclusion: Correct Answer ≠ Correct Reasoning—Emphasize Process Evaluation

Core conclusion of the study: A correct answer does not equal correct reasoning. For edge AI deployment, we need to go beyond accuracy metrics and focus on the consistency of the model's reasoning process. In critical fields like medical imaging and autonomous driving, the importance of reasoning consistency may exceed that of pure prediction accuracy.

Research on Reasoning Consistency in Knowledge Distillation: Are Compressed Models 'Thinking' Correctly?

Introduction: Reasoning Consistency in Knowledge Distillation—Are Compressed Models 'Thinking' Correctly?

Background: Evaluation Blind Spots in Knowledge Distillation and Reasoning Consistency Issues

Research Methods: Multi-dimensional Consistency Measurement Framework

Key Findings: Decoupling Patterns Between Accuracy and Reasoning Consistency

Experimental Design: Teacher and Student Model Configuration Details

Engineering Implications: Multi-dimensional Trade-offs and Deployment Strategies

Conclusion: Correct Answer ≠ Correct Reasoning—Emphasize Process Evaluation

Continue Reading

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

LLM-assisted-analysis: A New Approach to Detecting Logical Vulnerabilities in Smart Contracts Using Large Language Models

Building Modern LLM from Scratch: A Tutorial-level Implementation of Llama-style Language Model