章节 01
OPD: Re-examining On-Policy Distillation for LLMs - Phenomena, Mechanisms & Practice Guide
This post summarizes the systematic study on On-Policy Distillation (OPD) by Tsinghua University's NLP Lab. The research reveals limitations of traditional off-policy knowledge distillation and provides a complete OPD practice methodology, covering core phenomena, underlying mechanisms, experimental validation, and actionable guidelines for model compression and deployment.