Section 01
[Introduction] Cross-Language Code Clone Detection via DeepSeek-R1 Distillation: Small Models Can Also Have Large Model Reasoning Capabilities
This study transfers the strong reasoning capabilities of DeepSeek-R1 to small open-source models like Phi3 and Qwen-Coder via knowledge distillation. Combined with LoRA fine-tuning and response stabilization techniques, it significantly enhances the reliability and prediction performance of small models in cross-language code clone detection tasks such as Python-Java and Rust-Java.