Section 01
【Introduction】LARFT: Bridging the Gap Between Length Cognition and Generation Behavior in Large Language Models
LARFT (Length-Aware Reinforcement Fine-Tuning) uses length-aware reinforcement fine-tuning technology to address the "cognition-behavior gap" problem in large language models for length control tasks. This method enables models to truly understand and execute length-constrained instructions, achieving an average improvement of 20.92 points in length control tasks while keeping general capabilities almost unchanged.