Section 01
[Introduction] CAPO Method: Learning Personalized Explanatory Behavior Using Human Labeling Variability
This paper proposes the Cross-Annotator Preference Optimization (CAPO) method, aiming to enable large language models (LLMs) to learn and replicate the label-explanation behavior patterns of specific annotators. The core finding of the study is that Human Labeling Variability (HLV) can serve as a stable signal to help models understand annotators' personalized reasoning preferences.