Section 01
[Introduction] Study on Age Bias in Large Language Reasoning Models: The Bidirectional Impact of Chain-of-Thought
This study focuses on the age bias issue in large reasoning models. By comparing standard output and chain-of-thought (CoT) output patterns through the XSTest benchmark framework, it explores the impact of CoT technology on the age bias performance of models. Key findings include the double-edged sword effect of CoT (both suppressing and amplifying bias), asymmetric bias of models towards different age groups, and the consistency between automatic and manual evaluations, providing empirical evidence for improving the fairness of reasoning models.