Section 01
TARS: Bridging the Reasoning Gap of Speech Large Models with Reinforcement Learning (Introduction)
Speech Large Language Models (Speech LLMs) perform far worse than text models in complex reasoning tasks, resulting in a 'modal reasoning gap'. The TARS (Trajectory Alignment for Reasoning in Speech) proposed by the Amphion team at ACL 2026 effectively solves this problem through asymmetric reward design and trajectory alignment technology, achieving the best performance among 7B-scale models on benchmarks like MMSU and OBQA.