Section 01
ReflectMT: An Efficient Machine Translation Method with Internalized Reflection Capability (Introduction)
ReflectMT internalizes the 'translate-reflect-optimize' capability into the model via two-stage reinforcement learning, enabling it to generate high-quality translations directly during inference without explicit reasoning. On the WMT24 benchmark, its translation quality surpasses DeepSeek-R1 (COMET score 88.7 vs. 86.5), while reducing token consumption by 94.33%, solving the dilemma of balancing quality and efficiency in existing Large Reasoning Model (LRM) translation methods.