Section 01
[Introduction] Core Findings of the Evaluation of Contextual Translation Capabilities of Large Language Models
This study systematically evaluated the contextual translation capabilities of large language models by constructing synchronous context-free grammars (SCFG). It found that model performance decreases significantly with the scale of the grammar and the length of sentences, and performs worse on language pairs with large differences in morphology and writing systems. Additionally, it identified typical error patterns such as lexical recall errors, hallucination generation, and untranslated residues, providing key references for low-resource language translation and model improvement.