Section 01
[Introduction] Core Summary of the Comparative Study on Analogical Reasoning Capabilities of Transformer Models
This study systematically evaluates the analogical reasoning capabilities of five mainstream Transformer models: BERT, RoBERTa, DistilBERT, GPT-2, and T5, and explores the impact of different architectural designs (such as bidirectional/unidirectional attention mechanisms, pre-training objectives, etc.) on the models' ability to understand structured relationships. The results provide empirical evidence for model selection, architectural improvement, and understanding of machine cognitive mechanisms, with key findings including the superior performance of bidirectional encoder models, the effectiveness of training strategy optimization and knowledge distillation, etc.