Section 01
Triangular Multi-Agent Evaluation Framework: A New Paradigm for Mutual Supervision of Large Models (Introduction)
With the rapid development of large language models today, traditional single-model evaluation methods have problems such as strong subjectivity and incomplete coverage. The triangular multi-agent evaluation framework achieves automated assessment of models' reasoning quality, factual accuracy, and execution reliability through a three-party game mechanism involving Worker, Leader, and Auditor, providing new ideas to address the limitations of single evaluation methods.