Section 01
LLM Agreement Bias Benchmark: A Benchmark Framework for Detecting Agreement Bias and Answer Instability in Large Models
This article introduces the LLM Agreement Bias Benchmark—an open-source benchmark framework for detecting 'Agreement Bias' and answer instability in large language models (LLMs). Through multi-turn dialogue tests, this framework quantifies the model's tendency to cater to user opinions and the phenomenon of contradictory answers, providing key indicators for evaluating model reliability and consistency, and helping developers and researchers improve model flaws.