Section 01
EconBench: Evaluating the Economic Rationality of Large Language Models Using Behavioral Economics Experiments
EconBench is a benchmark tool specifically designed to test the economic preferences and rational decision-making abilities of large language models (LLMs). It evaluates AI's decision-making performance in risk, time, and social interaction scenarios through classic behavioral economics experiments. It fills the gap in existing AI benchmarks for the systematic evaluation of economic decision-making capabilities, helps understand the decision logic and "economic personality" of LLMs, and is of great significance for model selection, safety assessment, improvement, and AI alignment research.