Section 01
Introduction to Pluralistic Leaderboards: A New Paradigm for LLM Evaluation Tailored to Heterogeneous User Preferences
This article introduces Pluralistic Leaderboards, a new LLM evaluation mechanism that incorporates the concept of local stability from social choice theory to address the issue where traditional single rankings fail to reflect heterogeneous user preferences. It aims to provide a fairer and more stable evaluation method. The core idea is to recognize the diversity of user preferences and ensure the representativeness and fairness of the top-k model set for different user groups by satisfying local stability.