Section 01
LLMEval-Logic: A New Benchmark for Chinese Logical Reasoning Evaluation Released
This article introduces LLMEval-Logic, a real-scenario-based Chinese logical reasoning benchmark constructed through expert review, Z3 solver validation, and an adversarial reinforcement process. It includes basic and hard sets. Experiments reveal a significant gap in complex logical reasoning among current cutting-edge large language models, providing a new standard for evaluating the logical reasoning capabilities of Chinese LLMs.