Section 01
BeyondBench: Guide to the Anti-Data Contamination Reasoning Evaluation Benchmark for Language Models Accepted by ICLR 2026
BeyondBench is a research work accepted by ICLR 2026, focusing on solving the data contamination problem in language model evaluation. It constructs an anti-contamination evaluation methodology through dynamic test generation, multi-dimensional reasoning assessment, and difficulty adaptation mechanisms, aiming to accurately measure the real reasoning ability of models rather than their memorization ability.