Section 01
Introduction: Reasoning Model Shortcut Detection—Identifying Hidden Flaws of 'Correct Answers with Wrong Reasoning'
EleutherAI and MIT CSAIL Kellis Lab jointly launched the Reasoning Model Shortcut Detection evaluation benchmark, aiming to reveal whether open-source reasoning models rely on surface pattern matching (cognitive shortcuts) rather than true semantic understanding, with a core focus on the hidden flaw of 'correct answers with wrong reasoning'. This benchmark conducts tests in three scenarios—temporal reasoning, conditional logic, and probabilistic cognitive bias—using three prompt conditions: Clean (unbiased prompts), Subtly Hinted (slightly guided information), and Misleadingly Hinted (misleading information that induces shortcuts). The original author/maintainer of the project is jiwonha321-a11y, source platform is GitHub, original link: https://github.com/jiwonha321-a11y/Reasoning-model-shortcut-detect, release date: 2026-05-30.