Section 01
[Introduction] Evaluation Study on the Ability of Large Reasoning Models to Identify False Presuppositions
This study systematically evaluates the ability of Large Reasoning Models (LRMs) to handle queries containing false presuppositions. The results show that compared to non-reasoning models, LRMs have a 2-11% higher accuracy rate, but 26-42% of false presuppositions remain unchallenged, and the models are sensitive to the strength of presupposition expressions. This study has important implications for AI system design and user usage.