Section 01
【Introduction】The Truth Behind Large Models' 'Self-Doubt' Phenomenon: Interaction Between Prompt Framing and Answer Format
An experimental study on Qwen2.5-Math found that when known solvable math problems are described as 'unsolved' or 'open questions', the model's accuracy drops from 60% to 45%. However, further controlled experiments reveal that this phenomenon is more of an interaction effect between prompt format and answer presentation style, rather than a real degradation of the model's reasoning ability. This study explores the impact of model confidence on mathematical reasoning performance and related implications.