Section 01
Introduction: TwNV Framework Breaks the Spatial Intelligence Bottleneck of Multimodal Models
The TwNV framework addresses the view dependency issue in spatial reasoning by enabling the reasoning model to proactively request the synthesis of novel view images. It achieves an accuracy improvement of 1.3 to 3.9 percentage points across four spatial subtasks, providing a new paradigm for the spatial intelligence of multimodal models.