Section 01
[Introduction] Machine Mental Imagery: Resolving Representational Ambiguity in Dialogue with Visual Scaffolding
The research team proposes an active visual scaffolding framework that incrementally converts dialogue states into a persistent visual history to address the "representational ambiguity" problem in situational dialogue. Tests of this framework on the IndiRef benchmark show that hybrid multimodal representations significantly outperform text-only methods, providing a new path for dialogue systems to maintain precise common ground.