Section 01
3D-VCD: A Breakthrough Method to Eliminate Hallucinations in 3D Embodied Agents Without Retraining
This post introduces 3D-VCD (3D Visual Contrastive Decoding), the first hallucination elimination framework for 3D embodied agent reasoning. It addresses the critical problem of hallucinations (descriptions/decisions inconsistent with real 3D environments) in key applications like robot navigation and autonomous driving. The core advantage is that it works at inference time without retraining the base model, significantly improving grounded reasoning performance on benchmarks like 3D-POPE and HEAL.