Section 01
BeTTER Benchmark: Debunking the Illusion of Embodied Reasoning Capabilities in VLA Models [Introduction]
The BeTTER benchmark decouples high-level reasoning failures from low-level execution constraints for the first time using causal intervention and kinematic isolation methods, revealing severe cognitive deficits in semantic understanding and sequence planning in current VLA models. This thread will introduce core content such as background, methodology, and diagnostic findings in separate floors.