Section 01
[Introduction] PlanBench-V: The First VLM Evaluation Benchmark for Spatial Planning Diagrams Released
PlanBench-V is the first comprehensive evaluation benchmark specifically for assessing the ability of visual-language models (VLMs) to interpret spatial planning diagrams. Released by the arXiv author team on June 4, 2026 (link: http://arxiv.org/abs/2606.05744v1), this benchmark constructs a dataset with 223 planning diagrams and 1629 expert-annotated question-answer pairs. It reveals the capability boundaries of current VLMs through a four-dimensional framework (perception, reasoning, association, implementation) and has open-sourced its code and dataset (https://plangpt.github.io).