# V2V-GoT: A Vehicle-to-Vehicle Collaborative Autonomous Driving Framework Based on Multimodal Large Language Models and Graph-of-Thoughts

> V2V-GoT is the first Graph-of-Thoughts reasoning framework designed specifically for vehicle-to-vehicle (V2V) collaborative autonomous driving. It integrates multi-vehicle perception information via multimodal large language models to achieve occluded perception and planning-aware prediction, outperforming baseline methods in collaborative perception, prediction, and planning tasks.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-01T20:08:21.000Z
- 最近活动: 2026-04-01T20:17:25.763Z
- 热度: 143.8
- 关键词: 自动驾驶, 车车协同, V2V通信, 多模态大语言模型, 思维图谱, 遮挡感知, 轨迹预测, LLaVA, ICRA2026
- 页面链接: https://www.zingnex.cn/en/forum/thread/v2v-got
- Canonical: https://www.zingnex.cn/forum/thread/v2v-got
- Markdown 来源: floors_fallback

---

## Introduction to the V2V-GoT Framework

V2V-GoT is the first Graph-of-Thoughts reasoning framework designed specifically for vehicle-to-vehicle (V2V) collaborative autonomous driving. It integrates multi-vehicle perception information via multimodal large language models to achieve occluded perception and planning-aware prediction, outperforming baseline methods in collaborative perception, prediction, and planning tasks.

## Background and Challenges

One of the core bottlenecks of autonomous driving technology lies in the physical limitations of single-vehicle perception systems, where occlusion issues easily lead to safety hazards. Vehicle-to-vehicle (V2V) communication can expand the field of view, but traditional methods use simple feature fusion, making it difficult to leverage semantic correlations of multi-source information and perform complex reasoning.

## Core Methods and Innovations

V2V-GoT introduces Graph-of-Thoughts structured reasoning, decomposed into associated QA nodes. Two key innovations: 1. Occluded perception: Identify occluded areas and infer occluded targets using information from other vehicles; 2. Planning-aware prediction: Predict the behavior of other participants by combining the ego vehicle's candidate trajectories.

## Dataset and Model Training

Constructed the V2V-GoT-QA dataset (based on V2V4Real, including multi-vehicle perception features and QA sequences); fine-tuned using LoRA technology on LLaVA 1.5 for 10 epochs to adapt to the V2V collaborative domain.

## Experimental Result Analysis

Outperforms baselines in collaborative perception, prediction, and planning tasks, with significant advantages in occlusion scenarios; Graph-of-Thoughts provides interpretability, allowing traceable reasoning paths and easy integration of domain knowledge such as traffic rules.

## Open Source Support and Future Outlook

The project is open-source (GitHub includes code, datasets, etc., with datasets hosted on Hugging Face); future extensions can include vehicle-to-infrastructure (V2I) and multimodal sensor fusion scenarios, providing an example for combining large model reasoning with physical perception.
