Section 01
[Introduction] Knowledge Graph-Enhanced Vision-Language Models Improve Physical Reasoning Capabilities
This project (VLM-Reasoning-Model-using-Knowledge-Graph) was published by tirth1263 on GitHub (link: https://github.com/tirth1263/VLM-Reasoning-Model-using-Knowledge-Graph, release date: 2026-05-23). Its core idea is to enhance the physical world reasoning capabilities of vision-language models (VLMs) by combining knowledge graphs (KGs) with explicit physical rules. Compared to fine-tuning methods, this zero-shot reasoning enhancement strategy is lighter and more interpretable, and has achieved certain improvements on the ScienceQA physics validation set.