Section 01
Core Introduction to the Awesome Multimodal GUI Agents Project
This article introduces the Awesome Multimodal GUI Agents project, a GitHub resource list maintained by DeLunnLi (original link: https://github.com/DeLunnLi/Awesome-Multimodal-GUI-Agents, updated on 2026-05-31). It systematically compiles papers, datasets, benchmarks, models, and open-source projects in the field of multimodal GUI agents, covering four domains: web, mobile, desktop, and computer usage agents. Featuring cross-platform integration, this project helps researchers identify commonalities in technologies across different platforms and serves as an essential resource for domain entry and cutting-edge trend tracking.