Section 01
Introduction: AQuaUI—A Retraining-Free Visual Token Compression Scheme for GUI Agents
AQuaUI is a method to compress visual tokens of GUI Agents during the inference phase without retraining. By using an adaptive quadtree to identify and merge visually homogeneous regions, it reduces visual tokens by 29.52% while retaining 99.06% of performance, effectively addressing the computational overhead issue when GUI Agents process high-resolution screenshots.