Section 01
PointVG-R: A Reinforcement Learning Framework for Visual Pointing Reasoning
PointVG-R is a multi-modal large model training framework for visual pointing reasoning, based on reinforcement learning using PPO/GRPO algorithms. It realizes joint optimization of hand detection, pointing ray prediction, and target object localization, achieving significant improvements in visual grounding tasks.