Section 01
Introduction: Challenges in Emotion Understanding for Embodied Robots Watching Movies and the ESE Solution
This article focuses on the problem of emotion understanding of movies from an egocentric perspective for embodied companion robots. The core finding is that existing models trained on movie shots experience a sharp performance drop in real-world viewing scenarios, while the EgoScreen-Emotion (ESE) benchmark dataset proposed by the research team can significantly improve model robustness. The study emphasizes the importance of domain-specific data and long-context multimodal reasoning for achieving human-robot emotional empathy.