Section 01
MemDreamer: A Groundbreaking Solution for Long Video Understanding
MemDreamer is an innovative solution for long video understanding. Its core lies in decoupling perception and reasoning, adopting a hierarchical graph memory architecture and an agent-based retrieval mechanism, and transforming long video understanding into an agent exploration process. This solution achieves SOTA performance while using only 2% of the context, effectively addressing the token explosion and attention dilution issues in long video processing.