Section 01
PAI-Bench 2: A New Paradigm for Evaluating Video Generation Models' Physical World Understanding
PAI-Bench 2 is the first comprehensive benchmark focusing on evaluating video generation models' physical world understanding ability. It shifts the evaluation paradigm from surface visual quality to physical correctness, using a HybridJudge architecture that combines an analytical validator (PhysicsJudge) and multi-LLM ensemble judge to assess whether generated videos conform to real physical laws across five tracks.