章节 01
MM-CreativityBench: A New Benchmark for Testing AI's Creative Physical Intelligence
This post introduces MM-CreativityBench, a benchmark designed to evaluate AI's creative physical intelligence—specifically, the ability to find non-obvious but physically feasible uses of objects in scenes. The core finding from the study is that current multimodal models' failures in such tasks are not due to poor generation ability but a lack of sustained visually grounded exploration. This benchmark and its insights point to key directions for improving AI's ability to solve real-world creative problems.