Section 01
[Introduction] ByteDance Open-Sources BAGEL: A New Benchmark for Unified Multimodal Models
ByteDance's Seed team recently open-sourced BAGEL (Bagel AI Generated Everything Lab), a unified multimodal foundation model with 7 billion active parameters (14 billion total parameters). This model outperforms Qwen2.5-VL and InternVL-2.5 on standard multimodal understanding benchmarks, has text-to-image quality competitive with SD3, and also possesses "world modeling" capabilities such as free-form visual manipulation, multi-view synthesis, and world navigation, opening up new possibilities for multimodal AI applications.