Section 01
Lumina-DiMOO: A New Paradigm for Multimodal Large Models with Unified Discrete Diffusion Architecture (Introduction)
The Lumina-DiMOO model open-sourced by the Alpha-VLLM team is a multimodal large model using a fully discrete diffusion architecture, designed to unify the generation and understanding of multimodal tasks like text and images. This model has achieved leading levels among open-source unified multimodal models in multiple authoritative benchmark tests, with weights released on HuggingFace, along with complete inference and training code and technical reports.