Section 01
Panoramic Map of Multimodal Models: Introduction to the Evolution of Architectures from MLLM to NMM
Based on the Awesome Multimodal Modeling resource list maintained by OpenEnvision, this article systematically organizes the development of multimodal AI, covering four evolutionary stages—traditional multimodal models, multimodal large language models (MLLM), unified multimodal models (UMM), and native multimodal models (NMM)—as well as three core paradigms (MLLM, UMM, NMM). It provides researchers with a clear classification system and architecture comparison, helping to clarify the technical evolution path of the field.