Section 01
Introduction: BGE-SigLIP—An Embedding Model Unifying Multimodal and Cross-Lingual Representations
The BGE-SigLIP project integrates the SigLIP-2 visual encoder and BGE-M3 text encoder to build a unified vector space, enabling cross-lingual image-text retrieval in over 100 languages, providing new solutions for RAG applications and cross-lingual image search. The project is maintained by Aeluin-Technologies and was released on GitHub on May 26, 2026 (link: https://github.com/Aeluin-Technologies/BGE-SigLIP).