Section 01
[Introduction] MM-Fundus-CLIP: Research on a Multimodal Foundation Model for Fundus Images Integrating Large Language Models and CLIP
This project explores the development of a foundation model for fundus images using the CLIP contrastive learning architecture and large language models, enabling unified representation learning and cross-modal understanding of ophthalmic multimodal data. The project is maintained by myeongkyunkang and was published on GitHub (link: https://github.com/myeongkyunkang/mmfundusclip) in June 2026.