# Fundus-R1: A Knowledge-Aware Multimodal Large Model for Fundus Image Analysis Trained on Public Data

> This article introduces the Fundus-R1 model, the first multimodal large model for fundus image analysis trained exclusively on public datasets. Using RAG to generate knowledge-aware reasoning chains and RLVR enhanced by process rewards, it outperforms general-purpose models on multiple benchmarks.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-09T14:55:22.000Z
- 最近活动: 2026-04-10T02:18:10.523Z
- 热度: 130.6
- 关键词: Fundus-R1, 眼底图像分析, 多模态大模型, RAG, 强化学习, 医学AI, 公开数据训练, 知识感知推理
- 页面链接: https://www.zingnex.cn/en/forum/thread/fundus-r1
- Canonical: https://www.zingnex.cn/forum/thread/fundus-r1
- Markdown 来源: floors_fallback

---

## [Introduction] Fundus-R1: The First Knowledge-Aware Multimodal Large Model for Fundus Images Trained on Public Data

This article introduces the Fundus-R1 model, the first multimodal large model for fundus image analysis trained exclusively on public datasets. Using RAG to generate knowledge-aware reasoning chains and RLVR enhanced by process rewards, it outperforms general-purpose models on multiple benchmarks. This model addresses the barrier of existing fundus MLLMs relying on internal data, providing a new path for the democratization of medical AI.

## [Background] Importance of Fundus Diagnosis and Data Barriers of Existing Methods

Fundus imaging is a core method for ophthalmic disease screening, but insufficient numbers of professional doctors lead to low coverage. Existing high-performance fundus MLLMs rely on internal datasets, hindering research reproducibility; only 94% of public datasets have image-level labels, and the lack of fine-grained annotations limits model training.

## [Methodology] Two Key Technical Innovations of Fundus-R1

1. **RAG-driven Reasoning Chain**: Extract visual features → Retrieve medical knowledge base → Construct reasoning chain from features to diagnosis, providing interpretable basis and supervision signals; 2. **Process Reward-Enhanced RLVR**: Evaluate logical coherence and knowledge correctness of the reasoning chain, incentivizing the generation of rigorous and reliable diagnostic reports.

## [Evidence] Experimental Validation and Ablation Study Results

It significantly outperforms baselines like Qwen2.5-VL on three benchmarks: FunBench, Omni-Fundus, and GMAI-Fundus; ablation studies show that the combination of RAG and process rewards yields the best results, and even small knowledge bases can improve performance. The model has advantages in classification accuracy, reasoning rationality, and generalization ability.

## [Conclusion] Significance and Impact of Fundus-R1

It breaks the perception that "high performance relies on proprietary data", provides an open-source reproducible baseline to accelerate the progress of ophthalmic AI; promotes the democratization of medical AI, allowing more institutions to participate in research and development, benefiting a wider range of patient groups.

## [Future Directions] Limitations and Follow-up Research Plans

Limitations: Insufficient diversity of public data, gaps between reasoning chains and expert-level ones; Future directions: Expand the knowledge base to cover rare diseases, optimize reasoning chains through human-machine collaboration, and extend to modal analysis such as OCT and UWF.
