Section 01
Arabic_IC Project Introduction: Research on Multi-Model Arabic Image Caption Generation
The Arabic_IC project aims to fill the gap in image caption generation for low-resource languages like Arabic, and systematically evaluate the performance of mainstream large-scale generative models such as Google Gemini, Gemma, and Llama on this task. Based on the Flickr dataset, it explores the capability boundaries of modern vision-language models in generating high-quality, semantically rich, and linguistically coherent Arabic captions, with a focus on the development of AI technology for low-resource languages and the fairness of global accessibility.