Section 01
Multimodal Large Model OCR Fine-Tuning Practice: Guide to the Combined Optimization Scheme of LoRA+GRPO+ICL
This project is an undergraduate graduation design that demonstrates how to use LoRA (Low-Rank Adaptation) and GRPO (Group Relative Policy Optimization) technologies to fine-tune the multimodal large language model Qwen3VL, and integrate ICL (In-Context Learning) during the inference phase to improve OCR task performance. Combined with CTW and CASIA datasets, the project provides a complete optimization scheme for multimodal OCR models, and the technical combination forms an optimization loop from training to inference.