Section 01
Introduction to the RefDiff Framework: Fine-Grained Industrial Anomaly Detection Based on Multimodal Large Language Models
RefDiff is an innovative reference-conditioned difference framework that draws on the LLaVA architecture, applying multimodal large language models to the field of industrial anomaly detection to achieve more precise fine-grained defect recognition. As an open-source project, its core lies in combining multimodal models with difference learning and introducing reference images as conditions to enhance detection accuracy and interpretability.