Section 01
导读 / 主楼:ProjLens Reveals Backdoor Attack Mechanisms in the Projection Layers of Multimodal Large Models
Introduction / Main Floor: ProjLens Reveals Backdoor Attack Mechanisms in the Projection Layers of Multimodal Large Models
Multimodal Large Language Models (MLLMs) have achieved remarkable success in cross-modal understanding and generation, but their deployment faces severe threats from security vulnerabilities. ProjLens is an interpretability framework designed to reveal backdoor attack mechanisms in MLLMs. The study found that even normal downstream task alignment involving only fine-tuning of projection layers can introduce backdoor injection vulnerabilities, and their activation mechanisms differ from those observed in text-only LLMs.