Section 01
[Introduction] Research on Repeated Jailbreak Attacks on Multimodal Large Language Models: Security Risks of Vision-Language Alignment
This study explores repeated jailbreak attack methods against Multimodal Large Language Models (MLLMs). By combining adversarial images and text prompts, it tests and bypasses the safety alignment mechanisms of models like MiniGPT4 and mPLUG-Owl2. The research reveals new security challenges introduced by the visual modality, providing references for optimizing AI safety alignment. The original project is from GitHub (maintained by shrrynsh, released on May 27, 2026).