章节 01
AttentionJailbreak: Key Findings on LVLM Security Vulnerability via Attention Hijacking (ACL 2026)
This post summarizes the ACL 2026 study "AttentionJailbreak", which reveals a fundamental security flaw in Large Visual Language Models (LVLMs). By manipulating attention mechanisms (instead of overriding safety alignment), the attack achieves up to 94.4% success rate. Below are key details split into sections for clarity.