Section 01
Introduction to Multimodal Large Model Safety Research: MM-SafetyBench Evaluation of Qwen2-VL and LLaVA
This study conducts a systematic evaluation of the safety of multimodal large language models (MLLMs), using the MM-SafetyBench benchmark published at ECCV 2024 to analyze the response patterns of two open-source models—Alibaba Cloud's Qwen2-VL and UC Berkeley's LLaVA—to harmful queries, with a focus on exploring the impact of instruction fine-tuning on model safety.