正文

MPCI-Bench：评估视觉语言模型智能体的情境完整性新基准

MPCI-Bench是一个用于评估视觉语言模型智能体情境完整性的多模态基准测试，通过成对对比方法检验模型在不同情境下对敏感信息的适当传输判断能力。

MPCI-Bench情境完整性视觉语言模型隐私评估多模态基准智能体评估Contextual IntegrityVLM隐私保护

发布时间 2026/05/06 18:14最近活动 2026/05/06 18:21预计阅读 8 分钟

章节 01

MPCI-Bench: A New Benchmark for Evaluating Contextual Integrity of VLM Agents

MPCI-Bench is a multimodal benchmark designed to assess the contextual integrity of Visual Language Model (VLM) agents. It uses a pairwise contrast method to test models' ability to judge appropriate information transmission across different contexts, addressing the limitations of traditional binary privacy assessment approaches. This benchmark is based on Helen Nissenbaum's Contextual Integrity theory and covers multiple evaluation tasks to comprehensively measure privacy perception in VLMs.

章节 02

Background and Motivation for MPCI-Bench

With the rapid development of large language models and multimodal agents, privacy protection issues have become increasingly prominent. Traditional privacy assessment methods treat privacy as a binary attribute (either private or not), ignoring its highly contextual nature—same information may be appropriate in one context but not another. Helen Nissenbaum's 2004 Contextual Integrity theory provides a finer framework, stating that information flow appropriateness depends on compliance with the norms of the original data sharing context. MPCI-Bench is designed based on this theory to evaluate VLM agents' privacy awareness.

章节 03

Core Design of MPCI-Bench

Unlike existing privacy benchmarks, MPCI-Bench adopts a pairwise contrast design. Instead of asking if information is private, it challenges models to distinguish between appropriate and inappropriate information transmission in highly similar contexts with the same sensitive data. This design controls confounding variables, allowing precise measurement of models' understanding of contextual norms rather than general language or common sense reasoning.

章节 04

Dataset Composition of MPCI-Bench

MPCI-Bench contains 2,052 carefully designed test cases (1,026 matched positive/negative pairs). Each case includes:

Context Seed: Abstract parameters like sender, subject, receiver, data type, transmission method, principles, and application domain, covering various daily privacy scenarios.
Concrete Narrative: Natural language story derived from seed parameters (e.g., medical info sharing between patients, doctors, or insurers).
Agent Trace: User instructions, tool lists, ReAct-style tool calls, and target final action type, simulating real agent decision processes.
Image Metadata: Image paths from VISPR dataset and sensitivity labels, introducing visual privacy assessment.

章节 05

Multidimensional Evaluation Tasks in MPCI-Bench

MPCI-Bench includes five complementary tasks:

CI Probing: Judge if information transmission in a context is appropriate (Yes/No), evaluated by overall accuracy and layered performance.
Sensitive Grounding: Identify sensitive regions in images (list VISPR labels), evaluated by case accuracy and label recall.
Sensitive Sharing: Binary classification of whether sharing an image in a context involves sensitive info (cross-modal privacy test).
Final-Action Generation: Generate executable final actions for agents, evaluated via structured output comparison.
Leakage Judging: Analyze if agent traces have info leaks (score text/image leakage and usefulness), simulating post-audit scenarios.

章节 06

Technical Implementation and Usage of MPCI-Bench

MPCI-Bench uses Python 3.10+ and supports installation via uv/pip. For API-based models (e.g., GPT-4o, GPT-5), configure Azure OpenAI or compatible endpoints; local models can integrate via vLLM. Key commands:

Validate dataset: python -m mpci_bench.validate
Run action generation evaluation: python evaluate.py action --model gpt-5.4 --output eval/action/gpt-5.4.csv
Run leakage evaluation: python evaluate.py leakage --action-path eval/action/gpt-5.4.csv --judge gpt-5.4 --output eval/leakage/gpt-5.4.json A stable data loading interface (mpci_bench.data) is provided to avoid compatibility issues from hard-coded fields.

章节 07

Limitations and Ethical Considerations of MPCI-Bench

MPCI-Bench has several limitations:

Focuses on contextual integrity behavior, not general privacy knowledge or de-anonymization (cannot replace full privacy assessment).
Inherits biases from VISPR/Flickr Creative Commons image data and synthetic narratives/traces.
Simulated agent traces may not reflect real production complexity. Ethically: It does not include real private data (emails, Slack messages, etc.) and should only be used for evaluation/audit, not training privacy-violating systems.

章节 08

Practical Significance and Future Outlook of MPCI-Bench

MPCI-Bench provides a critical tool for evaluating VLM agents' privacy. Understanding contextual integrity is key to trustworthy AI as agents become more prevalent. It helps identify model privacy blind spots and guides next-gen context-aware agents. Future work can expand to more modalities (audio/video), complex multi-round interactions, and cross-cultural contextual norm studies—especially valuable for sensitive domains like healthcare, finance, and education.