Section 01
[Introduction] SafetyALFRED: Core Introduction to the Safety-Aware Planning Evaluation Framework for Multimodal Large Language Models
SafetyALFRED is an open-source evaluation framework accepted by ACL 2026 Findings, developed by the SLED Lab at the University of Michigan. It extends the classic ALFRED benchmark to fill the gap where traditional benchmarks ignore safety constraints, providing a standardized evaluation platform for the planning capabilities of multimodal large language models (MLLMs) in safety-sensitive scenarios.