Zing Forum

Reading

Safety Risks of Embodied Intelligence: Imbalance Between Planning Capability and Safety Awareness of Large Language Models

The DESPITE benchmark reveals that large language models (LLMs) exhibit a mismatch between planning capability and safety awareness in robot planning tasks; even models with near-100% planning accuracy still have a 28.3% probability of generating dangerous plans.

具身智能机器人安全大语言模型规划系统安全评估推理模型
Published 2026-04-21 00:18Recent activity 2026-04-21 11:50Estimated read 5 min
Safety Risks of Embodied Intelligence: Imbalance Between Planning Capability and Safety Awareness of Large Language Models
1

Section 01

[Introduction] Safety Risks of Embodied Intelligence: Imbalance Between Planning Capability and Safety Awareness of LLMs

This article reveals key findings through the DESPITE benchmark: large language models (LLMs) have a significant imbalance between planning capability and safety awareness in robot planning tasks. Even models with near-100% planning accuracy still have a 28.3% probability of generating dangerous plans. This phenomenon serves as an important warning for the safe deployment of embodied intelligence.

2

Section 02

Background: The Safety Paradox of Embodied Intelligence

LLM-driven planning systems have permeated physical scenarios such as household service robots, industrial robots, and autonomous driving. The traditional view holds that strong planning capability naturally leads to safety, but research shows that planning capability and safety awareness are relatively independent dimensions—models can excel in planning while ignoring potential dangers.

3

Section 03

Methodology: The DESPITE Benchmark Framework

The research team developed the DESPITE benchmark, which includes 12279 tasks covering two major categories: physical hazards (collision, fall, electric shock, etc.) and normative hazards (violations of ethics/laws). Its fully deterministic verification mechanism ensures objective and reliable test results, avoiding subjective evaluation biases.

4

Section 04

Key Evidence: Decoupling of Capability and Safety, and Advantages of Reasoning Models

  1. Scale effect of planning capability: For open-source models, as parameters increase from 3 billion to 671 billion, planning accuracy rises from 0.4% to 99.3%, but safety awareness only slightly increases from 38% to 57%;
  2. Dangerous plan generation rate: The optimal model still has a 28.3% probability of generating dangerous plans;
  3. Multiplicative relationship hypothesis: Probability of safe task completion = planning accuracy × safety awareness;
  4. Proprietary reasoning models have safety awareness of 71%-81%, while open-source reasoning models do not have this advantage.
5

Section 05

Conclusions and Implications: Core Challenges for Safe Deployment

  1. Must build multi-layered safety barriers (explicit safety checks, human supervision, physical constraints, etc.) and cannot rely solely on the model's inherent capabilities;
  2. Training paradigms need to incorporate safety awareness into core objectives instead of treating it as an afterthought;
  3. Evaluation criteria should be extended to dimensions such as safety and robustness, not just focusing on task completion rate.
6

Section 06

Future Research Directions

  1. Explore the mechanism of safety awareness and study whether it can be transferred to other models;
  2. Develop safety enhancement technologies (post-training alignment, safety prompt engineering, etc.);
  3. Extend the evaluation framework to scenarios such as execution monitoring, anomaly handling, and human-machine collaboration.
7

Section 07

Conclusion: Safety is the Precondition for the Development of Embodied Intelligence

LLMs have broad application prospects in embodied intelligence, but the imbalance between planning and safety is a systemic issue. It requires joint efforts from academia, industry, and regulatory agencies to ensure that embodied intelligence benefits human society under safe conditions.