Summary

A 2024 empirical study demonstrating that LLM/VLM-controlled robotic systems are fragile to simple input perturbations: task success drops 22.2% and 14.6% in two representative systems under instruction rephrasing and visual noise. The paper argues these systems need adversarial robustness hardening before any safety-critical deployment.

2024 年實證研究,展示 LLM/VLM 控制的機器人系統對簡單輸入擾動的脆弱性:在指令重新表述和視覺噪聲下,兩個代表性系統的任務成功率分別下降 22.2% 和 14.6%。論文認為這些系統在任何安全關鍵部署之前都需要對抗魯棒性強化。

Key Points

  • 22.2% / 14.6% success rate drops from simple perturbations — no sophisticated adversarial attack required
  • Dual vulnerability: both language modality (instruction rephrasing) and visual modality (image noise/shift) cause failure
  • Error cascade: LLM/VLM reasoning errors propagate directly to robot actuation without intermediate verification
  • White-box and black-box attacks both effective: the system does not require model internals to exploit
  • Framing: safety = reliable task execution (not constraint satisfaction) — bridges robotics reliability and ML robustness research

Insights

The 22.2% failure rate from simple rephrasing is alarming because natural language is inherently paraphrastic — real users will inevitably phrase instructions differently from training prompts. This is not a contrived adversarial threat but a realistic deployment scenario.

The paper implicitly identifies a fundamental problem with “LLM/VLM as safety filter” architectures (like Semantic-Metric Bayesian Risk Fields): if the VLM component is itself vulnerable to input perturbations, using it as a safety oracle inherits that fragility. Any pipeline that puts VLM on the safety-critical path needs to address this.

Connections