本文由 AI 分析生成
建立時間: 2026-04-02 來源: https://arxiv.org/abs/2602.15063
Summary
This PRISMA 2020 systematic review synthesizes 86 peer-reviewed articles on LLM-driven HRI (2019–2025), proposing the Sense-Interaction-Alignment (SIA) framework to replace the classical Sense-Plan-Act model. The review maps a 9-dimension taxonomy across the corpus, identifies eight design challenges, and finds the field is largely exploratory with fragmented evaluation practices and almost no longitudinal studies.
本篇 PRISMA 2020 系統性回顧分析 86 篇 LLM 驅動 HRI 論文,提出「感知-互動-對齊」(SIA)框架取代傳統 Sense-Plan-Act 模型,建立九維分類法,並識別八大設計挑戰。研究發現該領域仍屬探索性,評估方法分散,缺乏縱向研究。
Prerequisites
- Human-Robot Interaction fundamentals — the review situates LLMs within existing HRI design dimensions; familiarity with social robotics, teleoperation, and autonomous vehicles helps.
- Transformer-based LLMs including VLMs — the paper covers multimodal LLMs; understanding how vision-language models differ from text-only LLMs is needed to follow the sensing dimension.
- Systematic review methodology (PRISMA) — the paper’s credibility rests on its protocol; understanding inclusion/exclusion criteria and inter-rater reliability metrics (Cohen’s κ) matters for evaluating the taxonomy.
Core Idea
The SIA framework reframes what LLMs contribute to HRI. Rather than replacing Sense-Plan-Act mechanically, LLMs transform each layer: Sense becomes multimodal context grounding (not just rule-based perception); Interaction becomes generative and agentic (not scripted responses); Alignment adds a continuous adaptation loop that prior HRI systems lacked entirely. The review’s argument is that the field is building tools faster than it is building theory — the 9-dimension taxonomy and design challenges are a bid to impose structure on a fragmented literature.
Results
| Taxonomy Dimension | Inter-Rater Reliability (κ) |
|---|---|
| Application Domains | 0.904 |
| Morphology | 0.894 |
| Evaluation Metrics | 0.832 |
| Autonomy Levels | 0.846 |
| Modality & Interaction Channels | 0.768 |
| Generative & Agentic Interaction | 0.796 |
| Contextual Perception & Understanding | 0.720 |
| Methodology | 0.684 |
| Iterative Optimization & Alignment | 0.684 |
- 86 papers reviewed, concentrated in 2024–2025.
- Publication venues: HRI (31.4%), CHI (12.8%), ROMAN (10.5%), IJSR (8.1%).
Limitations
- Author-stated: search window ends early 2025; fast-moving field means recent work may be missed. English-only papers included.
- Unstated: the review synthesizes abstracts and methods sections — depth of engagement with each paper’s contribution varies. The SIA framework is a reframing of prior work, not an empirically validated model.
Reproducibility
- Code: PRISMA protocol; no code artifact.
- Datasets: public literature corpus (86 papers, identified sources documented).
- Compute: N/A.
Insights
The most useful contribution is the identification of critical gaps: almost no longitudinal studies, no standardized evaluation metrics, and user modeling that rarely goes beyond generic personas. These gaps are directly actionable research directions. The finding that LLM-HRI papers are heavily concentrated in 2024–2025 (reflecting the GPT-4 wave) but remain methodologically fragmented suggests the field needs consolidation work — which this review contributes to. The SIA framework’s alignment layer is the most novel: prior HRI theory had no analog for continuous behavioral/emotional/ethical repair.
Connections
Raw Excerpt
LLMs are reshaping foundational HRI capabilities around contextual sensing, socially-grounded interaction generation, and continuous human alignment. Current research remains largely exploratory, with varied experimental designs, methodologies, and evaluation approaches.