Context

前往 MIT CSAIL 年會前的準備工作:整理五個可以問學者的前瞻性問題,以及四位重點學者(Julie Shah、Andreea Bobu、Wojciech Matusik、Antonio Torralba)的最新研究方向與代表論文。

Key Insights

五個前瞻問題(可作為模板用於類似場合)

  • LfD 資料採集的未來:teleoperation vs passive video vs synthetic data,哪條路會收斂?
  • Representation alignment 可擴展性:如何在不需大量 per-user calibration 的情況下處理偏好噪音?
  • Foundation model scaling 與 dexterous manipulation:scale 能解決問題還是需要觸覺等新模態?
  • Sim-to-real 下一個突破點:differentiable sim、neural sim、還是直接繞過?
  • 高風險場景的 autonomy 動態分配:如何 data-driven 地決定何時自主、何時請求確認?

English versions — long form + one-liner

Q1 — LfD Data Collection

  • One-liner: What data collection approach do you think will dominate for dexterous manipulation — teleoperation, passive video, or something else entirely?
  • Long: Both teleoperation and passive video have fundamental tradeoffs — high-quality actions vs. scalable but action-free data. Do you see one approach dominating in the next five years, or is there a third path emerging such as synthetic data from simulation or world model rollouts?

Q2 — Representation Alignment Scalability

  • One-liner: How do we build robot alignment systems that handle diverse users without requiring extensive per-user calibration?
  • Long: Reward and preference learning assume consistent human supervision, but preferences are context-dependent and noisy. When a robot needs to serve users with different backgrounds and abilities, how do we design alignment mechanisms that capture individual variation without large per-user calibration efforts?

Q3 — Foundation Models and Dexterous Manipulation

  • One-liner: Is the gap between VLA models and real dexterous manipulation a scaling problem, or does it need fundamentally new modalities like tactile sensing?
  • Long: VLA models like RT-2 show promising zero-shot generalization, but there’s still a significant gap in fine-grained manipulation. Is this fundamentally a scaling problem — more data, bigger models — or does it require a different approach, such as integrating tactile feedback as a missing modality?

Q4 — Sim-to-Real Breakthrough

  • One-liner: What’s the most promising path to closing the sim-to-real gap in the next few years?
  • Long: Differentiable simulation has made progress on contact mechanics, but deformable objects and complex friction remain hard. What do you think is the most likely near-term breakthrough — higher-fidelity physics, neural simulation, or collecting enough real-world data to bypass simulation altogether?

Q5 — Dynamic Autonomy Allocation

  • One-liner: How should a robot decide when to act autonomously versus ask for human help in high-stakes situations?
  • Long: In domains like surgery or aerospace, the allocation of decision authority between humans and robots is critical. How should a robot dynamically calibrate when to act autonomously versus seek human confirmation, based on task uncertainty, situational pressure, and human cognitive load? Is this tractable as a data-driven problem in the near term?

學者研究重心(2025-2026)

  • Julie Shah(AeroAstro 系主任):LLM for HRI、inference-time policy steering、REALM(即時協助估計)
  • Andreea Bobu(CLEAR Lab):Representation alignment,FERL(feature-expansive reward learning),核心論點是 feature 錯比 reward 函數錯更根本
  • Wojciech Matusik:DiffTactile(可微分觸覺模擬 ICLR 2024)、smart gloves 觸覺轉移(Nature Comms 2024)
  • Antonio Torralba:ConceptGraphs(開放詞彙 3D 場景圖)、LLM 視覺能力評估(CVPR 2024)、2026 ACM Fellow

Connections