PointWorld & 3D World Models for Robotic Manipulation

Research Focus

Literature survey centered on NVIDIA’s PointWorld (arXiv:2601.03782) and the broader research area of 3D point cloud representations, world models, and point tracking applied to robotic manipulation. Covers foundational 3D deep learning, learned world models for control, point tracking methods, and recent 3D VLA approaches.

DOI List

10.48550/arXiv.1803.10122
10.48550/arXiv.1612.00593
10.48550/arXiv.1706.02413
10.48550/arXiv.2209.05451
10.48550/arXiv.2301.04104
10.48550/arXiv.2306.08637
10.48550/arXiv.2306.17817
10.48550/arXiv.2307.07635
10.48550/arXiv.2308.16891
10.48550/arXiv.2310.06114
10.48550/arXiv.2310.16828
10.48550/arXiv.2403.03954
10.48550/arXiv.2403.09631
10.48550/arXiv.2406.10721
10.48550/arXiv.2501.15830
10.48550/arXiv.2601.03782

Paper Nodes

Synthesis Matrix

論文年份貢獻類型核心表示Cross-embodiment主要指標
World Models2018理論基礎Latent RNNSimulated env score
PointNet2017理論基礎Point set (permutation-invariant)ModelNet40 accuracy
PointNet++2017架構Hierarchical point setClassification/segmentation
PerAct2023操控 benchmark3D voxelNoRLBench success rate
DreamerV32024世界模型Latent (image-based)No150+ task success
TAPIR2023點追蹤Per-frame 2D+temporalTAP-Vid benchmark
Act3D2023操控 policy3D feature fieldNoRLBench success rate
CoTracker2024點追蹤Joint 2D trackingPoint tracking benchmarks
GNFactor2023操控 policyNeural feature field (3D voxel)NoRLBench success rate
UniSim2024世界模型Video-based simulatorPartialZero-shot policy transfer
TD-MPC22024世界模型Latent (image-based)No104 continuous control tasks
3D Diffusion Policy2024操控 policy3D point cloudNoManipulation success rate
3D-VLA2024VLA + 世界模型3D scene + generativePartialEmbodied reasoning
RoboPoint2024空間推理VLM + spatial keypointsNoAffordance prediction (+21.8%)
SpatialVLA2025VLAEgo3D encodingYesCross-robot task success
PointWorld20263D 世界模型3D point flowYesZero-shot real-world tasks