TL;DR A novel model for Tracking Any Point (TAP) that effectively tracks any queried point on any physical surface throughout a video sequence is presented, and a proof-of-concept diffusion model which generates trajectories from static images, enabling plausible animations is demonstrated.