ViPE: Video Pose Engine for 3D Geometric Perception
ViPE: Video Pose Engine for 3D Geometric Perception

“Accurate 3D geometric perception is an important prerequisite for a wide range of spatial AI systems. However, acquiring consistent and precise 3D annotations from in-the-wild videos remains a key challenge. In this work, we introduce ViPE, a fast and versatile video processing engine designed to bridge this gap. ViPE efficiently estimates camera intrinsics, camera motion, and dense, near-metric depth maps from unconstrained raw videos. It is robust to diverse scenarios, including dynamic selfie videos, cinematic shots, or dashcams, and supports various camera models such as pinhole, wide-angle, and 360° panoramas…”
August 31, 2025
Subscribe
Login
Please login to comment
0 Comments