Enhancing 6DoF Pose and Focal Length Estimation from Uncontrolled RGB Images for Robotics Vision
ICRA 2024, YOKOHOMA, JAPAN - 3DVRM

Video


Abstract

Accurate 6DoF pose estimation is an important topic in robotics applications, from interactive systems to autonomous navigation and manipulation in augmented reality environments. Previous studies, that rely on single RGB images captured in uncontrolled environments often struggle to accurately estimate both the camera’s internal focal length and the object’s external pose parameters, primarily due to the inherent ambiguity of the perspective projection parameters of the pinhole camera model. Addressing this challenge, our study presents a two-stage approach by decoupling two projection related parameters by employing a render and compare strategy. Initially, we fix the z-axis translation (tz) to an arbitrary value, effectively estimating the other pose parameters and focal length, and achieving accurate results when depth is assumed to be fixed. Subsequently, we predict all parameters in the second stage, enhancing the method’s adaptability and accuracy by keeping the scale of the focal length to the object depth. This approach significantly overcomes projection scale ambiguity, devising improvements over existing methods. Both quantitative and qualitative results demonstrate the validity of presented approach, showcasing its applicability for diverse robotics applications where accurate pose estimation is critical, yet camera metadata is either unreliable or unavailable.

Citation



The website template was borrowed from Michaël Gharbi and Ref-NeRF.