Header

UZH-Logo

Maintenance Infos

Visual Inertial Odometry and Active Dense Reconstruction for Mobile Robots


Forster, Christian. Visual Inertial Odometry and Active Dense Reconstruction for Mobile Robots. 2016, University of Zurich, Faculty of Economics.

Abstract

Using cameras for localization and mapping with mobile robots is appealing as these sensors are small, inexpensive, and ubiquitous. However, since every camera image provides hundred thousands of measurements, it poses a great challenge to infer structure and motion from this wealth of data in real-time on computationally constrained robotic systems. Furthermore, robustness becomes an important factor when applying computer vision algorithms to mobile robots that are moving in uncontrolled environments. In this case, nuisances such as occlusions, illumination changes, or low textured surfaces increase the difficulty to track visual cues, which is fundamental to enable camera-based localization and mapping.
The first contribution of this thesis is an efficient, robust, and accurate visual odometry algorithm that computes the motion of a single camera solely from its stream of images. Therefore, the use of direct methods that operate directly on pixel level intensities is investigated. The advantage of direct methods is that pixel correspondence between images is given directly by the geometry of the problem and can be refined by using the local intensity gradients. However, joint refinement of structure and motion by pixel-wise minimization of intensity differences becomes intractable as the map grows. Therefore, a novel semi-direct approach is proposed that establishes feature correspondence using direct methods and subsequently relies on proven feature-based methods for refinement. We further show how inertial measurements can seamlessly be integrated in the optimization of structure and motion. Therefore, the second contribution of this thesis is a preintegration theory that allows summarizing many inertial measurements between two frames into single relative motion constraints. We formally discuss the generative measurement model as well as the nature of the rotation noise and derive the expression for the maximum a posteriori state estimator. Experimental results confirm that our modeling efforts lead to accurate state estimation in real-time, outperforming state-of-the-art approaches.
Tracking salient features in the image results in sparse point clouds; however, for robotic tasks such as path planning, manipulation, or obstacle avoidance, a denser surface representation is needed. Previous work on dense reconstruction from images aim at providing high fidelity reconstructions. However, for robotic applications, the accuracy of the reconstruction should be governed by the interaction task. Furthermore, it is crucial to have a measure of uncertainty in the reconstruction, which aids motion planning and fusion with complementary sensors. This motivates the third contribution of this thesis, which is an efficient algorithm for probabilistic dense depth estimation from a single camera. Therefore, we combine a multi-view and per-pixel-based recursive Bayesian depth estimation scheme with a fast smoothing method that takes into account the estimated depth uncertainty.
While most computer vision approaches fuse depth-maps in a cost volume, care has to be taken in terms of scalability and memory consumption for robotic applications. Therefore, building upon the proposed dense depth estimation, the next contribution of this thesis is a robot-centric elevation mapping system that suits a flying robot with down-looking camera and can be applied on-board Micro Aerial Vehicles (MAVs) for fully autonomous landing-spot detection and landing.
We further demonstrate the usefulness of dense depth-maps for localization of an MAV with respect to a ground robot. Therefore, we address the problem of registering the maps computed by two robots from distant vantage points, using different sensing modalities: a dense 3D reconstruction from the MAV is aligned with the map computed from the depth sensor on the ground robot.
The most exciting opportunity of computer vision for mobile robotics is that robots can exhibit control on the data acquisition process. This motivated the investigation of the following problem: given the image of a scene, what is the trajectory that an MAV-mounted camera should follow to perform optimal dense depth estimation? The last contribution of this thesis addresses this question and introduces a method to compute the measurement uncertainty and, thus, the expected information gain, on the basis of the scene structure and appearance. This results in the MAV to choose motion trajectories that avoid perceptual ambiguities inferred by the texture in the scene.

Abstract

Using cameras for localization and mapping with mobile robots is appealing as these sensors are small, inexpensive, and ubiquitous. However, since every camera image provides hundred thousands of measurements, it poses a great challenge to infer structure and motion from this wealth of data in real-time on computationally constrained robotic systems. Furthermore, robustness becomes an important factor when applying computer vision algorithms to mobile robots that are moving in uncontrolled environments. In this case, nuisances such as occlusions, illumination changes, or low textured surfaces increase the difficulty to track visual cues, which is fundamental to enable camera-based localization and mapping.
The first contribution of this thesis is an efficient, robust, and accurate visual odometry algorithm that computes the motion of a single camera solely from its stream of images. Therefore, the use of direct methods that operate directly on pixel level intensities is investigated. The advantage of direct methods is that pixel correspondence between images is given directly by the geometry of the problem and can be refined by using the local intensity gradients. However, joint refinement of structure and motion by pixel-wise minimization of intensity differences becomes intractable as the map grows. Therefore, a novel semi-direct approach is proposed that establishes feature correspondence using direct methods and subsequently relies on proven feature-based methods for refinement. We further show how inertial measurements can seamlessly be integrated in the optimization of structure and motion. Therefore, the second contribution of this thesis is a preintegration theory that allows summarizing many inertial measurements between two frames into single relative motion constraints. We formally discuss the generative measurement model as well as the nature of the rotation noise and derive the expression for the maximum a posteriori state estimator. Experimental results confirm that our modeling efforts lead to accurate state estimation in real-time, outperforming state-of-the-art approaches.
Tracking salient features in the image results in sparse point clouds; however, for robotic tasks such as path planning, manipulation, or obstacle avoidance, a denser surface representation is needed. Previous work on dense reconstruction from images aim at providing high fidelity reconstructions. However, for robotic applications, the accuracy of the reconstruction should be governed by the interaction task. Furthermore, it is crucial to have a measure of uncertainty in the reconstruction, which aids motion planning and fusion with complementary sensors. This motivates the third contribution of this thesis, which is an efficient algorithm for probabilistic dense depth estimation from a single camera. Therefore, we combine a multi-view and per-pixel-based recursive Bayesian depth estimation scheme with a fast smoothing method that takes into account the estimated depth uncertainty.
While most computer vision approaches fuse depth-maps in a cost volume, care has to be taken in terms of scalability and memory consumption for robotic applications. Therefore, building upon the proposed dense depth estimation, the next contribution of this thesis is a robot-centric elevation mapping system that suits a flying robot with down-looking camera and can be applied on-board Micro Aerial Vehicles (MAVs) for fully autonomous landing-spot detection and landing.
We further demonstrate the usefulness of dense depth-maps for localization of an MAV with respect to a ground robot. Therefore, we address the problem of registering the maps computed by two robots from distant vantage points, using different sensing modalities: a dense 3D reconstruction from the MAV is aligned with the map computed from the depth sensor on the ground robot.
The most exciting opportunity of computer vision for mobile robotics is that robots can exhibit control on the data acquisition process. This motivated the investigation of the following problem: given the image of a scene, what is the trajectory that an MAV-mounted camera should follow to perform optimal dense depth estimation? The last contribution of this thesis addresses this question and introduces a method to compute the measurement uncertainty and, thus, the expected information gain, on the basis of the scene structure and appearance. This results in the MAV to choose motion trajectories that avoid perceptual ambiguities inferred by the texture in the scene.

Statistics

Downloads

247 downloads since deposited on 22 Dec 2016
247 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Dissertation
Referees:Scaramuzza Davide
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Language:English
Date:2016
Deposited On:22 Dec 2016 07:49
Last Modified:28 Apr 2017 05:58
Other Identification Number:merlin-id:14351

Download

Download PDF  'Visual Inertial Odometry and Active Dense Reconstruction for Mobile Robots'.
Preview
Filetype: PDF
Size: 49MB