Header

UZH-Logo

Maintenance Infos

Voxel Map for Visual SLAM


Muglikar, Manasi; Zhang, Zichao; Scaramuzza, Davide (2020). Voxel Map for Visual SLAM. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 1 July 2020 - 1 October 2020. IEEE, 4181-4187.

Abstract

In modern visual SLAM systems, it is a standard practice to retrieve potential candidate map points from overlapping keyframes for further feature matching or direct tracking. In this work, we argue that keyframes are not the optimal choice for this task, due to several inherent limitations, such as weak geometric reasoning and poor scalability. We propose a voxel-map representation to efficiently retrieve map points for visual SLAM. In particular, we organize the map points in a regular voxel grid. Visible points from a camera pose are queried by sampling the camera frustum in a raycasting manner, which can be done in constant time using an efficient voxel hashing method. Compared with keyframes, the retrieved points using our method are geometrically guaranteed to fall in the camera field-of-view, and occluded points can be identified and removed to a certain extend. This method also naturally scales up to large scenes and complicated multi-camera configurations. Experimental results show that our voxel map representation is as efficient as a keyframe map with 5 keyframes and provides significantly higher localization accuracy (average 46% improvement in RMSE) on the EuRoC dataset. The proposed voxel-map representation is a general approach to a fundamental functionality in visual SLAM and widely applicable.

Abstract

In modern visual SLAM systems, it is a standard practice to retrieve potential candidate map points from overlapping keyframes for further feature matching or direct tracking. In this work, we argue that keyframes are not the optimal choice for this task, due to several inherent limitations, such as weak geometric reasoning and poor scalability. We propose a voxel-map representation to efficiently retrieve map points for visual SLAM. In particular, we organize the map points in a regular voxel grid. Visible points from a camera pose are queried by sampling the camera frustum in a raycasting manner, which can be done in constant time using an efficient voxel hashing method. Compared with keyframes, the retrieved points using our method are geometrically guaranteed to fall in the camera field-of-view, and occluded points can be identified and removed to a certain extend. This method also naturally scales up to large scenes and complicated multi-camera configurations. Experimental results show that our voxel map representation is as efficient as a keyframe map with 5 keyframes and provides significantly higher localization accuracy (average 46% improvement in RMSE) on the EuRoC dataset. The proposed voxel-map representation is a general approach to a fundamental functionality in visual SLAM and widely applicable.

Statistics

Citations

Dimensions.ai Metrics
11 citations in Web of Science®
14 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

83 downloads since deposited on 17 Dec 2020
22 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Scopus Subject Areas:Physical Sciences > Software
Physical Sciences > Control and Systems Engineering
Physical Sciences > Artificial Intelligence
Physical Sciences > Electrical and Electronic Engineering
Scope:Discipline-based scholarship (basic research)
Language:English
Event End Date:1 October 2020
Deposited On:17 Dec 2020 08:57
Last Modified:06 Mar 2024 14:33
Publisher:IEEE
ISBN:978-1-7281-7395-5
OA Status:Green
Publisher DOI:https://doi.org/10.1109/icra40945.2020.9197357
Related URLs:https://ieeexplore.ieee.org/document/9197357
Other Identification Number:merlin-id:20309
  • Content: Accepted Version