Abstract
The problem of inferring distances from a visual sensor to objects in a scene - referred to as depth estimation - can be solved in various ways. Among those, stereo vision is a method in which two sensors observe the same scene from different viewpoints. To recover the three-dimensional coordinates of a point, its two projections - one in each view - can be used for triangulation. However, the pair of points in the two views that correspond to each other has to be found first. This is known as stereo-matching and is usually a computationally expensive operation. Traditionally, this is performed by describing a point in the first view with some information from its surrounding, e.g. in a feature vector, and then searching for a match with a point described in a similar way in the other view. In this work, we propose a simple idea that alleviates this stereo-matching problem using an active component: a mirror-galvanometer driven laser. The laser beam is deflected by actuating two mirrors, thus creating a sequence of "light spots" in the scene. At these spots, contrast changes quickly. We capture those contrast changes by two Dynamic Vision Sensors (DVS). The high time-resolution of these sensors enables the detection of the laser-induced events in time and their matching using lightweight computation. This method enables event-based depth estimation at a high speed, low computational cost, and without exact sensor synchronization.