Header

UZH-Logo

Maintenance Infos

Real-time gesture interface based on event-driven processing from stereo silicon retinas


Lee, J-H; Delbruck, T; Pfeiffer, M; Park, P K J; Shin, C-W; Ryu, H; Kang, B C (2014). Real-time gesture interface based on event-driven processing from stereo silicon retinas. IEEE Transactions on Neural Networks and Learning Systems, 25(12):2250- 2263.

Abstract

We propose a real-time hand gesture interface based on combining a stereo pair of biologically inspired event-based dynamic vision sensor (DVS) silicon retinas with neuromorphic event-driven postprocessing. Compared with conventional vision or 3-D sensors, the use of DVSs, which output asynchronous and sparse events in response to motion, eliminates the need to extract movements from sequences of video frames, and allows significantly faster and more energy-efficient processing. In addition, the rate of input events depends on the observed movements, and thus provides an additional cue for solving the gesture spotting problem, i.e., finding the onsets and offsets of gestures. We propose a postprocessing framework based on spiking neural networks that can process the events received from the DVSs in real time, and provides an architecture for future implementation in neuromorphic hardware devices. The motion trajectories of moving hands are detected by spatiotemporally correlating the stereoscopically verged asynchronous events from the DVSs by using leaky integrate-and-fire (LIF) neurons. Adaptive thresholds of the LIF neurons achieve the segmentation of trajectories, which are then translated into discrete and finite feature vectors. The feature vectors are classified with hidden Markov models, using a separate Gaussian mixture model for spotting irrelevant transition gestures. The disparity information from stereovision is used to adapt LIF neuron parameters to achieve recognition invariant of the distance of the user to the sensor, and also helps to filter out movements in the background of the user. Exploiting the high dynamic range of DVSs, furthermore, allows gesture recognition over a 60-dB range of scene illuminance. The system achieves recognition rates well over 90% under a variety of variable conditions with static and dynamic backgrounds with naïve users.

Abstract

We propose a real-time hand gesture interface based on combining a stereo pair of biologically inspired event-based dynamic vision sensor (DVS) silicon retinas with neuromorphic event-driven postprocessing. Compared with conventional vision or 3-D sensors, the use of DVSs, which output asynchronous and sparse events in response to motion, eliminates the need to extract movements from sequences of video frames, and allows significantly faster and more energy-efficient processing. In addition, the rate of input events depends on the observed movements, and thus provides an additional cue for solving the gesture spotting problem, i.e., finding the onsets and offsets of gestures. We propose a postprocessing framework based on spiking neural networks that can process the events received from the DVSs in real time, and provides an architecture for future implementation in neuromorphic hardware devices. The motion trajectories of moving hands are detected by spatiotemporally correlating the stereoscopically verged asynchronous events from the DVSs by using leaky integrate-and-fire (LIF) neurons. Adaptive thresholds of the LIF neurons achieve the segmentation of trajectories, which are then translated into discrete and finite feature vectors. The feature vectors are classified with hidden Markov models, using a separate Gaussian mixture model for spotting irrelevant transition gestures. The disparity information from stereovision is used to adapt LIF neuron parameters to achieve recognition invariant of the distance of the user to the sensor, and also helps to filter out movements in the background of the user. Exploiting the high dynamic range of DVSs, furthermore, allows gesture recognition over a 60-dB range of scene illuminance. The system achieves recognition rates well over 90% under a variety of variable conditions with static and dynamic backgrounds with naïve users.

Statistics

Citations

9 citations in Web of Science®
10 citations in Scopus®
Google Scholar™

Altmetrics

Additional indexing

Item Type:Journal Article, not refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Neuroinformatics
Dewey Decimal Classification:570 Life sciences; biology
Language:English
Date:2014
Deposited On:25 Feb 2015 10:28
Last Modified:05 Apr 2016 19:00
Publisher:Institute of Electrical and Electronics Engineers
Number of Pages:14
ISSN:2162-237X
Publisher DOI:https://doi.org/10.1109/TNNLS.2014.2308551

Download

Full text not available from this repository.
View at publisher

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations