Header

UZH-Logo

Maintenance Infos

Bridging the Gap Between Events and Frames Through Unsupervised Domain Adaptation


Messikommer, Nico; Gehrig, Daniel; Gehrig, Mathias; Scaramuzza, Davide (2022). Bridging the Gap Between Events and Frames Through Unsupervised Domain Adaptation. IEEE Robotics and Automation Letters, 7(2):3515-3522.

Abstract

Reliable perception during fast motion maneuvers or in high dynamic range environments is crucial for robotic systems. Since event cameras are robust to these challenging conditions, they have great potential to increase the reliability of robot vision. However, event-based vision has been held back by the shortage of labeled datasets due to the novelty of event cameras. To overcome this drawback, we propose a task transfer method to train models directly with labeled images and unlabeled event data. Compared to previous approaches, (i) our method transfers from single images to events instead of high frame rate videos, and (ii) does not rely on paired sensor data. To achieve this, we leverage the generative event model to split event features into content and motion features. This split enables efficient matching between latent spaces for events and images, which is crucial for successful task transfer. Thus, our approach unlocks the vast amount of existing image datasets for the training of event-based neural networks. Our task transfer method consistently outperforms methods targeting Unsupervised Domain Adaptation for object detection by 0.26 mAP (increase by 93%) and classification by 2.7% accuracy.

Abstract

Reliable perception during fast motion maneuvers or in high dynamic range environments is crucial for robotic systems. Since event cameras are robust to these challenging conditions, they have great potential to increase the reliability of robot vision. However, event-based vision has been held back by the shortage of labeled datasets due to the novelty of event cameras. To overcome this drawback, we propose a task transfer method to train models directly with labeled images and unlabeled event data. Compared to previous approaches, (i) our method transfers from single images to events instead of high frame rate videos, and (ii) does not rely on paired sensor data. To achieve this, we leverage the generative event model to split event features into content and motion features. This split enables efficient matching between latent spaces for events and images, which is crucial for successful task transfer. Thus, our approach unlocks the vast amount of existing image datasets for the training of event-based neural networks. Our task transfer method consistently outperforms methods targeting Unsupervised Domain Adaptation for object detection by 0.26 mAP (increase by 93%) and classification by 2.7% accuracy.

Statistics

Citations

Dimensions.ai Metrics
12 citations in Web of Science®
15 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

109 downloads since deposited on 17 Feb 2022
29 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Scopus Subject Areas:Physical Sciences > Control and Systems Engineering
Physical Sciences > Biomedical Engineering
Physical Sciences > Human-Computer Interaction
Physical Sciences > Mechanical Engineering
Physical Sciences > Computer Vision and Pattern Recognition
Physical Sciences > Computer Science Applications
Physical Sciences > Control and Optimization
Physical Sciences > Artificial Intelligence
Scope:Discipline-based scholarship (basic research)
Language:English
Date:2022
Deposited On:17 Feb 2022 09:57
Last Modified:27 May 2024 01:54
Publisher:Institute of Electrical and Electronics Engineers
ISSN:2377-3766
OA Status:Green
Publisher DOI:https://doi.org/10.1109/LRA.2022.3145053
Other Identification Number:merlin-id:22182
  • Content: Accepted Version