Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Balanced Deep CCA for Bird Vocalization Detection

Kumar, Sumit; Anshuman, B; Rüttimann, Linus; Hahnloser, Richard H R; Arora, Vipul (2023). Balanced Deep CCA for Bird Vocalization Detection. In: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4 June 2023 - 10 June 2023, Institute of Electrical and Electronics Engineers.

Abstract

Event detection improves when events are captured by two different modalities rather than just one. But to train detection systems on multiple modalities is challenging, in particular when there is abundance of unlabelled data but limited amounts of labeled data. We develop a novel self-supervised learning technique for multi- modal data that learns (hidden) correlations between simultaneously recorded microphone (sound) signals and accelerometer (body vibration) signals. The key objective of this work is to learn useful embeddings associated with high performance in downstream event detection tasks when labeled data is scarce and the audio events of interest — songbird vocalizations — are sparse. We base our approach on deep canonical correlation analysis (DCCA) that suffers from event sparseness. We overcome the sparseness of positive labels by first learning a data sampling model from the labelled data and by applying DCCA on the output it produces. This method that we term balanced DCCA (b-DCCA) improves the performance of the unsupervised embeddings on the down-stream supervised audio detection task compared to classsical DCCA. Because data labels are frequently imbalanced, our method might be of broad utility in low-resource scenarios.

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Neuroinformatics
Dewey Decimal Classification:570 Life sciences; biology
Scopus Subject Areas:Physical Sciences > Software
Physical Sciences > Signal Processing
Physical Sciences > Electrical and Electronic Engineering
Language:English
Event End Date:10 June 2023
Deposited On:31 Jan 2024 12:08
Last Modified:21 Jan 2025 13:15
Publisher:Institute of Electrical and Electronics Engineers
Series Name:Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
ISSN:1520-6149
OA Status:Green
Publisher DOI:https://doi.org/10.1109/icassp49357.2023.10094650
Download PDF  'Balanced Deep CCA for Bird Vocalization Detection'.
Preview
  • Content: Submitted Version
  • Language: English

Metadata Export

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Downloads

8 downloads since deposited on 31 Jan 2024
8 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications