Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Monaural Source Separation Using a Random Forest Classifier

Riday, Cosimo; Bhargava, Saurabh; Hahnloser, Richard H R; Liu, Shih-Chii (2016). Monaural Source Separation Using a Random Forest Classifier. In: Interspeech 2016, San Francisco, CA, USA, 8 September 2016 - 12 September 2016. Proceedings of Interspeech 2016, 3344-3348.

Abstract

We address the problem of separating two audio sources from a single channel mixture recording. A novel method called Multi Layered Random Forest (MLRF) that learns a binary mask for both the sources is presented. Random Forest (RF) classifiers are trained for each frequency band of a source spectrogram. A specialized set of linear transformations are applied to a local time-frequency (T-F) neighborhood of the mixture that captures relevant local statistics. A sampling method is presented that efficiently samples T-F training bins in each frequency band. We draw equal numbers of dominant (more power) training samples from the two sources for RF classifiers that estimate the Ideal Binary Mask (IBM). An estimated IBM in a given layer is used to train a RF classifier in the next higher layer of the MLRF hierarchy. On average, MLRF performs better than deep Recurrent Neural Networks (RNNs) and Non-Negative Sparse Coding (NNSC) in signal-to-noise ratio (SNR) of reconstructed audio, overall T-F bin classification accuracy, as well as PESQ and STOI scores. Additionally, we demonstrate the ability of the MLRF to correctly reconstruct T-F bins of the target even when the latter has lower power in that frequency band.

Additional indexing

Item Type:Conference or Workshop Item (Speech), refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Neuroinformatics
Dewey Decimal Classification:570 Life sciences; biology
Scopus Subject Areas:Social Sciences & Humanities > Language and Linguistics
Physical Sciences > Human-Computer Interaction
Physical Sciences > Signal Processing
Physical Sciences > Software
Physical Sciences > Modeling and Simulation
Language:English
Event End Date:12 September 2016
Deposited On:27 Jan 2017 08:24
Last Modified:26 Jan 2022 11:49
Publisher:Proceedings of Interspeech 2016
Series Name:Proceedings of Interspeech 2016
Number of Pages:5
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Publisher DOI:https://doi.org/10.21437/Interspeech.2016-252
Official URL:http://www.isca-speech.org/archive/Interspeech_2016/abstracts/0252.html

Metadata Export

Statistics

Citations

Dimensions.ai Metrics
1 citation in Web of Science®
1 citation in Scopus®
Google Scholar™

Altmetrics

Downloads

137 downloads since deposited on 27 Jan 2017
20 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications