Header

UZH-Logo

Maintenance Infos

Semi-supervised autoencoders for speech emotion recognition


Deng, Jun; Xu, Xinzhou; Zhang, Zixing; Frühholz, Sascha; Schuller, Bjorn (2017). Semi-supervised autoencoders for speech emotion recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, (99):1.

Abstract

Despite the widespread use of supervised learning methods for speech emotion recognition, they are severely restricted due to the lack of sufficient amount of labelled speech data for the training. Considering the wide availability of unlabelled speech data, therefore, this paper proposes semisupervised autoencoders to improve speech emotion recognition. The aim is to reap the benefit from the combination of the labelled data and unlabelled data. The proposed model extends a popular unsupervised autoencoder by carefully adjoining a supervised learning objective. We extensively evaluate the proposed model on the INTERSPEECH 2009 Emotion Challenge database and other four public databases in different scenarios. Experimental results demonstrate that the proposed model achieves state-of-the-art performance with a very small number of labelled data on the challenge task and other tasks, and significantly outperforms other alternative methods.

Abstract

Despite the widespread use of supervised learning methods for speech emotion recognition, they are severely restricted due to the lack of sufficient amount of labelled speech data for the training. Considering the wide availability of unlabelled speech data, therefore, this paper proposes semisupervised autoencoders to improve speech emotion recognition. The aim is to reap the benefit from the combination of the labelled data and unlabelled data. The proposed model extends a popular unsupervised autoencoder by carefully adjoining a supervised learning objective. We extensively evaluate the proposed model on the INTERSPEECH 2009 Emotion Challenge database and other four public databases in different scenarios. Experimental results demonstrate that the proposed model achieves state-of-the-art performance with a very small number of labelled data on the challenge task and other tasks, and significantly outperforms other alternative methods.

Statistics

Citations

Dimensions.ai Metrics
5 citations in Web of Science®
7 citations in Scopus®
Google Scholar™

Altmetrics

Additional indexing

Item Type:Journal Article, original work
Communities & Collections:04 Faculty of Medicine > Neuroscience Center Zurich
04 Faculty of Medicine > Center for Integrative Human Physiology
06 Faculty of Arts > Institute of Psychology
Dewey Decimal Classification:150 Psychology
Date:October 2017
Deposited On:27 Nov 2017 09:52
Last Modified:04 Nov 2018 06:43
ISSN:2329-9290
OA Status:Closed
Publisher DOI:https://doi.org/10.1109/TASLP.2017.2759338
Project Information:
  • : FunderFP7
  • : Grant ID338164
  • : Project TitleIHEARU - Intelligent systems' Holistic Evolving Analysis of Real-life Universal speaker characteristics
  • : FunderSNSF
  • : Grant IDPP00P1_157409
  • : Project TitleChallenging The Human Auditory System at The Limits of Hearing

Download

Full text not available from this repository.
View at publisher

Get full-text in a library