Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Semi-supervised autoencoders for speech emotion recognition

Deng, Jun; Xu, Xinzhou; Zhang, Zixing; Frühholz, Sascha; Schuller, Bjorn (2017). Semi-supervised autoencoders for speech emotion recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, (99):1.

Abstract

Despite the widespread use of supervised learning methods for speech emotion recognition, they are severely restricted due to the lack of sufficient amount of labelled speech data for the training. Considering the wide availability of unlabelled speech data, therefore, this paper proposes semisupervised autoencoders to improve speech emotion recognition. The aim is to reap the benefit from the combination of the labelled data and unlabelled data. The proposed model extends a popular unsupervised autoencoder by carefully adjoining a supervised learning objective. We extensively evaluate the proposed model on the INTERSPEECH 2009 Emotion Challenge database and other four public databases in different scenarios. Experimental results demonstrate that the proposed model achieves state-of-the-art performance with a very small number of labelled data on the challenge task and other tasks, and significantly outperforms other alternative methods.

Additional indexing

Item Type:Journal Article, original work
Communities & Collections:04 Faculty of Medicine > Neuroscience Center Zurich
04 Faculty of Medicine > Zurich Center for Integrative Human Physiology (ZIHP)
06 Faculty of Arts > Institute of Psychology
Dewey Decimal Classification:150 Psychology
Scopus Subject Areas:Physical Sciences > Computer Science (miscellaneous)
Physical Sciences > Acoustics and Ultrasonics
Physical Sciences > Computational Mathematics
Physical Sciences > Electrical and Electronic Engineering
Language:English
Date:October 2017
Deposited On:27 Nov 2017 09:52
Last Modified:20 Nov 2024 04:32
ISSN:2329-9290
OA Status:Closed
Publisher DOI:https://doi.org/10.1109/TASLP.2017.2759338
Project Information:
  • Funder: FP7
  • Grant ID: 338164
  • Project Title: IHEARU - Intelligent systems' Holistic Evolving Analysis of Real-life Universal speaker characteristics
  • Funder: SNSF
  • Grant ID: PP00P1_157409
  • Project Title: Challenging The Human Auditory System at The Limits of Hearing
Full text not available from this repository.

Metadata Export

Statistics

Citations

Dimensions.ai Metrics
101 citations in Web of Science®
125 citations in Scopus®
Google Scholar™

Altmetrics

Authors, Affiliations, Collaborations

Similar Publications