Header

UZH-Logo

Maintenance Infos

Influences of fundamental oscillation on speaker identification in vocalic utterances by humans and computers


Dellwo, Volker; Kathiresan, Thayabaran; Pellegrino, Elisa; He, Lei; Schwab, Sandra; Maurer, Dieter (2018). Influences of fundamental oscillation on speaker identification in vocalic utterances by humans and computers. In: Interspeech 2018, Hyderabad, India, 2 September 2018 - 6 September 2018. ISCA, 3795-3799.

Abstract

We tested the influence of fundamental oscillation (fo) on human and machine speaker recognition performance in vocalic test utterances. In experiment I, we trained a Gaussian-Mixture model on 15 speakers (80 multi-word utterances each) and tested it with sustained vowel utterances (/a:/, /i:/ and /u:/) under six fo conditions, three changing (fall, rise, fall-rise) and three steady-state (high, mid, low). Results revealed better performance for the steady-state compared to the changing conditions and within the steady-state condition, performance was poorest for high fo. In experiment II, we tested 9 human listeners on a subset of 4 speakers from experiment I. They went through two training tasks (training 1: multi-word utterances; training 2: words). In the test, they recognized speakers based on the same vocalic utterances as in experiment I (for these 4 speakers). Results showed that performance was about equally high for the changing and steady-state vowels, however, in the steady-state condition performance was best for high fo vowels. The experiments suggest that (a) fo has an influence on the strength of speaker specific characteristics in vowels and (b) humans - compared to machines - pay attention to different acoustic information in vocalic utterances for speaker recognition.

Abstract

We tested the influence of fundamental oscillation (fo) on human and machine speaker recognition performance in vocalic test utterances. In experiment I, we trained a Gaussian-Mixture model on 15 speakers (80 multi-word utterances each) and tested it with sustained vowel utterances (/a:/, /i:/ and /u:/) under six fo conditions, three changing (fall, rise, fall-rise) and three steady-state (high, mid, low). Results revealed better performance for the steady-state compared to the changing conditions and within the steady-state condition, performance was poorest for high fo. In experiment II, we tested 9 human listeners on a subset of 4 speakers from experiment I. They went through two training tasks (training 1: multi-word utterances; training 2: words). In the test, they recognized speakers based on the same vocalic utterances as in experiment I (for these 4 speakers). Results showed that performance was about equally high for the changing and steady-state vowels, however, in the steady-state condition performance was best for high fo vowels. The experiments suggest that (a) fo has an influence on the strength of speaker specific characteristics in vowels and (b) humans - compared to machines - pay attention to different acoustic information in vocalic utterances for speaker recognition.

Statistics

Citations

Dimensions.ai Metrics
1 citation in Web of Science®
1 citation in Scopus®
Google Scholar™

Altmetrics

Downloads

103 downloads since deposited on 01 Oct 2018
14 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
06 Faculty of Arts > Zurich Center for Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Scopus Subject Areas:Social Sciences & Humanities > Language and Linguistics
Physical Sciences > Human-Computer Interaction
Physical Sciences > Signal Processing
Physical Sciences > Software
Physical Sciences > Modeling and Simulation
Language:English
Event End Date:6 September 2018
Deposited On:01 Oct 2018 13:52
Last Modified:14 Aug 2022 07:48
Publisher:ISCA
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Publisher DOI:https://doi.org/10.21437/Interspeech.2018-2331
Official URL:https://www.isca-speech.org/archive/Interspeech_2018
  • Content: Published Version