Header

UZH-Logo

Maintenance Infos

The influence of speech rate on Fujisaki model parameters


Mixdorff, Hansjörg; Leemann, Adrian; Dellwo, Volker (2014). The influence of speech rate on Fujisaki model parameters. EURASIP Journal on Audio, Speech, and Music Processing, 2014(33):online.

Abstract

The current paper examines influences of speech rate on Fujisaki model parameters based on read speech from the BonnTempo-Corpus containing productions by 12 native speakers of German at five different intended tempo levels (very slow, slow, normal, fast, fastest possible). The normal condition was produced at an average rate of 6.34 syllables/s or 100%, the very slow version at 67%, and the fastest version at 161% of the normal rate. We extracted F0 contours and subjected them to decomposition using the Fujisaki model. We ordered all the data with respect to their actual speech rates. First, we assessed how prosodic realizations vary with speech rate and examined phrase command magnitudes, the number of phrase commands as well as the base frequency, accent command amplitudes, and the timing of accent command with respects to the underlying syllables and their nuclear vowels. Second, we analyzed between-sentence variability within and between speakers and investigated whether and how the prosodic structure is preserved at different speech rates. For very slow speech, we found for some of the speakers that the original phrase structure had disintegrated into something like a list of isolated words separated by pauses. Very fast speech became chains of uniform syllables at very high pitch and with almost flat intonation. With respect to the F0 range reflected by the amplitude of accent commands, we found strong interspeaker differences. While four of the subjects exhibited a significant reduction at higher speech rates, the others did not. As speed increases, it appears that F0 gestures commence earlier in the syllable, that is, the onset time of accent commands is located closer to the syllable/vowel onset than at lower speed.

Abstract

The current paper examines influences of speech rate on Fujisaki model parameters based on read speech from the BonnTempo-Corpus containing productions by 12 native speakers of German at five different intended tempo levels (very slow, slow, normal, fast, fastest possible). The normal condition was produced at an average rate of 6.34 syllables/s or 100%, the very slow version at 67%, and the fastest version at 161% of the normal rate. We extracted F0 contours and subjected them to decomposition using the Fujisaki model. We ordered all the data with respect to their actual speech rates. First, we assessed how prosodic realizations vary with speech rate and examined phrase command magnitudes, the number of phrase commands as well as the base frequency, accent command amplitudes, and the timing of accent command with respects to the underlying syllables and their nuclear vowels. Second, we analyzed between-sentence variability within and between speakers and investigated whether and how the prosodic structure is preserved at different speech rates. For very slow speech, we found for some of the speakers that the original phrase structure had disintegrated into something like a list of isolated words separated by pauses. Very fast speech became chains of uniform syllables at very high pitch and with almost flat intonation. With respect to the F0 range reflected by the amplitude of accent commands, we found strong interspeaker differences. While four of the subjects exhibited a significant reduction at higher speech rates, the others did not. As speed increases, it appears that F0 gestures commence earlier in the syllable, that is, the onset time of accent commands is located closer to the syllable/vowel onset than at lower speed.

Statistics

Citations

1 citation in Web of Science®
1 citation in Scopus®
Google Scholar™

Altmetrics

Downloads

64 downloads since deposited on 03 Jan 2015
29 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:06 Faculty of Arts > Department of Comparative Linguistics
Dewey Decimal Classification:490 Other languages
890 Other literatures
410 Linguistics
Language:English
Date:13 August 2014
Deposited On:03 Jan 2015 19:20
Last Modified:08 Dec 2017 09:24
Publisher:SpringerOpen
ISSN:1687-4714
Funders:Swiss National Science Foundation
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.1186/s13636-014-0033-6

Download

Download PDF  'The influence of speech rate on Fujisaki model parameters'.
Preview
Content: Published Version
Filetype: PDF
Size: 1MB
View at publisher
Licence: Creative Commons: Attribution 2.0 Generic (CC BY 2.0)