Navigation auf zora.uzh.ch

Search

ZORA (Zurich Open Repository and Archive)

Acoustic compression in Zoom audio does not compromise voice recognition performance

Perepelytsia, Valeriia; Dellwo, Volker (2023). Acoustic compression in Zoom audio does not compromise voice recognition performance. Scientific Reports, 13(1):18742.

Abstract

Human voice recognition over telephone channels typically yields lower accuracy when compared to audio recorded in a studio environment with higher quality. Here, we investigated the extent to which audio in video conferencing, subject to various lossy compression mechanisms, affects human voice recognition performance. Voice recognition performance was tested in an old–new recognition task under three audio conditions (telephone, Zoom, studio) across all matched (familiarization and test with same audio condition) and mismatched combinations (familiarization and test with different audio conditions). Participants were familiarized with female voices presented in either studio-quality (N = 22), Zoom-quality (N = 21), or telephone-quality (N = 20) stimuli. Subsequently, all listeners performed an identical voice recognition test containing a balanced stimulus set from all three conditions. Results revealed that voice recognition performance (dʹ) in Zoom audio was not significantly different to studio audio but both in Zoom and studio audio listeners performed significantly better compared to telephone audio. This suggests that signal processing of the speech codec used by Zoom provides equally relevant information in terms of voice recognition compared to studio audio. Interestingly, listeners familiarized with voices via Zoom audio showed a trend towards a better recognition performance in the test (p = 0.056) compared to listeners familiarized with studio audio. We discuss future directions according to which a possible advantage of Zoom audio for voice recognition might be related to some of the speech coding mechanisms used by Zoom.

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
06 Faculty of Arts > Zurich Center for Linguistics
Special Collections > NCCR Evolving Language
Special Collections > Centers of Competence > Center for the Interdisciplinary Study of Language Evolution
Special Collections > Centers of Competence > Competence Centre Language and Medicine Zurich
08 Research Priority Programs > Language and Space
06 Faculty of Arts > Linguistic Research Infrastructure (LiRI)
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Scopus Subject Areas:Health Sciences > Multidisciplinary
Uncontrolled Keywords:Multidisciplinary
Language:English
Date:31 October 2023
Deposited On:11 Jan 2024 13:36
Last Modified:30 Aug 2024 01:39
Publisher:Nature Publishing Group
ISSN:2045-2322
OA Status:Gold
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.1038/s41598-023-45971-x
PubMed ID:37907749
Project Information:
  • Funder: SNSF
  • Grant ID: 185399
  • Project Title: The dynamics of indexical information in speech and its role in speech communication and speaker recognition
Download PDF  'Acoustic compression in Zoom audio does not compromise voice recognition performance'.
Preview
  • Content: Published Version
  • Language: English
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)

Metadata Export

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Downloads

6 downloads since deposited on 11 Jan 2024
6 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications