Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation

Zhang, Biao; Haddow, Barry; Sennrich, Rico (2023). Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation. In: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, Dubrovnik, Croatia, 2 May 2023 - 6 May 2023. Association for Computational Linguistics, 2264-2276.

Abstract

For end-to-end speech translation, regularizing the encoder with the Connectionist Temporal Classification (CTC) objective using the source transcript or target translation as labels can greatly improve quality. However, CTC demands an extra prediction layer over the vocabulary space, bringing in non-negligible model parameters and computational overheads, although this layer becomes useless at inference. In this paper, we re-examine the need for genuine vocabulary labels for CTC for regularization and explore strategies to reduce the CTC label space, targeting improved efficiency without quality degradation. We propose coarse labeling for CTC (CoLaCTC), which merges vocabulary labels via simple heuristic rules, such as using truncation, division or modulo (MOD) operations. Despite its simplicity, our experiments on 4 source and 8 target languages show that CoLaCTC with MOD particularly can compress the label space aggressively to 256 and even further, gaining training efficiency (1.18× ∼ 1.77× speedup depending on the original vocabulary size) yet still delivering comparable or better performance than the CTC baseline. We also show that CoLaCTC successfully generalizes to CTC regularization regardless of using transcript or translation for labeling.

Additional indexing

Item Type:Conference or Workshop Item (Paper), original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
06 Faculty of Arts > Zurich Center for Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Scopus Subject Areas:Physical Sciences > Computational Theory and Mathematics
Physical Sciences > Software
Social Sciences & Humanities > Linguistics and Language
Language:English
Event End Date:6 May 2023
Deposited On:28 Jul 2023 10:52
Last Modified:20 Jun 2024 09:55
Publisher:Association for Computational Linguistics
OA Status:Hybrid
Publisher DOI:https://doi.org/10.18653/v1/2023.eacl-main.166
Download PDF  'Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation'.
Preview
  • Content: Published Version
  • Language: English
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)

Metadata Export

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Downloads

5 downloads since deposited on 28 Jul 2023
5 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications