Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT

Voita, Elena; Sennrich, Rico; Titov, Ivan (2021). Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, 7 November 2021 - 11 November 2021. ACL Anthology, 8478-8491.

Abstract

Differently from the traditional statistical MT that decomposes the translation task into distinct separately learned components, neural machine translation uses a single neural network to model the entire translation process. Despite neural machine translation being de-facto standard, it is still not clear how NMT models acquire different competences over the course of training, and how this mirrors the different models in traditional SMT. In this work, we look at the competences related to three core SMT components and find that during training, NMT first focuses on learning target-side language modeling, then improves translation quality approaching word-by-word translation, and finally learns more complicated reordering patterns. We show that this behavior holds for several models and language pairs. Additionally, we explain how such an understanding of the training process can be useful in practice and, as an example, show how it can be used to improve vanilla non-autoregressive neural machine translation by guiding teacher model selection.

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:11 November 2021
Deposited On:08 Nov 2021 16:05
Last Modified:28 Apr 2022 07:15
Publisher:ACL Anthology
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Official URL:https://aclanthology.org/2021.emnlp-main.667
Project Information:
  • Funder: H2020
  • Grant ID: 825299
  • Project Title: Global Under-Resourced MEedia Translation
  • Funder: H2020
  • Grant ID: 678254
  • Project Title: Induction of Broad-Coverage Semantic Parsers
  • Funder: SNSF
  • Grant ID: PP00P1_176727
  • Project Title: Multi-Task Learning with Multilingual Resources for Better Natural Language Understanding
Download PDF  'Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT'.
Preview
  • Content: Published Version
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)

Metadata Export

Statistics

Citations

2 citations in Web of Science®
10 citations in Scopus®
Google Scholar™

Downloads

17 downloads since deposited on 08 Nov 2021
5 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications