Header

UZH-Logo

Maintenance Infos

Context-Aware Monolingual Repair for Neural Machine Translation


Voita, Elena; Sennrich, Rico; Titov, Ivan (2019). Context-Aware Monolingual Repair for Neural Machine Translation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, 3 November 2019 - 7 November 2019, 876-885.

Abstract

Modern sentence-level NMT systems often produce plausible translations of isolated sentences. However, when put in context, these translations may end up being inconsistent with each other. We propose a monolingual DocRepair model to correct inconsistencies between sentence-level translations. DocRepair performs automatic post-editing on a sequence of sentence-level translations, refining translations of sentences in context of each other. For training, the DocRepair model requires only monolingual document-level data in the target language. It is trained as a monolingual sequence-to-sequence model that maps inconsistent groups of sentences into consistent ones. The consistent groups come from the original training data; the inconsistent groups are obtained by sampling round-trip translations for each isolated sentence. We show that this approach successfully imitates inconsistencies we aim to fix: using contrastive evaluation, we show large improvements in the translation of several contextual phenomena in an English-Russian translation task, as well as improvements in the BLEU score. We also conduct a human evaluation and show a strong preference of the annotators to corrected translations over the baseline ones. Moreover, we analyze which discourse phenomena are hard to capture using monolingual data only.

Abstract

Modern sentence-level NMT systems often produce plausible translations of isolated sentences. However, when put in context, these translations may end up being inconsistent with each other. We propose a monolingual DocRepair model to correct inconsistencies between sentence-level translations. DocRepair performs automatic post-editing on a sequence of sentence-level translations, refining translations of sentences in context of each other. For training, the DocRepair model requires only monolingual document-level data in the target language. It is trained as a monolingual sequence-to-sequence model that maps inconsistent groups of sentences into consistent ones. The consistent groups come from the original training data; the inconsistent groups are obtained by sampling round-trip translations for each isolated sentence. We show that this approach successfully imitates inconsistencies we aim to fix: using contrastive evaluation, we show large improvements in the translation of several contextual phenomena in an English-Russian translation task, as well as improvements in the BLEU score. We also conduct a human evaluation and show a strong preference of the annotators to corrected translations over the baseline ones. Moreover, we analyze which discourse phenomena are hard to capture using monolingual data only.

Statistics

Downloads

9 downloads since deposited on 05 Nov 2019
9 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), not_refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:7 November 2019
Deposited On:05 Nov 2019 14:41
Last Modified:25 Nov 2019 12:25
Publisher:Association for Computational Linguistics
OA Status:Green
Official URL:https://www.aclweb.org/anthology/D19-1081.pdf
Related URLs:https://www.aclweb.org/anthology/D19-1081
Project Information:
  • : FunderH2020
  • : Grant ID825460
  • : Project TitleEuropean Live Translator
  • : FunderRoyal Society
  • : Grant IDNAF\R1\180122
  • : Project TitleTowards Discourse-level Neural Machine Translation

Download

Green Open Access

Download PDF  'Context-Aware Monolingual Repair for Neural Machine Translation'.
Preview
Content: Published Version
Language: English
Filetype: PDF
Size: 638kB
Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)