Header

UZH-Logo

Maintenance Infos

Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias


Vamvas, Jannis; Sennrich, Rico (2021). Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias. In: 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online and in Punta Cana, Dominican Republic, 7 November 2021 - 11 November 2021, GitHub.

Abstract

Lexical disambiguation is a major challenge for machine translation systems, especially if some senses of a word are trained less often than others. Identifying patterns of overgeneralization requires evaluation methods that are both reliable and scalable. We propose contrastive conditioning as a reference-free black-box method for detecting disambiguation errors. Specifically, we score the quality of a translation by conditioning on variants of the source that provide contrastive disambiguation cues. After validating our method, we apply it in a case study to perform a targeted evaluation of sequence-level knowledge distillation. By probing word sense disambiguation and translation of gendered occupation names, we show that distillation-trained models tend to overgeneralize more than other models with a comparable BLEU score. Contrastive conditioning thus highlights a side effect of distillation that is not fully captured by standard evaluation metrics. Code and data to reproduce our findings are publicly available.

Abstract

Lexical disambiguation is a major challenge for machine translation systems, especially if some senses of a word are trained less often than others. Identifying patterns of overgeneralization requires evaluation methods that are both reliable and scalable. We propose contrastive conditioning as a reference-free black-box method for detecting disambiguation errors. Specifically, we score the quality of a translation by conditioning on variants of the source that provide contrastive disambiguation cues. After validating our method, we apply it in a case study to perform a targeted evaluation of sequence-level knowledge distillation. By probing word sense disambiguation and translation of gendered occupation names, we show that distillation-trained models tend to overgeneralize more than other models with a comparable BLEU score. Contrastive conditioning thus highlights a side effect of distillation that is not fully captured by standard evaluation metrics. Code and data to reproduce our findings are publicly available.

Statistics

Downloads

46 downloads since deposited on 01 Sep 2021
46 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:11 November 2021
Deposited On:01 Sep 2021 13:40
Last Modified:04 Sep 2021 09:46
Publisher:GitHub
OA Status:Green
Free access at:Related URL. An embargo period may apply.
Related URLs:https://openreview.net/forum?id=RvO9DqoWI9V
https://github.com/ZurichNLP/contrastive-conditioning (Research Data)
Project Information:
  • : FunderSNSF
  • : Grant IDPP00P1_176727
  • : Project TitleMulti-Task Learning with Multilingual Resources for Better Natural Language Understanding

Download

Green Open Access

Download PDF  'Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias'.
Preview
Content: Accepted Version
Filetype: PDF
Size: 550kB