Navigation auf zora.uzh.ch

Search

ZORA (Zurich Open Repository and Archive)

Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias

Vamvas, Jannis; Sennrich, Rico (2021). Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias. In: 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online and in Punta Cana, Dominican Republic, 7 November 2021 - 11 November 2021, ACL Anthology.

Abstract

Lexical disambiguation is a major challenge for machine translation systems, especially if some senses of a word are trained less often than others. Identifying patterns of overgeneralization requires evaluation methods that are both reliable and scalable. We propose contrastive conditioning as a reference-free black-box method for detecting disambiguation errors. Specifically, we score the quality of a translation by conditioning on variants of the source that provide contrastive disambiguation cues. After validating our method, we apply it in a case study to perform a targeted evaluation of sequence-level knowledge distillation. By probing word sense disambiguation and translation of gendered occupation names, we show that distillation-trained models tend to overgeneralize more than other models with a comparable BLEU score. Contrastive conditioning thus highlights a side effect of distillation that is not fully captured by standard evaluation metrics. Code and data to reproduce our findings are publicly available.

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:11 November 2021
Deposited On:01 Sep 2021 13:40
Last Modified:27 Apr 2022 07:29
Publisher:ACL Anthology
OA Status:Green
Official URL:https://aclanthology.org/2021.emnlp-main.803/
Related URLs:https://github.com/ZurichNLP/contrastive-conditioning (Research Data)
Project Information:
  • Funder: SNSF
  • Grant ID: PP00P1_176727
  • Project Title: Multi-Task Learning with Multilingual Resources for Better Natural Language Understanding
Download PDF  'Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias'.
Preview
  • Content: Accepted Version

Metadata Export

Statistics

Citations

3 citations in Web of Science®
14 citations in Scopus®
Google Scholar™

Downloads

189 downloads since deposited on 01 Sep 2021
26 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications