Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Vision Matters When It Should: Sanity Checking Multimodal Machine Translation Models

Li, Jiaoda; Ataman, Duygu; Sennrich, Rico (2021). Vision Matters When It Should: Sanity Checking Multimodal Machine Translation Models. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, 7 November 2021 - 11 November 2021. ACL Anthology, 8556-8562.

Abstract

Multimodal machine translation (MMT) systems have been shown to outperform their text-only neural machine translation (NMT) counterparts when visual context is available. However, recent studies have also shown that the performance of MMT models is only marginally impacted when the associated image is replaced with an unrelated image or noise, which suggests that the visual context might not be exploited by the model at all. We hypothesize that this might be caused by the nature of the commonly used evaluation benchmark, also known as Multi30K, where the translations of image captions were prepared without actually showing the images to human translators. In this paper, we present a qualitative study that examines the role of datasets in stimulating the leverage of visual modality and we propose methods to highlight the importance of visual signals in the datasets which demonstrate improvements in reliance of models on the source images. Our findings suggest the research on effective MMT architectures is currently impaired by the lack of suitable datasets and careful consideration must be taken in creation of future MMT datasets, for which we also provide useful insights.

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:11 November 2021
Deposited On:08 Nov 2021 15:34
Last Modified:28 Apr 2022 07:15
Publisher:ACL Anthology
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Official URL:https://aclanthology.org/2021.emnlp-main.673
Project Information:
  • Funder: SNSF
  • Grant ID: PP00P1_176727
  • Project Title: Multi-Task Learning with Multilingual Resources for Better Natural Language Understanding
Download PDF  'Vision Matters When It Should: Sanity Checking Multimodal Machine Translation Models'.
Preview
  • Content: Published Version
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)

Metadata Export

Statistics

Citations

8 citations in Web of Science®
21 citations in Scopus®
Google Scholar™

Downloads

30 downloads since deposited on 08 Nov 2021
4 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications