Header

UZH-Logo

Maintenance Infos

Wino-X: Multilingual Winograd Schemas for Commonsense Reasoning and Coreference Resolution


Emelin, Denis; Sennrich, Rico (2021). Wino-X: Multilingual Winograd Schemas for Commonsense Reasoning and Coreference Resolution. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, 7 November 2021 - 11 November 2021. Association for Computational Linguistics, 8517-8532.

Abstract

Winograd schemas are a well-established tool for evaluating coreference resolution (CoR) and commonsense reasoning (CSR) capabilities of computational models. So far, schemas remained largely confined to English, limiting their utility in multilingual settings. This work presents Wino-X, a parallel dataset of German, French, and Russian schemas, aligned with their English counterparts. We use this resource to investigate whether neural machine translation (NMT) models can perform CoR that requires commonsense knowledge and whether multilingual language models (MLLMs) are capable of CSR across multiple languages. Our findings show Wino-X to be exceptionally challenging for NMT systems that are prone to undesirable biases and unable to detect disambiguating information. We quantify biases using established statistical methods and define ways to address both of these issues. We furthermore present evidence of active cross-lingual knowledge transfer in MLLMs, whereby fine-tuning models on English schemas yields CSR improvements in other languages.

Abstract

Winograd schemas are a well-established tool for evaluating coreference resolution (CoR) and commonsense reasoning (CSR) capabilities of computational models. So far, schemas remained largely confined to English, limiting their utility in multilingual settings. This work presents Wino-X, a parallel dataset of German, French, and Russian schemas, aligned with their English counterparts. We use this resource to investigate whether neural machine translation (NMT) models can perform CoR that requires commonsense knowledge and whether multilingual language models (MLLMs) are capable of CSR across multiple languages. Our findings show Wino-X to be exceptionally challenging for NMT systems that are prone to undesirable biases and unable to detect disambiguating information. We quantify biases using established statistical methods and define ways to address both of these issues. We furthermore present evidence of active cross-lingual knowledge transfer in MLLMs, whereby fine-tuning models on English schemas yields CSR improvements in other languages.

Statistics

Downloads

66 downloads since deposited on 08 Nov 2021
66 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:11 November 2021
Deposited On:08 Nov 2021 15:33
Last Modified:28 Apr 2022 07:15
Publisher:Association for Computational Linguistics
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Official URL:https://aclanthology.org/2021.emnlp-main.670
Project Information:
  • : FunderSNSF
  • : Grant IDPP00P1_176727
  • : Project TitleMulti-Task Learning with Multilingual Resources for Better Natural Language Understanding

Download

Green Open Access

Download PDF  'Wino-X: Multilingual Winograd Schemas for Commonsense Reasoning and Coreference Resolution'.
Preview
Content: Published Version
Filetype: PDF
Size: 606kB
Licence: Creative Commons: Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)