Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

NMTScore: A Multilingual Analysis of Translation-based Text Similarity Measures

Vamvas, Jannis; Sennrich, Rico (2022). NMTScore: A Multilingual Analysis of Translation-based Text Similarity Measures. In: Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, 7 December 2022 - 11 December 2022. Cornell University, 198-213.

Abstract

Being able to rank the similarity of short text segments is an interesting bonus feature of neural machine translation. Translation-based similarity measures include direct and pivot translation probability, as well as translation cross-likelihood, which has not been studied so far. We analyze these measures in the common framework of multilingual NMT, releasing the NMTScore library. Compared to baselines such as sentence embeddings, translation-based measures prove competitive in paraphrase identification and are more robust against adversarial or multilingual input, especially if proper normalization is applied. When used for reference-based evaluation of data-to-text generation in 2 tasks and 17 languages, translation-based measures show a relatively high correlation to human judgments.

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
06 Faculty of Arts > Zurich Center for Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:11 December 2022
Deposited On:08 Nov 2022 13:11
Last Modified:21 Jun 2024 03:38
Publisher:Cornell University
Number:2022
Additional Information:Working Papers
OA Status:Green
Free access at:Publisher DOI. An embargo period may apply.
Official URL:https://perma.cc/4KW7-7BNN
Related URLs:https://github.com/ZurichNLP/nmtscore (Research Data)
Project Information:
  • Funder: SNSF
  • Grant ID: PP00P1_176727
  • Project Title: Multi-Task Learning with Multilingual Resources for Better Natural Language Understanding
Download PDF  'NMTScore: A Multilingual Analysis of Translation-based Text Similarity Measures'.
Preview
  • Content: Published Version
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)

Metadata Export

Statistics

Citations

Downloads

52 downloads since deposited on 08 Nov 2022
14 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications