Header

UZH-Logo

Maintenance Infos

In Neural Machine Translation, What Does Transfer Learning Transfer?


Aji, Alham Fikri; Bogoychev, Nikolay; Heafield, Kenneth; Sennrich, Rico (2020). In Neural Machine Translation, What Does Transfer Learning Transfer? In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5 July 2020 - 10 July 2020, 7701-7710.

Abstract

Transfer learning improves quality for low-resource machine translation, but it is unclear what exactly it transfers. We perform several ablation studies that limit information transfer, then measure the quality impact across three language pairs to gain a black-box understanding of transfer learning. Word embeddings play an important role in transfer learning, particularly if they are properly aligned. Although transfer learning can be performed without embeddings, results are sub-optimal. In contrast, transferring only the embeddings but nothing else yields catastrophic results. We then investigate diagonal alignments with auto-encoders over real languages and randomly generated sequences, finding even randomly generated sequences as parents yield noticeable but smaller gains. Finally, transfer learning can eliminate the need for a warm-up phase when training transformer models in high resource language pairs.

Abstract

Transfer learning improves quality for low-resource machine translation, but it is unclear what exactly it transfers. We perform several ablation studies that limit information transfer, then measure the quality impact across three language pairs to gain a black-box understanding of transfer learning. Word embeddings play an important role in transfer learning, particularly if they are properly aligned. Although transfer learning can be performed without embeddings, results are sub-optimal. In contrast, transferring only the embeddings but nothing else yields catastrophic results. We then investigate diagonal alignments with auto-encoders over real languages and randomly generated sequences, finding even randomly generated sequences as parents yield noticeable but smaller gains. Finally, transfer learning can eliminate the need for a warm-up phase when training transformer models in high resource language pairs.

Statistics

Downloads

3 downloads since deposited on 23 Jun 2020
3 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:10 July 2020
Deposited On:23 Jun 2020 11:06
Last Modified:23 Jun 2020 19:30
Publisher:Association for Computational Linguistics
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Official URL:https://www.aclweb.org/anthology/2020.acl-main.688
Project Information:
  • : FunderSNSF
  • : Grant IDPP00P1_176727
  • : Project TitleMulti-Task Learning with Multilingual Resources for Better Natural Language Understanding

Download

Green Open Access

Download PDF  'In Neural Machine Translation, What Does Transfer Learning Transfer?'.
Preview
Content: Published Version
Language: English
Filetype: PDF
Size: 398kB
Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)