Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Combining statistical machine translation and translation memories with domain adaptation

Läubli, Samuel; Fishel, Mark; Volk, Martin; Weibel, Manuela (2013). Combining statistical machine translation and translation memories with domain adaptation. In: NODALIDA 2013, Nordic Conference of Computational Linguistics, Oslo, Norway, 22 May 2013 - 24 May 2013. Linköpings universitet Electronic Press, 331-341.

Abstract

Since the emergence of translation memory software, translation companies and freelance translators have been accumulating translated text for various languages and domains. This data has the potential of being used for training domain-specific machine translation systems for corporate or even personal use. But while the resulting systems usually perform well in translating domain-specific language, their out-of-domain vocabulary coverage is often insufficient due to the limited size of the translation memories. In this paper, we demonstrate that small in-domain translation memories can be successfully complemented with freely available general-domain parallel corpora such that (a) the number of out-of-vocabulary words (OOV) is reduced while (b) the in-domain terminology is preserved. In our experiments, a German–French and a German–Italian statistical machine translation system geared to marketing texts of the automobile industry has been significantly improved using Europarl and OpenSubtitles data, both in terms of automatic evaluation metrics and human judgement.

Additional indexing

Item Type:Conference or Workshop Item (Paper), not_refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:24 May 2013
Deposited On:13 Jun 2013 06:26
Last Modified:30 Jul 2020 09:09
Publisher:Linköpings universitet Electronic Press
Series Name:Linköping Electronic Conference Proceedings
ISSN:1650-3686
ISBN:978-91-7519-589-6
Funders:Swiss Federal Commission for Technology and Innovation CTI
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Official URL:http://www.ep.liu.se/ecp/085/030/ecp1385030.pdf
Related URLs:http://www.ep.liu.se/ecp_article/index.en.aspx?issue=085;article=030
Download PDF  'Combining statistical machine translation and translation memories with domain adaptation'.
Preview
  • Content: Published Version
  • Language: English

Metadata Export

Statistics

Altmetrics

Downloads

77 downloads since deposited on 13 Jun 2013
9 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications