Header

UZH-Logo

Maintenance Infos

Enforcing Consistent Translation of German Compound Coreferences


Mascarell, Laura; Fishel, Mark; Korchagina, Natalia; Volk, Martin (2014). Enforcing Consistent Translation of German Compound Coreferences. In: Konvens, Hildesheim, Germany, 8 October 2014 - 10 October 2014, s.n..

Abstract

Coreferences to a German compound (e.g. Nordwand) can be made using its last constituent (e.g. Wand). Intuitively, both coreferences and the last constituent of the compound should share the same translation. However, since Statistical Machine Translation (SMT) systems translate at sentence level, they both may be translated inconsistently across the document. Several studies focus on document level consistency, but mostly in general terms. This paper
presents a method to enforce consistency in this particular case. Using two in-domain phrase-based SMT systems, we analyse the effects of compound coreference translation consistency on translation quality and readability of documents. Experimental results show that our method improves correctness and consistency of those coreferences as well as document readability.

Abstract

Coreferences to a German compound (e.g. Nordwand) can be made using its last constituent (e.g. Wand). Intuitively, both coreferences and the last constituent of the compound should share the same translation. However, since Statistical Machine Translation (SMT) systems translate at sentence level, they both may be translated inconsistently across the document. Several studies focus on document level consistency, but mostly in general terms. This paper
presents a method to enforce consistency in this particular case. Using two in-domain phrase-based SMT systems, we analyse the effects of compound coreference translation consistency on translation quality and readability of documents. Experimental results show that our method improves correctness and consistency of those coreferences as well as document readability.

Statistics

Downloads

134 downloads since deposited on 03 Sep 2014
6 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), not_refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:10 October 2014
Deposited On:03 Sep 2014 13:33
Last Modified:27 Nov 2020 07:21
Publisher:s.n.
OA Status:Green