Header

UZH-Logo

Maintenance Infos

Parallel Corpora, Terminology Extraction and Machine Translation


Volk, Martin (2018). Parallel Corpora, Terminology Extraction and Machine Translation. In: 16. DTT-Symposion. Terminologie und Text(e), Mannheim, 22 March 2018 - 24 March 2018, 3-14.

Abstract

In this paper we first give an overview of parallel corpus annotation, alignment and retrieval. We present standard annotation methods such as Part-of-Speech tagging, lemmatization and dependency parsing, but we also introduce language-specific methods, e.g. for dealing with split verbs or truncated compounds in German. We argue for careful sentence and word alignment for parallel corpora. And we explain how word alignment is the basis for a wide range of applications from translation variant ranking to terminology extraction. We conclude with a discussion of the latest developments in Machine Translation.

Abstract

In this paper we first give an overview of parallel corpus annotation, alignment and retrieval. We present standard annotation methods such as Part-of-Speech tagging, lemmatization and dependency parsing, but we also introduce language-specific methods, e.g. for dealing with split verbs or truncated compounds in German. We argue for careful sentence and word alignment for parallel corpora. And we explain how word alignment is the basis for a wide range of applications from translation variant ranking to terminology extraction. We conclude with a discussion of the latest developments in Machine Translation.

Statistics

Altmetrics

Downloads

51 downloads since deposited on 29 Mar 2018
51 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Keynote), not_refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:24 March 2018
Deposited On:29 Mar 2018 13:17
Last Modified:31 Jul 2018 06:13
Publisher:s.n.
ISBN:978-9812245-3-5
OA Status:Green

Download

Download PDF  'Parallel Corpora, Terminology Extraction and Machine Translation'.
Preview
Content: Published Version
Filetype: PDF
Size: 251kB