Header

UZH-Logo

Maintenance Infos

The Text+Berg corpus: an alpine french-german parallel resource


Göhring, A; Volk, M (2011). The Text+Berg corpus: an alpine french-german parallel resource. In: TALN 2011, Montpellier, 27 June 2011 - 1 July 2011.

Abstract

This article presents a French-German parallel corpus of more than 4 million tokens which we have compiled as part of the digitization of a large multilingual heritage corpus of alpine texts. This corpus is a valuable resource for cultural heritage and cross-linguistic studies as well as for the development of domain-specific machine translation systems. We have turned a small fraction of the parallel corpus into a high-quality parallel treebank with manually checked syntactic annotations and cross-language word and phrase alignments. This alpine treebank is the first freely available French-German parallel treebank. It complements other treebanks with texts in a new domain and genre: mountaineering reports.

Abstract

This article presents a French-German parallel corpus of more than 4 million tokens which we have compiled as part of the digitization of a large multilingual heritage corpus of alpine texts. This corpus is a valuable resource for cultural heritage and cross-linguistic studies as well as for the development of domain-specific machine translation systems. We have turned a small fraction of the parallel corpus into a high-quality parallel treebank with manually checked syntactic annotations and cross-language word and phrase alignments. This alpine treebank is the first freely available French-German parallel treebank. It complements other treebanks with texts in a new domain and genre: mountaineering reports.

Statistics

Downloads

151 downloads since deposited on 16 Jun 2011
27 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:1 July 2011
Deposited On:16 Jun 2011 10:42
Last Modified:23 Nov 2017 03:53

Download

Download PDF  'The Text+Berg corpus: an alpine french-german parallel resource'.
Preview
Filetype: PDF
Size: 228kB