Header

UZH-Logo

Maintenance Infos

Morphological Disambiguation and Text Normalization for Southern Quechua Varieties


Rios, Annette; Castro Mamani, Richard (2014). Morphological Disambiguation and Text Normalization for Southern Quechua Varieties. In: COLING Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects (VarDial), Dublin, Ireland, 23 August 2014 - 23 August 2014, online.

Abstract

We built a pipeline to normalize Quechua texts through morphological analysis and disambiguation. Word forms are analyzed by a set of cascaded finite state transducers which split the words and rewrite the morphemes to a normalized form. However, some of these morphemes, or rather morpheme combinations, are ambiguous, which may affect the normalization. For this reason, we disambiguate the morpheme sequences with conditional random fields. Once we know the individual morphemes of a word, we can generate the normalized word form from the disambiguated morphemes.

Abstract

We built a pipeline to normalize Quechua texts through morphological analysis and disambiguation. Word forms are analyzed by a set of cascaded finite state transducers which split the words and rewrite the morphemes to a normalized form. However, some of these morphemes, or rather morpheme combinations, are ambiguous, which may affect the normalization. For this reason, we disambiguate the morpheme sequences with conditional random fields. Once we know the individual morphemes of a word, we can generate the normalized word form from the disambiguated morphemes.

Statistics

Downloads

81 downloads since deposited on 29 Jul 2014
13 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), not refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:23 August 2014
Deposited On:29 Jul 2014 10:47
Last Modified:08 Dec 2017 06:39
Publisher:s.n.
Official URL:http://corporavm.uni-koeln.de/vardial/papers/4/4_Paper.pdf
Related URLs:http://corporavm.uni-koeln.de/vardial/
http://corporavm.uni-koeln.de/vardial/program.html

Download

Download PDF  'Morphological Disambiguation and Text Normalization for Southern Quechua Varieties'.
Preview
Content: Published Version
Language: English
Filetype: PDF
Size: 131kB