Header

UZH-Logo

Maintenance Infos

Machine Learning Disambiguation of Quechua Verb Morphology


Rios, A; Göhring, A (2013). Machine Learning Disambiguation of Quechua Verb Morphology. In: Proceedings of the Second Workshop on Hybrid Approaches to Translation, Sofia, Bulgaria, 8 August 2013. Association for Computational Linguistics, 13-18.

Abstract

We have implemented a rule-based prototype of a Spanish-to-Cuzco Quechua MT system enhanced through the addition of statistical components. The greatest difficulty during the translation process is to generate the correct Quechua verb form in subordinated clauses. The prototype has several rules that decide which verb form should be used in a given context. However, matching the context in order to apply the correct rule depends crucially on the parsing quality of the Spanish input. As the form of the subordinated verb depends heavily on the conjunction in the subordinated Spanish clause and the semantics of the main verb, we extracted this information from two treebanks and trained different classifiers on this data. We tested the best classifier on a set of 4 texts, increasing the correct subordinated verb forms from 80% to 89%.

Abstract

We have implemented a rule-based prototype of a Spanish-to-Cuzco Quechua MT system enhanced through the addition of statistical components. The greatest difficulty during the translation process is to generate the correct Quechua verb form in subordinated clauses. The prototype has several rules that decide which verb form should be used in a given context. However, matching the context in order to apply the correct rule depends crucially on the parsing quality of the Spanish input. As the form of the subordinated verb depends heavily on the conjunction in the subordinated Spanish clause and the semantics of the main verb, we extracted this information from two treebanks and trained different classifiers on this data. We tested the best classifier on a set of 4 texts, increasing the correct subordinated verb forms from 80% to 89%.

Statistics

Citations

Downloads

161 downloads since deposited on 22 Aug 2013
5 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:8 August 2013
Deposited On:22 Aug 2013 06:44
Last Modified:06 Apr 2022 15:18
Publisher:Association for Computational Linguistics
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Official URL:http://www.aclweb.org/anthology/W13-2804
  • Content: Published Version
  • Language: English