Header

UZH-Logo

Maintenance Infos

CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects


Clematide, Simon; Makarov, Peter (2017). CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects. In: Fourth Workshop on NLP for Similar Languages, Varieties and Dialects, Valencia, 3 April 2017 - 3 April 2017, 170-177.

Abstract

Our submissions for the GDI 2017 Shared Task are the results from three different types of classifiers: Naive Bayes, Conditional Random Fields (CRF), and Support Vector Machine (SVM). Our CRF-based run achieves a weighted F1 score of 65% (third rank) being beaten by the best system by 0.9%. Measured by classification accuracy, our ensemble run (Naive Bayes, CRF, SVM) reaches 67% (second rank) being 1% lower than the best system. We also describe our experiments with Recurrent Neural Network (RNN) architectures. Since they performed worse than our non-neural approaches we did not include them in the submission.

Abstract

Our submissions for the GDI 2017 Shared Task are the results from three different types of classifiers: Naive Bayes, Conditional Random Fields (CRF), and Support Vector Machine (SVM). Our CRF-based run achieves a weighted F1 score of 65% (third rank) being beaten by the best system by 0.9%. Measured by classification accuracy, our ensemble run (Naive Bayes, CRF, SVM) reaches 67% (second rank) being 1% lower than the best system. We also describe our experiments with Recurrent Neural Network (RNN) architectures. Since they performed worse than our non-neural approaches we did not include them in the submission.

Statistics

Downloads

12 downloads since deposited on 20 Feb 2018
12 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Uncontrolled Keywords:dialect identification, machine learning,
Language:English
Event End Date:3 April 2017
Deposited On:20 Feb 2018 16:58
Last Modified:31 Jul 2018 04:40
Publisher:Association for Computational Linguistics
Funders:European Research Council Grant No. 338875
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Official URL:http://www.aclweb.org/anthology/W17-1221

Download

Download PDF  'CLUZH at VarDial GDI 2017: Testing a Variety of Machine Learning Tools for the Classification of Swiss German Dialects'.
Preview
Content: Published Version
Language: English
Filetype: PDF
Size: 737kB