Header

UZH-Logo

Maintenance Infos

Tagging Complex Non-Verbal German Chunks with Conditional Random Fields


Roth, Luzia; Clematide, Simon (2014). Tagging Complex Non-Verbal German Chunks with Conditional Random Fields. In: Proceedings of the 12th Edition of the KONVENS Converence, Hildesheim, Germany, October 8-10, 2014, Hildesheim, Germany, 8 October 2014 - 10 October 2014, 48-57.

Abstract

We report on chunk tagging methods for German that recognize complex non-verbal phrases using structural chunk tags with Conditional Random Fields (CRFs). This state-of-the-art method for sequence classification achieves 93.5% accuracy on newspaper text. For the same task, a classical trigram tagger approach based on Hidden Markov Models reaches a baseline of 88.1%. CRFs allow for a clean and principled integration of linguistic knowledge such as part-of-speech tags, morphological constraints and lemmas. The structural chunk tags encode phrase structures up to a depth of 3 syntactic nodes. They include complex prenominal and postnominal modifiers that occur frequently in German noun phrases.

Abstract

We report on chunk tagging methods for German that recognize complex non-verbal phrases using structural chunk tags with Conditional Random Fields (CRFs). This state-of-the-art method for sequence classification achieves 93.5% accuracy on newspaper text. For the same task, a classical trigram tagger approach based on Hidden Markov Models reaches a baseline of 88.1%. CRFs allow for a clean and principled integration of linguistic knowledge such as part-of-speech tags, morphological constraints and lemmas. The structural chunk tags encode phrase structures up to a depth of 3 syntactic nodes. They include complex prenominal and postnominal modifiers that occur frequently in German noun phrases.

Statistics

Altmetrics

Downloads

453 downloads since deposited on 14 Oct 2014
6 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Uncontrolled Keywords:Chunking, CRF, German
Language:English
Event End Date:10 October 2014
Deposited On:14 Oct 2014 15:45
Last Modified:30 Jul 2020 14:41
ISBN:978-3-934105-46-1
OA Status:Green
Related URLs:http://www.uni-hildesheim.de/konvens2014 (Organisation)