Sugisaki, Kyoko; Wiedmer, Nicolas; Hausendorf, Heiko (2018). Building a Corpus from Handwritten Picture Postcards: Transcription, Annotation and Part-of-Speech Tagging. In: 11th edition of the Language Resources and Evaluation Conference, Miyazaki, Japan, 7 May 2018 - 12 May 2018, The LREC.
Abstract
In this paper, we present a corpus of over 11,000 holiday picture postcards written in German and Swiss German. We discuss the processes of digitalization, transcription, manual annotation and the development of the automatic text segmentation and part-of-speech tagging.