Publication: Word and sentence segmentation in german: Overcoming idiosyncrasies in the use of punctuation in private communication
Word and sentence segmentation in german: Overcoming idiosyncrasies in the use of punctuation in private communication
Date
Date
Date
Citations
Sugisaki, K. (2017, September). Word and sentence segmentation in german: Overcoming idiosyncrasies in the use of punctuation in private communication. 27th International Conference, GSCL 2017, Berlin. https://doi.org/10.1007/978-3-319-73706-5_6
Abstract
Abstract
Abstract
In this paper, we present a segmentation system for German texts. We apply conditional random fields (CRF), a statistical sequential model, to a type of text used in private communication. We show that by segmenting individual punctuation, and by taking into account freestanding lines and that using unsupervised word representation (i.e., Brown clustering, Word2Vec and Fasttext) achieved a label accuracy of 96% in a corpus of postcards used in private communication.
Additional indexing
Creators (Authors)
Event Title
Event Title
Event Title
Event Location
Event Location
Event Location
Event Start Date
Event Start Date
Event Start Date
Event End Date
Event End Date
Event End Date
Item Type
Item Type
Item Type
Dewey Decimal Classifikation
Dewey Decimal Classifikation
Dewey Decimal Classifikation
Language
Language
Language
Date available
Date available
Date available
OA Status
OA Status
OA Status
Free Access at
Free Access at
Free Access at
Publisher DOI
Citations
Sugisaki, K. (2017, September). Word and sentence segmentation in german: Overcoming idiosyncrasies in the use of punctuation in private communication. 27th International Conference, GSCL 2017, Berlin. https://doi.org/10.1007/978-3-319-73706-5_6