Header

UZH-Logo

Maintenance Infos

ArchiMob: ein multidialektales Korpus schweizerdeutscher Spontansprache


Scherrer, Yves; Samardžić, Tanja; Glaser, Elvira (2019). ArchiMob: ein multidialektales Korpus schweizerdeutscher Spontansprache. Linguistik Online, 98(5):425-454.

Abstract

Although Swiss dialects of German are widely used in everyday communication, automatic processing of Swiss German is still a considerable challenge due to the fact that it is mostly a spoken variety and that it is subject to considerable regional variation. This paper presents the ArchiMob corpus, a freely available general-purpose corpus of transcribed spoken Swiss German based on oral history interviews. The corpus is a result of a long design process, intensive manual work and specially adapted computational processing. We first present the modalities of access of the corpus for dialectological, historical and computational research. We then describe how the documents were transcribed, segmented and aligned with the sound source, and summarise a series of experiments that have led to automatically annotated normalisation and part-of-speech tagging layers. Finally, we present several case studies to stimulate the use of the corpus for dialectological research.

Abstract

Although Swiss dialects of German are widely used in everyday communication, automatic processing of Swiss German is still a considerable challenge due to the fact that it is mostly a spoken variety and that it is subject to considerable regional variation. This paper presents the ArchiMob corpus, a freely available general-purpose corpus of transcribed spoken Swiss German based on oral history interviews. The corpus is a result of a long design process, intensive manual work and specially adapted computational processing. We first present the modalities of access of the corpus for dialectological, historical and computational research. We then describe how the documents were transcribed, segmented and aligned with the sound source, and summarise a series of experiments that have led to automatically annotated normalisation and part-of-speech tagging layers. Finally, we present several case studies to stimulate the use of the corpus for dialectological research.

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Downloads

15 downloads since deposited on 18 Dec 2019
15 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of German Studies
Dewey Decimal Classification:430 German & related languages
Language:German
Date:2019
Deposited On:18 Dec 2019 16:14
Last Modified:18 Dec 2019 16:14
Publisher:European University Viadrina
ISSN:1615-3014
OA Status:Gold
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.13092/lo.98.5947
Related URLs:https://www.recherche-portal.ch/permalink/f/5u2s2l/ebi01_prod005557589 (Library Catalogue)

Download

Gold Open Access

Download PDF  'ArchiMob: ein multidialektales Korpus schweizerdeutscher Spontansprache'.
Preview
Content: Published Version
Language: German
Filetype: PDF
Size: 1MB
View at publisher