Header

UZH-Logo

Maintenance Infos

Crowdsourcing Swiss Dialect Transcriptions for Assessing Factors in Writing Variations


Clematide, Simon; Frick, Karina; Aepli, Noëmi; Goldman, Jean-Philippe (2016). Crowdsourcing Swiss Dialect Transcriptions for Assessing Factors in Writing Variations. In: Proceedings of the 13th Conference on Natural Language Processing (KONVENS) Bochum, Germany September 19–21, 2016, Bochum, Germany, 19 September 2016 - 21 September 2016, 62-67.

Abstract

In this paper, we systematically analyze writing variations of Swiss German in two existing corpora with standard German glosses, a corpus of 10,000 short text messages and a corpus of transcribed oral history recordings (90,000 tokens). We show that neither resource is sufficient for assessing factors in writing variations of users and describe a data collection project involving a citizen science community for solving this problem. Laymen will independently and redundantly transcribe 1,200 short samples (15-20 seconds) of audio material in Swiss German according to their own best practice.

Abstract

In this paper, we systematically analyze writing variations of Swiss German in two existing corpora with standard German glosses, a corpus of 10,000 short text messages and a corpus of transcribed oral history recordings (90,000 tokens). We show that neither resource is sufficient for assessing factors in writing variations of users and describe a data collection project involving a citizen science community for solving this problem. Laymen will independently and redundantly transcribe 1,200 short samples (15-20 seconds) of audio material in Swiss German according to their own best practice.

Statistics

Downloads

32 downloads since deposited on 21 Dec 2016
19 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Speech), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
06 Faculty of Arts > Center for Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
430 German & related languages
Uncontrolled Keywords:Citizen Science, Swiss German, Non-Standard Orthography
Language:English
Event End Date:21 September 2016
Deposited On:21 Dec 2016 16:51
Last Modified:18 Apr 2018 11:48
Publisher:Sprachwissenschaftliches Institut, Ruhr-Universität Bochum
Series Name:Bochumer Linguistische Arbeitsberichte
ISSN:2190-0949
Funders:SNF CRAGP1_164811/1
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Official URL:https://www.linguistics.rub.de/bla/016-konvens2016.pdf
Project Information:
  • : FunderSNSF
  • : Grant ID
  • : Project TitleSNF CRAGP1_164811/1

Download

Download PDF  'Crowdsourcing Swiss Dialect Transcriptions for Assessing Factors in Writing Variations'.
Preview
Content: Published Version
Filetype: PDF
Size: 1MB
Publisher License