Navigation auf zora.uzh.ch

Search

ZORA (Zurich Open Repository and Archive)

Correlations and predictions of reading times using language models and surprisal

Schneider, Gerold (2022). Correlations and predictions of reading times using language models and surprisal. In: Krug, Manfred; Schützler, Ole; Vetter, Fabian; Werner, Valentin. Perspectives on Contemporary English : Structure, Variation, Cognition. Berlin, Bern, Bruxelles, New York, Oxford, Warszawa, Wien: Peter Lang, 209-243.

Abstract

How well can we predict reading times and thus cognitive processing load? This study first assesses correlations between reading times and then uses linear regression to predict reading times from two corpora. We suggest noise reduction methods using reader means and medians to obtain generalisations across individuals. This leads to much higher correlations, prediction accuracy and model fit. Our best models reach a prediction accuracy that is, on average, 37 % off, or that explains up to 54 % of the variation in our data, according to R^2. As the offness is smaller than the standard deviation, we accurately predict a potential reader. We use surprisal, part-of-speech (POS) tags, syntax, and many other features such as word length as a language model. Discourse-related features, for which we use distributional semantic similarity and the distance to previous occurrences, are shown to play a significant role Morphosyntactic (POS tags) and syntactic features (dependency labels) are also significant, though with a smaller weight. We also observe that fast readers correlate better to surprisal and our models.

Additional indexing

Item Type:Book Section, refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
06 Faculty of Arts > Zurich Center for Linguistics
06 Faculty of Arts > Linguistic Research Infrastructure (LiRI)
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Uncontrolled Keywords:eye-tracking, reading times, surprisal, language models, cognitive linguistics, discourse, syntax
Language:English
Date:2022
Deposited On:09 Dec 2022 11:36
Last Modified:09 Dec 2022 11:36
Publisher:Peter Lang
ISBN:9783631879733
OA Status:Closed
Publisher DOI:https://doi.org/10.3726/b19739

Metadata Export

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Downloads

7 downloads since deposited on 09 Dec 2022
1 download since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications