Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Migration von ZORA auf die Software DSpace

ZORA will change to a new software on 8th September 2025. Please note: deadline for new submissions is 21th July 2025!

Information & dates for training courses can be found here: Information on Software Migration.

Reading Does Not Equal Reading: Comparing, Simulating and Exploiting Reading Behavior Across Populations

Reich, David Robert; Deng, Shuwen; Björnsdóttir, Marina; Jäger, Lena A; Hollenstein, Nora (2024). Reading Does Not Equal Reading: Comparing, Simulating and Exploiting Reading Behavior Across Populations. In: Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, 20 May 2024 - 25 May 2024.

Abstract

Eye-tracking-while-reading corpora play a crucial role in the study of human language processing, and, more recently, have been leveraged for cognitively enhancing neural language models. A critical limitation of existing corpora is that they often lack diversity, comprising primarily native speakers. In this study, we expand the eye-tracking-while-reading dataset CopCo, which initially included only Danish L1 readers with and without dyslexia, by incorporating a new dataset of non-native readers with diverse L1 backgrounds. Thus, the extended CopCo corpus constitutes the first eye-tracking-while-reading dataset encompassing neurotypical L1 and L1 readers with dyslexia as well as non-native readers, all reading the same materials. We first provide extensive descriptive statistics of the extended CopCo corpus. Second, we investigate how different degrees of diversity of the training data affect a state-of-the-art generative model of eye movements in reading. Finally, we use this scanpath generation model for gaze-augmented language modeling and investigate the impact of diversity in the training data on the model’s performance on a range of NLP downstream tasks. The code can be found here: https://github.com/norahollenstein/copco-processing.

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:410 Linguistics
000 Computer science, knowledge & systems
Scopus Subject Areas:Physical Sciences > Theoretical Computer Science
Physical Sciences > Computational Theory and Mathematics
Physical Sciences > Computer Science Applications
Language:English
Event End Date:25 May 2024
Deposited On:09 Feb 2025 15:45
Last Modified:10 Feb 2025 21:04
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Official URL:https://aclanthology.org/2024.lrec-main.1187/
Download PDF  'Reading Does Not Equal Reading: Comparing, Simulating and Exploiting Reading Behavior Across Populations'.
Preview
  • Content: Published Version
  • Language: English
  • Licence: Creative Commons: Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Metadata Export

Statistics

Citations

Downloads

30 downloads since deposited on 09 Feb 2025
30 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications