Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Migration von ZORA auf die Software DSpace

ZORA will change to a new software on 8th September 2025. Please note: deadline for new submissions is 21th July 2025!

Information & dates for training courses can be found here: Information on Software Migration.

Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models

Säuberli, Andreas; Clematide, Simon (2024). Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models. In: Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) @ LREC-COLING 2024, Turin, Italy, 20 May 2024. ELRA and ICCL, 22-37.

Abstract

Reading comprehension tests are used in a variety of applications, reaching from education to assessing the comprehensibility of simplified texts. However, creating such tests manually and ensuring their quality is difficult and time-consuming. In this paper, we explore how large language models (LLMs) can be used to generate and evaluate multiple-choice reading comprehension items. To this end, we compiled a dataset of German reading comprehension items and developed a new protocol for human and automatic evaluation, including a metric we call text informativity, which is based on guessability and answerability. We then used this protocol and the dataset to evaluate the quality of items generated by Llama 2 and GPT-4. Our results suggest that both models are capable of generating items of acceptable quality in a zero-shot setting, but GPT-4 clearly outperforms Llama 2. We also show that LLMs can be used for automatic evaluation by eliciting item reponses from them. In this scenario, evaluation results with GPT-4 were the most similar to human annotators. Overall, zero-shot generation with LLMs is a promising approach for generating and evaluating reading comprehension test items, in particular for languages without large amounts of available data.

Additional indexing

Item Type:Conference or Workshop Item (Paper), original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Event End Date:20 May 2024
Deposited On:05 Jun 2024 18:14
Last Modified:05 Aug 2024 14:40
Publisher:ELRA and ICCL
OA Status:Green
Official URL:https://aclanthology.org/2024.readi-1.3
Project Information:
  • Funder: Innosuisse
  • Grant ID: PFFS-21-47
  • Project Title: Flagship Inclusive Information and Communication Technologies (IICT)
  • : Project Websitehttps://www.iict.uzh.ch/en.html
Download PDF  'Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models'.
Preview
  • Content: Published Version
  • Language: English
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)
Download PDF  'Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models'.
Preview
  • Content: Published Version
  • Language: German
  • Description: Zusammenfassung in leicht verständlichem Deutsch
  • Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)

Metadata Export

Statistics

Citations

Downloads

19 downloads since deposited on 05 Jun 2024
17 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications