Header

UZH-Logo

Maintenance Infos

Ranking Georeferences for Efficient Crowdsourcing of Toponym Annotations in a Historical Corpus of Alpine Texts


Goldzycher, Janis; Meraner, Isabel; Volk, Martin; Clematide, Simon (2020). Ranking Georeferences for Efficient Crowdsourcing of Toponym Annotations in a Historical Corpus of Alpine Texts. In: 5th Swiss Text Analytics Conference (SwissText) & 16th Conference on Natural Language Processing (KONVENS), Zurich, 23 June 2020 - 25 June 2020. CEUR-WS, online.

Abstract

This paper presents a simple method to rank georeference candidates to optimally support the workflow of a citizen science web application for toponym annotation in historical texts. We implement the general idea of efficient crowdsourcing based on human and artificial intelligence working hand in hand. For named entity recognition, we apply recent neural pretraining-based NER tagger methods. For named entity linking to geographical knowledge bases, we report on georeference ranking experiments testing the hypothesis that textual proximity indicates geographic proximity. Simulation results with online reranking that immediately integrates user verification show further improvements.

Abstract

This paper presents a simple method to rank georeference candidates to optimally support the workflow of a citizen science web application for toponym annotation in historical texts. We implement the general idea of efficient crowdsourcing based on human and artificial intelligence working hand in hand. For named entity recognition, we apply recent neural pretraining-based NER tagger methods. For named entity linking to geographical knowledge bases, we report on georeference ranking experiments testing the hypothesis that textual proximity indicates geographic proximity. Simulation results with online reranking that immediately integrates user verification show further improvements.

Statistics

Downloads

18 downloads since deposited on 15 Feb 2021
18 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Scopus Subject Areas:Physical Sciences > General Computer Science
Uncontrolled Keywords:Named Entity Recognition, NER, Named Entity Linking, NEL, Georeferencing
Language:English
Event End Date:25 June 2020
Deposited On:15 Feb 2021 06:30
Last Modified:30 Oct 2021 07:02
Publisher:CEUR-WS
Series Name:CEUR Workshop Proceedings
ISSN:1613-0073
OA Status:Green
Free access at:Publisher DOI. An embargo period may apply.
Official URL:http://ceur-ws.org/Vol-2624/paper11.pdf

Download

Green Open Access

Download PDF  'Ranking Georeferences for Efficient Crowdsourcing of Toponym Annotations in a Historical Corpus of Alpine Texts'.
Preview
Content: Published Version
Language: English
Filetype: PDF
Size: 1MB
Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)