Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

GazPNE2: A General Place Name Extractor for Microblogs Fusing Gazetteers and Pretrained Transformer Models

Hu, Xuke; Zhou, Zhiyong; Sun, Yeran; Kersten, Jens; Klan, Friederike; Fan, Hongchao; Wiegmann, Matti (2022). GazPNE2: A General Place Name Extractor for Microblogs Fusing Gazetteers and Pretrained Transformer Models. IEEE Internet of Things Journal, 9(17):16259-16271.

Abstract

The concept of “human as sensors” defines a new sensing model, in which humans act as sensors by contributing their observations, perceptions, and sensations. This is crucial for the development of Social Internet of Things, which is an integral part of cyber–physical–social systems. Online social media platforms, as the most active places where users act as social sensors, are responsive to real-world events and are useful for gathering situational information in real time. Unfortunately, posts rarely contain structured geographic information, thus hindering their usage for contributing to various challenges, such as emergency response. We address this limitation by introducing a general approach for extracting place names from tweets, named GazPNE2. It combines global gazetteers (i.e., OpenStreetMap and GeoNames), deep learning, and pretrained transformer models (i.e., BERT and BERTweet), which requires no manually annotated data. It can extract place names at both coarse (e.g., city) and fine-grained (e.g., street and POI) levels and place names with abbreviations. To fully evaluate GazPNE2 and compare it with 11 competing approaches, we use 19 public tweet data sets, containing 38 802 tweets and 22 197 places across the world. The results show GazPNE2 achieves a much higher F1 (0.8) than the other approaches. Furthermore, we apply GazPNE2 to three large unannotated tweet data sets related to over 20 crisis events (e.g., coronavirus disease 2019), containing 560 040 tweets. An F1 of 0.84 is achieved on 3000 tweets, which are randomly selected from the three data sets and then manually annotated.

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Geography
Dewey Decimal Classification:910 Geography & travel
Scopus Subject Areas:Physical Sciences > Signal Processing
Physical Sciences > Information Systems
Physical Sciences > Hardware and Architecture
Physical Sciences > Computer Science Applications
Physical Sciences > Computer Networks and Communications
Uncontrolled Keywords:Computer Networks and Communications, Computer Science Applications, Hardware and Architecture, Information Systems, Signal Processing
Language:English
Date:1 September 2022
Deposited On:08 Dec 2022 16:11
Last Modified:28 Oct 2024 02:39
Publisher:Institute of Electrical and Electronics Engineers
ISSN:2327-4662
Additional Information:Code and data are available on GitHub page: https://github.com/uhuohuy/GazPNE2.
OA Status:Closed
Publisher DOI:https://doi.org/10.1109/jiot.2022.3150967

Metadata Export

Statistics

Citations

Dimensions.ai Metrics
24 citations in Web of Science®
28 citations in Scopus®
Google Scholar™

Altmetrics

Downloads

1 download since deposited on 08 Dec 2022
0 downloads since 12 months
Detailed statistics

Authors, Affiliations, Collaborations

Similar Publications