Header

UZH-Logo

Maintenance Infos

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers


Barman, Raphaël; Ehrmann, Maud; Clematide, Simon; Oliveira, Sofia Ares; Kaplan, Frédéric (2021). Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers. arXiv.org 06144v4, University of Zurich.

Abstract

The massive amounts of digitized historical documents acquired over the last decades naturally lend themselves to automatic processing and exploration. Research work seeking to automatically process facsimiles and extract information thereby are multiplying with, as a first essential step, document layout analysis. If the identification and categorization of segments of interest in document images have seen significant progress over the last years thanks to deep learning techniques, many challenges remain with, among others, the use of finer-grained segmentation typologies and the consideration of complex, heterogeneous documents such as historical newspapers. Besides, most approaches consider visual features only, ignoring textual signal. In this context, we introduce a multimodal approach for the semantic segmentation of historical newspapers that combines visual and textual features. Based on a series of experiments on diachronic Swiss and Luxembourgish newspapers, we investigate, among others, the predictive power of visual and textual features and their capacity to generalize across time and sources. Results show consistent improvement of multimodal models in comparison to a strong visual baseline, as well as better robustness to high material variance.

Abstract

The massive amounts of digitized historical documents acquired over the last decades naturally lend themselves to automatic processing and exploration. Research work seeking to automatically process facsimiles and extract information thereby are multiplying with, as a first essential step, document layout analysis. If the identification and categorization of segments of interest in document images have seen significant progress over the last years thanks to deep learning techniques, many challenges remain with, among others, the use of finer-grained segmentation typologies and the consideration of complex, heterogeneous documents such as historical newspapers. Besides, most approaches consider visual features only, ignoring textual signal. In this context, we introduce a multimodal approach for the semantic segmentation of historical newspapers that combines visual and textual features. Based on a series of experiments on diachronic Swiss and Luxembourgish newspapers, we investigate, among others, the predictive power of visual and textual features and their capacity to generalize across time and sources. Results show consistent improvement of multimodal models in comparison to a strong visual baseline, as well as better robustness to high material variance.

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Downloads

20 downloads since deposited on 23 Feb 2021
20 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Working Paper
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Uncontrolled Keywords:Computer Science - Computer Vision and Pattern Recognition , Computer Science - Computation and Language , Computer Science - Information Retrieval , Computer Science - Machine Learning
Language:English
Date:19 January 2021
Deposited On:23 Feb 2021 16:13
Last Modified:23 Feb 2021 16:13
Series Name:arXiv.org
ISSN:2331-8422
OA Status:Green
Free access at:Official URL. An embargo period may apply.
Publisher DOI:https://doi.org/10.5281/zenodo.3706863
Official URL:https://arxiv.org/abs/2002.06144
Related URLs:https://jdmdh.episciences.org/7097
Project Information:
  • : FunderSNSF
  • : Grant IDCRSII5_173719
  • : Project TitleMedia Monitoring of the Past

Download

Green Open Access

Download PDF  'Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers'.
Preview
Content: Published Version
Language: English
Filetype: PDF
Size: 5MB
View at publisher
Licence: Creative Commons: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)