Header

UZH-Logo

Maintenance Infos

Text Zoning and Classification for Job Advertisements in German, French and English


Gnehm, Ann-Sophie; Clematide, Simon (2020). Text Zoning and Classification for Job Advertisements in German, French and English. In: Proceedings of the Fourth Workshop on Natural Language Processing and Computational Social Science, Online, 1 November 2020 - 30 November 2020, 83-93.

Abstract

We present experiments to structure job ads into text zones and classify them into pro- fessions, industries and management functions, thereby facilitating social science analyses on labor marked demand. Our main contribution are empirical findings on the benefits of contextualized embeddings and the potential of multi-task models for this purpose. With contextualized in-domain embeddings in BiLSTM-CRF models, we reach an accuracy of 91% for token-level text zoning and outperform previous approaches. A multi-tasking BERT model performs well for our classification tasks. We further compare transfer approaches for our multilingual data.

Abstract

We present experiments to structure job ads into text zones and classify them into pro- fessions, industries and management functions, thereby facilitating social science analyses on labor marked demand. Our main contribution are empirical findings on the benefits of contextualized embeddings and the potential of multi-task models for this purpose. With contextualized in-domain embeddings in BiLSTM-CRF models, we reach an accuracy of 91% for token-level text zoning and outperform previous approaches. A multi-tasking BERT model performs well for our classification tasks. We further compare transfer approaches for our multilingual data.

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Downloads

1 download since deposited on 15 Feb 2021
1 download since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Speech), not_refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Sociology
06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:300 Social sciences, sociology & anthropology
Language:English
Event End Date:30 November 2020
Deposited On:15 Feb 2021 10:39
Last Modified:16 Feb 2021 18:00
OA Status:Hybrid
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.18653/v1/2020.nlpcss-1.10

Download

Hybrid Open Access

Download PDF  'Text Zoning and Classification for Job Advertisements in German, French and English'.
Preview
Content: Published Version
Language: English
Filetype: PDF
Size: 257kB
View at publisher
Licence: Creative Commons: Attribution 4.0 International (CC BY 4.0)