Header

UZH-Logo

Maintenance Infos

Systematically Detecting Patterns of Social, Historical and Linguistic Change: The Framing of Poverty in Times of Poverty


Schneider, Gerold (2022). Systematically Detecting Patterns of Social, Historical and Linguistic Change: The Framing of Poverty in Times of Poverty. Transactions of the Philological Society, 120(3):447-473.

Abstract

The linguistic DNA project seeks to understand the evolution of philosophy, society and language during the Modern English period. Corpora like Early English Books Online (EEBO), Corpus of Late Modern English Texts (CLMET) and Corpus of Historical American English (COHA) allow us to apply statistical data-driven models extracting patterns to confirm our expectations. As systems biology has revolutionised biology by systematically searching for all patterns, we detect patterns in our data systematically with contextual and distributional semantic approaches, an approach that could be called systems history. We uncover semantic patterns with methods from text mining, computational linguistics and digital humanities. We normalise the spelling automatically to present-day variants and use bottom-up analyses to step from words to concepts: collocations, topic modelling and distributional semantics. We illustrate the approaches with two case studies: associations of poverty changing across time, and Charles Dickens social criticism, his vision of helping to improve the situation of the poor. As no gold standard for our task exists, our approaches are exploratory, which entails considerable manual intervention, e.g. sifting candidate lists, reading excerpts and interpreting topic models. A fully automatic approach is currently neither feasible nor envisaged: semi-automatic approaches give researchers the inspiring opportunity to interact with the texts in a constant move between distant and close reading. The different characteristics of the various statistical methods offer complementary perspectives.

Abstract

The linguistic DNA project seeks to understand the evolution of philosophy, society and language during the Modern English period. Corpora like Early English Books Online (EEBO), Corpus of Late Modern English Texts (CLMET) and Corpus of Historical American English (COHA) allow us to apply statistical data-driven models extracting patterns to confirm our expectations. As systems biology has revolutionised biology by systematically searching for all patterns, we detect patterns in our data systematically with contextual and distributional semantic approaches, an approach that could be called systems history. We uncover semantic patterns with methods from text mining, computational linguistics and digital humanities. We normalise the spelling automatically to present-day variants and use bottom-up analyses to step from words to concepts: collocations, topic modelling and distributional semantics. We illustrate the approaches with two case studies: associations of poverty changing across time, and Charles Dickens social criticism, his vision of helping to improve the situation of the poor. As no gold standard for our task exists, our approaches are exploratory, which entails considerable manual intervention, e.g. sifting candidate lists, reading excerpts and interpreting topic models. A fully automatic approach is currently neither feasible nor envisaged: semi-automatic approaches give researchers the inspiring opportunity to interact with the texts in a constant move between distant and close reading. The different characteristics of the various statistical methods offer complementary perspectives.

Statistics

Citations

Altmetrics

Downloads

15 downloads since deposited on 16 Dec 2022
12 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:06 Faculty of Arts > English Department
06 Faculty of Arts > Institute of Computational Linguistics
08 Research Priority Programs > Digital Society Initiative
06 Faculty of Arts > Zurich Center for Linguistics
08 Research Priority Programs > Digital Religion(s)
06 Faculty of Arts > Linguistic Research Infrastructure (LiRI)
Dewey Decimal Classification:820 English & Old English literatures
Scopus Subject Areas:Social Sciences & Humanities > Language and Linguistics
Social Sciences & Humanities > Linguistics and Language
Language:English
Date:November 2022
Deposited On:16 Dec 2022 07:05
Last Modified:28 May 2024 01:43
Publisher:Wiley-Blackwell Publishing, Inc.
ISSN:0079-1636
OA Status:Green
Publisher DOI:https://doi.org/10.1111/1467-968X.12252