UZH-Logo

Electoral Campaigns and Relation Mining: Extracting Semantic Network Data from Newspaper Articles


Wüest, B; Clematide, S; Bünzli, A; Laupper, D (2011). Electoral Campaigns and Relation Mining: Extracting Semantic Network Data from Newspaper Articles. Journal of Information Technology & Politics, 8(4):444-463.

Abstract

Among the many applications in social science for the entry and management of data, there are only a few software packages that apply natural language processing to identify semantic concepts such as issue categories or political statements by actors. Although these procedures usually allow efficient data collection, most have difficulty in achieving sufficient accuracy because of the high complexity and mutual relationships of the variables used in the social sciences. To address these flaws, we suggest a (semi-) automatic annotation approach that implements an innovative coding method (Core Sentence Analysis) by computational linguistic techniques (mainly entity recognition, concept identification, and dependency parsing). Although such computational linguistic tools have been readily available for quite a long time, social scientists have made astonishingly little use of them. The principal aim of this article is to gather data on party-issue relationships from newspaper articles. In the first stage, we try to recognize relations between parties and issues with a fully automated system. This recognition is extensively tested against manually annotated data of the coverage in the boulevard newspaper Blick of the Swiss national parliamentary elections of 2003 and 2007. In the second stage, we discuss possibilities for extending our approach, such as by enriching these relations with directional measures indicating their polarity.

Among the many applications in social science for the entry and management of data, there are only a few software packages that apply natural language processing to identify semantic concepts such as issue categories or political statements by actors. Although these procedures usually allow efficient data collection, most have difficulty in achieving sufficient accuracy because of the high complexity and mutual relationships of the variables used in the social sciences. To address these flaws, we suggest a (semi-) automatic annotation approach that implements an innovative coding method (Core Sentence Analysis) by computational linguistic techniques (mainly entity recognition, concept identification, and dependency parsing). Although such computational linguistic tools have been readily available for quite a long time, social scientists have made astonishingly little use of them. The principal aim of this article is to gather data on party-issue relationships from newspaper articles. In the first stage, we try to recognize relations between parties and issues with a fully automated system. This recognition is extensively tested against manually annotated data of the coverage in the boulevard newspaper Blick of the Swiss national parliamentary elections of 2003 and 2007. In the second stage, we discuss possibilities for extending our approach, such as by enriching these relations with directional measures indicating their polarity.

Altmetrics

Downloads

165 downloads since deposited on 22 Mar 2012
23 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Political Science
Dewey Decimal Classification:320 Political science
Uncontrolled Keywords:Computer-assisted content analysis, core sentence approach, electoral research, natural language processing, relation mining
Language:English
Date:2011
Deposited On:22 Mar 2012 09:25
Last Modified:05 Apr 2016 15:34
Publisher:Taylor & Francis Inc.
ISSN:1933-1681
Funders:Swiss National Science Foundation
Publisher DOI:10.1080/19331681.2011.567387
Permanent URL: http://doi.org/10.5167/uzh-58296

Download

[img]
Content: Published Version
Filetype: PDF - Registered users only
Size: 769kB
View at publisher
[img]
Preview
Content: Accepted Version
Filetype: PDF
Size: 893kB

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations