Header

UZH-Logo

Maintenance Infos

Micro-text classification between small and big data


Christen, Markus; Niederberger, Thomas; Ott, Thomas; Aryobsei, Suleiman; Hofstetter, Reto (2015). Micro-text classification between small and big data. Nonlinear Theory and Its Applications, 6(4):556-569.

Abstract

Micro-texts emerging from social media platforms have become an important source for research. Automatized classification and interpretation of such micro-texts is challenging. The problem is exaggerated if the number of texts is at a medium level, making it too small for effective machine learning, but too big to be efficiently analyzed solely by humans. We present a semi-supervised learning system for micro-text classification that combines machine learning techniques with the unmatched human ability for making demanding, i.e. nonlinear decisions based on sparse data. We compare our system with human performance and a predefined optimal classifier using a validated benchmark data-set.

Abstract

Micro-texts emerging from social media platforms have become an important source for research. Automatized classification and interpretation of such micro-texts is challenging. The problem is exaggerated if the number of texts is at a medium level, making it too small for effective machine learning, but too big to be efficiently analyzed solely by humans. We present a semi-supervised learning system for micro-text classification that combines machine learning techniques with the unmatched human ability for making demanding, i.e. nonlinear decisions based on sparse data. We compare our system with human performance and a predefined optimal classifier using a validated benchmark data-set.

Statistics

Altmetrics

Downloads

1 download since deposited on 14 Dec 2015
0 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, not refereed, original work
Communities & Collections:01 Faculty of Theology > Center for Ethics
04 Faculty of Medicine > Institute of Biomedical Ethics and History of Medicine
08 University Research Priority Programs > Ethics
Dewey Decimal Classification:610 Medicine & health
Language:English
Date:October 2015
Deposited On:14 Dec 2015 15:20
Last Modified:05 Apr 2016 19:36
Publisher:Institute of Electronics, Information and Communication Engineers (IEICE)
ISSN:2185-4106
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.1587/nolta.6.556
Official URL:https://www.jstage.jst.go.jp/article/nolta/6/4/6_556/_pdf
Related URLs:http://www.nolta.ieice.org/data/what.html (Organisation)
http://www.nolta.ieice.org/data/archives/archives.html (Publisher)
https://www.jstage.jst.go.jp/article/nolta/6/4/6_556/_article (Publisher)

Download

Preview Icon on Download
Content: Published Version
Filetype: PDF (Verlags PDF) - Registered users only
Size: 373kB
View at publisher

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations