Header

UZH-Logo

Maintenance Infos

Web conversations about complementary and alternative medicines and cancer: content and sentiment analysis


Mazzocut, Mauro; Truccolo, Ivana; Antonini, Marialuisa; Rinaldi, Fabio; Omero, Paolo; Ferrarin, Emanuela; De Paoli, Paolo; Tasso, Carlo (2016). Web conversations about complementary and alternative medicines and cancer: content and sentiment analysis. Journal of Medical Internet Research, 18(6):e120.

Abstract

Background: The use of complementary and alternative medicine (CAM) among cancer patients is widespread and mostly self-administrated. Today, one of the most relevant topics is the nondisclosure of CAM use to doctors. This general lack of communication exposes patients to dangerous behaviors and to less reliable information channels, such as the Web. The Italian context scarcely differs from this trend. Today, we are able to mine and analyze systematically the unstructured information available in the Web, to get an insight of people’s opinions, beliefs, and rumors concerning health topics.
Objective: Our aim was to analyze Italian Web conversations about CAM, identifying the most relevant Web sources, therapies, and diseases and measure the related sentiment.
Methods: Data have been collected using the Web Intelligence tool ifMONITOR. The workflow consisted of 6 phases: (1) eligibility criteria definition for the ifMONITOR search profile; (2) creation of a CAM terminology database; (3) generic Web search and automatic filtering, the results have been manually revised to refine the search profile, and stored in the ifMONITOR database; (4) automatic classification using the CAM database terms; (5) selection of the final sample and manual sentiment analysis using a 1-5 score range; (6) manual indexing of the Web sources and CAM therapies type retrieved. Descriptive univariate statistics were computed for each item: absolute frequency, percentage, central tendency (mean sentiment score [MSS]), and variability (standard variation σ).
Results: Overall, 212 Web sources, 423 Web documents, and 868 opinions have been retrieved. The overall sentiment measured tends to a good score (3.6 of 5). Quite a high polarization in the opinions of the conversation partaking emerged from standard variation analysis (σ≥1). In total, 126 of 212 (59.4%) Web sources retrieved were nonhealth-related. Facebook (89; 21%) and Yahoo Answers (41; 9.7%) were the most relevant. In total, 94 CAM therapies have been retrieved. Most belong to the “biologically based therapies or nutrition” category: 339 of 868 opinions (39.1%), showing an MSS of 3.9 (σ=0.83). Within nutrition, “diets” collected 154 opinions (18.4%) with an MSS of 3.8 (σ=0.87); “food as CAM” overall collected 112 opinions (12.8%) with a MSS of 4 (σ=0.68). Excluding diets and food, the most discussed CAM therapy is the controversial Italian “Di Bella multitherapy” with 102 opinions (11.8%) with an MSS of 3.4 (σ=1.21). Breast cancer was the most mentioned disease: 81 opinions of 868.
Conclusions: Conversations about CAM and cancer are ubiquitous. There is a great concern about the biologically based therapies, perceived as harmless and useful, under-rating all risks related to dangerous interactions or malnutrition. Our results can be useful to doctors to be aware of the implications of these beliefs for the clinical practice. Web conversation exploitation could be a strategy to gain insights of people’s perspective for other controversial topics.

Abstract

Background: The use of complementary and alternative medicine (CAM) among cancer patients is widespread and mostly self-administrated. Today, one of the most relevant topics is the nondisclosure of CAM use to doctors. This general lack of communication exposes patients to dangerous behaviors and to less reliable information channels, such as the Web. The Italian context scarcely differs from this trend. Today, we are able to mine and analyze systematically the unstructured information available in the Web, to get an insight of people’s opinions, beliefs, and rumors concerning health topics.
Objective: Our aim was to analyze Italian Web conversations about CAM, identifying the most relevant Web sources, therapies, and diseases and measure the related sentiment.
Methods: Data have been collected using the Web Intelligence tool ifMONITOR. The workflow consisted of 6 phases: (1) eligibility criteria definition for the ifMONITOR search profile; (2) creation of a CAM terminology database; (3) generic Web search and automatic filtering, the results have been manually revised to refine the search profile, and stored in the ifMONITOR database; (4) automatic classification using the CAM database terms; (5) selection of the final sample and manual sentiment analysis using a 1-5 score range; (6) manual indexing of the Web sources and CAM therapies type retrieved. Descriptive univariate statistics were computed for each item: absolute frequency, percentage, central tendency (mean sentiment score [MSS]), and variability (standard variation σ).
Results: Overall, 212 Web sources, 423 Web documents, and 868 opinions have been retrieved. The overall sentiment measured tends to a good score (3.6 of 5). Quite a high polarization in the opinions of the conversation partaking emerged from standard variation analysis (σ≥1). In total, 126 of 212 (59.4%) Web sources retrieved were nonhealth-related. Facebook (89; 21%) and Yahoo Answers (41; 9.7%) were the most relevant. In total, 94 CAM therapies have been retrieved. Most belong to the “biologically based therapies or nutrition” category: 339 of 868 opinions (39.1%), showing an MSS of 3.9 (σ=0.83). Within nutrition, “diets” collected 154 opinions (18.4%) with an MSS of 3.8 (σ=0.87); “food as CAM” overall collected 112 opinions (12.8%) with a MSS of 4 (σ=0.68). Excluding diets and food, the most discussed CAM therapy is the controversial Italian “Di Bella multitherapy” with 102 opinions (11.8%) with an MSS of 3.4 (σ=1.21). Breast cancer was the most mentioned disease: 81 opinions of 868.
Conclusions: Conversations about CAM and cancer are ubiquitous. There is a great concern about the biologically based therapies, perceived as harmless and useful, under-rating all risks related to dangerous interactions or malnutrition. Our results can be useful to doctors to be aware of the implications of these beliefs for the clinical practice. Web conversation exploitation could be a strategy to gain insights of people’s perspective for other controversial topics.

Statistics

Citations

2 citations in Web of Science®
1 citation in Scopus®
Google Scholar™

Altmetrics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Computational Linguistics
Dewey Decimal Classification:000 Computer science, knowledge & systems
410 Linguistics
Language:English
Date:2016
Deposited On:27 Dec 2016 11:15
Last Modified:29 Jun 2017 08:03
Publisher:Gunther Eysenbach
ISSN:1438-8871
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.2196/jmir.5521
Related URLs:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4929351
PubMed ID:27311444
Other Identification Number:PMCID: 4929351

Download

Full text not available from this repository.
View at publisher

TrendTerms

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.
You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.

Author Collaborations