Header

UZH-Logo

Maintenance Infos

Few-Shot Information Operation Detection Using Active Learning Approach


Alizadeh, Meysam; Shapiro, Jacob N (2023). Few-Shot Information Operation Detection Using Active Learning Approach. In: Thomson, Robert L; Al-khateeb, Samer; Burger, Annetta; Park, Patrick; Pyke, Aryn A. Social, Cultural, and Behavioral Modeling. 16th International Conference, SBP-BRiMS 2023, Pittsburgh, PA, USA, September 20–22, 2023, Proceedings. Cham: Springer, 253-262.

Abstract

Previous research suggested that supervised machine learning can be utilized to detect information operations (IO) on social media. Most of the related research assumes that the new data will always be available in the exact timing that models set to be updated. In practice, however, the detection and attribution of IO accounts is time-consuming. There is thus a mismatch between the performance assessment procedures in existing work and the real-world problem they seek to solve. We bridge this gap by demonstrating how active learning approaches can extend the application of classifiers by reducing their dependence on new data. We evaluate the performance of an existing classifier when it gets updated according to five active learning strategies. Using state-sponsored information operation Twitter data, the results show that if querying from Twitter is possible, the best active learning strategy requires 5–10 times less tweets than the original model while only showing 1–3% reduction in the average monthly F1 scores across countries and prediction tasks. If querying from Twitter is not possible, the corresponding active learning strategy requires 5–10 times less tweets while showing 1–9% reduction in the average monthly F1 scores. Depending on the country, a hand-full to few hundred new ground-truth examples would suffice to achieve a reasonable performance.

Abstract

Previous research suggested that supervised machine learning can be utilized to detect information operations (IO) on social media. Most of the related research assumes that the new data will always be available in the exact timing that models set to be updated. In practice, however, the detection and attribution of IO accounts is time-consuming. There is thus a mismatch between the performance assessment procedures in existing work and the real-world problem they seek to solve. We bridge this gap by demonstrating how active learning approaches can extend the application of classifiers by reducing their dependence on new data. We evaluate the performance of an existing classifier when it gets updated according to five active learning strategies. Using state-sponsored information operation Twitter data, the results show that if querying from Twitter is possible, the best active learning strategy requires 5–10 times less tweets than the original model while only showing 1–3% reduction in the average monthly F1 scores across countries and prediction tasks. If querying from Twitter is not possible, the corresponding active learning strategy requires 5–10 times less tweets while showing 1–9% reduction in the average monthly F1 scores. Depending on the country, a hand-full to few hundred new ground-truth examples would suffice to achieve a reasonable performance.

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Additional indexing

Item Type:Book Section, refereed, original work
Communities & Collections:06 Faculty of Arts > Institute of Political Science
Dewey Decimal Classification:320 Political science
Scopus Subject Areas:Physical Sciences > Theoretical Computer Science
Physical Sciences > General Computer Science
Uncontrolled Keywords:Information operation, Active learning, Text classification
Language:English
Date:2023
Deposited On:04 Jan 2024 10:45
Last Modified:07 Mar 2024 04:50
Publisher:Springer
Series Name:Lecture Notes in Computer Science
ISSN:0302-9743
ISBN:9783031431289
OA Status:Closed
Publisher DOI:https://doi.org/10.1007/978-3-031-43129-6_25
Full text not available from this repository.