For now more than four decades, quantitative protest event analysis (PEA) has routinely contributed to the testing and reﬁnement of theories on political processes from diﬀerent perspectives. However, it is commonly agreed that PEA data face serious challenges regarding their data sources. Precisely, researchers applying PEA struggle with the fact that they cannot use multiple sources for large geographical areas and long time periods. As a consequence, most of the scholarship still focuses on a narrow set of European countries or the United States and does not cover the period since the early 2000s. We are bringing PEA and computational linguistics together to suggest and evaluate an approach that will enable political scientists to extend their research designs with a more eﬃcient and at the same time reliable data collection. The approach relies on hidden topic models, word space models, and named entity recognition to identify and code protest events.