How does media coverage of electoral campaigns distinguish parties and candidates in emerging democracies? To answer, we present a multi-step procedure that we apply in South Africa. First, we develop a theoretically informed classification of election coverage as either “narrow” or “broad” from within the entire corpus of news coverage during an electoral campaign. Second, to deploy our classification scheme, we use a supervised machine learning approach to classify news as “broad,” “narrow,” or “not election-related.” Finally, we combine our supervised classification with a topic modeling algorithm (BERTTopic) that is based on Bidirectional Encoder Representations from Transformers (BERT), in addition to other statistical and machine learning methods. The combination of our classification scheme, BERTTopic, and associated methods allows us to identify the main election-related themes among broad and narrow election-related coverage, and how different candidates and parties are associated with these themes. We provide an in-depth discussion of our method for interested users in the social sciences. We then apply our proposed techniques on text from nearly 100,000 news articles during South Africa’s 2014 campaign and test our empirical predictions about candidate and party coverage of corruption, the economy, health, public infrastructure, and security. The application of our method highlights a nuanced campaign environment in South Africa; candidates and parties frequently receive distinct and substantive coverage on key campaign themes.
Read full abstract