Abstract

AbstractLarge language models (LLMs) like GPT-4 have the potential to dramatically change the landscape of influence operations. They can generate persuasive, tailored content at scale, making campaigns using falsified content, such as disinformation and fake accounts, easier to produce. Advances in self-hosted open-source models have meant that adversaries can evade content moderation and security checks built into large commercial models such as those commercialised by Anthropic, Google, and OpenAI. New multi-lingual models make it easier than ever for foreign adversaries to pose as local actors. This article examines the heightened threats posed by synthetic media, as well as the potential that these tools hold for creating effective countermeasures. It begins with assessing the challenges posed by a toxic combination of automated bots, human-controlled troll accounts, and more targeted social engineering operations. However, the second part of the article assesses the potential for these same tools to improve detection. Promising countermeasures include running internal generative models to bolster training data for internal classifiers, detecting statistical anomalies, identifying output from common prompts, and building specialised classifiers optimised for specific monitoring needs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call