ABSTRACT In the digital era, media content is crucial for political analysis, providing valuable insights through news articles, social media posts, speeches, and reports. Natural Language Processing (NLP) has transformed Political Information Extraction (IE), automating tasks such as event extraction and sentiment analysis. Traditional NLP methods, while effective, are often task-specific and require specialized expertise. In contrast, Large Language Models (LLMs) powered by Generative Artificial Intelligence (GenAI) offer a more integrated solution. However, domain-specific challenges persist, which led to the development of the Retrieval-Augmented Generation (RAG) framework. RAG enhances LLMs by incorporating external data retrieval, addressing issues related to data availability. To demonstrate RAG’s capabilities, we introduce the Political-RAG system, designed to extract political event information from media content, including Twitter data and news articles. Initially developed for event extraction, the Political-RAG system lays the foundation for developing various complex Political IE tasks. These include detecting hate speech, analyzing conflicts, assessing political bias, and evaluating social trends, sentiment, and opinions.
Read full abstract