The present study was aimed at identifying key topics in online discussions about the use of ChatGPT by examining a large dataset extracted from Reddit social media using natural language processing. A corpus of 159,971 posts about ChatGPT were extracted from a custom-made python-coded Reddit content scraper for posts in the r/ChatGPT subreddit discussions. After cleaning the data, the sample was reduced to 119,853 posts which was subjected to cluster analysis using the open-source IRaMuTeQ software to identify main topics based on the cooccurrence of texts. These clusters were named by a panel of social psychology experts (n=3) by reading typical text segments within each cluster. Four thematic clusters emerged, categorized into two main topics: “Society and AI Integration”, focusing on ethical concerns (32.1%), and “Operational Aspects and Applications”, which delves into technical and practical facets (67.9%). The latter includes clusters like “AI Technical Framework”, “Casual AI Interactions”, and “Human-AI Etiquette”. The Reddit discourse provides a comprehensive understanding of ChatGPT, revealing user priorities like system capabilities and ethical considerations. Notably, the “Human-AI Etiquette” cluster is a new topic less covered in existing literature. The findings underscore the importance of effective prompting for meaningful user engagement with ChatGPT.
Read full abstract