Hybrid data mining for pandemic public opinion analysis: integrating sentiment, topic, and geolocation data

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Abstract In the era of global health crises, social media has become both a mirror and amplifier of public opinion, influencing individual behaviours, policy responses, and the spread of (mis)information. Traditional monitoring techniques—such as surveys and focus groups—lack the timeliness, scalability, and granularity required for fast-moving health emergencies. This study presents a hybrid data mining framework that integrates sentiment analysis, topic modelling, and geolocation analytics to deliver a multidimensional view of pandemic-related public discourse. Using approximately 57,000 COVID-19-related tweets extracted via the Twitter API, lexicon-based sentiment analysis tools (VADER and TextBlob), Latent Dirichlet Allocation (LDA) topic modeling, and Orange Data Mining’s Document Map geolocation feature to capture public sentiment, thematic structures, and geographic patterns are employed. Results show a predominance of neutral sentiment (52.4%), with major topics including public health measures, vaccination discourse, and misinformation narratives. The COVID-19 pandemic has underscored the critical role of social media in shaping public discourse, disseminating information, and influencing public health decision-making. This study presents a hybrid data mining framework that integrates sentiment analysis, topic modeling, and geolocation analytics to provide a multidimensional understanding of pandemic-related discussions. Geolocation mapping revealed regional variations in sentiment, particularly higher vaccine skepticism in certain countries. The integrated framework demonstrates a reproducible, user-friendly, and region-aware methodology for crisis informatics, offering actionable insights for policymakers and public health agencies. The framework aligns with WHO infodemic management guidance and recent ethics recommendations, offering a practical, governance-ready model for health ministries and research institutions in low- and middle-income countries (LMICs) (WHO, in Social listening in infodemic management for public health: ethical guidance, World Health Organization, Geneva, 2025; Bhatt et al. in Public Health Rev 46:11. 10.3389/phrs.2025.00011, 2025 and Cascella et al. in Humanit Soc Sci Commun 12:76. 10.1057/s41599-025-04564-x, 2025).

Similar Papers
  • Preprint Article
  • 10.2196/preprints.69983
Discovering Topics and Trends in Artificial Intelligence Chatbots in Medicine: Using Latent Dirichlet Allocation Topic Modeling (Preprint)
  • Dec 12, 2024
  • Ming Yue Ni + 5 more

BACKGROUND With the widespread adoption of the internet and smart devices, chatbots have emerged as significant auxiliary tools for public health activities. Despite the increasing application of chatbots in the medical field, comprehensive assessments of research topics and trends in this area remain relatively scarce. OBJECTIVE This study analyzed the application topics of chatbot technology in the medical field and explored the trends of these topics across different time periods, various journals, and different countries. METHODS In this study, a bibliometric approach was used to systematically search the PubMed, CINAHL, Web of Science and Embase databases for literature on medicine and chatbots between 2004 and 2024. By applying Latent Dirichlet Allocation (LDA) topic modeling, the study identified and analyzed the thematic applications of chatbots in the medical field, and explored the temporal evolution of these topics as well as their distribution characteristics across journals and countries. RESULTS We ultimately identified 3,029 articles for analysis. Utilizing the Latent Dirichlet Allocation (LDA) topic modeling technique, we identified nine core topics from the abstracts: ChatGPT medical quiz accuracy research, digital healthcare support assistants, mental health intervention research, epidemic health conversation application research, cancer patient diagnosis and treatment care, artificial intelligence (AI) healthcare education potential research, natural language processing models, human-computer interaction emotion research, and AI reading assistance systems. This study also found that these topics have shown diverse developmental trajectories over time, reflecting the evolution of research interests. In addition, researchers from different journals and countries have shown significant differences in the topics they focus on. CONCLUSIONS This study analyzed the topic distribution, temporal trends, journal, and country distribution characteristics of chatbots in the medical field. The results revealed popular and less researched topics, as well as emerging directions and trends, providing researchers with a tool for rapid identification. These findings not only provide guidance for researchers in selecting research directions but also offer references for journals and countries in determining research priorities, formulating strategic plans, and promoting international collaborative research.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 12
  • 10.1186/s40537-022-00605-3
An intelligent literature review: adopting inductive approach to define machine learning applications in the clinical domain
  • Apr 28, 2022
  • Journal of Big Data
  • Renu Sabharwal + 1 more

Big data analytics utilizes different techniques to transform large volumes of big datasets. The analytics techniques utilize various computational methods such as Machine Learning (ML) for converting raw data into valuable insights. The ML assists individuals in performing work activities intelligently, which empowers decision-makers. Since academics and industry practitioners have growing interests in ML, various existing review studies have explored different applications of ML for enhancing knowledge about specific problem domains. However, in most of the cases existing studies suffer from the limitations of employing a holistic, automated approach. While several researchers developed various techniques to automate the systematic literature review process, they also seemed to lack transparency and guidance for future researchers. This research aims to promote the utilization of intelligent literature reviews for researchers by introducing a step-by-step automated framework. We offer an intelligent literature review to obtain in-depth analytical insight of ML applications in the clinical domain to (a) develop the intelligent literature framework using traditional literature and Latent Dirichlet Allocation (LDA) topic modeling, (b) analyze research documents using traditional systematic literature review revealing ML applications, and (c) identify topics from documents using LDA topic modeling. We used a PRISMA framework for the review to harness samples sourced from four major databases (e.g., IEEE, PubMed, Scopus, and Google Scholar) published between 2016 and 2021 (September). The framework comprises two stages—(a) traditional systematic literature review consisting of three stages (planning, conducting, and reporting) and (b) LDA topic modeling that consists of three steps (pre-processing, topic modeling, and post-processing). The intelligent literature review framework transparently and reliably reviewed 305 sample documents.

  • PDF Download Icon
  • Research Article
  • 10.54517/esp.v8i3.1958
Analyze IMDb movies by sentiment and topic analysis
  • Oct 25, 2023
  • Environment and Social Psychology
  • Ningjing Ouyang

Movie is an important cultural form, carrying multiple levels and meanings such as art, entertainment and social value. Movie review and rating data sets are huge, and deep learning and natural language processing methods are widely used today. Advances in big data and deep learning offer unprecedented opportunities to understand moviegoer behavior and preferences while providing a cost-effective way to gain insights relevant to the entertainment industry. This project conducts sentiment analysis, topic modeling, and visual statistical analysis based on the IMDb movie data set to identify key factors and deeper insights that influence successful decision-making in film production. This project first uses the word embedding method to vectorize the movie review text, and then uses Bidirectional Long Short-Term Memory (Bi-LSTM) to perform sentiment classification. In addition, statistical methods such as visualization were used to discover conclusions such as the highest average number of movies released in November, and identify trends, patterns and relationships between the variables of IMDb movies. Finally, the Latent Dirichlet Allocation (LDA) topic modeling model was constructed to find out that the important topic with increased demand is light entertainment movies, highlighting the commercial feasibility of comedy movies as a profitable business model. In summary, this project uses an emotion-topic fusion analysis method based on the Bi-LSTM emotion classification method and the LDA topic modeling method. The results show that the Bi-LSTM model can better identify positive and negative emotions in movie reviews, and the LDA topic model performs well in mining popular topics.

  • Conference Article
  • Cite Count Icon 15
  • 10.2991/sekeie-14.2014.47
Text Similarity Computing Based on LDA Topic Model and Word Co-occurrence
  • Jan 1, 2014
  • Minglai Shao + 1 more

LDA (Latent Dirichlet Allocation) topic model has been widely applied to text clustering owing to its efficient dimension reduction. The prevalent method is to model text set through LDA topic model, to make inference by Gibbs sampling, and to calculate text similarity with JS (JensenShannon) distance. However, JS distance cannot distinguish semantic associations among text topics. For this defect, a new text similarity computing algorithm based on hidden topics model and word co-occurrence analysis is introduced. Tests are carried out to verify the clustering effect of this improved computing algorithm. Results show that this method can effectively improve text similarity computing result and text clustering accuracy. Keywords-topic model; LDA (Latent Dirichlet Allocation); JS (Jensen-Shannon) distance; word co-occurrence; similarity

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 8
  • 10.2196/44356
Tweeting for Health Using Real-time Mining and Artificial Intelligence-Based Analytics: Design and Development of a Big Data Ecosystem for Detecting and Analyzing Misinformation on Twitter.
  • Jun 9, 2023
  • Journal of Medical Internet Research
  • Plinio Pelegrini Morita + 4 more

Digital misinformation, primarily on social media, has led to harmful and costly beliefs in the general population. Notably, these beliefs have resulted in public health crises to the detriment of governments worldwide and their citizens. However, public health officials need access to a comprehensive system capable of mining and analyzing large volumes of social media data in real time. This study aimed to design and develop a big data pipeline and ecosystem (UbiLab Misinformation Analysis System [U-MAS]) to identify and analyze false or misleading information disseminated via social media on a certain topic or set of related topics. U-MAS is a platform-independent ecosystem developed in Python that leverages the Twitter V2 application programming interface and the Elastic Stack. The U-MAS expert system has 5 major components: data extraction framework, latent Dirichlet allocation (LDA) topic model, sentiment analyzer, misinformation classification model, and Elastic Cloud deployment (indexing of data and visualizations). The data extraction framework queries the data through the Twitter V2 application programming interface, with queries identified by public health experts. The LDA topic model, sentiment analyzer, and misinformation classification model are independently trained using a small, expert-validated subset of the extracted data. These models are then incorporated into U-MAS to analyze and classify the remaining data. Finally, the analyzed data are loaded into an index in the Elastic Cloud deployment and can then be presented on dashboards with advanced visualizations and analytics pertinent to infodemiology and infoveillance analysis. U-MAS performed efficiently and accurately. Independent investigators have successfully used the system to extract significant insights into a fluoride-related health misinformation use case (2016 to 2021). The system is currently used for a vaccine hesitancy use case (2007 to 2022) and a heat wave-related illnesses use case (2011 to 2022). Each component in the system for the fluoride misinformation use case performed as expected. The data extraction framework handles large amounts of data within short periods. The LDA topic models achieved relatively high coherence values (0.54), and the predicted topics were accurate and befitting to the data. The sentiment analyzer performed at a correlation coefficient of 0.72 but could be improved in further iterations. The misinformation classifier attained a satisfactory correlation coefficient of 0.82 against expert-validated data. Moreover, the output dashboard and analytics hosted on the Elastic Cloud deployment are intuitive for researchers without a technical background and comprehensive in their visualization and analytics capabilities. In fact, the investigators of the fluoride misinformation use case have successfully used the system to extract interesting and important insights into public health, which have been published separately. The novel U-MAS pipeline has the potential to detect and analyze misleading information related to a particular topic or set of related topics.

  • Research Article
  • 10.2196/77424
Public Attention to Mpox in China During the Pandemic: Qualitative Analysis of TikTok Data Using Latent Dirichlet Allocation Topic Modeling
  • Aug 21, 2025
  • Journal of Medical Internet Research
  • Donghang Luo + 9 more

BackgroundMpox has reemerged as a global public health concern. With the growing reliance on social media for health information dissemination, understanding public perception through these platforms is essential for designing effective health promotion strategies.ObjectiveThis study analyzes TikTok data related to mpox using Latent Dirichlet Allocation (LDA) topic modeling. This paper aims to extract key topics and inform targeted health promotion strategies for mpox prevention and control.MethodsUsing the “Aisou Jisou” system, we collected TikTok data containing the keyword “Mpox” from April 1, 2022, to March 31, 2025. The dataset comprised 25,672 text data and associated search terms. We analyzed trends in the Search Index and Target Group Index (TGI) across time, gender, age groups, and provinces. LDA topic modeling was applied to identify latent topics within the text data, and topic evolution was examined during 4 peak months of the Search Index.ResultsA total of 4 major Search Index peaks were identified on TikTok in China, which are May 2022, July 2023, August 2024, and February 2025. These peaks aligned with key global and national mpox events, including WHO’s declaration of a global mpox outbreak in May 2022 and the detection of the clade Ib Mpox in China in January 2025. TGI analysis revealed that users aged 18‐23 years exhibited the highest engagement. Spatially, Beijing, Tianjin, and Jilin recorded the highest cumulative TGI values (5922.38, 5692.41, and 3579.90, respectively). LDA topic modeling identified 8 primary topics, including transmission and prevention, vaccine concerns, and misinformation, etc. Public attention evolved from general disease knowledge toward issues of stigmatization and vaccine distrust over time. Sankey diagrams illustrated shifts in public attention across topics at different Search Index peaks, with “Mpox Transmission and Prevention” receiving the most attention in May 2022 and “Mpox Vaccination and Infection Prevention” in February 2025.ConclusionsTikTok provides real-time insights into public attention during mpox outbreaks, but can also propagate misinformation and stigmatizing narratives. Public health authorities should leverage these platforms for timely communication, actively address misinformation, and mitigate social bias. Tailored strategies are needed to enhance health literacy, minimize stigma, and strengthen outbreak preparedness and response. This study highlights the dual role of social media as both an information source and a potential vector for misinformation, emphasizing the necessity for active monitoring and regulation by health authorities to ensure the accuracy and reliability of disseminated health information.

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/iccst50977.2020.00094
Sentiment Analysis of Consumer-Generated Online Reviews of Physical Bookstores Using Hybrid LSTM-CNN and LDA Topic Model
  • Oct 1, 2020
  • Yan Wang + 2 more

Physical bookstore is the leader of cultural trend, the carrier of national reading and the provider of public cultural services, which embodies the cultural soft power of a city. The widely use of Internet e-commerce platform and the change of people's reading habits have brought great impact on physical bookstores, resulting in poor overall profitability of physical bookstores. In order to realize the sustainable development of physical bookstores, we mine and analyze consumer-generated online reviews. In this paper, a method of sentiment analysis based on Hybrid LSTM-CNN (Hybrid Long Short-Term Memory-Convolutional Neural Network) and LDA (Latent Dirichlet Allocation) topic model is proposed. Firstly, the Hybrid LSTM-CNN model is used to classify reviews, and then LDA topic model is used to extract features of positive and negative reviews. The results show that Hybrid LSTM-CNN model has better performance than the classic LSTM and CNN in sentiment classification. The LDA model mines that consumers have the positive attitude towards the products, context and ambiance of physical bookstores, and the negative attitude towards price and service. This method studies consumer-generated online reviews in physical bookstores from two aspects: sentiment classification and topic mining, which can help physical bookstore operators to know consumer feedback in time.

  • Research Article
  • Cite Count Icon 18
  • 10.1016/j.foodcont.2020.107435
A topic model approach to identify and track emerging risks from beeswax adulteration in the media
  • Jul 2, 2020
  • Food Control
  • Agnes Rortais + 8 more

A topic model approach to identify and track emerging risks from beeswax adulteration in the media

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 1
  • 10.2139/ssrn.3708327
Trends in COVID-19 Publications: Streamlining Research Using NLP and LDA
  • Jan 1, 2020
  • SSRN Electronic Journal
  • Akash Gupta + 3 more

Research publications related to the novel coronavirus disease COVID-19 are rapidly growing in number. However, current online literature hubs, even with artificial intelligence, are inadequate for identifying the relative strength of research topics. Hence, we aimed to develop a comprehensive Latent Dirichlet Allocation (LDA) topic model using natural language processing (NLP) techniques, provide visualisations for temporal trends, and apply our methodology to improve existing online literature hubs.Using the search term “COVID”, abstracts were extracted from PubMed®, from January to July 2020 (N=16346). An LDA topic model was trained on 81% of abstracts. Weekly temporal trends were visualised as a heatmap on all abstracts. Then, we tested our methodology on over 23,000 abstracts gathered from January 2020 to September 2020 from LitCovid, a literature hub from the National Center for Biotechnology Information. We use our topic model to subdivide LitCovid’s eight categories into corresponding LDA topics.The optimised LDA topic model, created using PubMed® data, produced 25 comprehensive topics with no significant overlap. There were temporal changes for topics: prominence of “Mental Health” and “Socioeconomic Impact” increased, “Genome Sequence” decreased, and “Epidemiology” remained relatively constant. We identified inadequate representation of “Airborne Transmission Protection”. Importantly, research on masks and PPE is skewed towards clinical applications with a lack of population-based epidemiological research. Our methodology, when applied to LitCovid, identified important topics within each LitCovid category. For example, “Case Report” was split into topics such as “Pulmonary” and “Oncology” as well as the under-represented topics “Haematology” and “Gastroenterology”. Our work allows for comprehensive topic identification and intuitive visualisation of temporal trends in COVID-19 research. Implementation of the methodology complements existing online literature hubs and identifies underrepresented topics such as population-based studies on masks that may be of significant public interest.Funding Statement: None to declare.Declaration of Interests: There are no conflicts of interest.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 12
  • 10.3390/make5020029
Evaluating the Coverage and Depth of Latent Dirichlet Allocation Topic Model in Comparison with Human Coding of Qualitative Data: The Case of Education Research
  • May 14, 2023
  • Machine Learning and Knowledge Extraction
  • Gaurav Nanda + 5 more

Fields in the social sciences, such as education research, have started to expand the use of computer-based research methods to supplement traditional research approaches. Natural language processing techniques, such as topic modeling, may support qualitative data analysis by providing early categories that researchers may interpret and refine. This study contributes to this body of research and answers the following research questions: (RQ1) What is the relative coverage of the latent Dirichlet allocation (LDA) topic model and human coding in terms of the breadth of the topics/themes extracted from the text collection? (RQ2) What is the relative depth or level of detail among identified topics using LDA topic models and human coding approaches? A dataset of student reflections was qualitatively analyzed using LDA topic modeling and human coding approaches, and the results were compared. The findings suggest that topic models can provide reliable coverage and depth of themes present in a textual collection comparable to human coding but require manual interpretation of topics. The breadth and depth of human coding output is heavily dependent on the expertise of coders and the size of the collection; these factors are better handled in the topic modeling approach.

  • Research Article
  • Cite Count Icon 13
  • 10.1093/jamiaopen/ooad112
Topic modeling on clinical social work notes for exploring social determinants of health factors.
  • Jan 4, 2024
  • JAMIA open
  • Shenghuan Sun + 4 more

Existing research on social determinants of health (SDoH) predominantly focuses on physician notes and structured data within electronic medical records. This study posits that social work notes are an untapped, potentially rich source for SDoH information. We hypothesize that clinical notes recorded by social workers, whose role is to ameliorate social and economic factors, might provide a complementary information source of data on SDoH compared to physician notes, which primarily concentrate on medical diagnoses and treatments. We aimed to use word frequency analysis and topic modeling to identify prevalent terms and robust topics of discussion within a large cohort of social work notes including both outpatient and in-patient consultations. We retrieved a diverse, deidentified corpus of 0.95 million clinical social work notes from 181 644 patients at the University of California, San Francisco. We conducted word frequency analysis related to ICD-10 chapters to identify prevalent terms within the notes. We then applied Latent Dirichlet Allocation (LDA) topic modeling analysis to characterize this corpus and identify potential topics of discussion, which was further stratified by note types and disease groups. Word frequency analysis primarily identified medical-related terms associated with specific ICD10 chapters, though it also detected some subtle SDoH terms. In contrast, the LDA topic modeling analysis extracted 11 topics explicitly related to social determinants of health risk factors, such as financial status, abuse history, social support, risk of death, and mental health. The topic modeling approach effectively demonstrated variations between different types of social work notes and across patients with different types of diseases or conditions. Our findings highlight LDA topic modeling's effectiveness in extracting SDoH-related themes and capturing variations in social work notes, demonstrating its potential for informing targeted interventions for at-risk populations. Social work notes offer a wealth of unique and valuable information on an individual's SDoH. These notes present consistent and meaningful topics of discussion that can be effectively analyzed and utilized to improve patient care and inform targeted interventions for at-risk populations.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-030-24268-8_8
Short Text Topic Recognition and Optimization Method for University Online Community
  • Jan 1, 2019
  • Xu Wu + 4 more

The university online community mainly records what happens in target areas and groups of people. It has the characteristics of timeliness, regional strong and clear target groups. Compared with Weibo and post-bar, university community’s text topic recognition needs to solve the problems of large text noise, fast text update and short single text content. To this end, this paper proposes a method of building university topic model based on LDA topic model. Through the steps of original text’s noise reduction, LDA (Latent Dirichlet Allocation (LDA), is a topic model commonly used in the field of machine learning and is often used for text categorization.) model recognition and weighted calculation of recognition results, etc., the event themes that characterize the common characteristics for university online community are obtained. Experiments based on real university online community’s data show that the topic model of university popular events established by the topic recognition model of this paper can reflect some popular events in colleges and universities, so as to provide reasonable support for university management.

  • Research Article
  • Cite Count Icon 3
  • 10.1111/jjns.12520
Latent Dirichlet allocation topic modeling of free-text responses exploring the negative impact of the early COVID-19 pandemic on research in nursing.
  • Nov 30, 2022
  • Japan Journal of Nursing Science
  • Madoka Inoue + 4 more

To derive latent topics from free-text responses on the negative impact of the pandemic on research activities and determine similarities and differences in the resulting themes between academic-based and clinical-based researchers. We performed a secondary analysis of free-text responses from a cross-sectional online survey conducted by the Japan Academy of Nursing Science of its members in early 2020. The participants were categorized into two groups by workplace (academic-based and clinical-based researchers). Latent Dirichlet allocation (LDA) topic modeling was used to extract latent topics statistically and list important keywords/text associated with the topics. After organizing similar topics by principal component analysis (PCA), we finally derived topic-associated themes by reading the keywords/texts and determining the similarity and differences of the themes between the two groups. A total of 201 respondents (163 academic-based and 38 clinical-based researchers) provided free-text responses. LDA identified eight and three latent topics for the academic-based and clinical-based researchers, respectively. While PCA re-grouped the eight topics derived from the former group into four themes, no merging of the topics from the latter group was performed resulting in three themes. The only theme common to the two groups was "barriers to conducting research," with the remaining themes differing between the groups. Using LDA topic modeling with PCA, we identified similarities and differences in the themes described in free-text responses about the negative impact of the pandemic between academic-based and clinical-based researchers. Measures to mitigate the negative impact of pandemics on nursing research may need to be tailored separately.

  • Dissertation
  • Cite Count Icon 1
  • 10.20381/ruor-5492
Content Management and Hashtag Recommendation in a P2P Social Networking Application
  • Jan 1, 2015
  • Keerthi Nelaturu

In this thesis focus is on developing an online social network application with a Peer-to-Peer infrastructure motivated by BestPeer++ architecture and BATON overlay structure. BestPeer++ is a data processing platform which enables data sharing between enterprise systems. BATON is an open-sourced project which implements a peer-to-peer with a topology of a balanced tree. We designed and developed the components for users to manage their accounts, maintain friend relationships, and publish their contents with privacy control and newsfeed, notification requests in this social networking application. We also developed a Hashtag Recommendation system for this social networking application. A user may invoke a recommendation procedure while writing a content. After being invoked, the recommendation procedure returns a list of candidate hashtags, and the user may select one hashtag from the list and embed it into the content. The proposed approach uses Latent Dirichlet Allocation (LDA) topic model to derive the latent or hidden topics of different content. LDA topic model is a well developed data mining algorithm and generally effective in analyzing text documents with different lengths. The topic model is further used to identify the candidate hashtags that are associated with the texts in the published content through their association with the derived hidden topics. We considered different methods of recommendation approach for the procedure to select candidate hashtags from different content. Some methods consider the hashtags contained in the contents of the whole social network or of the user self. These are content-based recommendation techniques which matching user’s own profile with the profiles of items.. Some methods consider the hashtags contained in contents of the friends or of the similar users. These are collaborative filtering based recommendation

  • Research Article
  • 10.1177/21582440251390678
Identify Future Trending Topics by Thematic Mapping of the Cinema Phenomenon Using Machine Learning and LDA
  • Oct 1, 2025
  • Sage Open
  • Türker Elitaş + 2 more

This study aims to evaluate 16,891 academic publications in the field of cinema between 1980 and 2024 using bibliometric analysis and topic modeling methods. Based on data obtained from the Web of Science (WOS) and Scopus databases, bibliometric findings were received, including the distribution of publications by year, the annual number and rate of citations per article, the most productive authors in the field, the production status of authors over time, the countries of authors and the number of articles they published, and the journals with the highest number of publications. Data obtained from the Web of Science (WOS) and Scopus databases were also used to identify prominent word groups and themes in the articles using text mining and Latent Dirichlet Allocation (LDA) topic modeling. As a result of the analysis, 12 main themes emerged based on word-text relationships and the weight of publications. The findings show that cinema studies have developed with increasing momentum over the years and that there has been a growing focus on certain topics. This study systematically examines the development of cinema studies literature through descriptive content analysis and LDA topic modeling. In this respect, it is important in that it systematically reveals the structural and thematic transformation of academic production in the field of cinema and provides a theoretical and methodological basis for future research. It also makes a current and multidimensional contribution to the discipline in terms of revealing the increasingly important digital trends, cultural representations, and interdisciplinary developments in cinema studies.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.