Mumps is a viral respiratory disease characterized by facial swelling and transmitted through respiratory secretions. Despite the availability of an effective vaccine, mumps outbreaks have reemerged globally, including in China, where it remains a significant public health issue. In Yunnan province, China, the incidence of mumps has fluctuated markedly and is higher than that in mainland China, underscoring the need for improved outbreak prediction methods. Traditional surveillance methods, however, may not be sufficient for timely and accurate outbreak prediction. Our study aims to leverage the Baidu search index, representing search volumes from China's most popular search engine, along with environmental data to develop a predictive model for mumps incidence in Yunnan province. We analyzed mumps incidence in Yunnan Province from 2014 to 2023, and used time series data, including mumps incidence, Baidu search index, and environmental factors, from 2016 to 2023, to develop predictive models based on long short-term memory networks. Feature selection was conducted using Pearson correlation analysis, and lag correlations were explored through a distributed nonlinear lag model (DNLM). We constructed four models with different combinations of predictors: (1) model BE, combining the Baidu index and environmental factors data; (2) model IB, combining mumps incidence and Baidu index data; (3) model IE, combining mumps incidence and environmental factors; and (4) model IBE, integrating all 3 data sources. The incidence of mumps in Yunnan showed significant variability, peaking at 37.5 per 100,000 population in 2019. From 2014 to 2023, the proportion of female patients ranged from 41.3% in 2015 to 45.7% in 2020, consistently lower than that of male patients. After excluding variables with a Pearson correlation coefficient of <0.10 or P values of <.05, we included 3 Baidu index search term groups (disease name, symptoms, and treatment) and 6 environmental factors (maximum temperature, minimum temperature, sulfur dioxide, carbon monoxide, particulate matter with a diameter of 2.5 µm or less, and particulate matter with a diameter of 10 µm or less) for model development. DNLM analysis revealed that the relative risks consistently increased with rising Baidu index values, while nonlinear associations between temperature and mumps incidence were observed. Among the 4 models, model IBE exhibited the best performance, achieving the coefficient of determination of 0.72, with mean absolute error, mean absolute percentage error, and root-mean-square error values of 0.33, 15.9%, and 0.43, respectively, in the test set. Our study developed model IBE to predict the incidence of mumps in Yunnan province, offering a potential tool for early detection of mumps outbreaks. The performance of model IBE underscores the potential of integrating search engine data and environmental factors to enhance mumps incidence forecasting. This approach offers a promising tool for improving public health surveillance and enabling rapid responses to mumps outbreaks.
Read full abstract