This investigation sought to discern the risk factors for atrial fibrillation within Shanghai's Chongming District, analyzing data from 678 patients treated at a tertiary hospital in Chongming District, Shanghai, from 2020 to 2023, collecting information on season, C-reactive protein, hypertension, platelets, and other relevant indicators. The researchers introduced a novel dual feature-selection methodology, combining hierarchical clustering with Fisher scores (HC-MFS), to benchmark against four established methods. Through the training of five classification models on a designated dataset, the most effective model was chosen for method performance evaluation, with validation confirmed by test set scores. Impressively, the HC-MFS approach achieved the highest accuracy and the lowest root mean square error in the classification model, at 0.9118 and 0.2970, respectively. This provides a higher performance compared to existing methods, thanks to the combination and interaction of the two methods, which improves the quality of the feature subset. The research identified seasonal changes that were strongly associated with atrial fibrillation (pr = 0.31, FS = 0.11, and DCFS = 0.33, ranked first in terms of correlation); LDL cholesterol, total cholesterol, C-reactive protein, and platelet count, which are associated with inflammatory response and coronary heart disease, also indirectly contribute to atrial fibrillation and are risk factors for AF. Conclusively, this study advocates that machine-learning models can significantly aid clinicians in diagnosing individuals predisposed to atrial fibrillation, which shows a strong correlation with both pathological and climatic elements, especially seasonal variations, in the Chongming District.
Read full abstract