Abstract

Objective: We aimed to construct a random forest model to predict atrial fibrillation (AF) in Chinese population. Design and method: This study was comprised of 682 237 subjects with or without AF. Each subject had 19 features that included the subjects’ age, gender, underlying diseases, CHA2DS2-VASc score, and follow-up period. The data were split into train and test sets at an approximate 9:1 ratio: 614 013 data points were placed into the train set and 68 224 data points were placed into the test set. In this study, weighted average F1, precision, and recall values were used to measure prediction model performance. The F1, precision, and recall values were calculated across the train set, the test set, and all data. Results: The area under receiving operating characteristic (ROC) curve was also used to evaluate the performance of the prediction model. The prediction model achieved a k-fold cross-validation accuracy of 0.979 (k = 10). In the test set, the prediction model achieved an F1 value of 0.968, precision value of 0.958, and recall value of 0.979. The area under ROC curve of the model was 0.948 (95% confidence interval 0.947–0.949). This model was validated with a separate dataset. Conclusions: This study showed a novel AF risk prediction scheme for Chinese individuals with random forest model methodology.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call