Abstract

In this paper, individual sample data randomly measured are preprocessed, for example, outliers values are deleted and the characteristics of the samples are normalized to between 0 and 1. The correlation analysis approach is then used to determine and rank the relevance of stroke characteristics, and factors with poor correlation are discarded. The samples are randomly split into a 70% training set and a 30% testing set. Finally,the random forest model and XGBoost algorithm combined with cross-validation and grid search method are implemented to learn the stroke characteristics. The accuracy of the testing set by the XGBoost algorithm is 0.9257, which is better than that of the random forest model with 0.8991. Thus, the XGBoost model is selected to predict the stroke for ten people, and the obtained conclusion is that two people have a stroke and eight people have no stroke.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.