Abstract

This paper classifies 19 human actions based on a dataset of 1.2 million human actions acquired from sensors. The Xgboost classification model is established, and the MIV algorithm is used as an index to evaluate the importance of each variable to the dependent variable. The MIV values of all features are sorted according to the absolute value of each variable, and finally the top 10 groups of features are selected as the features of the reduced data set, and fine-tuned by grid search. Select the simplified data set, get the maximum roc_auc through continuous testing, and get the optimal model. The model has a recall of 1 and a precision, F1-score, and AUC of 0.99. Then, in order to make the model have a good generalization ability under the limited data set, a feasible method is designed to evaluate the generalization ability of the model. using the SMOTE-Tomek integrated sampling method to calculate k-nearest neighbor samples for each minority class sample, select the class samples whose neighbor similarity coefficients meet the requirements. Randomly generate new samples according to the adjacency relationship between the linearly interpolated sample and its neighbor samples. According to the original data training set, generate similar data samples and put them into the model trained with the original data for prediction evaluation. Finally, the classification accuracy rate of the generated samples is obtained, the precision rate is 0.98, and the recall rate is 0.99. The F1 value is 0.98, and the Roc_AUC value is 0.98, which proves that the evaluation model has good generalization ability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.