In tennis matches, the victory and turning points of the game are often influenced by various factors. To explore the factors that affect match fluctuations (changes in the flow of scoring) and to provide suggestions for athletes match strategies, this paper first identifies general indicators through a literature review and uses logistic regression to determine the effectiveness of the chosen model. Secondly, it employs the Fourier function fitting to identify turning points in the match. Considering the scarcity of turning points in the game, this paper uses the SMOTE method to expand the dataset. Subsequently, it tests with a random forest classification model, achieving an accuracy of 93.433%. To improve the models accuracy, several indicators were added to the original model, resulting in a correct rate of 98.51%. Finally, to verify the models results and applicability, the model was applied to other matches with good results. A sensitivity analysis was conducted, revealing good model stability. The model results indicate that the main factors affecting the appearance of turning points include the players movement distance during the match, whether there are changes in the depth and width of the return, score differences, and the maximum number of consecutive wins. When tested in other types of matches, we found that the importance of these factors may change to some extent, but the results remain satisfactory. KEYWORDS: Volatility prediction, Random Forest, Logistic regression, Sensitivity analysis.
Read full abstract