With advancements in science and technology, a vast amount of data from various media is being generated in the field of sports. Applying statistical methods to extract meaningful information from this data has become essential. This study aimed to predict volleyball match outcomes by performing Monte Carlo simulations based on live text commentary data from the 2023-2024 women's professional volleyball season. By generating the necessary variables for the simulation from the text commentary data and conducting 10,000 iterations of Monte Carlo simulations, the study confirmed that the simulations accurately mirrored actual match outcomes. In particular, when one team demonstrated overwhelming performance, the accuracy of set score predictions was notably high, and the simulation results demonstrated the potential for developing customized strategies. However, limitations were observed in predicting outcomes in specific scenarios, such as when a team with a high total score loses 3-2. To address these limitations, future work should focus on more precisely modeling the complex interactions between various situational and variable factors by converting qualitative factors, which have been challenging to quantify, into objective indicators for integration into the model.
Read full abstract