Abstract
This study develops a machine learning-based system to predict English Premier League (EPL) outcomes, employing models such as Principal Component Analysis (PCA), K-Nearest Neighbors (KNN), Random Forests, and Support Vector Machines (SVM). The analysis covered a large dataset of matches, with the data normalized to ensure consistency and accuracy across models. Among the methods used, Random Forests showed the most robust performance in predicting match outcomes, particularly in forecasting wins and losses. However, both Random Forests and SVM encountered difficulties in accurately predicting draws, which points to areas where further refinement is needed. The prediction probabilities largely fell within a specific range, indicating the models' ability to identify patterns, but significant overfitting was observed in the models. This overfitting suggests that while the models performed well on the training data, they struggled to generalize to new, unseen data, highlighting the importance of implementing more effective regularization techniques to prevent overfitting and improve the models' overall predictive accuracy in real-world scenarios.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.