Abstract
The prediction of future soccer match outcomes has been a challenging task for data scientist for years. Researchers used different features to represent soccer teams performance and players skills. These features are then used to create predictive models using machine learning algorithms. This paper presents a new hybrid approach to predict the outcome of future soccer matches. Our hybrid approach combines machine learning and statistical models to predict future match outcomes. The paper analyzes the hidden patterns within a training dataset that has the results of 205,182 soccer match outcomes, played between 2000/2001 and 2016/2017 seasons. Using feature engineering techniques, the paper explores individual leagues and teams statistics, discusses the impact of playing at home or away on winning the match, compares the effectiveness of using only recent match outcomes data versus all matches in the training set, and evaluates the prediction accuracy when creating separate models for each league versus a single model for all leagues. This paper presents two different hybrid models to predict soccer match outcomes. Our best model achieved 46.6% prediction accuracy of the test set at a ranked probability score of 0.2176.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.