On Predicting Soccer Outcomes in the Greek League Using Machine Learning

Marios-Christos Malamatinos,George A Papakostas,Eleni Vrochidou

doi:10.3390/computers11090133

Marios-Christos Malamatinos, George A Papakostas + Show 1 more

Open Access

PDF Available

https://doi.org/10.3390/computers11090133

Copy DOI

Export

Save

Cite

Journal: Computers	Publication Date: Aug 31, 2022
Citations: 8	License type: CC BY 4.0

Affiliation: International Hellenic University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

The global expansion of the sports betting industry has brought the prediction of outcomes of sport events into the foreground of scientific research. In this work, soccer outcome prediction methods are evaluated, focusing on the Greek Super League. Data analysis, including data cleaning, Sequential Forward Selection (SFS), feature engineering methods and data augmentation is conducted. The most important features are used to train five machine learning models: k-Nearest Neighbor (k-NN), LogitBoost (LB), Support Vector Machine (SVM), Random Forest (RF) and CatBoost (CB). For comparative reasons, the best model is also tested on the English Premier League and the Dutch Eredivisie, exploiting data statistics from six seasons from 2014 to 2020. Convolutional neural networks (CNN) and transfer learning are also tested by encoding tabular data to images, using 10-fold cross-validation, after applying grid and randomized hyperparameter tuning: DenseNet201, InceptionV3, MobileNetV2 and ResNet101V2. This is the first time the Greek Super League is investigated in depth, providing important features and comparative performance between several machine and deep learning models, as well as between other leagues. Experimental results in all cases demonstrate that the most accurate prediction model is the CB, reporting 67.73% accuracy, while the Greek Super League is the most predictable league.

Full Text