Modeling outcomes of soccer matches

Alkeos Tsokos,Santhosh Narayanan,Franz Király,Gavin Whitaker,Ioannis Kosmidis,Mihai Cucuringu,Gianluca Baio

doi:10.1007/s10994-018-5741-1

Alkeos Tsokos, Santhosh Narayanan + Show 5 more

Open Access

https://doi.org/10.1007/s10994-018-5741-1

Copy DOI

Abstract

We compare various extensions of the Bradley–Terry model and a hierarchical Poisson log-linear model in terms of their performance in predicting the outcome of soccer matches (win, draw, or loss). The parameters of the Bradley–Terry extensions are estimated by maximizing the log-likelihood, or an appropriately penalized version of it, while the posterior densities of the parameters of the hierarchical Poisson log-linear model are approximated using integrated nested Laplace approximations. The prediction performance of the various modeling approaches is assessed using a novel, context-specific framework for temporal validation that is found to deliver accurate estimates of the test error. The direct modeling of outcomes via the various Bradley–Terry extensions and the modeling of match scores using the hierarchical Poisson log-linear model demonstrate similar behavior in terms of predictive performance.

Highlights

The current paper stems from our participation in the 2017 Machine Learning Journal (Springer) challenge on predicting outcomes of soccer matches from a range of leagues around the world (MLS challenge, in short)
The suffix (t) indicates features with coefficients varying with matches played The model indicated by † is the one we used to compute the probabilities for the submission to the MLS challenge The acronyms are as follows: BL, Baseline; CS, Bradley–Terry with constant strengths; LF, Bradley–Terry with linear features; TVC, Bradley–Terry with time-varying coefficients; AFD, Bradley–Terry with additive feature differences and time interactions; hierarchical Poisson log-linear model (HPL), Hierarchical Poisson log-linear model
The sets of features that were used in the LF, TVC, AFD and HPL specifications in Table 4 resulted from ad-hoc experimentation with different combinations of features in the LF specification

Summary

Introduction

The current paper stems from our participation in the 2017 Machine Learning Journal (Springer) challenge on predicting outcomes of soccer matches from a range of leagues around the world (MLS challenge, in short). The performance of the various modeling approaches in predicting the outcomes of matches is assessed using a novel, context-specific framework for temporal validation that is found to deliver accurate estimates of the prediction error. The direct modeling of the outcomes using the various Bradley–Terry extensions and the modeling of match scores using the hierarchical Poisson log-linear model deliver similar performance in terms of predicting the outcome.

Data exploration

Feature extraction

Bradley–Terry models and extensions

33–34 August–May

BL: baseline

CS: constant strengths

LF: linear with features

TVC: time-varying coefficients

AFD: additive feature differences with time interactions

Handling draws

Identifiability

Other data-specific considerations

Model structure

Estimation

MLS challenge

Ranked probability score

Classification accuracy

Meta-analysis

Implementation

Results

Conclusions and discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Machine Learning	Publication Date: Aug 1, 2018
Citations: 20	License type: open-access

R Discovery Prime

R Discovery Prime

Modeling outcomes of soccer matches

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning

Lead the way for us

Similar Papers

A Regularized Cox Hierarchical Model for Incorporating Annotation Information in Predictive Omic Studies.
Dixin Shen ... Eric Kawaguchi
bioRxiv : the preprint server for biology | VOL. -
Dixin Shen, et. al.Dixin Shen ... Eric Kawaguchi
29 Jul 2024
bioRxiv : the preprint server for biology | VOL. -

A regularized Cox hierarchical model for incorporating annotation information in predictive omic studies
Dixin Shen ... Eric Kawaguchi
BioData Mining | VOL. 17
Dixin Shen, et. al.Dixin Shen ... Eric Kawaguchi
24 Oct 2024
BioData Mining | VOL. 17

2010 county and city-level water-use data and associated explanatory variables

-

01 Jan 2017
2010 county and city-level water-use data and associated explanatory variables

Level‐wise strain recovery and error estimation for natural element hierarchical plate models
Jin‐Rae Cho
International Journal for Numerical Methods in Engineering | VOL. 122
Jin‐Rae ChoJin‐Rae Cho
01 Apr 2021
International Journal for Numerical Methods in Engineering | VOL. 122

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Modeling outcomes of soccer matches

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning