Abstract

Forecasting of attendance demand to sports events is a common topic of study in the sports economics literature, being traditionally addressed through the use of multivariate regression analysis or structural equation modeling. In recent years, a restricted number of authors have approached the problem using machine learning methods, with promising results. In this article, we investigate the use of analytical techniques from the machine learning toolbox, namely symbolic regression and genetic programming (SR/GP), to determine the best fitting prediction function that relates contextual and panel independent variables to soccer match attendance. For that purpose, we analyze five years of attendance at soccer matches played at a large stadium in Southern Brazil. Two datasets with game-level attendance to matches from two soccer championships are considered, covering the seasons from 2014 to 2019. We also propose the use of expert panels to collect information on relevant candidate independent variables and their interactions to be tested in the prediction models, expediting the feature selection step of the modeling process. From the academic perspective, our study is the first to propose the use of SR/GP to model soccer match attendance, contributing to the limited number of studies that use game-by-game attendance as the dependent variable and develop team-specific attendance models. From the managerial perspective, identifying factors responsible for systematic variations in match attendance levels enables better sport management and marketing plans.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call