ABSTRACT The interest in sports performance analysis is rising and tracking data holds high potential for game analysis in team sports due to its accuracy and informative content. Together with machine learning approaches one can obtain deeper and more objective insights into the performance structure. In soccer, the analysis of the defense was neglected in comparison to the offense. Therefore, the aim of this study is to predict ball gains in defense using tracking data to identify tactical variables that drive defensive success. We evaluated tracking data of 153 games of German Bundesliga season 2020/21. With it, we derived player (defensive pressure, distance to the ball, & velocity) and team metrics (inter-line distances, numerical superiority, surface area, & spread) each containing a tactical idea. Afterwards, we trained supervised machine learning classifiers (logistic regression, XGBoost, & Random Forest Classifier) to predict successful (ball gain) vs. unsuccessful defensive plays (no ball gain). The expert-reduction-model (Random Forest Classifier with 16 features) showed the best and satisfying prediction performance (F1-Score (test) = 0.57). Analyzing the most important input features of this model, we are able to identify tactical principles of defensive play that appear to be related to gaining the ball: press the ball leading player, create numerical superiority in areas close to the ball (press short pass options), compact organization of defending team. Those principles are highly interesting for practitioners to gain valuable insights in the tactical behavior of soccer players that may be related to the success of defensive play.
Read full abstract