Abstract

In professional soccer, the choices made in forming a team lineup are crucial for achieving good results. Players are characterized by different skills and their relevance depends on the position that they occupy on the pitch. Experts can recognize similarities between players and their styles, but the procedures adopted are often subjective and prone to misclassification. The automatic recognition of players’ styles based on their diversity of skills can help coaches and technical directors to prepare a team for a competition, to substitute injured players during a season, or to hire players to fill gaps created by teammates that leave. The paper adopts dimensionality reduction, clustering and computer visualization tools to compare soccer players based on a set of attributes. The players are characterized by numerical vectors embedding their particular skills and these objects are then compared by means of suitable distances. The intermediate data is processed to generate meaningful representations of the original dataset according to the (dis)similarities between the objects. The results show that the adoption of dimensionality reduction, clustering and visualization tools for processing complex datasets is a key modeling option with current computational resources.

Highlights

  • We find nowadays a vast literature on soccer data, but research based on dimensionality reduction, clustering and computer visualization of soccer players data is scarce

  • The results show that the adoption of dimensionality reduction and visualization tools for processing complex data is a key modeling option with current computational resources

  • The uniform manifold approximation and projection (UMAP) was proven very effective for visualizing clusters of objects, outperforming other dimensionality reduction, clustering and information visualization techniques both in terms of their computational time, memory requirements and ability to unveil patterns embedded in the data [57]

Read more

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Dimensionality reduction-based schemes try to preserve in low dimensional representations the information embedded in the original datasets They include linear methods, such as classic multidimensional scaling [47], principal component [48], canonical correlation [49], linear discriminant [50] and factor analysis [51], as well as nonlinear approaches, such as non-classic MDS, or Sammon’s projection [52], isomap [53], Laplacian eigenmap [54], diffusion map [55], t-distributed stochastic neighbor embedding [56] and uniform manifold approximation and projection (UMAP) [57].

The Uniform Manifold Approximation and Projection
Description of the Dataset
Ronaldo k
The UMAP for Global Comparison and Visualization of Soccer Players
The UMAP for Local Comparison and Visualization of Soccer Players
Hazard
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call