Abstract

ObjectivesProfessional sporting organisations invest considerable resources collecting and analysing data in order to better understand the factors that influence performance. Recent advances in non-invasive technologies, such as global positioning systems (GPS), mean that large volumes of data are now readily available to coaches and sport scientists. However analysing such data can be challenging, particularly when sample sizes are small and data sets contain multiple highly correlated variables, as is often the case in a sporting context. Multicollinearity in particular, if not treated appropriately, can be problematic and might lead to erroneous conclusions. In this paper we present a novel ‘leave one variable out’ (LOVO) partial least squares correlation analysis (PLSCA) methodology, designed to overcome the problem of multicollinearity, and show how this can be used to identify the training load (TL) variables that influence most ‘end fitness’ in young rugby league players.MethodsThe accumulated TL of sixteen male professional youth rugby league players (17.7 ± 0.9 years) was quantified via GPS, a micro-electrical-mechanical-system (MEMS), and players’ session-rating-of-perceived-exertion (sRPE) over a 6-week pre-season training period. Immediately prior to and following this training period, participants undertook a 30–15 intermittent fitness test (30-15IFT), which was used to determine a players ‘starting fitness’ and ‘end fitness’. In total twelve TL variables were collected, and these along with ‘starting fitness’ as a covariate were regressed against ‘end fitness’. However, considerable multicollinearity in the data (VIF >1000 for nine variables) meant that the multiple linear regression (MLR) process was unstable and so we developed a novel LOVO PLSCA adaptation to quantify the relative importance of the predictor variables and thus minimise multicollinearity issues. As such, the LOVO PLSCA was used as a tool to inform and refine the MLR process.ResultsThe LOVO PLSCA identified the distance accumulated at very-high speed (>7 m·s-1) as being the most important TL variable to influence improvement in player fitness, with this variable causing the largest decrease in singular value inertia (5.93). When included in a refined linear regression model, this variable, along with ‘starting fitness’ as a covariate, explained 73% of the variance in v30-15IFT ‘end fitness’ (p<0.001) and eliminated completely any multicollinearity issues.ConclusionsThe LOVO PLSCA technique appears to be a useful tool for evaluating the relative importance of predictor variables in data sets that exhibit considerable multicollinearity. When used as a filtering tool, LOVO PLSCA produced a MLR model that demonstrated a significant relationship between ‘end fitness’ and the predictor variable ‘accumulated distance at very-high speed’ when ‘starting fitness’ was included as a covariate. As such, LOVO PLSCA may be a useful tool for sport scientists and coaches seeking to analyse data sets obtained using GPS and MEMS technologies.

Highlights

  • Professional sporting organisations invest considerable resources collecting and analysing data to better understand the factors that influence athletic performance

  • The leave one variable out’ (LOVO) partial least squares correlation analysis (PLSCA) identified the distance accumulated at very-high speed (>7 m s-1) as being the most important training load (TL) variable to influence improvement in player fitness, with this variable causing the largest decrease in singular value inertia (5.93)

  • The LOVO PLSCA technique appears to be a useful tool for evaluating the relative importance of predictor variables in data sets that exhibit considerable multicollinearity

Read more

Summary

Introduction

Professional sporting organisations invest considerable resources collecting and analysing data to better understand the factors that influence athletic performance. Recent advances in wearable technology and computing power mean that large volumes of data are readily available to the applied practitioner [1]. While this data is becoming easier to collect, analysing it can be a challenging task, when sample sizes are small (i.e. limited by squad size) and the data is highly correlated–something that can lead to instability when applying standard least squares regression techniques, making it difficult to draw firm inference [2,3]. A player’s velocity and acceleration are the first and second derivatives of the distance travelled These variables are not independent, but instead are highly correlated. It is not surprising that strong correlations have been reported between variables widely used to assess training load (TL) [4,5]

Objectives
Methods
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call