Abstract
An important and challenging problem in the evaluation of baseball players is the quantification of batted-ball talent. This problem has traditionally been addressed using linear regression or machine learning methods. We use large sets of trajectory measurements acquired by in-game sensors to show that the predictive value of a batted ball depends on its physical properties. This knowledge is exploited to estimate batted-ball distributions defined over a multidimensional measurement space from observed distributions by using regression parameters that adapt to batted ball properties. This process is central to a new method for estimating batted-ball talent. The domain of the batted-ball distributions is defined by a partition of measurement space that is selected to optimize the accuracy of the estimates. We present examples illustrating facets of the new approach and use a set of experiments to show that the new method generates estimates that are significantly more accurate than those generated using current methods. The new methodology supports the use of fine-grained contextual adjustments and we show that this process further improves the accuracy of the technique.
Highlights
Radar and optical sensors have been installed in Major League Baseball (MLB) stadiums in recent years and collect several terabytes of data during each game [1]
In order to exploit this principle we developed a new method for distribution estimation that transforms an observed distribution over local regions of measurement space
We show that by modeling the variation in the predictive value of batted balls, the measurement space partitioning (MSP) method improves on the accuracy of existing methods for estimating batted-ball talent level
Summary
Radar and optical sensors have been installed in Major League Baseball (MLB) stadiums in recent years and collect several terabytes of data during each game [1]. Player talent level on batted balls is defined as the expected value of a statistic which can be estimated from a sample of observations. An intuitively appealing measure of talent level is the naive estimate which is the value of the batted ball statistic over a player’s observed sample. Several ML methods have used sensor data to quantify player performance on batted balls These methods include a technique [18] that combines knearest neighbors with a generalized linear model as well as techniques [10] [19] that use kernel density estimates within a Bayesian framework. We show that by modeling the variation in the predictive value of batted balls, the MSP method improves on the accuracy of existing methods for estimating batted-ball talent level Another advantage of the MSP approach is the ability to incorporate fine-grained contextual information into estimates. Prediction is the process of using the observed data to predict the unobserved performance y(j) for each player j
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have