Abstract
We present a systematic approach for prediction purposes based on panel data, involving information about different interacting subjects and different times (here: two). The corresponding bivariate regression problem can be solved analytically for the final statistical estimation error. Furthermore, this expression is simplified for the special case that the subjects do not change their properties between the last measurement and the prediction period. This statistical framework is applied to the prediction of soccer matches, based on information from the previous and the present season. It is determined how well the outcome of soccer matches can be predicted theoretically. This optimum limit is compared with the actual quality of the prediction, taking the German premier league as an example. As a key step for the actual prediction process one has to identify appropriate observables which reflect the strength of the individual teams as close as possible. A criterion to distinguish different observables is presented. Surprisingly, chances for goals turn out to be much better suited than the goals themselves to characterize the strength of a team. Routes towards further improvement of the prediction are indicated. Finally, two specific applications are discussed.
Highlights
Panel data analysis deals with a regression procedure where individual subjects as well as information at different times is taken into account [1]
For Gaussian statistics there exists a direct connection between Bayesian inference and a regression analysis; see, e.g., [5]
It is related to the Bayesian approach because it takes into account the impact of additional information as well as the impact of decorrelations on the estimation of future events
Summary
Panel data analysis deals with a regression procedure where individual subjects as well as information at different times is taken into account [1]. We simplify the general result by using the assumption that the underlying property of the subject does not change between the final measurement and the prognosis time interval. This does not necessarily hold for the time of earlier measurements. Having an explicit expression of the estimator quality it is possible to judge the relevance of the available information for the prediction process in a detailed manner. We can define the limit of optimum prediction and judge, how far a specific prediction procedure differs from this limit
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.