Abstract

When constructing a regression model, the primary problem faced by the researcher is that it is not clear what the equation of connection between the explained and explanatory variables should be. This initial stage of construction the selection of the model structural specification is called. When choosing a regression specification in parallel, the question arises of which explanatory variables should be included in the equation. This is the problem of variables selection in regression models. Its essence is to single out from the set of “candidates” for inclusion a subset of the most informative of them based on some quality criterion. The article is devoted to the problem of variables selection in regression models estimated using the ordinary least squares. The previously proposed approach to selection a given number of variables based on mixed 0–1 linear programming is considered. The unknown parameters in this problem are the beta coefficients of standardized regression and Boolean variables that are responsible for the occurrence of factors in the model. The optimal values ​​of unknown parameters are found on the basis of maximizing the value of the coefficient of determination of regression. Unfortunately, to solve the problem under consideration, it is required to manually set the number of selected factors, which is often impossible to determine in advance. Therefore, the goal was to formalize the problem so that as a result of its solution the optimal number of selected regressors was also determined. For this purpose, the adjusted determination coefficient, depending on the number of model factors, was used as the objective function. As a result, the problem of mixed integer linear programming was formulated. The unknown parameters in it are still beta coefficients and Boolean variables, as well as an integer variable – the number of regressors. Based on data on prices and characteristics of sedans and hatchbacks of the American automobile industry, a computational experiment was carried out confirming the correctness of the developed mathematical apparatus. The problem formalized in this work in the form of a mixed integer linear programming looks more preferable from a computational point of view than the same problem formalized in modern scientific literature as a mixed quadratic linear programming.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.