Abstract

The paper proposes a weighted cross-validation (WCV) algorithm to select a linear regression model with change-point under a scale mixtures of normal (SMN) distribution that yields the best prediction results. SMN distributions are used to construct robust regression models to the influence of outliers on the parameter estimation process. Thus, we relaxed the usual assumption of normality of the regression models and considered that the random errors follow a SMN distribution, specifically the Student-t distribution. In addition, we consider the fact that the parameters of the regression model can change from a specific and unknown point, called change-point. In this context, the estimations of the model parameters, which include the change-point, are obtained via the EM-type algorithm (Expectation-Maximization). The WCV method is used in the selection of the model that presents greater robustness and that offers a smaller prediction error, considering that the weighting values come from step E of the EM-type algorithm. Finally, numerical examples considering simulated and real data (data from television audiences) are presented to illustrate the proposed methodology.

Highlights

  • Linear regression models are widely used to describe the average relationship between a response variable and one or more explanatory variables

  • This work proposes a weighted cross-validation (WCV) method for selecting a regression model with a changepoint continuous that considers distribution with heavier tails (SMN), as an alternative to normal distribution, in order to reduce the influence of outliers and allowing high level of predicitivity

  • When ui = 1, i = 1, ..., n, the results described above coincide with the maximum likelihood estimates of the linear regression model with changepoint under normal errors

Read more

Summary

INTRODUCTION

Linear regression models are widely used to describe the average relationship between a response variable and one or more explanatory variables. The assumption of normality error in a regression model is usually adopted in the literature This assumption becomes unrealistic when the data follow a distribution with heavier tails, and even more, the coefficient estimates are sensitive to extreme observations. Field e Blanchard (1997) and Markatou, Afendras e Agostinelli (2018) extend this idea and propose WCV methods for selecting robust models for extreme observations. Following this approach, this work proposes a WCV method for selecting a regression model with a changepoint continuous that considers distribution with heavier tails (SMN), as an alternative to normal distribution, in order to reduce the influence of outliers and allowing high level of predicitivity.

SPECIFICATION OF THE REGRESSION MODEL
Scale Mixtures of Normal Distributions
The EM Algorithm
Description of the EM-type Algorithm
WEIGHTED CROSS-VALIDATION METHOD
Estimate the average forecast error considering the validation set from
SIMULATION STUDIES
APPLICATION IN TELEVISION AUDIENCE DATA
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call