Abstract

In classical regression analysis, the ordinary least-squares estimation is the best estimation if the essential assumptions are satisfied. However, if the data does not satisfy some of these assumptions, then results can be misleading. Especially, outliers violate the assumption of normally distributed residuals in the least-squares regression. Robust regression is a modern technique for analyzing data that are contaminated with outliers. The standard setup is to assume that the given samples are derived from a nice distribution, but that an adversary as the power to arbitrary corrupt a constant fraction of the observed data. With advances in technologies, most data problems carry structures such as the number of covariates (p) may exceed the sample size (n), as in the case with high dimensional dataset. Due to some limitations in high dimensional problems, the classical approaches may no longer be useful. One of the alternative approaches commonly used in the ridge estimator introduced by Hoerl and Kennard (Technometrics 12:55–67, 1970). The robust ridge regression provides a solution for the high dimensional dataset with outliers. To be more specific, when prior information (in the form of non-sample information) is available about the vector parameter, the estimation can be improved. This information, known as uncertain prior information or restriction, is useful in the estimation procedure, especially, when the information based on the sample data may be limited. The information may be due to (a) a fact known from theoretical or experimental considerations, (b) a hypothesis that need to be tested or, (c) an artificially imposed condition to reduce or eliminate redundancy in the description of the model. On the other hand, in some experimental cases, it is not certain whether this prior information hold. The consequence of incorporating non-sample information depends on the quality or reliability of the information introduced in the estimation process. This uncertain prior information, in the form of hypotheses, can be used in two different ways: (a) a preliminary test estimation procedure, and (b) the Stein-type shrinkage estimation. We consider robust ridge estimation in semiparametric high dimensional data and propose a preliminary test, Stein-type and positive-rule Stein-type robust estimators. For these estimators, a real data example is considered to illustrate the efficiency of the estimators.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call