Abstract
It is not uncommon to find an outlier in the response variable in linear regression. Such a deviant value needs to be detected and scrutinized to find out why it is not in agreement with its fitted value. Srikantan [1] has developed a test statistic for detecting the presence of an outlier in the response variable in a multiple linear regression model. Approximate critical values of this test statistic are available and are obtained based on the first-order Bonferroni upper bound. The exact critical values are not available and a result of that, tests carried out on the basis of this approximate critical values may not be very accurate. In this paper, we obtained more accurate and precise critical values of this test statistic for large sample sizes (herein called asymptotic critical values) to improve on the tests that use these critical values. The procedure involved using the exact probability density function of this test statistic to obtain its asymptotic critical values. We then compared these asymptotic critical values with the approximate critical values obtained. An application to simulation results for linear regression models was used to examine the power of this test statistic. The asymptotic critical values obtained were found to be more accurate and precise. Also, the test performed better under these asymptotic values (the power performance of this test statistic was found to better when the asymptotic critical values were used).
Highlights
If a value of the response variable deviates considerably from its fitted value than others values deviate from their fitted values, we call such a value an outlier
We considered introducing an outlier by adding a constand c to the maximum value in each dataset of the response variable (max(yi) + c), i = 1, 2, ..., n)
The principles employed in this work in deriving asymptotic critical values x0 of the test statistic tn involved using the exact distribution of this test statistic assuming lack of independence is ignored asymptotically(when (n ⇒ ∞), while the principles employed by [1] in obtaining the upper bound t0 of the critical value of tn involved using the concept of the Bonferroni inequality
Summary
If a value of the response variable deviates considerably from its fitted value than others values deviate from their fitted values, we call such a value an outlier. One of the most popular definitions of an outlier has been given by [2]). He described an outlier as an observation which deviates so much from other observations as it were generated by a different mechanism. Johnson et al [5] defined an outlier as an observation which is inconsistent with the remainder of observation in dataset from which it occurs. According to Domansk [6], numerous statistical methods for outlier detection have been proposed. He recommended that circumspection (cautiouness), double checking, recalculation, etc may help
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.