Abstract

It is not uncommon to find an outlier in the response variable in linear regression. Such a deviant value needs to be detected and scrutinized to find out why it is not in agreement with its fitted value. Srikantan [1] has developed a test statistic for detecting the presence of an outlier in the response variable in a multiple linear regression model. Approximate critical values of this test statistic are available and are obtained based on the first-order Bonferroni upper bound. The exact critical values are not available and a result of that, tests carried out on the basis of this approximate critical values may not be very accurate. In this paper, we obtained more accurate and precise critical values of this test statistic for large sample sizes (herein called asymptotic critical values) to improve on the tests that use these critical values. The procedure involved using the exact probability density function of this test statistic to obtain its asymptotic critical values. We then compared these asymptotic critical values with the approximate critical values obtained. An application to simulation results for linear regression models was used to examine the power of this test statistic. The asymptotic critical values obtained were found to be more accurate and precise. Also, the test performed better under these asymptotic values (the power performance of this test statistic was found to better when the asymptotic critical values were used).

Highlights

  • If a value of the response variable deviates considerably from its fitted value than others values deviate from their fitted values, we call such a value an outlier

  • We considered introducing an outlier by adding a constand c to the maximum value in each dataset of the response variable (max(yi) + c), i = 1, 2, ..., n)

  • The principles employed in this work in deriving asymptotic critical values x0 of the test statistic tn involved using the exact distribution of this test statistic assuming lack of independence is ignored asymptotically(when (n ⇒ ∞), while the principles employed by [1] in obtaining the upper bound t0 of the critical value of tn involved using the concept of the Bonferroni inequality

Read more

Summary

Introduction

If a value of the response variable deviates considerably from its fitted value than others values deviate from their fitted values, we call such a value an outlier. One of the most popular definitions of an outlier has been given by [2]). He described an outlier as an observation which deviates so much from other observations as it were generated by a different mechanism. Johnson et al [5] defined an outlier as an observation which is inconsistent with the remainder of observation in dataset from which it occurs. According to Domansk [6], numerous statistical methods for outlier detection have been proposed. He recommended that circumspection (cautiouness), double checking, recalculation, etc may help

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call