Abstract

The Pearson’s chi-squared statistic ( X 2 ) does not in general follow a chi-square distribution when it is used for goodness-of-fit testing for a multinomial distribution based on sparse contingency table data. We explore properties of [Zelterman, D., 1987. Goodness-of-fit tests for large sparse multinomial distributions. J. Amer. Statist. Assoc. 82 (398), 624–629] D 2 statistic and compare them with those of X 2 and compare the power of goodness-of-fit test among the tests using D 2 , X 2 , and the statistic ( L r ) which is proposed by [Maydeu-Olivares, A., Joe, H., 2005. Limited- and full-information estimation and goodness-of-fit testing in 2 n contingency tables: A unified framework. J. Amer. Statist. Assoc. 100 (471), 1009–1020] when the given contingency table is very sparse. We show that the variance of D 2 is not larger than the variance of X 2 under null hypotheses where all the cell probabilities are positive, that the distribution of D 2 becomes more skewed as the multinomial distribution becomes more asymmetric and sparse, and that, as for the L r statistic, the power of the goodness-of-fit testing depends on the models which are selected for the testing. A simulation experiment strongly recommends to use both D 2 and L r for goodness-of-fit testing with large sparse contingency table data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.