Abstract

The null distribution of the likelihood ratio test (LRT) of a onecomponent normal model versus two-component normal mixture model is unknown. In this paper, we take a bootstrap approach to the likelihood ratio test for testing bimodality of plasma glucose concentrations from Rancho Bernardo Diabetes Study. The small p-values from this approach support the hypothesis that a bimodal normal mixture model fits the data significantly better than a unimodal normal model. The size and power of the bootstrap based LRT are evaluated through simulations. The results suggest that a sample size of close to 500 would be necessary in order to attain a power of 90% for detecting the unbalanced mixtures with means and variances similar to those in the Rancho Bernardo data. Besides sample size, the power also depends on the two means and variances of the two components in the data.

Highlights

  • Bimodality of blood glucose concentrations has been reported in many populations with a high prevalence of diabetes, including Pima Indians (Rushforth et al, 1971), Nauruans from Micronesia (Zimmet and Whitehouse, 1978), Samoans (Raper et al, 1984), Asian Indians who had migrated to South Africa (Steinberg et al, 1970), and Mexican Americans who were ∼50% white (Rosenthal et al, 1985)

  • In order to evaluate the performance of the proposed procedure in Section 2 for testing between H0: one-component normal model vs. Ha: two-component normal mixture model in (1.1), we evaluate the size and power of the procedure

  • The p-values from this approach indicate that a bimodal normal model fits the Rancho Bernardo plasma glucose data significantly better than a unimodal normal model

Read more

Summary

Introduction

Bimodality of blood glucose concentrations has been reported in many populations with a high prevalence of diabetes, including Pima Indians (Rushforth et al, 1971), Nauruans from Micronesia (Zimmet and Whitehouse, 1978), Samoans (Raper et al, 1984), Asian Indians who had migrated to South Africa (Steinberg et al, 1970), and Mexican Americans who were ∼50% white (Rosenthal et al, 1985). When the two normal components in the mixture distribution have equal variance, i.e., σ1 = σ2, the p-value provided by the traditional LRT has been shown to be liberal (Thode et al, 1988) and an improved approximation by a chi-square distribution with 2.5 degrees of freedom has been suggested (Ning and Finch, 2000). Because the two components have unequal variances in the Rancho Bernardo data, the p-values in Fan et al (2005) are based on a chi-square distribution with 6 degrees of freedom, which may be adequate or even a little conservative based on simulations conducted to investigate the distribution of the LRT for the Rancho Bernardo data (Yang, 2005).

A Bootstrap Approach to the Likelihood Ratio Test for Mixture Models
Algorithm for obtaining p-value of the LRT
The LRT applied to plasma glucose data
Evaluation of Size and Power
Algorithms for evaluating the size and power
Simulation results on size and power
Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.