Abstract

In many scientific studies, the underlying data-generating process is unknown and multiple statistical models are considered to describe it. For example, in a factorial experiment we might consider models involving just main effects, as well as those that include interactions. Model-averaging is a commonly-used statistical technique to allow for model uncertainty in parameter estimation. In the frequentist setting, the model-averaged estimate of a parameter is a weighted mean of the estimates from the individual models, with the weights typically being based on an information criterion, cross-validation, or bootstrapping. One approach to building a model-averaged confidence interval is to use a Wald interval, based on the model-averaged estimate and its standard error. This has been the default method in many application areas, particularly those in the life sciences. The MA-Wald interval, however, assumes that the studentized model-averaged estimate has a normal distribution, which can be far from true in practice due to the random, data-driven model weights. Recently, the model-averaged tail area Wald interval (MATA-Wald) has been proposed as an alternative to the MA-Wald interval, which only assumes that the studentized estimate from each model has a N(0, 1) or t-distribution, when that model is true. This alternative to the MA-Wald interval has been shown to have better coverage in simulation studies. However, when we have a response variable that is skewed, even these relaxed assumptions may not be valid, and use of these intervals might therefore result in poor coverage. We propose a new interval (MATA-SBoot) which uses a parametric bootstrap approach to estimate the distribution of the studentized estimate for each model, when that model is true. This method only requires that the studentized estimate from each model is approximately pivotal, an assumption that will often be true in practice, even for skewed data. We illustrate use of this new interval in the analysis of a three-factor marine global change experiment in which the response variable is assumed to have a lognormal distribution. We also perform a simulation study, based on the example, to compare the lower and upper error rates of this interval with those for existing methods. The results suggest that the MATA-SBoot interval can provide better error rates than existing intervals when we have skewed data, particularly for the upper error rate when the sample size is small.

Highlights

  • It is well known that calculation of a confidence interval after selection of a best model ignores model uncertainty and can lead to the interval having poor coverage [1,2,3,4,5]

  • The clearest difference between the methods are for the upper confidence limit, with the MATA-SBoot interval generally having an upper error rate that is closest to the nominal level (Figs 2 to 4)

  • Because the MATA-SBoot increases its width to account for skewness, it was always wider than the Model-Averaged Wald Interval (MA-Wald) and MATA-Wald intervals and usually wider than the percentile bootstrap (PB) interval

Read more

Summary

Introduction

It is well known that calculation of a confidence interval after selection of a best model ignores model uncertainty and can lead to the interval having poor coverage [1,2,3,4,5]. A simple alternative is to use an interval based on the full model. In settings where this model provides a good approximation to the “truth”, this will often lead to error rates close to the required levels. Even in these settings, a simpler model may provide a narrower interval with good coverage properties. Model-averaging offers a compromise between these two types of intervals, in that we might expect it to lead to a narrower interval than the full model, whilst providing better coverage than an interval based on a single best model [6, 7]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.