Abstract

Construction of confidence intervals or regions is an important part of statistical inference. The usual approach to constructing a confidence interval for a single parameter or confidence region for two or more parameters requires that the distribution of estimated parameters is known or can be assumed. In reality, the sampling distributions of parameters of biological importance are often unknown or difficult to be characterized. Distribution-free nonparametric resampling methods such as bootstrapping and permutation have been widely used to construct the confidence interval for a single parameter. There are also several parametric (ellipse) and nonparametric (convex hull peeling, bagplot and HPDregionplot) methods available for constructing confidence regions for two or more parameters. However, these methods have some key deficiencies including biased estimation of the true coverage rate, failure to account for the shape of the distribution inherent in the data and difficulty to implement. The purpose of this paper is to develop a new distribution-free method for constructing the confidence region that is based only on a few basic geometrical principles and accounts for the actual shape of the distribution inherent in the real data. The new method is implemented in an R package, distfree.cr/R. The statistical properties of the new method are evaluated and compared with those of the other methods through Monte Carlo simulation. Our new method outperforms the other methods regardless of whether the samples are taken from normal or non-normal bivariate distributions. In addition, the superiority of our method is consistent across different sample sizes and different levels of correlation between the two variables. We also analyze three biological data sets to illustrate the use of our new method for genomics and other biological researches.

Highlights

  • Confidence interval estimates of individual parameters are more informative than simple point estimates and they are widely used in statistical inference [1,2,3]

  • We develop a new geometry-based, distributionfree approach to constructing the confidence region (CR) for two or more variables

  • It should be a significant complement to the existing parametric and nonparametric methods including bagplot [16], convex hull peeling [12], and HPDregionplot [25])

Read more

Summary

Introduction

Confidence interval estimates of individual parameters are more informative than simple point estimates and they are widely used in statistical inference [1,2,3]. Construction of the confidence intervals or regions for parameters often assumes that the data are from a normal distribution and they are balanced. When the distribution is unknown or hard to be characterized, several nonparametric procedures are available for construction of the confidence intervals or regions. The outmost convex hull is identified, the observations in the convex are assigned with index value of one and these observations are removed from the data. This procedure is iterated but the index value is increased by one for each iteration until all observation are assigned with indexes. The fundamental behind the HPDregionplot is to use the contour that embraces the desired proportion of the capacity based on the two-dimensional kernel density estimates [17] as CR

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call