Correcting for Sampling Error in between-Cluster Effects: An Empirical Bayes Cluster-Mean Approach with Finite Population Corrections
With clustered data, such as where students are nested within schools or employees are nested within organizations, it is often of interest to estimate and compare associations among variables separately for each level. While researchers routinely estimate between-cluster effects using the sample cluster means of a predictor, previous research has shown that such practice leads to biased estimates of coefficients at the between level, and recent research has recommended the use of latent cluster means with the multilevel structural equation modeling framework. However, the latent cluster mean approach may not always be the best choice as it (a) relies on the assumption that the population cluster sizes are close to infinite, (b) requires a relatively large number of clusters, and (c) is currently only implemented in specialized software such as Mplus. In this paper, we show how using empirical Bayes estimates of the cluster means can also lead to consistent estimates of between-level coefficients, and illustrate how the empirical Bayes estimate can incorporate finite population corrections when information on population cluster sizes is available. Through a series of Monte Carlo simulation studies, we show that the empirical Bayes cluster-mean approach performs similarly to the latent cluster mean approach for estimating the between-cluster coefficients in most conditions when the infinite-population assumption holds, and applying the finite population correction provides reasonable point and interval estimates when the population is finite. The performance of EBM can be further improved with restricted maximum likelihood estimation and likelihood-based confidence intervals. We also provide an R function that implements the empirical Bayes cluster-mean approach, and illustrate it using data from the classic High School and Beyond Study.
464
- 10.1016/s0147-1767(03)00054-3
- Jul 29, 2003
- International Journal of Intercultural Relations
27
- 10.1177/1094428119879758
- Nov 7, 2019
- Organizational Research Methods
60
- 10.20982/tqmp.16.4.p248
- May 1, 2020
- The Quantitative Methods for Psychology
44
- 10.1080/10705511.2016.1207179
- Aug 15, 2016
- Structural Equation Modeling: A Multidisciplinary Journal
236
- 10.1037/1082-989x.12.1.45
- Mar 1, 2007
- Psychological Methods
418
- 10.1007/s11336-013-9328-2
- Oct 1, 2013
- Psychometrika
165
- 10.1080/10705511.2018.1511375
- Sep 28, 2018
- Structural Equation Modeling: A Multidisciplinary Journal
85
- 10.1037/met0000287
- Feb 1, 2021
- Psychological Methods
2693
- 10.1027/1614-2241.1.3.86
- Jan 1, 2005
- Methodology
3081
- 10.1037/a0020141
- Jan 1, 2010
- Psychological Methods
- Research Article
11
- 10.1177/0013164415618705
- Jul 19, 2016
- Educational and Psychological Measurement
We investigated methods of including covariates in two-level models for cluster randomized trials to increase power to detect the treatment effect. We compared multilevel models that included either an observed cluster mean or a latent cluster mean as a covariate, as well as the effect of including Level 1 deviation scores in the model. A Monte Carlo simulation study was performed manipulating effect sizes, cluster sizes, number of clusters, intraclass correlation of the outcome, patterns of missing data, and the squared correlations between Level 1 and Level 2 covariates and the outcome. We found no substantial difference between models with observed means or latent means with respect to convergence, Type I error rates, coverage, and bias. However, coverage could fall outside of acceptable limits if a latent mean is included as a covariate when cluster sizes are small. In terms of statistical power, models with observed means performed similarly to models with latent means, but better when cluster sizes were small. A demonstration is provided using data from a study of the Tools for Getting Along intervention.
- Research Article
8
- 10.1037/met0000137
- Mar 1, 2018
- Psychological Methods
The research literature has paid little attention to the issue of finite population at a higher level in hierarchical linear modeling. In this article, we propose a method to obtain finite-population-adjusted standard errors of Level-1 and Level-2 fixed effects in 2-level hierarchical linear models. When the finite population at Level-2 is incorrectly assumed as being infinite, the standard errors of the fixed effects are overestimated, resulting in lower statistical power and wider confidence intervals. The impact of ignoring finite population correction is illustrated by using both a real data example and a simulation study with a random intercept model and a random slope model. Simulation results indicated that the bias in the unadjusted fixed-effect standard errors was substantial when the Level-2 sample size exceeded 10% of the Level-2 population size; the bias increased with a larger intraclass correlation, a larger number of clusters, and a larger average cluster size. We also found that the proposed adjustment produced unbiased standard errors, particularly when the number of clusters was at least 30 and the average cluster size was at least 10. We encourage researchers to consider the characteristics of the target population for their studies and adjust for finite population when appropriate. (PsycINFO Database Record
- Research Article
7
- 10.1198/108571106x110676
- Jun 1, 2006
- Journal of Agricultural, Biological, and Environmental Statistics
A gene-by-gene mixed model analysis is a useful statistical method for assessing significance for microarray gene differential expression. While a large amount of data on thousands of genes are collected in a microarray experiment, the sample size for each gene is usually small, which could limit the statistical power of this analysis. In this report, we introduce an empirical Bayes (EB) approach for general variance component models applied to microarray data. Within a linear mixed model framework, the restricted maximum likelihood (REML) estimates of variance components of each gene are adjusted by integrating information on variance components estimated from all genes. The approach starts with a series of single-gene analyses. The estimated variance components from each gene are transformed to the “ANOVA components”. This transformation makes it possible to independently estimate the marginal distribution of each “ANOVA component.” The modes of the posterior distributions are estimated and inversely transformed to compute the posterior estimates of the variance components. The EB statistic is constructed by replacing the REML variance estimates with the EB variance estimates in the usual t statistic. The EB approach is illustrated with a real data example which compares the effects of five different genotypes of male flies on post-mating gene expression in female flies. In a simulation study, the ROC curves are applied to compare the EB statistic and two other statistics. The EB statistic was found to be the most powerful of the three. Though the null distribution of the EB statistic is unknown, a t distribution may be used to provide conservative control of the false positive rate.
- Research Article
244
- 10.1111/j.0006-341x.2001.01173.x
- Dec 1, 2001
- Biometrics
Estimation of covariance matrices in small samples has been studied by many authors. Standard estimators, like the unstructured maximum likelihood estimator (ML) or restricted maximum likelihood (REML) estimator, can be very unstable with the smallest estimated eigenvalues being too small and the largest too big. A standard approach to more stably estimating the matrix in small samples is to compute the ML or REML estimator under some simple structure that involves estimation of fewer parameters, such as compound symmetry or independence. However, these estimators will not be consistent unless the hypothesized structure is correct. If interest focuses on estimation of regression coefficients with correlated (or longitudinal) data, a sandwich estimator of the covariance matrix may be used to provide standard errors for the estimated coefficients that are robust in the sense that they remain consistent under misspecification of the covariance structure. With large matrices, however, the inefficiency of the sandwich estimator becomes worrisome. We consider here two general shrinkage approaches to estimating the covariance matrix and regression coefficients. The first involves shrinking the eigenvalues of the unstructured ML or REML estimator. The second involves shrinking an unstructured estimator toward a structured estimator. For both cases, the data determine the amount of shrinkage. These estimators are consistent and give consistent and asymptotically efficient estimates for regression coefficients. Simulations show the improved operating characteristics of the shrinkage estimators of the covariance matrix and the regression coefficients in finite samples. The final estimator chosen includes a combination of both shrinkage approaches, i.e., shrinking the eigenvalues and then shrinking toward structure. We illustrate our approach on a sleep EEG study that requires estimation of a 24 x 24 covariance matrix and for which inferences on mean parameters critically depend on the covariance estimator chosen. We recommend making inference using a particular shrinkage estimator that provides a reasonable compromise between structured and unstructured estimators.
- Research Article
2
- 10.1016/j.jspi.2012.04.001
- Apr 9, 2012
- Journal of Statistical Planning and Inference
Finite population corrections for multivariate Bayes sampling
- Research Article
5
- 10.1046/j.0039-0402.2003.00256.x
- Mar 5, 2004
- Statistica Neerlandica
For a multilevel model with two levels and only a random intercept, the quality of different estimators of the random intercept is examined. Analytical results are given for the marginal model interpretation where negative estimates of the variance components are allowed for. Except for four or five level‐2 units, the Empirical Bayes Estimator (EBE) has a lower average Bayes risk than the Ordinary Least Squares Estimator (OLSE). The EBEs based on restricted maximum likelihood (REML) estimators of the variance components have a lower Bayes risk than the EBEs based on maximum likelihood (ML) estimators. For the hierarchical model interpretation, where estimates of the variance components are restricted being positive, Monte Carlo simulations were done. In this case the EBE has a lower average Bayes risk than the OLSE, also for four or five level‐2 units. For large numbers of level‐1 (30) or level‐2 units (100), the performances of REML‐based and ML‐based EBEs are comparable. For small numbers of level‐1 (10) and level‐2 units (25), the REML‐based EBEs have a lower Bayes risk than ML‐based EBEs only for high intraclass correlations (0.5).
- Research Article
21
- 10.1007/s001220050375
- Jan 1, 1997
- Theoretical and Applied Genetics
Genetic correlations (rho ( g )) are frequently estimated from natural and experimental populations, yet many of the statistical properties of estimators of rho ( g ) are not known, and accurate methods have not been described for estimating the precision of estimates of rho ( g ). Our objective was to assess the statistical properties of multivariate analysis of variance (MANOVA), restricted maximum likelihood (REML), and maximum likelihood (ML) estimators of rho ( g ) by simulating bivariate normal samples for the one-way balanced linear model. We estimated probabilities of non-positive definite MANOVA estimates of genetic variance-covariance matrices and biases and variances of MANOVA, REML, and ML estimators of rho ( g ), and assessed the accuracy of parametric, jackknife, and bootstrap variance and confidence interval estimators for rho ( g ). MANOVA estimates of rho ( g ) were normally distributed. REML and ML estimates were normally distributed for rho ( g ) = 0.1, but skewed for rho ( g ) = 0.5 and 0.9. All of the estimators were biased. The MANOVA estimator was less biased than REML and ML estimators when heritability (H), the number of genotypes (n), and the number of replications (r) were low. The biases were otherwise nearly equal for different estimators and could not be reduced by jackknifing or bootstrapping. The variance of the MANOVA estimator was greater than the variance of the REML or ML estimator for most H, n, and r. Bootstrapping produced estimates of the variance of rho ( g ) close to the known variance, especially for REML and ML. The observed coverages of the REML and ML bootstrap interval estimators were consistently close to stated coverages, whereas the observed coverage of the MANOVA bootstrap interval estimator was unsatisfactory for some H, rho ( g ), n, and r. The other interval estimators produced unsatisfactory coverages. REML and ML bootstrap interval estimates were narrower than MANOVA bootstrap interval estimates for most H, rho ( g ), n, and r.
- Research Article
44
- 10.1007/s00122-006-0246-x
- Mar 17, 2006
- Theoretical and Applied Genetics
Multi-trait (co)variance estimation is an important topic in plant and animal breeding. In this study we compare estimates obtained with restricted maximum likelihood (REML) and Bayesian Gibbs sampling of simulated data and of three traits (diameter, height and branch angle) from a 26-year-old partial diallel progeny test of Scots pine (Pinus sylvestris L.). Based on the results from the simulated data we can conclude that the REML estimates are accurate but the mode of posterior distributions from the Gibbs sampling can be overestimated depending on the level of the heritability. The mean and median of the posteriors were considerably higher than the expected values of the heritabilities. The confidence intervals calculated with the delta method were biased downwardly. The highest probability density (HPD) interval provides a better interval estimate, but could be slightly biased at the lower level. Similar differences between REML and Gibbs sampling estimates were found for the Scots pine data. We conclude that further simulation studies are needed in order to evaluate the effect of different priors on (co)variance components in the genetic individual model.
- Research Article
30
- 10.1016/j.jspi.2006.08.008
- Jan 30, 2007
- Journal of Statistical Planning and Inference
Empirical Bayes estimation in finite population sampling under functional measurement error models
- Research Article
24
- 10.1080/00273171.2016.1236237
- Nov 1, 2016
- Multivariate Behavioral Research
ABSTRACTFor mixed models generally, it is well known that modeling data with few clusters will result in biased estimates, particularly of the variance components and fixed effect standard errors. In linear mixed models, small sample bias is typically addressed through restricted maximum likelihood estimation (REML) and a Kenward-Roger correction. Yet with binary outcomes, there is no direct analog of either procedure. With a larger number of clusters, estimation methods for binary outcomes that approximate the likelihood to circumvent the lack of a closed form solution such as adaptive Gaussian quadrature and the Laplace approximation have been shown to yield less-biased estimates than linearization estimation methods that instead linearly approximate the model. However, adaptive Gaussian quadrature and the Laplace approximation are approximating the full likelihood rather than the restricted likelihood; the full likelihood is known to yield biased estimates with few clusters. On the other hand, linearization methods linearly approximate the model, which allows for restricted maximum likelihood and the Kenward-Roger correction to be applied. Thus, the following question arises: Which is preferable, a better approximation of a biased function or a worse approximation of an unbiased function? We address this question with a simulation and an illustrative empirical analysis.
- Research Article
10
- 10.1186/s12874-022-01550-8
- Apr 13, 2022
- BMC Medical Research Methodology
BackgroundStepped wedge trials are an appealing and potentially powerful cluster randomized trial design. However, they are frequently implemented with a small number of clusters. Standard analysis methods for these trials such as a linear mixed model with estimation via maximum likelihood or restricted maximum likelihood (REML) rely on asymptotic properties and have been shown to yield inflated type I error when applied to studies with a small number of clusters. Small-sample methods such as the Kenward-Roger approximation in combination with REML can potentially improve estimation of the fixed effects such as the treatment effect. A Bayesian approach may also be promising for such multilevel models but has not yet seen much application in cluster randomized trials.MethodsWe conducted a simulation study comparing the performance of REML with and without a Kenward-Roger approximation to a Bayesian approach using weakly informative prior distributions on the intracluster correlation parameters. We considered a continuous outcome and a range of stepped wedge trial configurations with between 4 and 40 clusters. To assess method performance we calculated bias and mean squared error for the treatment effect and correlation parameters and the coverage of 95% confidence/credible intervals and relative percent error in model-based standard error for the treatment effect.ResultsBoth REML with a Kenward-Roger standard error and degrees of freedom correction and the Bayesian method performed similarly well for the estimation of the treatment effect, while intracluster correlation parameter estimates obtained via the Bayesian method were less variable than REML estimates with different relative levels of bias.ConclusionsThe use of REML with a Kenward-Roger approximation may be sufficient for the analysis of stepped wedge cluster randomized trials with a small number of clusters. However, a Bayesian approach with weakly informative prior distributions on the intracluster correlation parameters offers a viable alternative, particularly when there is interest in the probability-based inferences permitted within this paradigm.
- Research Article
10
- 10.1080/00949650802635165
- Mar 1, 2010
- Journal of Statistical Computation and Simulation
This article deals with the estimation of a fixed population size through capture-mark-recapture method that gives rise to hypergeometric distribution. There are a few well-known and popular point estimators available in the literature, but no good comprehensive comparison is available about their merits. Apart from the available estimators, an empirical Bayes (EB) estimator of the population size is proposed. We compare all the point estimators in terms of relative bias and relative mean squared error. Next, two new interval estimators – (a) an EB highest posterior distribution interval and (b) a frequentist interval estimator based on a parametric bootstrap method, are proposed. The comparison is then carried among the two proposed interval estimators and interval estimators derived from the currently available estimators in terms of coverage probability and average length (AL). Based on comprehensive numerical results, we rank and recommend the point estimators as well as interval estimators for practical use. Finally, a real-life data set for a green treefrog population is used as a demonstration for all the methods discussed.
- Research Article
27
- 10.1002/env.3170050403
- Dec 1, 1994
- Environmetrics
Assessments of the potential health impacts of contaminants and other environmental risk factors are often based on comparisons of disease rates among collections of spatially aligned areas. These comparisons are valid only if the observed rates adequately reflect the true underlying area‐specific risk. In areas with small populations, observed incidence values can be highly unstable and true risk differences among areas can be masked by spurious fluctuations in the observed rates. We examine the use of Bayes and empirical Bayes methods for stabilizing incidence rates observed in geographically aligned areas. While these methods improve stability, both the Bayes and empirical Bayes approaches produce a histogram of the estimates that is too narrow when compared to the true distribution of risk. Constrained empirical Bayes estimators have been developed that provide improved estimation of the variance of the true rates. We use simulations to compare the performance of Bayes, empirical Bayes, and constrained empirical Bayes approaches for estimating incidence rates in a variety of multivariate Gaussian scenarios with differing levels of spatial dependence. The mean squared error of estimation associated with the simulated observed rates was, on average, five times greater than that of the Bayes and empirical Bayes estimates. The sample variance of the standard Bayes and empirical Bayes estimates was consistently smaller than the variance of the simulated rates. The constrained estimators produced collections of rate estimates that dramatically improved estimation of the true dispersion of risk. In addition, the mean square error of the constrained empirical Bayes estimates was only slightly greater than that of the unconstrained rate estimates. We illustrate the use of empirical and constrained empirical Bayes estimators in an analysis of lung cancer mortality rates in Ohio.
- Research Article
1
- 10.1002/gepi.22501
- Sep 18, 2022
- Genetic epidemiology
Populations of non-European ancestry are substantially underrepresented in genome-wide association studies (GWAS). As genetic effects can differ between ancestries due to possibly different causal variants or linkage disequilibrium patterns, a meta-analysis that includes GWAS of all populations yields biased estimation in each of the populations and the bias disproportionately impacts non-European ancestry populations. This is because meta-analysis combines study-specific estimates with inverse variance as the weights, which causes biases towards studies with the largest sample size, typical of the European ancestry population. In this paper, we propose two empirical Bayes (EB) estimators to borrow the strength of information across populations although accounting for between-population heterogeneity. Extensive simulation studies show that the proposed EB estimators are largely unbiased and improve efficiency compared to the population-specific estimator. In contrast, even though the meta-analysis estimator has a much smaller variance, it yields significant bias when the genetic effect is heterogeneous across populations. We apply the proposed EB estimators to a large-scale trans-ancestry GWAS of stroke and demonstrate that the EB estimators reduce the variance of the population-specific estimator substantially, with the effect estimates close to the population-specific estimates.
- Research Article
15
- 10.1348/000711000159178
- May 1, 2000
- The British journal of mathematical and statistical psychology
In a previous paper, Boik presented an empirical Bayes (EB) approach to the analysis of repeated measurements. The EB approach is a blend of the conventional univariate and multivariate approaches. Specifically, in the EB approach, the underlying covariance matrix is estimated by a weighted sum of the univariate and multivariate estimators. In addition to demonstrating that his approach controls test size and frequently is more powerful than either the epsilon-adjusted univariate or multivariate approaches, Boik showed how conventional multivariate software can be used to conduct EB analyses. Our investigation examined the Type I error properties of the EB approach when its derivational assumptions were not satisfied as well as when other factors known to affect the conventional tests of significance were varied. For comparative purposes we also investigated procedures presented by Huynh and by Keselman, Carriere, and Lix, procedures designed for non-spherical data and covariance heterogeneity, as well as an adjusted univariate and multivariate test statistic. Our results indicate that when the response variable is normally distributed and group sizes are equal, the EB approach was robust to violations of its derivational assumptions and therefore is recommended due to the power findings reported by Boik. However, we also found that both the EB approach and the adjusted univariate and multivariate procedures were prone to depressed or elevated rates of Type I error when data were non-normally distributed and covariance matrices and group sizes were either positively or negatively paired with one another. On the other hand, the Huynh and Keselman et al. procedures were generally robust to these same pairings of covariance matrices and group sizes.
- Research Article
- 10.1080/00273171.2025.2575399
- Oct 16, 2025
- Multivariate Behavioral Research
- Research Article
- 10.1080/00273171.2025.2561947
- Sep 18, 2025
- Multivariate Behavioral Research
- Research Article
- 10.1080/00273171.2025.2561943
- Sep 17, 2025
- Multivariate Behavioral Research
- Research Article
- 10.1080/00273171.2025.2557275
- Sep 17, 2025
- Multivariate Behavioral Research
- Research Article
- 10.1080/00273171.2025.2561945
- Sep 15, 2025
- Multivariate Behavioral Research
- Research Article
- 10.1080/00273171.2025.2561944
- Sep 14, 2025
- Multivariate Behavioral Research
- Research Article
- 10.1080/00273171.2025.2551373
- Sep 2, 2025
- Multivariate Behavioral Research
- Research Article
- 10.1080/00273171.2025.2552303
- Aug 27, 2025
- Multivariate Behavioral Research
- Research Article
- 10.1080/00273171.2025.2552304
- Aug 26, 2025
- Multivariate Behavioral Research
- Research Article
- 10.1080/00273171.2025.2551370
- Aug 25, 2025
- Multivariate Behavioral Research
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.