Abstract

In linear regression analysis, the estimator of the variance of the estimator of the regression coefficients should take into account the clustered nature of the data, if present, since using the standard textbook formula will in that case lead to a severe downward bias in the standard errors. This idea of a cluster-robust variance estimator (CRVE) generalizes to clusters the classical heteroskedasticity-robust estimator. Its justification is asymptotic in the number of clusters. Although an improvement, a considerable bias could remain when the number of clusters is low, the more so when regressors are correlated within cluster. In order to address these issues, two improved methods were proposed; one method, which we call CR2VE, was based on biased reduced linearization, while the other, CR3VE, can be seen as a jackknife estimator. The latter is unbiased under very strict conditions, in particular equal cluster size. To relax this condition, we introduce in this paper CR3VE-λ, a generalization of CR3VE where the cluster size is allowed to vary freely between clusters. We illustrate the performance of CR3VE-λ through simulations and we show that, especially when cluster sizes vary widely, it can outperform the other commonly used estimators.

Highlights

  • In linear regressions with clustered data, it is common practice to estimate the variance of the estimated parameters using the cluster-robust variance estimator (CRVE from hereon) introduced by Liang and Zeger (1986), as a generalization of the White (1980)heteroskedastic-robust estimator

  • Bell and McCaffrey (2002) show that in a finite context, with few clusters and error terms that are correlated within cluster, CRVE leads to severely downwardbiased standard errors and to misleading inference about the estimated parameters

  • We introduce CR3VE-λ, a cluster-robust variance estimator that is identical to CR3VE in the case of balanced clusters but, in the case of unbalanced clusters, takes the difference in cluster sizes into account such that the computed standard errors are less conservative and unbiased under more general conditions

Read more

Summary

Introduction

In linear regressions with clustered data, it is common practice to estimate the variance of the estimated parameters using the cluster-robust variance estimator (CRVE from hereon) introduced by Liang and Zeger (1986), as a generalization of the White (1980). Bell and McCaffrey (2002) show that in a finite context, with few clusters and error terms that are correlated within cluster, CRVE leads to severely downwardbiased standard errors and to misleading inference about the estimated parameters. Following Bell and McCaffrey (2002), inferences about the estimated parameters can be improved by (i) reducing the bias of CRVE with either BRL (bias reduced linearization), known as CR2VE, or the jackknife estimator v JK , known as CR3VE, both based on transformed OLS residuals; CR2VE and CR3VE generalize, using clustered data, the heteroskedasticity-consistent covariance estimators HC2 and HC3, introduced by MacKinnon and White (1985). We introduce CR3VE-λ, a cluster-robust variance estimator that is identical to CR3VE in the case of balanced clusters but, in the case of unbalanced clusters, takes the difference in cluster sizes into account such that the computed standard errors are less conservative and unbiased under more general conditions.

Basic Theory
From CR3VE to CR3VE-λ
Monte Carlo Simulations
A Note on Future Research
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call