Robust IV inference with clustering dependence

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Summary Linear instrumental variables (IV) models with clustering dependence are widely used in empirical studies, although the common solution, the cluster covariance estimator, often produces undesirable inferential results, especially with weak instruments. In this paper, I propose a method that is robust to both weak IV and (potentially heterogeneous) clustering dependence. The proposed method is based on the idea of Fama–MacBeth estimation, with group-level estimators being a truncated version of the unbiased IV estimator. Truncation stabilizes the group-level estimator by ensuring bounded second moments, thus improving finite-sample performance in weak instrument settings. Asymptotic validity is shown under both strong and weak IV sequences, as well as under general requirements. The proposed method is applied to study the effect of city compactness on population density.

Similar Papers
  • Research Article
  • Cite Count Icon 1
  • 10.1111/biom.13784
Discussion on "Instrumented difference-in-differences" by Ting Ye, Ashkan Ertefaie, James Flory, Sean Hennessy & Dylan S. Small.
  • Nov 8, 2022
  • Biometrics
  • Hyunseung Kang

We reinterpret the instrumented difference-in-differences (iDID) under a linear instrumental variables (IV) model. Under the linear IV model, we show why iDID is a clear improvement over two existing methods, difference-in-differences (DID) and a cross-sectional, IV analysis. We also re-express some of the assumptions of iDID using familiar, regression-based identification assumptions. We conclude with a method inspired by the linear IV model that can potentially remedy the weak identification problem iniDID.

  • Research Article
  • Cite Count Icon 39
  • 10.1080/07474930801960410
Finite Sample Evidence Suggesting a Heavy Tail Problem of the Generalized Empirical Likelihood Estimator
  • May 15, 2008
  • Econometric Reviews
  • Patrik Guggenberger

Comprehensive Monte Carlo evidence is provided that compares the finite sample properties of generalized empirical likelihood (GEL) estimators to the ones of k-class estimators in the linear instrumental variables (IV) model. We focus on sample median, mean, mean squared error, and on the coverage probability and length of confidence intervals obtained from inverting a t-statistic based on the various estimators. The results indicate that in terms of the above criteria, all the GEL estimators and the limited information maximum likelihood (LIML) estimator behave very similarly. This suggests that GEL estimators might also share the “no-moment” problem of LIML. At sample sizes as in our Monte Carlo study, there is no systematic bias advantage of GEL estimators over k-class estimators. On the other hand, the standard deviation of GEL estimators is pronouncedly higher than for some of the k-class estimators. Therefore, if mean squared error is used as the underlying loss function, our study suggests the use of computationally simple estimators, such as two-stage least squares, in the linear IV model rather than GEL. Based on the properties of confidence intervals, we cannot recommend the use of GEL estimators either in the linear IV model.

  • Research Article
  • Cite Count Icon 69
  • 10.1017/s0266466603192055
DETECTING LACK OF IDENTIFICATION IN GMM
  • Jan 31, 2003
  • Econometric Theory
  • Jonathan H Wright

One of the key assumptions of the standard linear instrumental variables (IV) model is that the instruments and endogenous variables are correlated. This is the identification assumption, without which the usual IV estimator is neither consistent nor asymptotically normal. If the correlation between the instruments and the endogenous variables is nonzero, but slight, then the conventional Gaussian asymptotic theory for the IV model can nevertheless provide a very poor approximation to the actual sampling distribution of estimators and test statistics. Recognizing the identification assumption on which the IV model relies, it is quite common in the applied literature to test for instrument relevance by a first-stage F-test. The null hypothesis is one of a total lack of identification. A rejection of this hypothesis by no means implies that issues of weak instruments can be ignored (Staiger and Stock, 1997). But a failure to reject this hypothesis is a strong indication of identification difficulties. The firststage F-test is an important and useful diagnostic in the IV model. The generalized method of moments (GMM) model (Hansen, 1982) nests the linear IV model as a special case. Not surprisingly, analogous issues arise in this model. Researchers have found that, in many contexts, the conventional Gaussian asymptotic theory provides a poor approximation to the sampling distribution of GMM estimators and test statistics. There are many possible reasons why this could happen, but they include identification problems. However, I am aware of no test of the identification condition in the nonlinear-inparameters GMM model in the existing literature. This paper proposes such a

  • Research Article
  • Cite Count Icon 27
  • 10.1111/j.1524-4733.2009.00567.x
Too Much Ado about Instrumental Variable Approach: Is the Cure Worse than the Disease?
  • Nov 1, 2009
  • Value in Health
  • Onur Baser

Too Much Ado about Instrumental Variable Approach: Is the Cure Worse than the Disease?

  • Research Article
  • Cite Count Icon 55
  • 10.1002/jae.1148
Instrumental variables regressions with uncertain exclusion restrictions: a Bayesian approach
  • Jan 1, 2012
  • Journal of Applied Econometrics
  • Aart Kraay

SUMMARYThe identification of structural parameters in the linear instrumental variables (IV) model is typically achieved by imposing the prior identifying assumption that the error term in the structural equation of interest is orthogonal to the instruments. Since this exclusion restriction is fundamentally untestable, there are often legitimate doubts about the extent to which the exclusion restriction holds. In this paper I illustrate the effects of such prior uncertainty about the validity of the exclusion restriction on inferences based on linear IV models. Using a Bayesian approach, I provide a mapping from prior uncertainty about the exclusion restriction into increased uncertainty about parameters of interest. Moderate prior uncertainty about exclusion restrictions can lead to a substantial loss of precision in estimates of structural parameters. This loss of precision is relatively more important in situations where IV estimates appear to be more precise, for example in larger samples or with stronger instruments. I illustrate these points using several prominent recent empirical papers that use linear IV models. An accompanying electronic table allows users to readily explore the robustness of inferences to uncertainty about the exclusion restriction in their particular applications. Copyright © 2010 John Wiley & Sons, Ltd.

  • Research Article
  • Cite Count Icon 3777
  • 10.1198/073500102288618658
A Survey of Weak Instruments and Weak Identification in Generalized Method of Moments
  • Oct 1, 2002
  • Journal of Business & Economic Statistics
  • James H Stock + 2 more

Weak instruments arise when the instruments in linear instrumental variables (IV) regression are weakly correlated with the included endogenous variables. In generalized method of moments (GMM), more generally, weak instruments correspond to weak identification of some or all of the unknown parameters. Weak identification leads to GMM statistics with nonnormal distributions, even in large samples, so that conventional IV or GMM inferences are misleading. Fortunately, various procedures are now available for detecting and handling weak instruments in the linear IV model and, to a lesser degree, in nonlinear GMM.

  • Single Report
  • Cite Count Icon 30
  • 10.1920/wp.cem.2010.3110
Sparse models and methods for optimal instruments with an application to eminent domain
  • Oct 22, 2010
  • Alexandre Belloni + 3 more

We develop results for the use of Lasso and Post-Lasso methods to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, $p$. Our results apply even when $p$ is much larger than the sample size, $n$. We show that the IV estimator based on using Lasso or Post-Lasso in the first stage is root-n consistent and asymptotically normal when the first-stage is approximately sparse; i.e. when the conditional expectation of the endogenous variables given the instruments can be well-approximated by a relatively small set of variables whose identities may be unknown. We also show the estimator is semi-parametrically efficient when the structural error is homoscedastic. Notably our results allow for imperfect model selection, and do not rely upon the unrealistic "beta-min" conditions that are widely used to establish validity of inference following model selection. In simulation experiments, the Lasso-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument-robust procedures. In an empirical example dealing with the effect of judicial eminent domain decisions on economic outcomes, the Lasso-based IV estimator outperforms an intuitive benchmark. In developing the IV results, we establish a series of new results for Lasso and Post-Lasso estimators of nonparametric conditional expectation functions which are of independent theoretical and practical interest. We construct a modification of Lasso designed to deal with non-Gaussian, heteroscedastic disturbances which uses a data-weighted $\ell_1$-penalty function. Using moderate deviation theory for self-normalized sums, we provide convergence rates for the resulting Lasso and Post-Lasso estimators that are as sharp as the corresponding rates in the homoscedastic Gaussian case under the condition that $\log p = o(n^{1/3})$.

  • Research Article
  • Cite Count Icon 30
  • 10.2139/ssrn.1910169
Sparse Models and Methods for Optimal Instruments with an Application to Eminent Domain
  • Aug 15, 2011
  • SSRN Electronic Journal
  • Alexandre Belloni + 3 more

We develop results for the use of LASSO and Post-LASSO methods to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, p, that apply even when p is much larger than the sample size, n. We rigorously develop asymptotic distribution and inference theory for the resulting IV estimators and provide conditions under which these estimators are asymptotically oracle-efficient. In simulation experiments, the LASSO-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument-robust procedures. In an empirical example dealing with the effect of judicial eminent domain decisions on economic outcomes, the LASSO based IV estimator substantially reduces estimated standard errors allowing one to draw much more precise conclusions about the economic effects of these decisions. Optimal instruments are conditional expectations; and in developing the IV results, we also establish a series of new results for LASSO and Post-LASSO estimators of non-parametric conditional expectation functions which are of independent theoretical and practical interest. Specifically, we develop the asymptotic theory for these estimators that allows for non-Gaussian, heteroscedastic disturbances, which is important for econometric applications. By innovatively using moderate deviation theory for self-normalized sums, we provide convergence rates for these estimators that are as sharp as in the homoscedastic Gaussian case under the weak condition that log p = o(n1=3). Moreover, as a practical innovation, we provide a fully data-driven method for choosing the user-specified penalty that must be provided in obtaining LASSO and Post-LASSO estimates and establish its asymptotic validity under non-Gaussian, heteroscedastic disturbances.

  • Research Article
  • Cite Count Icon 1
  • 10.1177/09622802241281035
LASSO-type instrumental variable selection methods with an application to Mendelian randomization
  • Nov 15, 2024
  • Statistical Methods in Medical Research
  • Muhammad Qasim + 2 more

Valid instrumental variables (IVs) must not directly impact the outcome variable and must also be uncorrelated with nonmeasured variables. However, in practice, IVs are likely to be invalid. The existing methods can lead to large bias relative to standard errors in situations with many weak and invalid instruments. In this paper, we derive a LASSO procedure for the k-class IV estimation methods in the linear IV model. In addition, we propose the jackknife IV method by using LASSO to address the problem of many weak invalid instruments in the case of heteroscedastic data. The proposed methods are robust for estimating causal effects in the presence of many invalid and valid instruments, with theoretical assurances of their execution. In addition, two-step numerical algorithms are developed for the estimation of causal effects. The performance of the proposed estimators is demonstrated via Monte Carlo simulations as well as an empirical application. We use Mendelian randomization as an application, wherein we estimate the causal effect of body mass index on the health-related quality of life index using single nucleotide polymorphisms as instruments for body mass index.

  • Research Article
  • Cite Count Icon 15
  • 10.1097/ede.0000000000000152
Quantitative falsification of instrumental variables assumption using balance measures.
  • Sep 1, 2014
  • Epidemiology
  • M Sanni Ali + 8 more

To the Editor: Instrumental variable analysis has been used to control for unmeasured confounding in nonrandomized studies.1–4 An instrumental variable (1) is associated with exposure, (2) affects outcome only through the exposure, and (3) is independent of confounders.1–4 If these key assumptions are satisfied (together with additional assumptions, such as homogeneity),1,3,4 instrumental variable analysis could consistently estimate the average causal effect of exposure.1,4 However, if one of the assumptions is violated, the estimate can be severely biased.1,3,4 Several methods are available for checking the first assumption,2,4 but there is no well-established method for checking the second and third assumptions. Some authors1,3 have argued that these assumptions are untestable, as they involve unmeasured confounding. Glymour et al5 suggested several approaches (eg, leverage prior causal assumptions) for evaluating the validity of instrumental variable, although, in certain situations, they might fail to identify a biased instrumental variable or inappropriately suggest that a valid instrumental variable is biased. In addition, balance of measured confounders between instrumental variable categories has been used as a supportive evidence for the third assumption.2,6 Alternatively, an imbalance of measured confounders can falsify this assumption. We propose the standardized difference (SDif), a robust balance measure used in propensity score methods,7,8 to falsify the third assumption by checking independence between an instrumental variable and measured confounders. If measured confounders are insufficiently balanced between instrumental variable categories, indicated by SDif values deviating from zero (eg, >0.10),7 this may also imply an imbalance of unmeasured confounders, even after conditioning on measured confounders (depending on the associations among instrumental variables and measured and unmeasured confounders). In that case, the third assumption is violated; hence, (un)adjusted instrumental variable analysis is inappropriate. However, if measured confounders are balanced, investigators should rely on background knowledge to argue that such balance could be carried over to unmeasured confounders.2,6 In a simulation study, we assessed the performance of SDif to quantitatively falsify the third assumption. In addition, we applied this measure in an empirical study on the relation between β2-agonist use and myocardial infarction, using physician preference as an instrumental variable. For details, we refer to the eAppendix (https://links.lww.com/EDE/A815). Key findings are summarized below. Data were generated with binary instrumental variable and exposure, continuous confounders (3 measured and 1 unmeasured), and a continuous outcome based on the causal diagram shown in the Figure (panel A). SDif was calculated for the measured confounders.FIGURE: Relation between bias and standardized difference of observed confounders between instrumental variable (IV) categories. Panel A shows a directed acyclic graph (DAG) of the simulations, where X = exposure, Y = outcome, C = observed confounders, U = unobserved confounder, and Z = IV. Panels B and C show the relation between the mean standardized difference versus bias of IV estimates (based on 10,000 simulations of 10,000 subjects). In panel B, observed confounders were not included in the IV models, whereas in panel C they were included. Observed confounders were adjusted for in conventional regression analysis in both panels B and C.Panel B shows the results of instrumental variable analysis without adjustment for measured confounders. The magnitude of bias in the instrumental variable estimate increased with decreasing balance of measured confounders between instrumental variable categories (eg, for an instrumental variable that was independent of unmeasured confounders, the bias ranged from 0.0 to 6.3 for corresponding SDif of 0.05–0.60). When the instrumental variable was independent of the measured confounders, but associated with the unmeasured confounders, instrumental variable estimates were biased, although the SDif was close to zero. Panel C shows the results of instrumental variable analysis with adjustment for measured confounders. When the instrumental variable was independent of the unmeasured confounder, effect estimates were unbiased. Moreover, the bias in adjusted instrumental variable estimates was smaller than that in unadjusted instrumental variable estimates. Importantly, when the instrumental variable was associated with measured and unmeasured confounders, estimates from adjusted instrumental variable models were more biased than those from conventional regression analysis adjusting for measured confounders. The pattern of bias was similar when the measured and unmeasured confounders were associated, except that the magnitude of bias was smaller if confounders were positively correlated (details in eAppendix, https://links.lww.com/EDE/A815). Our study shows that the standardized difference can be a useful tool to falsify the third instrumental variable assumption (ie, the instrumental variable is independent of confounders). However, balance of measured confounders between instrumental variable categories does not guarantee balance of unmeasured confounders. If there is an imbalance of measured confounders between instrumental variable categories, indicating a violation of the third assumption, researchers should consider refraining from instrumental variable analysis, irrespective of possible adjustment for measured confounders. M. Sanni Ali Md. Jamal Uddin Division of Pharmacoepidemiology and Clinical Pharmacology Utrecht Institute for Pharmaceutical Sciences University of Utrecht Utrecht, The Netherlands R. H. H. Groenwold Julius Center for Health Sciences and Primary Care University Medical Center Utrecht Utrecht, The Netherlands W. R. Pestman Department of Mathematical Psychology Catholic University of Leuven Leuven, Belgium S. V. Belitser Division of Pharmacoepidemiology and Clinical Pharmacology Utrecht Institute for Pharmaceutical Sciences University of Utrecht Utrecht, The Netherlands A. W. Hoes Julius Center for Health Sciences and Primary Care University Medical Center Utrecht Utrecht, The Netherlands A. de Boer Division of Pharmacoepidemiology and Clinical Pharmacology Utrecht Institute for Pharmaceutical Sciences University of Utrecht Utrecht, The Netherlands K. C. B. Roes Julius Center for Health Sciences and Primary Care University Medical Center Utrecht Utrecht, The Netherlands Olaf H. Klungel Division of Pharmacoepidemiology and Clinical Pharmacology Utrecht Institute for Pharmaceutical Sciences University of Utrecht Utrecht, The Netherlands [email protected]

  • Research Article
  • Cite Count Icon 36
  • 10.1016/j.jeconom.2010.01.002
Applications of subsampling, hybrid, and size-correction methods
  • Jan 18, 2010
  • Journal of Econometrics
  • Donald W.K Andrews + 1 more

Applications of subsampling, hybrid, and size-correction methods

  • Dissertation
  • 10.6092/unibo/amsdottorato/8813
Essays on bootstrap inference under weakly identified models
  • Apr 9, 2019
  • Riccardo Ievoli

Instrumental Variables (IV) are widely used in econometrics to overcome endogeneity problem in regression models, which occurs when regressors are correlated with the stochastic component. Nonetheless, in applied works, practitioners face with instruments that are collectively ``weak'', i.e. poorly correlated with endogenous regressors. Under weak instruments, conventional estimators are no longer consistent and asymptotically normal. Furthermore, bootstrap methods could be useful to improve inference in IV estimation. However, under poorly relevant instruments, the bootstrap is deemed invalid and its use is generally discouraged in applied papers. In this work, we propose a new derivation of bootstrapped IV estimators under weak instruments asymptotics (Stock and Yogo, 2005) using residual--based bootstrap method involving fixed or resampled instruments. We prove that bootstrap counterpart of estimators, conditionally on the data, converges to a random distribution preserving some patterns (non--normality) of weak and irrelevant instruments scenarios. These issues may be also reflected in bootstrap--based confidence sets and hypothesis testing. In this sense, we explore the usefulness of bootstrap methods to provide information on the weakness (or the strength) of the instruments. We consider descriptive indicators and develop new bootstrap-based tests useful to detect weak instruments in IV framework. The method basically relies on Angelini et al. (2016) and allows to test normality of a certain number of (possibly standardized) bootstrap replications. Since conventional normality tests can lose power in presence of more instruments and high endogeneity, we propose new test statistics with the aim to test standard normality on the bootstrap replications. These tests are based on the moments of standard normal and are asymptotically chi--square distributed under the null hypothesis. In conclusion, we find that, in some cases, bootstrapped estimators may be used to test weak identification.

  • Single Report
  • Cite Count Icon 5
  • 10.3386/w13787
A Maximum Likelihood Method for the Incidental Parameter Problem
  • Feb 1, 2008
  • Marcelo Moreira

This paper uses the invariance principle to solve the incidental parameter problem. We seek group actions that preserve the structural parameter and yield a maximal invariant in the parameter space with fixed dimension. M-estimation from the likelihood of the maximal invariant statistic yields the maximum invariant likelihood estimator (MILE). We apply our method to (i) a stationary autoregressive model with fixed effects; (ii) an agent-specific monotonic transformation model; (iii) an instrumental variable (IV) model; and (iv) a dynamic panel data model with fixed effects. In the first two examples, there exist group actions that completely discard the incidental parameters. In a stationary autoregressive model with fixed effects, MILE coincides with existing conditional and integrated likelihood methods. The invariance principle also gives a new perspective to the marginal likelihood approach. In an agent-specific monotonic transformation model, our approach yields an estimator that is consistent and asymptotically normal when errors are Gaussian. In an instrumental variable (IV) model, this paper unifies asymptotic results under strong instruments (SIV) and many weak instruments (MWIV) frameworks. We obtain consistency, asymptotic normality, and optimality results for the limited information maximum likelihood estimator directly from the invariant likelihood. Our approach is parallel to M-estimation in problems in which the number of parameters does not change with the sample size. In a dynamic panel data model with N individuals and T time periods, MILE is consistent as long as NT goes to infinity. We obtain a large N, fixed T bound; this bound coincides with Hahn and Kuersteiner's (2002) bound when T goes to infinity. MILE reaches (i) our bound when N is large and T is fixed; and (ii) Hahn and Kuersteiner's (2002) bound when both N and T are large.

  • Single Book
  • Cite Count Icon 17
  • 10.1596/1813-9450-4632
Instrumental Variables Regressions With Honestly Uncertain Exclusion Restrictions
  • May 1, 2008
  • Aart Kraay

The validity of instrumental variables (IV) regression models depends crucially on fundamentally untestable exclusion restrictions. Typically exclusion restrictions are assumed to hold exactly in the relevant population, yet in many empirical applications there are reasonable prior grounds to doubt their literal truth. In this paper I show how to incorporate prior uncertainty about the validity of the exclusion restriction into linear IV models, and explore the consequences for inference. In particular I provide a mapping from prior uncertainty about the exclusion restriction into increased uncertainty about parameters of interest. Moderate prior uncertainty about exclusion restrictions can lead to a substantial loss of precision in estimates of structural parameters. This loss of precision is relatively more important in situations where IV estimates appear to be more precise, for example in larger samples or with stronger instruments. The author illustrates these points using several prominent recent empirical papers that use linear IV models.

  • Research Article
  • Cite Count Icon 44
  • 10.2139/ssrn.937943
The Reduced Form: A Simple Approach to Inference with Weak Instruments
  • Nov 28, 2007
  • SSRN Electronic Journal
  • Victor Chernozhukov + 1 more

In this paper, we consider simple methods for performing robust inference in linear instrumental variables models with weak instruments. We focus on inference based on the reduced form and show that conventional inference procedures about the relevance of the instruments excluded from the structural equation lead to tests of the structural parameters which are valid even if the instruments are weakly correlated to the endogenous variables. The use of standard heteroskedasticity and autocorrelation consistent covariance matrix estimators in constructing these tests also results in inference which is robust to heteroskedasticity, autocorrelation, and weak instruments. We provide a simulation experiment that demonstrates that the procedures have the correct size and good power in many relevant situations and conclude with an empirical example.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.