Massive increases in American imprisonment since 1974 may have exacerbated inequality among adult men [1–7]. Because the risk of parental imprisonment has increased in tandem with the risk of imprisonment for adult men [8], mass imprisonment might also have exacerbated childhood inequality—but only if parental imprisonment harms children [9–11]. It is into this area that the excellent work by Roettger and colleagues [12] fits. By linking paternal incarceration with children's substance use trajectories, they extend knowledge about the effects of paternal incarceration in an important—and too long ignored—direction [12]. Yet the elephant in the room remains: does parental incarceration cause poor child outcomes? On this point, the authors are candid: too many obstacles to causal inference persist to be certain [12]. Unfortunately, candor (like correlation) takes us but part of the way to establishing causality. In this commentary, I discuss the substantial obstacles to causal inference in this area, with attention to how future work might try to overcome such obstacles. Discussions of obstacles to causal inference are often onerous, but because removing a (possibly) antisocial father from the household seems at first glance at least as likely to benefit children (or have no effect on them) as to harm them, deciphering whether these associations are causal is no mere academic exercise. Arguments suggesting that paternal incarceration does not cause poor child outcomes fall into two main groups. The first suggests that stable differences between individuals drive both the risk of imprisonment and poor child outcomes. Most research in this area uses covariate adjustment, propensity scores or fixed effects to address such concerns [5,9–13]. Covariate adjustment and propensity scores both approximate causal relationships only if all factors associated with imprisonment and child outcomes are controlled. Unfortunately, data sets rarely contain ideal measures of all such factors—and often contain no measures of some—so these methods are unlikely to produce causal estimates. Roettger and colleagues, for instance, include far more extensive controls than most research in this area, yet their only measures of parental drug and alcohol abuse are maternal self-reports of binge drinking [12]. (Some data sets do contain excellent measures, but these data sets rarely represent contemporary children [10].) Another method for dealing with such concerns is a fixed-effects model, which controls all bias due to stable characteristics. As Roettger and colleagues note [12], however, such models can be utilized only when change in the explanatory and dependent variables can be linked. Thus, as most incarcerations preceded the first measure of drug use, they cannot use these models [12]. As if this were not bad enough, researchers must contend with a second objection: that something else changed prior to the incarceration, explaining any association. This objection is especially troubling as it suggests that estimates derived from a model adjusting for all stable traits are only causal if no such changes preceded the incarceration. Unfortunately, it is nearly impossible to address this concern using survey data, and no research in this area has done so. So how do we tackle the elephant? I offer three imperfect suggestions for how to do so. First, surveys should include more information on factors that shape both the risk of parental incarceration and child outcomes. (If these measures capture change, all the better.) A full list of such measures is beyond the scope of this commentary, but research in this and related areas indicates that better measures of parental criminal justice contact, criminality, drug use and abuse and social marginalization would take us part of the way towards overcoming omitted variable bias [5,9–11,14]. Although the substantial investment such data collection requires may not have passed a cost–benefit test in 1974, it probably passes it now that the risk of parental imprisonment for black children exceeds 25% [8]. Secondly, researchers considering questions such as those considered by Roettger and colleagues [12] might try to test their hypotheses using data on slightly younger children, because so doing would enable them to use the fixed-effects models that strengthen causal inference even without improvements in survey measures. Finally, researchers should make much greater efforts to exploit natural variation in incarceration in order to provide stronger tests of causality. Such research designs are common in other fields such as economics and experimental psychology [15] and probably have the most potential for helping researchers in this field to generate causal estimates. Sadly, following these suggestions will not fully remove the elephant. None the less, recent work in this area [5,9–13] provides great motivation for overcoming obstacles to causal inference, as the associations it presents suggest that mass incarceration has the potential to exacerbate inequality not just among men, but also the children they leave behind. None.