Previous articleNext article FreeA Comment on Statistical Significance and Standards of ProofMichael S. PardoMichael S. Pardo*Henry Upson Sims Professor of Law, University of Alabama School of Law. Search for more articles by this author PDFPDF PLUSFull Text Add to favoritesDownload CitationTrack CitationsPermissionsReprints Share onFacebookTwitterLinked InRedditEmailQR Code SectionsMoreIt is a pleasure to have the opportunity to comment on “Error Costs, Legal Standards of Proof, and Statistical Significance” by Michelle M. Burtis, Jonah B. Gelbach, and Bruce H. Kobayashi (2017). The article provides a clear, sophisticated, and persuasive discussion of the relationships between conventional tests of statistical significance and legal standards of proof. Although tests of statistical significance and legal standards of proof are, in theory, each concerned with similar considerations regarding the types and frequency of inferential errors, the article in my view is correct to argue that they are analytically distinct and that one cannot be mapped onto the other in a straightforward manner. Thus, I agree with their general conclusion: “There is no one level of statistical significance that generally corresponds to the legal standard of proof.” The article’s related discussion of different types of statistical tests is also persuasive. I thus also agree with their general conclusion that likelihood ratio tests (comparing competing explanations supporting each side) map onto legal standards of proof more closely than conventional tests with a fixed significance level.1Despite the fact that courts, litigants, and academics have sometimes argued or assumed that evidence in the form of conventional tests of statistical significance (at the typical .05 level, or at some other level) is necessary or sufficient to satisfy a legal standard of proof, the article illustrates clearly the problems with these arguments and assumptions. Here is the crux of the analysis: conventional tests of statistical significance typically focus on one type of error (false positives) and are less concerned with false negatives. In other words, the thumb is on the scale of not declaring that a relationship exists (e.g., causation or discrimination) unless it is warranted or justified, but there is less concern with failing to declare a relationship when one does, in fact, exist. Legal standards of proof, by contrast, are concerned with both types of errors and their costs. The article demonstrates how likelihood ratio tests that focus on both types of error are likely to improve accuracy and reduce overall error costs compared with tests based on fixed significance levels.2In this comment, my aim is to situate the article’s analysis in the academic literature and debates on legal standards of proof (focusing on the “preponderance of the evidence” standard). Here is a brief, somewhat simplified picture of standards of proof and their underlying rationales.3 Standards of proof focus on the related goals of accuracy and allocating the risk of error between the parties. Common assumptions about the preponderance standard are that it aims at minimizing total errors and allocating the risk of error roughly evenly among civil litigants. These assumptions are justified, in part, by the further assumption that the costs of each type of error will be roughly similar (and thus reducing total errors is likely to reduce to error costs). Moreover, equalizing the risk of error reflects a principle of equality among civil litigants (see Redmayne 1999, 171–74; Solum 2004, 286–89). Asymmetric error costs then justify higher proof standards (e.g., “clear and convincing evidence” and “beyond a reasonable doubt”), which attempt to skew the risk of error away from false positives.This simple picture gives rise to several distinct theoretical issues that have generated disagreements among evidence scholars. I discuss three such issues as they relate to the analysis in the article.1. To What Extent Are the Standards Comparative?Standards of proof are sometimes interpreted as probabilistic thresholds (e.g., > .5) and sometimes as comparative assessments (e.g., a likelihood ratio > 1). These interpretations are not the same, and they imply different outcomes. Suppose a plaintiff’s theory of what happened is .4 probable and the defendant’s alternative theory is .2 probable. Has the plaintiff proven its case by a preponderance of the evidence? The issue concerns what to do with the unknown probability space. Should all the unknown possibilities go against the plaintiff, or should they be divided evenly among the parties? A number of scholars (myself included) have argued, from different perspectives, that a comparative account better explains the legal standards and better fits with their assumed goals regarding accuracy and the risk of error (see, e.g., Pardo and Allen 2008; Cheng 2013; Clermont 2013, 149). Although the article focuses on general defense explanations (e.g., “no discrimination”), the likelihood ratio framework appears to likewise embrace a comparative interpretation of legal standards. One question the article leaves open, however, is how the analysis would change in a situation in which the plaintiff and the defendant are each offering more specific, alternative explanations. In other words, suppose the defendant’s explanation is not the null hypothesis (as the article assumes) but instead involves a more specific explanation for the actions. This relates to the distinction that the article raises between simple and composite explanations (Burtis et al. 2017), but not necessarily. Neither side may be offering a composite of alternatives—there may just be two simple theories that do not fill the entire space of possibilities.2. What Are the Criteria That Underlie the Standards?There is substantial debate on whether the thresholds employed by legal standards of proof are probabilistic thresholds or whether they depend on other criteria (e.g., explanatory threshold; see Pardo and Allen 2008; Allen and Pardo, forthcoming). The article assumes they are probabilistic thresholds of some sort (expressed by either fixed probability thresholds or likelihood ratios), but there is a wrinkle with this assumption, with potentially deep consequences. Legal standards of proof apply to individual elements of claims, not to claims as a whole, so even proving a plaintiff’s claim on particular elements may create suboptimal results in terms of errors and errors costs (see Allen and Jehl 2003; Cohen 1977, 58–67). For example, in a two-element claim, A and B, a plaintiff will win under the preponderance standard by proving each element to .6. But if A and B are probabilistically independent, then the plaintiff’s claim is only .36 probable. The effect gets worse with more elements.4 Once again, this is related to—but not quite the same as—the issue of simple versus composite explanations that the authors note (Burtis et al. 2017). Even simple hypotheses offered by a party may contain multiple legal elements (e.g., “I suffered an adverse employment action because of race” or “The defendant’s product caused my injuries”). In sum, the analysis in the article applies most clearly to legal disputes that involve one legal element and one contested factual issue.3. What Is the Relationship between Items of Evidence and Standards of Proof?More generally, the article well illustrates the need to keep separate—as a conceptual matter—evidence, on one hand, and the standard of proof, on the other. The different types of statistical tests that are examined (those based on fixed significance levels and likelihood ratios) are essentially different types of evidence, and the article provides compelling reasons why the tests are likely to differ in terms of their probative value in proving contested factual issues. As a general matter, the relationship between any item of evidence and the standard of proof is complex, and the probative value of evidence will be defeasible (depending on the other evidence, the specific context, and the contrasting claims and arguments of the parties). This is so for statistical tests regarding discrimination, for relative-risk analysis in proving causation in toxic-tort cases under the preponderance standard, and for random-match probabilities in proving identity in criminal cases under the beyond a reasonable doubt standard, to name just a few vexing examples. This is also true for eyewitness testimony, confessions, or any other nonstatistical evidence. Thus, one lesson to take from the article’s analysis is that various doctrinal rules of thumb (e.g., the 80% rule in measuring disparate impact; Burtis et al. 2017) or requiring a relative risk of ≥ 2.0 in tort cases5), or various presumptions that shift a burden of production or persuasion based on particular items of evidence (see Allen et al. 2016, 821–35, 857–59), will always be imperfect guides to legal standards of proof rather than capturing something essential about the relationship between the evidence and the standard of proof. Notes 1 Burtis et al. (2017) note that “reconciling legal standards of proof and statistical thresholds can be achieved by replacing fixed significance levels with likelihood ratio tests.” In addition to the general conclusions, I found nearly all of the analysis to be persuasive, taking issue with only minor details—for example, the assumption that the “presumption of innocence” in criminal cases refers to prior odds of guilt. For a contrary view, see Laudan (2006, 90–109).2 Burtis et al. (2017) illustrate the differences between the tests with an employment-discrimination example. Their analysis of the example should be required reading for any courts, litigants, or academics arguing that there is a clear relationship between statistical significance and legal standards of proof.3 The assumptions in this picture are each contested, but they are nevertheless common. For a more detailed discussion, see Allen et al. (2016, 803–59).4 Probabilistic dependence among elements creates similar problems (see Allen and Pardo, forthcoming). The conjunction effect also applies to comparative probabilistic standards. For example, suppose that in a two-element claim the plaintiff proves one to .9 and the other to .4, and the probability of the defendant’s alternative theory on the elements is .1 and .6. The plaintiff will lose despite offering a theory that is six times more likely than the alternative (.36 vs. .06).5 According to Gold (2011), “courts have equated more than a doubling of relative risk in an exposed group to a more-likely-than-not probability of causation in an exposed individual plaintiff” (1523).ReferencesAllen, Ronald J., and Sarah A. Jehl. 2003. “Burdens of Persuasion in Civil Cases: Algorithms vs. Explanations.” Michigan State Law Review 2003:893–944.First citation in articleGoogle ScholarAllen, Ronald J., and Michael S. Pardo. Forthcoming. “Relative Plausibility and Its Critics.” International Journal of Evidence and Proof.First citation in articleGoogle ScholarAllen, Ronald J., Eleanor Swift, David S. Schwartz, Michael S. Pardo, and Alex Stein. 2016. An Analytical Approach to Evidence: Text, Problems and Cases, 6th ed. New York: Wolters Kluwer.First citation in articleGoogle ScholarBurtis, Michelle M., Jonah B. Gelbach, and Bruce H. Kobayashi. 2017. “Error Costs, Legal Standards of Proof, and Statistical Significance.” Supreme Court Economic Review 25:1–58.First citation in articleAbstractGoogle ScholarCheng, Edward K. 2013. “Reconceptualizing the Burden of Proof.” Yale Law Journal 122:1254–79.First citation in articleGoogle ScholarClermont, Kevin M. 2013. Standards of Decision in Law. Durham, NC: Carolina Academic.First citation in articleGoogle ScholarCohen, L. Jonathan. 1977. The Probable and the Provable. Oxford: Clarendon.First citation in articleGoogle ScholarGold, Steve C. 2011. “The ‘Reshapement’ of the False Negative Asymmetry in Toxic Tort Causation.” William Mitchell Law Review 37:1507–81.First citation in articleGoogle ScholarLaudan, Larry. 2006. Truth, Error, and Criminal Law: An Essay in Legal Epistemology. Cambridge: Cambridge University Press.First citation in articleGoogle ScholarPardo, Michael S., and Ronald J. Allen. 2008. “Juridical Proof and the Best Explanation.” Law and Philosophy 27:223–68.First citation in articleCrossrefGoogle ScholarRedmayne, Mike. 1999. “Standards of Proof in Civil Litigation.” Modern Law Review 62:167–95.First citation in articleCrossrefGoogle ScholarSolum, Lawrence B. 2004. “Procedural Justice.” Southern California Law Review 78:181–321.First citation in articleGoogle Scholar Previous articleNext article DetailsFiguresReferencesCited by Supreme Court Economic Review Volume 252017 Sponsored by the Antonin Scalia Law School, George Mason University Article DOIhttps://doi.org/10.1086/699728 Views: 170Total views on this site HistoryPublished online December 07, 2018 © 2018 by the University of Chicago. All rights reserved.PDF downloadCrossref reports no articles citing this article.