Abstract

Statistical thinking will one day be as necessary for effective citizenship as the ability to read and write.—H. G. Wells During the decades following this prediction, statistical thinking has emerged as an increasingly important element of American public discourse. By now, quantitative data and statistical reasoning are regularly featured in media coverage of issues ranging from political polling to economic forecasting. Gradually, the general population is becoming more sophisticated in its statistical thinking. For medical disciplines like gastroenterology, the last few decades have also brought increasing rigor to our statistical thinking. The standards of statistical literacy in our field are advancing rapidly, and this trend has implications for this journal's authors and its readership. What accounts for the rising level of statistical rigor in fields like ours? One explanation is simply better technology. When I worked on my first research paper in 1971, my conclusions were supported by P values calculated using the t-test. These calculations were cumbersome whether they were done by hand or using a state-of-the-art mechanical calculator. More complex statistical tests (e.g., logistic regressions) were rarely performed because they required a mainframe computer. By contrast, statistical tests like these are now easily handled by widely available software and hardware. Thus, technology is no longer a barrier to performing the statistical tests needed for optimal data analysis. The demand for statistical rigor is also being driven by the trend toward hypothesis-generating, rather than hypothesis-driven, experimentation. Genetic, genomic, and proteomic studies often produce large amounts of primary data and require especially complex and cautious data analysis. For our particular discipline, one additional factor promoting statistical rigor has been an infusion of younger gastroenterologists who have studied advanced statistics as part of their specialized training in epidemiology, outcomes research, and related fields. As the statistical rigor of our field increases, clinical and laboratory-based investigators are setting higher standards for themselves regarding data analysis. For many authors, it may require special effort to reach the level of statistical rigor expected by some of our recently trained colleagues. However, this effort can bring large benefits by reducing the risk that carefully collected data will be improperly analyzed and thereby lead to erroneous conclusions. All will agree that we should do what we can to reduce avoidable data analysis errors. To this end, it may help to consider some common data analysis errors encountered while reviewing recent manuscripts. One particularly common error concerns assuming that data conforms to a normal distribution when the data are bimodal or skewed. When data sets are not normal, median values are often more representative than mean values and the t-test is not valid for intergroup comparisons. Nonparametric tests should be used instead. A second common error concerns analyses involving multiple tests or comparisons. In these situations, the significance level should be adjusted for chance outcomes (e.g., correcting for multiple comparisons). A related problem occurs when unexpected findings lead to additional comparisons or subgroup analyses beyond those in the original experimental design. Manuscripts reporting such post hoc analyses should specifically distinguish between the study's primary hypothesis and secondary questions arising after data collection. A third type of error can occur when reaching negative conclusions based onlimited numbers of tested samples. In these cases, groups may seem indistinguishable even though a larger study would show significant differences. This highlights the importance of power calculations when planning experiments and when interpreting negative data. Finally, a surprisingly common problem with data analysis involves failing to consider whether the magnitude of an effect is large enough to be important. For novel findings to be conclusive and useful, statistical significance is generally necessary but not sufficient. When studying large populations or using refined assays, we often can detect small effects with high levels of statistical significance. The key point is that once we're convinced that an effect is statistically significant, it's time to ask whether the effect we're studying is large enough to matter. Gregor Mendel's example can teach us much about the relationship between data collection and data analysis. Mendel recorded an estimated 21,000 observations during a decade of devoted effort. In spite of all this effort, this data had little intrinsic importance. Fortunately, Mendel then analyzed his data quatitatively and this is what enabled him to recognize and test the more general patterns revealed by his data. Thus, Mendel's discovery of hereditary principles required more than a careful recording of data; it required quantitatively and statistical thinking. Throughout the world, gastroenterology research groups and clinical units are collecting data of the highest possible quality. By bringing the same level of rigor and thoughtfulness to the analysis of our data, we can assure that we reach sound conclusions from our data and that our research contributions will be as useful as possible.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.