Abstract

A major challenge in weather research is associated with the size of the data sample from which evidence can be presented in support of some hypothesis. This issue arises often in severe storm research, since severe storms are rare events, at least in any one place. Although large numbers of severe storm events (such as tornado occurrences) have been recorded, some attempts to reduce the impact of data quality problems within the record of tornado occurrences also can reduce the sample size to the point where it is too small to provide convincing evidence for certain types of conclusions. On the other hand, by carefully considering what sort of hypothesis to evaluate, it is possible to find strong enough signals in the data to test conclusions relatively rigorously. Examples from tornado occurrence data are used to illustrate the challenge posed by the interaction between sample size and data quality, and how it can be overcome by being careful to avoid asking more of the data than what they legitimately can provide. A discussion of what is needed to improve data quality is offered.

Highlights

  • In order to provide a concrete example of how to recognize problems associated with small sample sizes, the database on occurrence of tornadoes that is maintained by the Storm Prediction Center (SPC) is used

  • It might be tempting to consider a 33-year period of record at least marginally long enough to use for detecting cycles with periods of a few years

  • The presence of secular trends in the tornado occurrence data means that long periods of record contain artifacts that are difficult to deconvolve from real meteorological information

Read more

Summary

Introduction

For many research topics in meteorology, the issue of sample size is an important one. Tornado-related research, in particular, has many topics where sample sizes are insufficient to draw certain types of conclusions. There are numerous reasons for this to be an issue when studying the historical record of tornado occurrences, but at times the sheer size of the dataset can convince the unwary that sufficient data are available to draw robust conclusions, whereas that may not necessarily be the case. The goal of this paper is to illustrate the importance of how sample size limitations, in combination with secular trends in the tornado occurrence data, can limit the ability to draw valid conclusions from tornado occurrence datasets. DOSWELL provides an example illustrating the problem, and section 3 shows how another strong signal can be found in what is essentially the same data set.

An illustration of small sample size problems
26 August 2007
An example of a strong signal in the data
Findings
Discussion and conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.