Small Sample Size and Data Quality Issues Illustrated Using Tornado Occurrence Data

Charles A Doswell Iii

doi:10.55599/ejssm.v2i5.10

Abstract

A major challenge in weather research is associated with the size of the data sample from which evidence can be presented in support of some hypothesis. This issue arises often in severe storm research, since severe storms are rare events, at least in any one place. Although large numbers of severe storm events (such as tornado occurrences) have been recorded, some attempts to reduce the impact of data quality problems within the record of tornado occurrences also can reduce the sample size to the point where it is too small to provide convincing evidence for certain types of conclusions. On the other hand, by carefully considering what sort of hypothesis to evaluate, it is possible to find strong enough signals in the data to test conclusions relatively rigorously. Examples from tornado occurrence data are used to illustrate the challenge posed by the interaction between sample size and data quality, and how it can be overcome by being careful to avoid asking more of the data than what they legitimately can provide. A discussion of what is needed to improve data quality is offered.

Highlights

In order to provide a concrete example of how to recognize problems associated with small sample sizes, the database on occurrence of tornadoes that is maintained by the Storm Prediction Center (SPC) is used
It might be tempting to consider a 33-year period of record at least marginally long enough to use for detecting cycles with periods of a few years
The presence of secular trends in the tornado occurrence data means that long periods of record contain artifacts that are difficult to deconvolve from real meteorological information

Summary

Introduction

For many research topics in meteorology, the issue of sample size is an important one. Tornado-related research, in particular, has many topics where sample sizes are insufficient to draw certain types of conclusions. There are numerous reasons for this to be an issue when studying the historical record of tornado occurrences, but at times the sheer size of the dataset can convince the unwary that sufficient data are available to draw robust conclusions, whereas that may not necessarily be the case. The goal of this paper is to illustrate the importance of how sample size limitations, in combination with secular trends in the tornado occurrence data, can limit the ability to draw valid conclusions from tornado occurrence datasets. DOSWELL provides an example illustrating the problem, and section 3 shows how another strong signal can be found in what is essentially the same data set.

An illustration of small sample size problems

26 August 2007

An example of a strong signal in the data

Findings

Discussion and conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: E-Journal of Severe Storms Meteorology	Publication Date: Sep 28, 2021
Citations: 26	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Small Sample Size and Data Quality Issues Illustrated Using Tornado Occurrence Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: E-Journal of Severe Storms Meteorology

Lead the way for us

Similar Papers

Longitudinal stability of medication adherence: Trying to decipher an important construct.
Sarah R Lieber ... Eyal Shemesh
Pediatric transplantation | VOL. 19
Sarah R Lieber, et. al.Sarah R Lieber ... Eyal Shemesh
04 May 2015
Pediatric transplantation | VOL. 19

Considerations of sample size in medical research
John W Waterbor ... Kelley E Swatzell
Journal of the American Academy of Physician Assistants | VOL. 21
John W Waterbor, et. al.John W Waterbor ... Kelley E Swatzell
01 Apr 2008
Journal of the American Academy of Physician Assistants | VOL. 21

Does PLS have advantages for small sample size or non-normal data?
...
Management Information Systems Quarterly | VOL. 36
, et. al. ...
01 Sep 2012
Management Information Systems Quarterly | VOL. 36

Does PLS Have Advantages for Small Sample Size or Non-Normal Data?
Goodhue ... Lewis
MIS Quarterly | VOL. 36
Goodhue, et. al. Goodhue ... Lewis
01 Jan 2012
MIS Quarterly | VOL. 36

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Small Sample Size and Data Quality Issues Illustrated Using Tornado Occurrence Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: E-Journal of Severe Storms Meteorology