Computing and graphing probability values of pearson distributions: a SAS/IML macro

Qing Yang,Xinming An,Wei Pan

doi:10.1186/s13029-019-0076-2

Abstract

BackgroundAny empirical data can be approximated to one of Pearson distributions using the first four moments of the data (Elderton WP, Johnson NL. Systems of Frequency Curves. 1969; Pearson K. Philos Trans R Soc Lond Ser A. 186:343–414 1895; Solomon H, Stephens MA. J Am Stat Assoc. 73(361):153–60 1978). Thus, Pearson distributions made statistical analysis possible for data with unknown distributions. There are both extant, old-fashioned in-print tables (Pearson ES, Hartley HO. Biometrika Tables for Statisticians, vol. II. 1972) and contemporary computer programs (Amos DE, Daniel SL. Tables of percentage points of standardized pearson distributions. 1971; Bouver H, Bargmann RE. Tables of the standardized percentage points of the pearson system of curves in terms of β1 and β2. 1974; Bowman KO, Shenton LR. Biometrika. 66(1):147–51 1979; Davis CS, Stephens MA. Appl Stat. 32(3):322–7 1983; Pan W. J Stat Softw. 31(Code Snippet 2):1–6 2009) available for obtaining percentage points of Pearson distributions corresponding to certain pre-specified percentages (or probability values; e.g., 1.0%, 2.5%, 5.0%, etc.), but they are little useful in statistical analysis because we have to rely on unwieldy second difference interpolation to calculate a probability value of a Pearson distribution corresponding to a given percentage point, such as an observed test statistic in hypothesis testing.ResultsThe present study develops a SAS/IML macro program to identify the appropriate type of Pearson distribution based on either input of dataset or the values of four moments and then compute and graph probability values of Pearson distributions for any given percentage points.ConclusionsThe SAS macro program returns accurate approximations to Pearson distributions and can efficiently facilitate researchers to conduct statistical analysis on data with unknown distributions.

Highlights

Any empirical data can be approximated to one of Pearson distributions using the first four moments of the data
Contemporary computer programs [5,6,7,8,9] that provided a means of obtaining percentage points of Pearson distributions corresponding to certain pre-specified percentages
They are little useful in statistical analysis because we have difference interpolation for btootehmskpelowyneusnsw√ieβld1 yansdeckounrdtosis β2 to calculate a probability value of a Pearson distribution corresponding to a given percentage point, such as an observed test statistic in hypothesis testing

Summary

Results

To evaluate the accuracy of the SAS/IML macro program for computing and graphing probability values of Pearson distributions, the calculated parameters of the approximated Pearson distributions from this SAS/IML macro were first compared with the corresponding ones in [1]. The absolute differences between the calculated parameters from the SAS/IML macro and those from [1]’s tables are all very small with almost all of them less than .001 and a few less than .019. The computed probability values from the SAS/IML macro were evaluated using the percentage points in [4]’s Table 32 276) corresponding to probability values of 2.5% and 97.5% for illustration purposes only.

Conclusions

Background

Discussion