Abstract

The histogram and frequency table are fundamental tools for describing continuous variables or discrete variables with many values. Most statistical programs are not flexible, nor do they explicitly state the rules they use to construct histograms or provide guidelines for constructing interval tables. However, by programming or applying the appropriate procedures, this can be achieved with Excel, MATLAB, and R. The objective of this methodological article is to provide a script for the R program to calculate the number and width of class intervals using eight rules that provide a uniform width (four depending on the sample size and four based on optimal width). The script automates the selection of the rule to produce an interval table and a histogram with overlaid density and normal curves. Additionally, symmetry is assessed using the D’Agostino test, mesokurtosis with the Anscombe-Glynn test, and normality with the Lilliefors, Anderson-Darling, and Shapiro-Francia tests. Furthermore, three rules are calculated that provide variable width: one for samples of 25 to 39 data points (multiple of 5) and two for samples of at least 40 data points (Mann-Wald and Moore). Once one of these three rules is chosen, it is applied to the normality check using the likelihood ratio test. Additionally, an optimal histogram provided by R from its basic library is computed. The script is applied to two examples and is adapted to the small samples (< 25 data points) in a third example. It is concluded that this script can be of practical and didactic use.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.