Abstract
Distribution counting is a special-purpose algorithm for sorting integers that range from zero to a known maximum (Sedgewick, 1983). In psychology, the algorithm can be used in programs that produce summary results of objective tests, in which scores can range from zero to the number of items, and in similar item-analysis programs. For these types of applications, distribution counting has at least three advantages relative to generalpurpose sorting algorithms such as shellsort and quicksort. The first advantage is simplicity. A BASIC translation of a modified version of Sedgewick's (1983) Pascal code, implemented in Optimized Systems Software, Inc., BASIC XL, on a 6502-based Atari 8oo microcomputer, consists of straightforward code (see Listing 1). The algorithm actually sorts data by using cumulative frequencies as array indexes. For example, if the cumulative frequency of the ith score is k, then its sorted position is the kth element of the array Score. Once the ith score has been assigned to its sorted position, its associated cumulative frequency is decremented by 1. That is, there are now (k 1) unsorted scores with a value less than or equal to the ith score. Thus, the array index for the next score equal to the ith will be (k 1), the index of the next will be (k 2), and so on. Repeating this process for each score will sort an array into ascending order. The version of the algorithm presented here differs from Sedgewick's (1983) in two ways. First, Sedgewick's version requires two passes through the data to read the data and tally their frequencies. The version presented here does both on the same pass. Second, in Sedgewick's version, the data are read into the array Score, sorted into the array Temp, and then read back into the array Score. To eliminate another pass through the data, the present version reads the data into the array Temp and sorts them directly into the array Score. However, if the data are resident in the array Score prior to being sorted (e.g., in an interactive program), they will have to be read into the array Temp before being sorted; and the benefit of this modification will be lost. The second advantage of the algorithm is that it produces as intermediate results both frequency and cumulative frequency distributions, often required in testing applications. Although the basic version of the algorithm maintains only the cumulative distribution at the termination of a sort, it can be easily modified to maintain both.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Behavior Research Methods, Instruments, & Computers
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.