Abstract The purpose of this study was to examine procedures for estimating the reliability for a criterion-referenced measure in the psychomotor domain. Reliability is defined as the consistency of classification of examinees into mastery and nonmastery categories. Three trial mastery criteria—6, 7, 8—were utilized along with three test mastery criteria—.6n, .7n, .8n. Motor skill was defined as first-ball scores of each frame in a line of bowling. Since the empirical distribution functions for men and women subjects were significantly different at trial criteria of 6 and 7, separate reliability coefficients were estimated for each sex. The single administration estimates of reliability developed by Huynh and Subkoviak were equally good indicators of the Swaminathan-Hambleton-Algina estimate of P when the test was administered on 2 days (P represents the proportion of agreement of classifications). Variations in trial and test mastery criteria yielded different proportions of subjects assigned to mastery ...