Tests used for such purposes as determining educational quality, defining educational needs, hiring an employee, student selection and placement and performing guidance and clinic services have an important place in education and psychology. Of course, they should have certain psychometric features related to test scores' validity and reliability. Various test theories have helped to create more valid and reliable measurements and, as a result, to make better decisions regarding individuals. In education and psychology, Classical Test Theory (CTT) and Item Response Theory (IRT) are both widely used. CTT assumes that an individual's observed score is the total of the true score and the error score, while IRT estimates an individual's ability or latent trait from responses to test items (Embretson & Reise, 2000).When IRT assumptions and model-data fit are ensured, item and ability parameters' invariance occurs; this is known as the most important advantage IRT has over CTT. Item and ability parameters' invariance means estimating ability parameters independently of item sample and estimating item parameters independently of ability sample. IRT's invariance feature makes it very practicable in many applications, for instance, test development, computerized adaptive testing, bias studies, test equating and item mapping (Hambleton & Swaminathan, 1985). IRT is classified under two main categories as parametric IRT (PIRT) and nonparametric IRT (NIRT) (Olivares, 2005; Sijtsma & Molenaar, 2002).To analyse ordered items, such as Likert-type attitude items, partial credit cognitive items or not ordered graded items such as multiple-choice test items, item response models are developed towards polytomous items in IRT (Ostini & Nering, 2006). In these models developed for polytomous items, a non-linear relationship between an individual's latent trait and the possibility of choosing a certain category of item answer is explained (Embretson & Reise, 2000). Graded Response Model (GRM), part of IRT models developed for polytomous items, is often preferred by researchers for applications since it is more useful in presentations, portfolios, essays and Likert-type items with ordered item categories (DeMars, 2010; Ostini & Nering, 2006). To scale tests that consist of polytomous items by making true estimates according to GRM, evaluating PIRT's assumptions and model-data fit is necessary. And to provide these assumptions and model-data fit, large samples are needed. At this point, NIRT models draw attention because they provide a practical advantage in determining psychometric properties of tests with fewer items and respondents (Stout, 2001).NIRT models are defined as statistical scaling methods that require fewer assumptions than PIRT models for measuring persons and items (Stochl, 2007). With their wide application area, NIRT models are used in ordinal scales, applied research areas, sociology, marketing research and health research on quality of life (Sijtsma, 2005). The literature reveals that two models, namely, the Mokken model and nonparametric regression estimation models, are employed. These two models are themselves divided into sub-models. The Mokken model consists of the sub-models Monotone Homogeneity Model (MHM) and the Double Monotonicity Model (DMM). Nonparametric regression estimation models consist of such sub-models as the Kernel Smoothing Approach Model (KSAM), the Isotonic Regression Estimation and the Smoothed Isotonic Regression Estimation models (Lee, 2007; Sijtsma & Molenaar, 2002). Along with theoretical studies being conducted, new sub-models are being added to nonparametric regression estimation models.As a NIRT model, MHM requires unidimensionality, local independence and monotonicity assumptions, and it defines the relationship that latent variables and items with homogeneous (unidimensional) and monotone item characteristic curve (ICC) have (Meijer & Baneke, 2004; Sijtsma & Molenaar, 2002). …
Read full abstract