A continuous item response model using a censored normal distribution
This study proposes an item response theory (IRT) model for bounded continuous data: the censored normal response model (CNRM). This model has a structure similar to the Tobit models, which makes it possible to take into account the ceiling and floor effects on item responses. The CNRM is formulated as a special case of the generalized normal ogive framework, which unifies several existing models. A parameter estimation method using the EM algorithm is shown and is applied to simulated and real data. The results suggest that the CNRM provides a computationally efficient and highly interpretable alternative to Molenaar et al.’s (2022) zero-and-one inflated approach.
- Research Article
2
- 10.1177/00131644221111993
- Jul 21, 2022
- Educational and psychological measurement
Psychometricians have devoted much research and attention to categorical item responses, leading to the development and widespread use of item response theory for the estimation of model parameters and identification of items that do not perform in the same way for examinees from different population subgroups (e.g., differential item functioning [DIF]). With the increasing use of computer-based measurement, use of items with a continuous response modality is becoming more common. Models for use with these items have been developed and refined in recent years, but less attention has been devoted to investigating DIF for these continuous response models (CRMs). Therefore, the purpose of this simulation study was to compare the performance of three potential methods for assessing DIF for CRMs, including regression, the MIMIC model, and factor invariance testing. Study results revealed that the MIMIC model provided a combination of Type I error control and relatively high power for detecting DIF. Implications of these findings are discussed.
- Research Article
16
- 10.1177/00131644231164316
- Apr 4, 2023
- Educational and Psychological Measurement
A recurring question regarding Likert items is whether the discrete steps that this response format allows represent constant increments along the underlying continuum. This question appears unsolvable because Likert responses carry no direct information to this effect. Yet, any item administered in Likert format can identically be administered with a continuous response format such as a visual analog scale (VAS) in which respondents mark a position along a continuous line. Then, the operating characteristics of the item would manifest under both VAS and Likert formats, although perhaps differently as captured by the continuous response model (CRM) and the graded response model (GRM) in item response theory. This article shows that CRM and GRM item parameters hold a formal relation that is mediated by the form in which the continuous dimension is partitioned into intervals to render the discrete Likert responses. Then, CRM and GRM characterizations of the items in a test administered with VAS and Likert formats allow estimating the boundaries of the partition that renders Likert responses for each item and, thus, the distance between consecutive steps. The validity of this approach is first documented via simulation studies. Subsequently, the same approach is used on public data from three personality scales with 12, eight, and six items, respectively. The results indicate the expected correspondence between VAS and Likert responses and reveal unequal distances between successive pairs of Likert steps that also vary greatly across items. Implications for the scoring of Likert items are discussed.
- Research Article
42
- 10.1207/s15327906mbr3704_05
- Oct 1, 2002
- Multivariate Behavioral Research
This article analyzes the relations between two continuous response models intended for typical response items: the linear congeneric model and Samejima's continuous response model (CRM). Using a factor analytical (FA) approach based on the assumption of underlying response variables, I describe how a particular case of the CRM can be considered as a nonlinear counterpart of Spearman's FA model. The mathematical relations between the: item-trait regressions, item parameter values, and conditional and marginal distributions of both models are obtained. The results allow (a) the item parameter values of the linear model to be obtained from CRM item parameter values, and (b) the conditions in which the congeneric model will be a good approximation to the CRM to be predicted. The relations described are illustrated using an empirical example and assessed by means of a simulation study.
- Research Article
113
- 10.1027/2698-1866/a000034
- Feb 1, 2023
- Psychological Test Adaptation and Development
The importance of providing structural validity evidence for test score(s) derived from psychometric test instruments is highlighted by several institutions; for example, the American Psychological Association (2014) demands that evidence for the validity of an instruments' internal structure and its underlying measurement model must be provided before it is applied in psychological assessment. The knowledge about the latent structure of data obtained with tests addressing the major question "What is/are the construct[s] being measured" by psychological tests under investigation (Ziegler, 2014 (Ziegler, , 2020)) . The study of structural validity is typically addressed with factor analyses when the test scores reflect continuous latent traits. As most submissions to Psychological Test Adaptation and Development (PTAD) deal with the adaptation and further development of existing measures, authors typically test a measurement model that is based on theoretical considerations and prior findings on original versions (or adaptations) of the test under investigation. Our literature review of PTAD's publications showed that more than 90% of the articles contain at least one confirmatory factor analysis (CFA). As editor and reviewers of PTAD, we appreciate that authors are rigorous in providing evidence on the structural validity of their tests' data. However, since PTAD's inception in 2019, we experience that one comment is frequently communicated to authors during the review process, namely, the request to adjust the analytic approach in CFA from maximum likelihood (ML) estimation toward using the mean-and variance-adjusted weighted least squares (WLSMV; Muthén et al., 1997) estimator to account for the ordinal nature of the data that psychological instruments typically generate on the item level. In this editorial, we discuss the rationale behind choosing the WLSMV estimator when analyzing test adaptations and developments that are based on ordinal categorical data and concisely illustrate the problems associated with using the ML estimator (potentially in combination with robust tests of model fit) for such data.
- Research Article
37
- 10.1177/014662169802200402
- Dec 1, 1998
- Applied Psychological Measurement
An item parameter estimation procedure is cedure was pro-developed using an EM algorithm for Samejima's (1973) continuous item response model. The potential usefulness of this model is examined, including the density function and the practical meaning of the item parameters and statistical properties of the model. The expected a posteriori and maximum likelihood estimates of the person (0) parameters and their associated standard errors are also described. The item parameter estimation pro grammed and evaluated using simulated data. The results show that the estimation procedure performs well in estimating item and 0 parameters. These estimation procedures should permit the application of the continuous response model to many measurement problems.
- Research Article
1
- 10.2333/bhmk.39.183
- Jul 1, 2012
- Behaviormetrika
Probability testing (PT) is a way to respond to multiple-choice test items. In PT the examinee gives to each response option his/her subjective probability of its being correct as an expression of partial knowledge. By using PT more item information can be drawn from the subjects than the other scoring methods that can be used for multiple-choice items. In this research, a multi-dimensional continuous item response model for PT is proposed. Moreover, the matrix of information function, a method of estimating item parameter, a method of estimating the subject’s vector of latent traits are introduced.
- Research Article
- 10.1080/03610918.2022.2034864
- Jan 28, 2022
- Communications in Statistics - Simulation and Computation
Gaussian copula joint models for mixed correlated longitudinal continuous and count responses with random effects are presented where the count responses have zero-inflated power series distribution. To account for associations between zero-inflated count and continuous responses, we use the Gaussian copula to indirectly specify their joint distributions. A full likelihood-based approach is applied to obtain IFM method to estimate marginal parameters marginally and share parameters jointly. In this method, we used the Monte Carlo EM algorithm to obtain the parameter estimates of Gaussian copula joint models. To illustrate the utility of the models, some simulation studies are performed. Finally, the proposed models are motivated by applying a medical data set. The data set is extracted from an observational study where the correlated responses are the continuous response of body mass index and the power series response of the number of joint damages.
- Research Article
- 10.3390/jintelligence12030026
- Feb 25, 2024
- Journal of Intelligence
Language proficiency assessments are pivotal in educational and professional decision-making. With the integration of AI-driven technologies, these assessments can more frequently use item types, such as dictation tasks, producing response features with a mixture of discrete and continuous distributions. This study evaluates novel measurement models tailored to these unique response features. Specifically, we evaluated the performance of the zero-and-one-inflated extensions of the Beta, Simplex, and Samejima's Continuous item response models and incorporated collateral information into the estimation using latent regression. Our findings highlight that while all models provided highly correlated results regarding item and person parameters, the Beta item response model showcased superior out-of-sample predictive accuracy. However, a significant challenge was the absence of established benchmarks for evaluating model and item fit for these novel item response models. There is a need for further research to establish benchmarks for evaluating the fit of these innovative models to ensure their reliability and validity in real-world applications.
- Research Article
11
- 10.3758/s13428-012-0229-6
- Jun 26, 2012
- Behavior Research Methods
This study compares two algorithms, as implemented in two different computer softwares, that have appeared in the literature for estimating item parameters of Samejima's continuous response model (CRM) in a simulation environment. In addition to the simulation study, a real-data illustration is provided, and CRM is used as a potential psychometric tool for analyzing measurement outcomes in the context of curriculum-based measurement (CBM) in the field of education. The results indicate that a simplified expectation-maximization (EM) algorithm is as effective and efficient as the traditional EM algorithm for estimating the CRM item parameters. The results also show promise for using this psychometric model to analyze CBM outcomes, although more research is needed in order to recommend CRM as a standard practice in the CBM context.
- Research Article
34
- 10.1016/j.jmp.2014.06.001
- Jul 24, 2014
- Journal of Mathematical Psychology
Cultural consensus theory for continuous responses: A latent appraisal model for information pooling
- Research Article
10
- 10.1177/0146621618817779
- Dec 12, 2018
- Applied Psychological Measurement
Dual item response theory (IRT) models in which items and individuals have different amounts of measurement error have been proposed in the literature. Any developments in these models, however, are feasible only for continuous responses. This article discusses a comprehensive dual modeling approach, based on underlying latent response variables, from which specific models for continuous, graded, and binary responses are obtained. Procedures for (a) calibrating the items, (b) scoring individuals, (c) assessing model appropriateness, and (d) assessing measurement precision are discussed for all the resulting models. Simulation results suggest that the proposal is quite feasible. A practical illustration is given with an empirical example in the personality domain.
- Research Article
3
- 10.1177/00131644241242789
- Apr 17, 2024
- Educational and Psychological Measurement
The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent experience and methodological considerations. Response styles, which are frequently observed in self-reported data, reflect a propensity to answer questionnaire items in a consistent manner, regardless of the item content. These response styles have been identified as causes of skewed scale scores and biased trait inferences. In this study, we investigate the impact of response styles on individuals’ responses within a continuous scale context, with a specific emphasis on extreme response style (ERS) and acquiescence response style (ARS). Building upon the established continuous response model (CRM), we propose extensions known as the CRM-ERS and CRM-ARS. These extensions are employed to quantitatively capture individual variations in these distinct response styles. The effectiveness of the proposed models was evaluated through a series of simulation studies. Bayesian methods were employed to effectively calibrate the model parameters. The results demonstrate that both models achieve satisfactory parameter recovery. Neglecting the effects of response styles led to biased estimation, underscoring the importance of accounting for these effects. Moreover, the estimation accuracy improved with increasing test length and sample size. An empirical analysis is presented to elucidate the practical applications and implications of the proposed models.
- Research Article
37
- 10.1007/s11336-015-9469-6
- Jul 9, 2015
- Psychometrika
With a few exceptions, the problem of linking item response model parameters from different item calibrations has been conceptualized as an instance of the problem of test equating scores on different test forms. This paper argues, however, that the use of item response models does not require any test score equating. Instead, it involves the necessity of parameter linking due to a fundamental problem inherent in the formal nature of these models—their general lack of identifiability. More specifically, item response model parameters need to be linked to adjust for the different effects of the identifiability restrictions used in separate item calibrations. Our main theorems characterize the formal nature of these linking functions for monotone, continuous response models, derive their specific shapes for different parameterizations of the 3PL model, and show how to identify them from the parameter values of the common items or persons in different linking designs.
- Book Chapter
- 10.1007/978-3-031-27781-8_11
- Jan 1, 2023
Conditional independence assumptions play an important role in many psychometric models, but can sometimes be too restrictive in modeling process data from educational and psychological tests such as response times. For this reason, a continuous speed-accuracy response model is developed that relaxes the assumption of conditional independence of items given latent proficiency (“local” independence). Our model is a generalization of the speed-accuracy response model developed by Maris and van der Maas (Psychometrika, 77:615-633, 2012) in which a scoring rule incorporating both accuracy and speed of item responses is assumed to produce a sufficient statistic for a latent proficiency variable. The assumption of local independence is dropped in a similar way as in the interaction model developed for dichotomous item responses by Haberman (Multivariate and Mixture Distribution Rasch Models, pp. 201–216. Springer, New York, 2007). Recently, Verhelst (Theoretical and Practical Advances in Computer-Based Educational Measurement, pp. 135–160. Springer, Cham, 2019) discussed similar models in the context of exponential family models for continuous item responses. A pairwise conditional maximum likelihood approach is developed to estimate item parameters. The model is illustrated by an application to data from a listening test.
- Research Article
4841
- 10.1198/tech.2006.s417
- Aug 1, 2006
- Technometrics
Multilevel Statistical Models