Item analysis of the Grit Assessment Scale: International version in Portuguese language
Abstract: Grit can be understood as an element of personality, being a positive and non-cognitive human characteristic, composed of two components: perseverance of effort and consistency of interests. Due to its social, professional, and educational impact, grit has become a construct of great interest. Although some instruments have been developed and adapted for its evaluation in different contexts, the absence of measures in Portuguese led to the development of the Grit Assessment Scale – International Version in Portuguese Language (EAGrit-LP), based on samples from Brazil and Portugal. Even though previous studies have investigated some evidence of validity for the instrument, its items had not yet been individually evaluated. Therefore, the objective of the present study was to perform an item analysis, using Item Response Theory. For this purpose, a sample of 1.050 Brazilians and 656 Portuguese individuals responded to the EAGrit-LP (aged between 17 and 71 years old; M = 24.1 years; SD = 9.2). The results showed that all items had appropriate infit and outfit indices, and that items difficulty was, in general, low. Six items displayed differential item functioning (DIF), based on participants’ gender and country of origin. These results can be used to revise the scale and guide the development of more challenging items, as well as to the development of normative tables that are specific to gender and country of origin.
- Research Article
2
- 10.1186/s12891-022-05329-2
- Apr 22, 2022
- BMC Musculoskeletal Disorders
BackgroundSubgrouping of migraine patients according to the pain response to manual palpation of the upper cervical spine has been recently described. Based on the neuroanatomy and the convergence of spinal and trigeminal nerves in the trigeminocervical complex, the cervical segments C1 to C3 are potentially relevant. To date it has not been investigated whether palpation results of all upper cervical segments are based on one underlying construct which allows combining the results of several tests. Therefore, the aim of this secondary analysis of a cohort study was to determine whether results from all three segments form one construct.MethodsSeventy-one migraine patients with chronic or frequent episodic migraine diagnosed according to the international headache society classification version 3 were examined by one physiotherapist. Manual palpation using a posterior to anterior pressure was performed on the upper three cervical vertebrae unilaterally left and right. The results of the palpation according to the patients’ responses were combined using factor analysis. In addition, item response theory (IRT) was used to investigate the structure of the response pattern as well as item difficulty and discrimination.FindingsFactor analysis (principal component) showed that the palpation of C3 loads less onto the underlying construct than the palpation of C1 and C2. Considering a cut-off value > 1.0, the eigenvalues of all three segments do not represent one underlying construct. When excluding the results from C3, remaining items form one construct. The internal consistency of the pain response to palpation of C1 and C2 is acceptable with a Cronbach’s alpha of 0.69. IRT analysis showed that the rating scale model fits best to the pain response pattern. The discrimination value (1.24) was equal for all items. Item difficulty showed a clear hierarchical structure between the palpation of C1 and C2, indicating that people with a higher impairment are more likely to respond with referred pain during palpation of C2.ConclusionStatistical analysis confirms that results from the palpation of the cervical segments C1 and C2 in migraine patients can be combined. IRT analysis confirmed the ordinal pattern of the pain response and showed the higher probability of a pain response during palpation of C2. The pain response to C3 palpation is not relevant for unidimensional IRT analysis.Trial RegistrationGerman registry of clinical trials (DRKS00015995), Registered 20. December 2018, https://www.drks.de/drks_web/setLocale_EN.do
- Research Article
22
- 10.1177/0013164411412943
- Aug 4, 2011
- Educational and Psychological Measurement
The relationship between differential item functioning (DIF) and item difficulty on the SAT is such that more difficult items tended to exhibit DIF in favor of the focal group (usually minority groups). These results were reported by Kulick and Hu, and Freedle and have been enthusiastically discussed by more recent literature. Examining the validity of the original reports of this systematic relationship is important so that we can move on to investigating more effectively its causes and the consequences associated to test score use. This article explores the hypothesis that the observed relationship between DIF and item difficulty observed in the SAT could be because of one of the following explanations: (a) the confounding of DIF and impact by the shortcomings of the standardization approach and/or (b) by random guessing. The relationship between DIF and item difficulty is examined using item response theory, which better controls for differences between impact and DIF than the standardization approach and also allows us to test the importance of guessing. The results obtained generally find evidence in support of the relationship between item difficulty and DIF suggesting that the phenomenon reported by earlier research is not a mere artifact of the statistical methodologies used to study DIF.
- Research Article
- 10.1038/s41598-025-91129-2
- Feb 26, 2025
- Scientific Reports
Background: Patient self-advocacy plays a crucial role in improving cancer patients’ quality of life, but there is no validated instrument to assess this concept among Chinese head and neck cancer patients. This study aimed to cross-culturally translate the Patient Self-Advocacy Scale (PSAS) and evaluate its psychometric properties using classical test theory and item response theory. Methods: The PSAS underwent cross-cultural adaptation based on Brislin’s translation model and a cross-sectional survey of 302 head and neck cancer patients at a tertiary hospital in Tianjin was conducted from November 2023 to August 2024. Classical test theory was used for item analysis and validation of reliability (internal consistency, test-retest reliability) and validity (content validity, construct validity). Item response theory was applied to evaluate model fit, reliability, item difficulty, and measurement invariance. Results: Classical test theory analysis demonstrated good item discrimination with item-total correlations ranging from 0.776 to 0.942 and critical ratios from 13.269 to 33.170 (p < 0.05), as well as good internal consistency (Cronbach’s α = 0.942 for the total scale) and test-retest reliability (ICC = 0.840 for the total scale, p < 0.001). I-CVI values ranged from 0.80 to 1.00, with an S-CVI of 0.95. The three-factor model demonstrated good fit (χ2/df = 2.595, RMSEA = 0.090, SRMR = 0.072, CFI = 0.966, IFI = 0.966, TLI = 0.956). Rasch analysis indicated a good model fit and reliability (person/item separation index > 1.5, person/item reliability coefficient > 0.9). The Wright map showed good matching between item difficulty and person ability. Differential item functioning (DIF) analysis revealed no significant differences across gender. Conclusion: The Chinese version of PSAS demonstrates satisfactory psychometric properties among head and neck cancer patients and provides healthcare providers with a tool to assess patients’ self-advocacy, potentially facilitating patient-centered care and self-management in clinical practice and improving patients’ health and quality of life outcomes.
- Research Article
- 10.7586/jkbns.2018.20.1.11
- Jan 12, 2017
- Journal of Korean Biological Nursing Science
Purpose: The purposes of this study were to perform items analysis using the classical test theory (CTT) and the item response theory (IRT), and to establish the validity and reliability of the Korean version of pressure ulcer prevention knowledge. Methods: The 26-item pressure ulcer prevention knowledge instrument was translated into Korean, and the item analysis of the 22 items having an adequate content validity index (CVI), was conducted. A total of 240 registered nurses in 2 university hospitals completed the questionnaire. Each item was analyzed applying CTT and IRT according to 2-parameter logistic model. Response alternatives quality, item difficulty and item discrimination were evaluated. For testing validity and reliability, Pearson correlation coefficient and Kuder Richardson-20 (KR-20) were used. Results: Scale CVI was .90 (Item-CVI range=.75-1.00). The total correct answer rate for this study population was relatively low as 52.5%. The quality of response alternatives was found to be relatively good (range=.02-.83). The item difficulty of the questions ranged form .10 to .86 according to CTT and -12.19 to 29.92 according to the IRT. This instrument had 12-low, 2-medium and 8-high item difficulty applying IRT. The values for the item discrimination ranged .04-.57 applying CTT and .00-1.47 applying IRT. And overall internal consistency (KR-20) was .62 and stability (test-retest) was .82. Conclusion: The instrument had relatively weak construct validity, item discrimination according to the IRT. Therefore, the cautious usage of a Korean version of this instrument would be recommended for discrimination because there are so many attractive response alternatives and low internal consistency. Key Words: Psychometrics; Knowledge; Pressure ulcer Key Words: ì¬ë¦¬ì¸¡ì ; ì§ì; ìì°½
- Research Article
7
- 10.4103/1119-3077.151720
- Jan 1, 2015
- Nigerian Journal of Clinical Practice
Item analysis is an effective method in the evaluation of multiple-choice achievement tests. This study aimed to compare the classical and the latent class models used in item analysis, as well as their efficacy in the evaluation of the examinations of the medical faculty. The achievement tests in the medical faculty were evaluated using different methods. The two methods used were the classical and the latent class models. Among the classical methods, Cronbach's alpha, split half methods, item discrimination, and item difficulty was investigated. On the other hand, various models of item response theory (IRT) and their statistics were compared in the group of latent class methods. Reliability statistics had values above 0.87. Item no. 7 was found easy, item no. 45 difficult and item no. 64 fairly difficult according to the evaluations done by classical and item response theories. In terms of item discrimination, item no. 45 had lower, item no. 7 had middle and item no. 64 had high discrimination levels. The distribution graph shows that personal abilities are good enough to tick the correct choice. In this study, similar results were obtained by classical and latent methods. IRT can be considered perfect at a mathematical level, and if its assumptions are satisfied, it can easily perform assessments and measurements for most types of complex problems. Classical theory is easy to understand and to apply, while IRT is, on the contrary, sometimes rather difficult to understand and to implement.
- Dissertation
2
- 10.4225/03/584102e205d2c
- Dec 2, 2016
Development of the assessment of physiotherapy practice - a standardised and validated approach to assessment of professional competence in physiotherapy
- Research Article
29
- 10.1177/0013164407310129
- Jan 14, 2008
- Educational and Psychological Measurement
Recent research examining racial differences on standardized cognitive tests has focused on the impact of test item difficulty. Studies using data from the SAT and GRE have reported a correlation between item difficulty and differential item functioning (DIF) such that minority test takers are less likely than majority test takers to respond correctly to easy test items. The statistical techniques used and the effect sizes reported in these studies have been heavily criticized. This study addresses these criticisms by examining the relationship between item difficulty and DIF by using alternative statistical techniques based on item response theory and a different standardized test. The results replicate previous research and provide support for the generalizability of the findings.
- Research Article
- 10.31579/2688-7517/043
- Apr 29, 2022
- Addiction Research and Adolescent Behaviour
Differential Item Functioning (DIF), which is a statistical feature of an item and provides a sign of unpredicted performance of items on a test, occurs once dissimilar groups of test takers with the same level of ability show different performance on a single test. The aim of this paper was to examine DIF on the Pearson Test of English (PTE) test items. To that end, 250 intermediate EFL learners with the age range of 26 - 36 in two different fields of study (125, Engineering, and 125 Sciences) were randomly chosen for the analysis. The Item Response Theory (IRT) Likelihood Ratio (LR) approach was utilized to find items showing DIF. The scored items of 250 PTE test takers were analyzed using the IRT three-parameter model utilizing item difficulty (b parameter), item discrimination (a parameter), and pseudo-guessing (c parameter). The results of the independent samples t-test for comparison of means in two groups depicted that Science participants performed better than the Engineering ones particularly in Speaking & Writing and Reading sections. It is evident that the PTE test was statistically easier for the Science students at 0.05 level. Linguistic analyses of Differential Item Functioning items also confirmed the findings of the quantitative part, indicating a far better performance on the part of Science students.
- Research Article
4
- 10.1016/j.jcma.2013.02.008
- Apr 18, 2013
- Journal of the Chinese Medical Association
Item response analysis on an examination in anesthesiology for medical students in Taiwan: A comparison of one- and two-parameter logistic models
- Research Article
- 10.6145/jme201302
- Mar 1, 2013
Background: Item analysis is used to ensure the validity of a test. The Classic Test Theory (CTT) and the Item Response Theory (IRT) are two main item analysis theories. Objective: This study discussed and compared advantages and disadvantages of CTT and IRT in screening out potential problematic test items. Expert opinion and student feedback were also considered before removal of truly problematic items. The study aimed to develop an item analysis procedure to ensure classroom test validity. Method: Eighty-six sixth-year medical students answered a newly developed authentic medical test composed of 48 multiple-choice questions. For item analysis, this study used CTT and IRT methods for the quantitative analysis, while the expert opinion and student feedback were used for the qualitative ones. Cronbach's Alphas were the coefficients of the internal consistency of the whole test. Results: The Cronbach's Alpha of the responses to all 48 items in the test was 0.55. Using IRT, 4 items were deleted and the alpha increased to 0.57. Using CTT, 24 items were deleted and the alpha increased to 0.70. Using IRT and CTT as well as expert opinion, 21 items were deleted and the alpha increased to 0.71. Conclusions: Both CTT and IRT help to increase the test reliability. Compared to IRT, CTT is more effective at increasing the test reliability. Moreover, expert opinion and student feedback offer valuable suggestions for item selection. Based on CTT, expert opinion and student feedback is a considerable procedure for item selection.
- Research Article
- 10.24952/ee.v12i2.12754
- Dec 23, 2024
- English Education : English Journal for Teaching and Learning
Item Analysis is used to determine the quality of test items, whether applicable or not applicable for the test takers’ ability assessment. Owing to that, our research attempts to measure the quality of personal fabricated English items for 8th grade students under the Classical Test Theory (CTT) and Item Response Theory (IRT) by Rasch models. We adopted reliability, item difficulty, discrimination power, and distractor effectivity, following to both theories. Overall, 30 items with multiple-choice format were handed out to 46 students. The items were analyzed quantitatively by deploying the Quest.exe application. The results showed that the items are reliable with 0.69 CTT and 1.0 IRT values, and the item difficulties are also varied: 12, 14, and 4 based on CTT categorizations and index easy, moderate, and difficult, while IRT demonstrated similar results. There is only 1 item inadequate to differentiate students’ ability, and this item required a revision; furthermore, 17 out of 30 items have effective distractors. This research is expected to contribute to Item analysis and Quest.exe demonstration for the same purposes.
- Research Article
14
- 10.1080/23311908.2021.1923166
- Jun 1, 2021
- Cogent Psychology
Positive psychology nurtures the potent qualities of individuals and aids them in carving a niche for themselves. Based on this theoretical foundation, a non-cognitive trait-like grit plays an imperative role in attaining high achievement. Previous studies have identified three dimensions of grit: perseverance of effort, consistency of interest and adaptability to situations. Recent research has criticized the dimension consistency of interest in a collectivist context. The present study provides an account of grit in view of eastern perspectives to check the suitability of the construct in India. Current findings provide a framework for the development and validation of Multi-Dimensional Scale of Grit reveals four dimensions of grit, namely, adaptability to situation, perseverance of effort, spirited initiative and steadfastness in adverse situations. It also provides an insight regarding the duration of goal attainment with respect to grit. The research conducted over three studies included Indian university students to develop and examine the psychometric properties of grit. Study 1 focused on item analysis and development of the factor structure through exploratory factor analysis. Study 2 confirmed the previously obtained factor structure through confirmatory factor analysis. In study 3, the psychometric properties of the scale were measured through test-retest reliability and validity, criterion, convergent and divergent. Results indicated that Multi-Dimensional Scale of Grit is a reliable and valid measure. It also indicated that the obtained 12 items and four dimensions were in synchronization with the relevant eastern perspective.
- Research Article
- 10.28933/ajerr-2021-09-2609
- Jan 1, 2021
- American Journal of Educational Research and Reviews
There are three purposes of this paper. The first is to present a brief introduction to item response theory in conjunction with marketing research. The second is to present a review of the current uses of item response theory in representative marketing research journals. The third is to present an example that illustrate and contrasts classical test theory and item response theory approaches to item and scale analysis. Several item response theory relevant papers were recently published in various marketing research journals. Because models under item response theory, from simple to complex, were used without any systematic introduction in marketing research, this paper briefly presents the main concepts in item response theory. A content analysis was done for the second purpose with 30 item response theory relevant articles in marketing research journals. Articles were sorted based on the taxonomy of item response theory models. Many articles reviewed relied on some type of unidimensional dichotomous item response theory models. Articles published recently within the past 10 years used more complicated item response theory models, both mathematically and statistically, than other previously published articles in marketing research journals. Lastly, data from a scale with three Likert-type items of four response categories were analysed using a traditional approach based on item statistics and coefficient alpha as well as using an item response theory approach by employing the graded response model. Main concepts of item response theory were explicated with figures.
- Research Article
21
- 10.1111/jocn.14895
- May 7, 2019
- Journal of Clinical Nursing
To validate Nurse Practitioner Primary Care Organizational Climate Questionnaire (NP-PCOCQ) items using item response theory (IRT) models and conduct differential item functioning (DIF) analysis to test the item functioning among nurse practitioners (NPs) practicing in different U.S. states with variable regulations governing NP practice. Nurse Practitioner Primary Care Organizational Climate Questionnaire is the only NP-specific tool measuring NP work environment and is being used in different U.S. states with variable NP scope of practice regulations and internationally to produce evidence about NP work environments within their organisations. Cross-sectional survey design was used to collect data from 278 primary care NPs in New York (NY) and 314 NPs in Massachusetts (MA). NPs completed the 29-item NP-PCOCQ. Data collection involved an online survey in NY and a mail survey in MA in 2012. We used Samejima's graded response model for IRT and ordinal logistic regression for DIF analysis. A STROBE checklist was completed. IRT models yielded discrimination parameters ranging from 0.98-4.65 in NY and 1.25-6.94 in MA. Item difficulty parameters were within -3 to +3 range, suggesting a fair range of item difficulties exist in the scale. Only five of the 29 items on NP-PCOCQ exhibited DIF, suggesting some other state-related factor besides the measured construct influenced item responses; thus, the items were removed. Our findings indicate that a shortened, 24-item NP-PCOCQ is capable of measuring organisational climate of NPs practicing in different U.S. states. NP-PCOCQ can be used in future research to measure NP work environment. The tool can also be used by practice administrators to assess NP work environment and identify deficiencies to address them. This evidence about NP work environment can be used by practice administrators to promote favourable work environments for NPs to deliver high-quality care.
- Research Article
1
- 10.1177/10731911241306370
- Dec 23, 2024
- Assessment
This study examined the item- and scale-level functioning of the Social Appearance Anxiety Scale (SAAS) as well as differential functioning by gender using an item response theory (IRT) analysis. SAAS data collected from 840 college students were analyzed. A graded response model was used to analyze the 16 items comprising the SAAS. The measure was found to be unidimensional in its factor structure, and every item demonstrated high to very high ability to differentiate respondents varying in levels of the underlying trait (i.e., appearance concerns). In addition, we found evidence of differential item functioning (DIF) by gender for four items, corresponding to small effect sizes. Two of these items were related to internal experiences of appearance concerns (e.g., nervousness and discomfort when a flaw is noticed by others) that were more likely to be endorsed by women, and two of the items were related to external evaluative experiences related to appearance (e.g., missing opportunities and life being more difficult) that were more likely to be endorsed by men. Overall, the IRT and DIF results suggest that the SAAS effectively identifies appearance concerns among individuals with low to very high appearance concerns.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.