Year Year arrow
arrow-active-down-0
Publisher Publisher arrow
arrow-active-down-1
Journal
1
Journal arrow
arrow-active-down-2
Institution Institution arrow
arrow-active-down-3
Institution Country Institution Country arrow
arrow-active-down-4
Publication Type Publication Type arrow
arrow-active-down-5
Field Of Study Field Of Study arrow
arrow-active-down-6
Topics Topics arrow
arrow-active-down-7
Open Access Open Access arrow
arrow-active-down-8
Language Language arrow
arrow-active-down-9
Filter Icon Filter 1
Year Year arrow
arrow-active-down-0
Publisher Publisher arrow
arrow-active-down-1
Journal
1
Journal arrow
arrow-active-down-2
Institution Institution arrow
arrow-active-down-3
Institution Country Institution Country arrow
arrow-active-down-4
Publication Type Publication Type arrow
arrow-active-down-5
Field Of Study Field Of Study arrow
arrow-active-down-6
Topics Topics arrow
arrow-active-down-7
Open Access Open Access arrow
arrow-active-down-8
Language Language arrow
arrow-active-down-9
Filter Icon Filter 1
Export
Sort by: Relevance
  • New
  • Open Access Icon
  • Research Article
  • 10.1177/10731911251407472
"Why Bother? There's Always Another Question": Shortening Bandura's Mechanisms of Moral Disengagement Scale.
  • Jan 3, 2026
  • Assessment
  • Romain Decrop + 4 more

Moral disengagement (MD), or the cognitive strategies used to avoid feelings of guilt in contexts of moral transgression, has been an established cognitive risk factor for engagement in antisocial and criminal behaviors. In justice-involved samples, MD is most frequently measured using the 32-item Mechanisms of Moral Disengagement Scale (MMD). The current study aims to develop a short-form version of the MMD with strong psychometric properties and predictive utility. Using data from a longitudinal study of justice-involved youth, we generated theoretically and data-driven short-form versions of the MMD. We then validated and compared the short-form versions to the full MMD in a different sample of justice-involved youth. Results indicate that a data-driven 11-item short form consistently performed well across both samples. Recommendations are made for future researchers interested in exploring MD, and implications in risk assessment are discussed.

  • New
  • Research Article
  • 10.1177/10731911251407880
In the Beast Mode: Predator Self-Identifications as a Model of Disagreeable Functioning.
  • Jan 3, 2026
  • Assessment
  • Michael D Robinson + 2 more

Some key sources of motivation are likely to be primitive. As a way of assessing such sources of motivation and their links to personality, participants in three studies (total N = 808) were asked to choose animals that they would prefer to be, with all pairs contrasting predator and prey animals. Individuals who select predator animals more often may wish to interact with the environment in a self-serving and callous manner. In support of such thinking, Study 1 linked higher levels of predator self-identification to lower levels of agreeableness and interpersonal warmth. Study 2 extended this model by showing that wishing to be predator animals was linked to self-serving behavior in economic games. Study 3 found inverse relationships between predator preferences and daily agreeableness levels in both between-person and within-person analyses. The findings, in total, highlight a motivation-based orientation to the environment that is disagreeable and self-serving.

  • New
  • Research Article
  • 10.1177/10731911251406404
Toward Digital Assessment of Developmental Dyslexia in Mainland China: Establishing Nationwide Norms With a GAMLSS Approach.
  • Dec 31, 2025
  • Assessment
  • Wenjuan Liu + 8 more

Existing diagnosis instruments for developmental dyslexia (DD) in mainland China are limited in generalizability and typically rely on traditional norming approaches, which require large sample sizes to achieve precision. This study aims to develop and validate the Beijing Normal University Diagnostic Tool for Chinese Mandarin Developmental Dyslexia (BNU-DTCMDD), a DD diagnostic tool with regression-based norms for elementary school students in mainland China. A nationally representative sample of 3,782 first-to-sixth-grade students and a clinical sample of 84 first-to-sixth-grade students diagnosed with specific learning disabilities (SLD) were administered the BNU-DTCMDD, comprising six tasks that measure reading abilities and related cognitive skills. The tool demonstrated high internal consistency (Cronbach's α .73-.99), good test-retest reliability (Pearson's r .68-.99), good structural validity, and reasonable criterion validity (Cohen's d 0.27-0.63). Norms were established using generalized additive models for location, scale, and shape (GAMLSS), yielding percentile curves and Z-scores. Based on the norms, the prevalence of DD was 6.08% in the normative sample and 73.81% in the clinical sample with SLD. The BNU-DTCMDD can diagnose DD in elementary school students in mainland China with good reliability and validity, and its regression-based norms overcome the statistical constraints of traditional norming and support timely diagnosis and intervention for DD.

  • New
  • Research Article
  • 10.1177/10731911251406405
Detecting Suicidal Ideation in Adolescence Using Self-Reported Emotional and Behavioral Patterns: Comparing Machine Learning and Large Language Model Predictions.
  • Dec 31, 2025
  • Assessment
  • Davide Marengo + 1 more

Suicidal ideation in adolescents is a critical public health issue requiring early detection. This study examined whether machine learning (ML) and large language models (LLMs) can detect ideation in 1,197 students (ages 10-15) using self-reported Strengths and Difficulties Questionnaire (SDQ) data. Clinically relevant ideation was defined using Suicidal Ideation Questionnaire-Junior (SIQ-JR) cut-offs. Gemini 1.5 Pro and GPT-4o were prompted to estimate SIQ-JR scores from SDQ responses and demographics; Logistic Regression, Naive Bayes, and Random Forest models were trained on either SDQ data or LLM predictions. LLM predictions correlated with SIQ-JR (ρ = .61) and showed good discrimination across thresholds (area under the curve (AUC) ≥ .83), with item-level associations paralleling self-reports, revealing strong associations with emotional symptoms and peer problems. In cross-validated analyses, the best SDQ-based ML model reached sensitivity = .85 and specificity = .72; the best LLM-based model achieved .80 and .74. Notably, ML models trained directly on SDQ responses consistently outperformed those incorporating LLM predictions across all SIQ-JR thresholds. Nonetheless, LLMs demonstrated promising accuracy in identifying suicidal ideation based on SDQ and demographic data. Further refinement and validation are required before these approaches can be considered viable for clinical implementation.

  • New
  • Research Article
  • 10.1177/10731911251395993
Detecting Cry in Daylong Audio Recordings Using Machine Learning: The Development and Evaluation of Binary Classifiers.
  • Dec 30, 2025
  • Assessment
  • Lauren M Henry + 14 more

Atypical cry in infants/toddlers may serve as early, ecologically valid, and scalable indicators of irritability, a transdiagnostic mental health risk marker. Machine learning may identify cry in daylong audio recordings toward predicting outcomes. We developed a novel cry detection algorithm and evaluated performance against our reimplementation of an existing algorithm. In PyTorch, we reimplemented a support vector machine classifier that uses acoustic and deep spectral features from a modified AlexNet. We developed a novel classifier combining wav2vec 2.0 with conventional audio features and gradient boosting machines. Both classifiers were trained and evaluated using a previously annotated open-source data set (N = 21). In a new data set (N = 100), we annotated cry and examined the performance of both classifiers in identifying this ground truth. The existing and novel algorithms performed well in identifying ground truth cry in both the data set in which they were developed (AUCs = 0.897, 0.936) and the new data set (AUCs = 0.841, 0.902), underscoring generalization to unseen data. Bayesian comparison demonstrated that the novel algorithm outperformed the existing algorithm, which can be attributed to the novel algorithm's feature space and use of gradient boosting machines. This research provides a foundation for efficient detection of atypical cry patterns, with implications for earlier identification of dysregulated irritability presaging psychopathology.

  • New
  • Research Article
  • 10.1177/10731911251401405
Conceptualization and Measurement of Anxious Freezing.
  • Dec 30, 2025
  • Assessment
  • Maya A Marder + 3 more

Studies of passive freeze behavior, an innate reaction to perceived or actual threat, have largely been concerned with its physical manifestations in the face of imminent danger (e.g., tonic immobility). Relatively little work has examined psychological aspects of the freezing phenomenon (e.g., cognitive freezing and threat evaluation) that may contribute significantly to the freezing episode. The present research considers dimensions of freezing, a set of contexts that may elicit freezing, and ways freezing relates to other internalizing symptoms or previous experiences of traumatic life events. The Anxious Freezing Questionnaire (AFQ) was developed using university samples (N = 653, N = 447, N = 590). Scale development best practices characterized a three-factor solution yielding physical freezing, cognitive freezing, and threat evaluation factors with good reliability and validity that were moderately correlated with, yet distinguishable from, other anxiety scales. Findings indicate that social-evaluative and performance contexts are relevant for freezing episodes. Results showed that previous experiences of traumatic events were significantly associated with higher levels of anxious freezing across all factors. This instrument has promise for identifying individual differences in profiles of anxiety-related freezing, with consideration of dimensional symptoms and a range of freezing-related contexts that may occur in everyday life.

  • New
  • Research Article
  • 10.1177/10731911251401321
Using Generative Artificial Intelligence to Advance Hypothesis-Driven Scale Validation: Identifying Criterion Measures and Generating Precise a Priori Hypotheses.
  • Dec 29, 2025
  • Assessment
  • Kyle D Austin + 3 more

We propose, illustrate, and evaluate the use of artificial intelligence (AI) to advance rigorous hypothesis-driven scale validation. Using a qualitative approach, we found that AI provided useful suggestions for measures to be used as criteria in scale validation research. Using data and expert predictions previously used to validate nine scales/subscales, we evaluated AI's ability to produce precise, psychologically reasonable validity hypotheses. ChatGPT and Gemini produced hypotheses with "inter-trial consistency" similar to experts' "inter-rater consistency," and their hypotheses agreed strongly with experts' hypotheses. Importantly, their hypothesized validity correlations were roughly as accurate (in terms of corresponding with actual validity correlations) as were experts' hypotheses. Replicating across nine scales/subscales, results are encouraging regarding the use of AI to facilitate a precise hypothesis-driven approach to convergent and discriminant validity in a way that saves time with little-to-no cost in psychological or psychometric quality.

  • New
  • Research Article
  • 10.1177/10731911251403907
Self-Esteem Assessment Based on Self-Introduction: A Multimodal Approach to Personality Computing.
  • Dec 29, 2025
  • Assessment
  • Xinlei Zang + 1 more

The present study aimed to develop and validate a multimodal self-esteem recognition method based on a self-introduction task, with the goal of achieving automated self-esteem evaluation. We recruited two independent samples of undergraduate students (N = 211 and N = 63) and collected 40-second self-introduction videos along with Rosenberg Self-Esteem Scale (RSES) scores. Features were extracted from three modalities-visual, audio, and text-and three-class models were trained using the dataset of 211 participants. Results indicated that the late-fusion multimodal model achieved the highest performance (Accuracy, ACC = 0.447 ± 0.019; Macro-averaged F1, Macro-F1 = 0.438 ± 0.020) and further demonstrated cross-sample generalizability when validated on the independent sample of 63 participants (ACC = 0.381, Macro-F1 = 0.379). Reliability testing showed good interrater consistency (Fleiss' κ = 0.723, Intraclass Correlation Coefficient, ICC = 0.745). Criterion-related validity analyses indicated that the proposed method was significantly correlated with life satisfaction, subjective happiness, positive and negative affect, depression, anxiety, stress, relational self-esteem, and collective self-esteem. Moreover, incremental validity analyses indicated that the multimodal model provided additional predictive value for positive affect beyond the RSES. Taken together, these findings provide preliminary evidence that multimodal behavioral features can assist in achieving automated self-esteem evaluation, offering a feasible, low-burden complement to traditional self-report.

  • New
  • Research Article
  • 10.1177/10731911251401306
Evaluating Continuous Performance Tests as Embedded Measures of Performance Validity in ADHD Assessments: A Systematic Review and Meta-Analysis.
  • Dec 28, 2025
  • Assessment
  • Pinar Toptas + 5 more

Assessing the credibility of presented problems is an essential part of the clinical evaluation of attention-deficit/hyperactivity disorder (ADHD) in adulthood. We conducted a systematic review and meta-analysis to examine Continuous Performance Tests (CPTs) as embedded validity indicators. Eighteen studies (n = 3,021; 67 effect sizes) were analyzed: eight simulation studies and ten analogue studies. Moderating variables included study design (simulation vs. criterion) and sample type (student vs. nonstudent). CPTs effectively distinguish between credible and noncredible performance (g = 0.73). Effect sizes were nearly twice as large in simulation studies (g = 0.94) compared to criterion group studies (g = 0.55), underscoring the influence of study design on the interpretation of research findings. Student and nonstudent groups did not differ significantly. CPTs are valuable as embedded validity indicators. Given the moderate effects, clinical decisions should not rely on a single CPT but on a variety of measures.

  • New
  • Research Article
  • 10.1177/10731911251399030
One Construct or Many? Clarifying the Structure and Meaning of Measures of Psychological and Cognitive Flexibility and Their Components in a Community and Chronic Pain Sample.
  • Dec 26, 2025
  • Assessment
  • Jayden Lucas + 5 more

There are a plethora of "flexibility" constructs and measures in psychology, but the extent to which they assess the same or different constructs, and whether flexibility and inflexibility are separate constructs (vs. extremes of the same bipolar continuum), remains underexplored. We examined the distinctiveness of seven different self-report measures of psychological (in)flexibility and cognitive flexibility using an online community (N = 465) and a chronic pain sample (N = 445). We analyzed the latent structure of these questionnaires using item-level exploratory structural equation modeling that controlled for measure-specific variance, and we tested these factors in relation to a range of mental health outcomes (concurrent validity) and discriminant validity measures. Findings indicate that psychological and cognitive flexibility questionnaires can be characterized at multiple levels, including six lower-order components that span individual measures and global factors that account for their shared variance. The six factors were broadly and uniquely associated with clinically relevant variables, including symptoms and well-being. We also found support for the notion that flexibility and inflexibility exist on a single bipolar continuum, rather than being characterized as separate. Implications for clinical assessment in research and intervention settings are discussed.