- New
- Research Article
- 10.3390/jintelligence14050079
- May 5, 2026
- Journal of Intelligence
- Wenbo Du + 2 more
Q-matrix construction is a foundational yet challenging step in cognitive diagnostic assessment (CDA), which is traditionally reliant on labor-intensive and subjective methods like expert judgment and verbal report analysis. This study explores the potential of generative artificial intelligence (GenAI) to optimize this critical process within the domain of EFL reading. By applying three GenAI models (DeepSeek-V3.2, Kimi 2.5, and Doubao 2.0), three purely GenAI-informed Q-matrices (Qmat-DS, Qmat-K, and Qmat-DB) were generated, and through expert revision, a human–AI collaborative Q-matrix (Qmat-DS-H) was obtained. These were compared with an expert-constructed Q-matrix (Qmat-E) and a student-derived Q-matrix (Qmat-S). Using a simulated dataset (N = 1000) and empirical response data from 1083 EFL learners on a diagnostic reading test, the psychometric performance of the six Q-matrices was estimated via the G-DINA model, ACDM model, and RRUM model. Results demonstrated that the human–AI collaborative Q-matrix consistently outperformed the other five Q-matrices, achieving the best absolute model-data fit, the highest classification accuracy, the most stable item parameters, and the most balanced attribute correlation structure. The purely GenAI-informed Q-matrices showed mixed results: there were some improvements in relative fit and slip stability compared to manually constructed Q-matrices, but variable absolute fit and attribute correlation patterns. The findings substantiate GenAI as a feasible pathway for enhancing the efficiency, consistency, and psychometric quality of Q-matrix construction. This study offers a preliminary framework for advancing CDA development, addressing a key methodological bottleneck in language assessment.
- New
- Research Article
- 10.3390/jintelligence14050077
- May 2, 2026
- Journal of Intelligence
- Juyoung Jung + 2 more
In intelligence research, the sharing of item response data from cognitive ability assessments is often restricted by privacy concerns, while traditional parametric simulation methods frequently fail to capture complex response dependencies. This study proposes a neural network copula (NNC) framework for generating synthetic dichotomous item response data that preserves essential psychometric properties without revealing sensitive examinee information. By decoupling the modeling of marginal item probabilities from the dependence structure using a deep autoencoder and kernel density estimation, the framework accommodates the discrete nature of binary item response data while minimizing distributional assumptions. Validation against large-scale empirical data demonstrated high correspondence across multiple facets. At the data consistency level, the NNC-based synthetic data reproduced total score distributions and inter-item correlations. Psychometrically, the method yielded consistent item characteristic curve parameter estimates, item fit statistics, and test information functions. Furthermore, Monte Carlo replications demonstrated algorithmic stability and inferential precision.
- New
- Research Article
- 10.3390/jintelligence14050076
- May 2, 2026
- Journal of Intelligence
- José Antonio Azuela + 5 more
This study examines the validity of the Spanish version of the Scientific Epistemic Beliefs (SEB) Questionnaire among university students in northeastern Mexico, considering multiple sources of evidence. The SEB measures four dimensions of epistemic beliefs: Source, Certainty, Development, and Justification. Data from pilot (n = 150) and main (n = 791) samples were analyzed using Exploratory and Confirmatory Factor Analyses (EFA, CFA), Item Response Theory (IRT), and Differential Item Functioning (DIF). The results provided evidence consistent with a four-factor model, with adequate internal consistency (α = 0.85) and acceptable-to-good fit indices (CFI = 0.944, TLI = 0.936, RMSEA = 0.067, SRMR = 0.071) for a 22-item scale. IRT analyses indicated strong item discrimination, with Source and Certainty covering a broad range of the latent trait, while Development and Justification were more informative at lower to moderate levels. DIF analyses indicated negligible differences in item functioning by gender and academic semester, with minor DIF detected across faculties. Non-parametric analyses identified statistically significant but small differences, with females scoring slightly higher across all dimensions and variations also observed across academic semesters and faculties. Descriptive comparisons with published international data provide contextual evidence within a broader cross-cultural framework.
- New
- Research Article
- 10.3390/jintelligence14040071
- Apr 21, 2026
- Journal of Intelligence
- Lilan Chen + 2 more
Mathematical abilities are critical for the developmental outcomes of children with autism spectrum disorder (ASD). However, little is known about these abilities and their association with the approximate number system (ANS) in preschoolers with ASD beyond Western samples, including Chinese children. This cross-sectional study examined whether formal and informal mathematical abilities differed between children with and without ASD and assessed the extent to which these abilities were associated with ANS acuity. Participants included 47 children with ASD and 47 typically developing (TD) children aged 3-7 years. All children were assessed on measures of formal and informal mathematical abilities, ANS acuity, and non-verbal IQ. No significant group differences in mathematical abilities were found among children aged 3-5 years. However, among children aged 6-7 years, the ASD group showed significantly lower performance in mathematical abilities compared to their TD peers. ANS acuity was significantly correlated with both formal and informal mathematical abilities in the ASD group, but only with informal mathematical abilities in the TD group. Furthermore, ANS acuity accounted for 5.4% of the unique variance in formal mathematical abilities specifically within the ASD group. The patterns of mathematical abilities and their relationship with ANS acuity differ between preschoolers with and without ASD. These findings suggest a differential association between ANS and formal mathematics learning in children with ASD, highlighting implications for the design of early numeracy interventions.
- New
- Research Article
- 10.3390/jintelligence14040070
- Apr 20, 2026
- Journal of Intelligence
- Huan Kuang
Computer-based assessments generate rich process data that captures examinees' interactions with test items. Using process data from the U.S. PISA 2012 computer-based mathematics assessment sample, this study applied recurrent neural networks to predict item-level correctness and assessment-level latent proficiency. The analysis also examines the impact of expert-engineered features, levels of architectural complexity, action variability, and score variability on model performance. At the item level, most models achieved AUC values around 0.80, indicating good predictive performance. Moderate correlations were observed between latent proficiency from 30 items and predictions based on process data from a subset of items (n = 10). For item-level models, adding expert-engineered features reduces training time and may improve predictive performance with low action variability. For the assessment-level models, adding expert-engineered features improved performance. Model complexity, including model type (i.e., standard RNN, GRU, and LSTM), number of nodes, and number of layers, had little effect on accuracy and efficiency. Moreover, items with greater action variability were associated with better model performance. The findings suggest that simple neural network architectures are sufficient for modeling process data with limited action variability and that combining action sequences with expert-engineered features improves accuracy, efficiency, and interpretability.
- Research Article
- 10.3390/jintelligence14040069
- Apr 17, 2026
- Journal of Intelligence
- Colin Peperkorn + 1 more
Matrix reasoning tests are frequently used to measure intelligence and identify gifted students across domains. To date, there is limited evidence on the usefulness of contextualised tasks for identifying domain-specific giftedness. In the current study, matrix reasoning tasks tailored to biological contexts were developed and validated for students in grades 3-6. The tasks were evaluated across two research cycles, involving a total of N = 895 students (n1 = 470; n2 = 425). An item analysis based on item response theory indicated acceptable item parameters and fit indices for the final item pool. Correlation analyses revealed moderate-to-strong associations with IQ, assessed via abstract matrix reasoning, as well as with domain-specific achievement in biological inquiry processes. A known-groups comparison revealed that students identified as gifted in biology outperformed a comparison group of peers, providing preliminary known-groups validity evidence for the developed tasks. Overall, the matrix reasoning tasks tailored to biology showed acceptable psychometric properties, demonstrated positive correlations with achievement in biological inquiry, and the study provided initial evidence of their usefulness for identifying gifted students in biology.
- Research Article
- 10.3390/jintelligence14040068
- Apr 17, 2026
- Journal of Intelligence
- Shangqing Yuan + 5 more
Creative problem solving often fails because people rely on heuristic responses reinforced by prior experience. According to the default-interventionist account, analytic intervention can override these heuristic defaults only when the semantic system provides access to competing representations. We tested this prediction using a modified Chinese Remote Associates Task in which two factors were independently manipulated: semantic accessibility (high vs. low) and situational induction (strong vs. weak). A significant interaction emerged: strong induction impaired performance only under low semantic accessibility, whereas high semantic accessibility was associated with attenuated induction costs. This pattern is consistent with semantic accessibility serving as a cognitive buffer that may support analytic override of induced heuristic defaults. A separate comparison between induction and non-induction trials confirmed that induction reliably produced a mental set. These findings resolve conflicting claims about the role of semantic knowledge in creativity by showing that knowledge both constrains and enables insight depending on its interaction with experience-driven heuristics.
- Research Article
- 10.3390/jintelligence14040067
- Apr 17, 2026
- Journal of Intelligence
- Csongor Toth + 9 more
The increasing prevalence of digital media use among children and adolescents has raised concerns regarding its potential impact on cognitive and communication development. Previous research has linked higher screen exposure to poorer language outcomes; however, the mechanisms underlying these associations remain insufficiently understood, particularly with respect to pragmatic communication. The present study aimed to examine the relationships between daily screen time, executive functioning (EF), and communication-related outcomes, and to test whether EF mediates the association between digital media exposure and pragmatic communication and language performance. A cross-sectional observational study was conducted with 240 children and adolescents aged 6-15 years. Caregivers reported children's daily screen time, digital consumption and communication skills. EF was assessed using performance-based tasks measuring inhibitory control, working memory, and cognitive flexibility. Language performance was evaluated using a standardized composite measure. Pearson correlations, mediation analyses with bootstrapped confidence intervals, and factorial analyses of variance were performed, controlling for age, sex, parental mediation, and educational content exposure. Higher daily screen time was significantly associated with lower EF, weaker pragmatic communication, and poorer language performance. EF was positively related to both pragmatic and language outcomes and partially mediated the relationship between screen time and communication measures. Educational digital content and parental mediation showed positive associations with EF and communication outcomes, whereas recreational content exhibited negative associations. Group comparisons indicated that negative associations between screen exposure and developmental outcomes were more pronounced in younger children. These findings suggest that EF may represent a key intermediary mechanism underlying the association between digital media exposure and communication-related development. The results highlight the importance of considering not only the quantity but also the quality and context of children's digital media use, particularly during early developmental stages.
- Research Article
- 10.3390/jintelligence14040066
- Apr 15, 2026
- Journal of Intelligence
- Abdullah + 7 more
Early diagnosis of cognitive decline is vital for timely treatment of mild cognitive impairment (MCI) and Alzheimer's disease (AD), yet standard clinical assessments often miss subtle longitudinal language changes. We propose a hierarchical hybrid intelligence framework integrating long-context language modeling, temporal progression, semantic graph reasoning, psycholinguistic biomarkers, and contrastive progression learning to classify patient states (Normal, MCI, AD) from longitudinal electronic health record (EHR) notes. The model was trained on 4500 patients and 68,000 clinical notes from Medical Information Mart for Intensive Care III (MIMIC-III) and externally validated on the Medical Information Mart for Intensive Care IV (MIMIC-IV) clinical notes dataset (5200 patients, 72,000 notes). Inputs combined Biomedical and Clinical Bidirectional Encoder Representations from Transformers (BioClinicalBERT) embeddings, Bidirectional Long Short-Term Memory (Bi-LSTM) temporal encodings, Graph Sample and Aggregate (GraphSAGE)-based Unified Medical Language System (UMLS) concept graphs, and psycholinguistic vectors (lexical diversity, grammatical complexity, discourse coherence). On the MIMIC-III hold-out set, the model achieved 99.999% accuracy, a macro F1-score of 0.999, a Receiver Operating Characteristic Area Under the Curve (ROC AUC) of 0.999, and a temporal stability variance of 0.0008. Monte Carlo cross-validation (10,000 folds) yielded 99.997±0.003% accuracy and 0.999±0.001 macro F1. Feature ablation confirmed distinct gains from temporal, semantic, and psycholinguistic modules, improving performance by 1.1% over text-only baselines. Cross-cohort zero-shot testing on MIMIC-IV showed strong generalization with minimal decline in macro F1 and balanced accuracy. Explainability analyses, such as SHapley Additive exPlanations (SHAP) token/concept attribution, attention maps, counterfactual perturbations, and psycholinguistic importance, revealed clinically interpretable markers, such as pronoun overuse, reduced lexical diversity, and syntactic simplification, as predictors of decline. Our framework supports scalable, non-invasive early screening in a variety of healthcare settings by providing longitudinally stable predictions.
- Research Article
- 10.3390/jintelligence14040065
- Apr 14, 2026
- Journal of Intelligence
- Ziyang Huang + 2 more
Generative artificial intelligence (GenAI) is rapidly transforming design education by enabling new forms of human-AI collaborative learning. However, how GenAI relates to cognitive and motivational processes in design learning contexts remains insufficiently understood. This study examines whether integrating GenAI into visual design instruction is associated with improvements in domain-specific creative performance and explores the relationships among cognitive load, learning motivation, and learning outcomes. A six-week randomized instructional experiment was conducted with 120 undergraduate students majoring in visual communication design. Creative performance was evaluated through blind expert ratings, and the relationships among key variables were analyzed using Partial Least Squares Structural Equation Modeling (PLS-SEM). The results show that GenAI-integrated instruction is associated with higher levels of learning motivation, engagement, and expert-rated creative performance compared with traditional instruction, whereas cognitive-load indicators show comparatively limited predictive strength within the overall model. In addition, Integrated Teaching Alignment (ITA) significantly moderates the relationship between perceived relevance and learning satisfaction. These findings suggest that GenAI may function as an external cognitive support tool, with learning outcomes appearing to be associated with motivational and instructional factors, while cognitive-load indicators show comparatively limited associations within this instructional context.