BackgroundMachine learning classifications of first-episode psychosis (FEP) using neuroimaging have predominantly analyzed brain volumes. Some studies examined cortical thickness data, but most of them have used parcellation approaches with data from single sites, which limits claims of generalizability. To address these limitations, we conducted a large-scale, multi-site analysis of cortical thickness comparing parcellations and vertex-wise approaches. By leveraging the multi-site nature of the study, we further investigated how different demographical and site-dependent variables affected predictions. Finally, we assessed relationships between the predictions and clinical variables.Methods428 subjects (147 females, mean age 27.14) with FEP and 448 (230 females, mean age 27.06) healthy controls were enrolled in 8 centers by the ClassiFEP group. All subjects underwent a structural MRI (sMRI) session and were clinically assessed. Cortical thickness parcellation (68 areas) and full cortical maps (20484 vertices) were extracted. Supervised (linear Support Vector Machine) classification was used to differentiate FEP from HC, within a repeated nested Cross-Validation (CV) framework through the NeuroMiner software. In both inner and outer CVs, a 10-fold CV cycle was employed. We performed repeated nested CV at the outer cross-validation cycle by randomly permuting the participants within their groups (10 permutations) and repeating the CV cycle for each of these permutations. As feature preprocessing, regression of covariates (age, sex, and site), Principal Component Analysis and Scaling were applied. All preprocessing steps were implemented within the CV. Further analyses were conducted by stratifying the sample for MRI scanner, sex and by performing random resampling with increasingly reduced sample sizes.ResultsVertex-wise thickness maps outperformed parcellation-based methods with a balanced accuracy (BAC) of 66.2% and an Area Under the Curve of 72%, compared to a BAC of 59% and an Area Under the Curve of 61% obtained with the ROI-based approach. The two BACs were significantly different based on the McNemar’s Test. By stratifying our sample for MRI scanner, we increased the overall BAC to more than 70% and we also increased generalizability across sites. Temporal areas resulted the most influential regions in the classification. The predictive decision scores presented significant correlations with age at onset, duration of treatment and the presence of affective vs non-affective psychosis.DiscussionCortical thickness could represent a valid measure to classify FEP subjects, showing temporal areas as potential markers in the early stages of psychosis. The assessment of site-dependent variables allowed us to increase the across-site generalizability of the model, thus attempting to address an important machine learning limitation, especially in the framework of large multi-site cohort and big data analysis.
Read full abstract