Abstract

BackgroundThe current diagnosis of major depressive disorder (MDD) is mainly based on the patient's self-report and clinical symptoms. Machine learning methods are used to identify MDD using resting-state functional magnetic resonance imaging (rs-fMRI) data. However, due to large site differences in multisite rs-fMRI data and the difficulty of sample collection, most of the current machine learning studies use small sample sizes of rs-fMRI datasets to detect the alterations of functional connectivity (FC) or network attribute (NA), which may affect the reliability of the experimental results. MethodsMultisite rs-fMRI data were used to increase the size of the sample, and then we extracted the functional connectivity (FC) and network attribute (NA) features from 1611 rs-fMRI data (832 patients with MDD (MDDs) and 779 healthy controls (HCs)). ComBat algorithm was used to harmonize the data variances caused by the multisite effect, and multivariate linear regression was used to remove age and sex covariates. Two-sample t-test and wrapper-based feature selection methods (support vector machine recursive feature elimination with cross-validation (SVM-RFECV) and LightGBM's "feature_importances_" function) were used to select important features. The Shapley additive explanations (SHAP) method was used to assign the contribution of features to the best classification effect model. ResultsThe best result was obtained from the LinearSVM model trained with the 136 important features selected by SVMRFE-CV. In the nested five-fold cross-validation (consisting of an outer and an inner loop of five-fold cross-validation) of 1611 data, the model achieved the accuracy, sensitivity, and specificity of 68.90 %, 71.75 %, and 65.84 %, respectively. The 136 important features were tested in a small dataset and obtained excellent classification results after balancing the ratio between patients with depression and HCs. ConclusionsThe combined use of FC and NA features is effective for classifying MDDs and HCs. The important FC and NA features extracted from the large sample dataset have some generalization performance and may be used as a reference for the altered brain functional connectivity networks in MDD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call