BackgroundYouths face significant mental health challenges exacerbated by stressful life events, particularly in the context of the COVID-19 pandemic. Immature coping strategies can worsen mental health outcomes. MethodsThis study utilised a two-wave cross-sectional survey design with data collected from Chinese youth aged 14–25 years. Wave 1 (N = 3038) and Wave 2 (N = 539) datasets were used for model development and external validation, respectively. Twenty-five features, encompassing dimensions related to demographic information, stressful life events, social support, coping strategies, and emotional intelligence, were input into the model to predict the mental health status of youth, which was considered their coping outcome. Shapley additive explanation (SHAP) was used to determine the importance of each risk factor in the feature selection. The intersection of top 10 features identified by random forest and XGBoost were considered the most influential predictors of mental health during the feature selection process, and was then taken as the final set of features for model development. Machine learning models, including logistic regression, AdaBoost, and a backpropagation neural network (BPNN), were trained to predict the outcomes. The optimum model was selected according to the performance in both internal and external validation. ResultsThis study identified six key features that were significantly associated with mental health outcomes: punishment, adaptation issues, self-regulation of emotions, learning pressure, use of social support, and recognition of others' emotions. The BPNN model, optimized through feature selection methods like SHAP, demonstrated superior performance in internal validation (C-index [95 % CI] = 0.9120 [0.9111, 0.9129], F-score [95 % CI] = 0.8861 [0.8853, 0.8869]). Additionally, external validation showed the model had strong discrimination (C-index = 0.9749, F-score = 0.8442) and calibration (Brier score = 0.029) capabilities. LimitationsAlthough the clinical prediction model performed well, the study it still limited by self-reported data and representativeness of samples. Causal relationships need to be established to interpret the coping mechanism from multiple perspectives. Also, the limited data on minority groups may lead to algorithmic unfairness. ConclusionsMachine learning models effectively identified and predicted mental health outcomes among youths, with the SHAP+BPNN model showing promising clinical applicability. These findings emphasise the importance and effectiveness of targeted interventions with the help of clinical prediction model.
Read full abstract