Detecting the risk of bullying victimization among adolescents: A large-scale machine learning approach

Wei Yan,Yidan Yuan,Menghao Yang,Peng Zhang,Kaiping Peng

doi:10.1016/j.chb.2023.107817

Abstract

There is an increasing interest in using machine learning methods to identify risk factors for problematic behaviors. The current study tested and compared six machine learning algorithms: Logistic Regression, Naive Bayes, Decision Tree, Random Forest, K-Nearest Neighbors (KNN), and Light Gradient Boosting Machine (LightGBM), to detect risk factors for both traditional bullying victimization and cyberbullying victimization among Chinese adolescents. The Random Forest algorithm and LightGBM algorithm obtained similar accuracy and precision, and outperformed other four algorithms. We then combined the feature importance of LightGBM and Random Forest algorithms to evaluate the predictive power of 40 potentially relevant personal, educational, social and psychological factors in predicting bullying victimization, achieving better accuracy and higher performance. These results showed that the combined model can distinguish high-risk and low-risk adolescents for both types of bullying victimization based on a few easy-to-find variables. By comparing the relative significance of each factor, the current study also found mental illness, physical illness, and unhealthy living environments as having the highest values in predicting bullying victimization. Thus, the recommended model has a great application value in preventing bullying victimization among Chinese adolescents.

Full Text