This study aims to propose an innovative machine learning (ML)-based diagnostic model for automatically classifies dental, skeletal and functional Class III malocclusions. The collected data related to 46 cephalometric feature measurements from 4-14-year-old children (n = 666). The data set was divided into a training set and a test set in a 7:3 ratio. Initially, we employed the Recursive Feature Elimination (RFE) algorithm to filter the 46 input parameters, selecting 14 significant features. Subsequently, we constructed 10 ML models and trained these models using the 14 significant features from the training set through ten-fold cross-validation, and evaluated the models' average accuracy in test set. Finally, we conducted an interpretability analysis of the optimal model using the ML model interpretability tool SHapley Additive exPlanations (SHAP). The top five models ranked by their area under the curve (AUC) values were: GPR (0.879), RBF SVM (0.876), QDA (0.876), Linear SVM (0.875) and L2 logistic (0.869). The DeLong test showed no statistical difference between GPR and the other models (p > 0.05). Therefore GPR was selected as the optimal model. The SHAP feature importance plot revealed that he top five features were SN-GoMe (the ratio of the length of the anterior skull base SN to that of the mandibular base GoMe), U1-NA (maxillary incisor angulation to NA plane), Overjet (the distance between two lines perpendicular to the functional occlusal plane from U1 and L), ANB (the difference between angles SNA and SNB), and AB-NPo (the angle between the AB and N-Pog line). Our findings suggest that ML models based on cephalometric data could effectively assist dentists to classify dental, functional and skeletal Class III malocclusions in children. In addition, features such as SN_GoMe, U1_NA and Overjet can as important indicators for predicting the severity of Class III malocclusions.
Read full abstract