BackgroundExisting prediction methods for academic majors based on personality traits have notable gaps, including limited model complexity and generalizability.The current study aimed to utilize advanced Machine Learning (ML) algorithms with smoothing functions to predict academic majors completed based on personality subscales. MethodsWe used reports from 59,413 individuals to perform the current study. All advanced algorithms implemented in this article were based on R software (version 4.1.3, R Core Team, 2021). All model parameters were optimized based on resampling and cross-validation (CV). In addition, pseudo-R2 as a robust metric has been used to compare the performance of models, which, unlike most studies, considers the quality of model-predicted probabilities. ResultThe results indicated that advanced ML models' performance on training and test data was superior to logistic regression. Pseudo-R2 and AUC results showed that advanced models such as kNN, GBE, and RF had the highest scores based on test data compared to other models. The pseudo-R2 values for the models used in this study varied across the test dataset; the lowest value belonged to the logistic regression algorithm at .022, and the highest value was recorded for the kNN algorithm at .099. The agreeableness subscale is the most influential component in predicting the completion of university education, followed by conscientiousness and emotional stability. ConclusionThe potential of advanced methods to enhance the accuracy and validity of predictions is a promising development in our field. Their performance, particularly in handling large data sets with complex patterns, is a reason for optimism about the future of research in this area.
Read full abstract