Effective pre-hire assessments impact organizational outcomes. Recent developments in machine learning provide an opportunity for practitioners to improve upon existing scoring methods. This study compares the effectiveness of an empirically keyed scoring model with a machine learning, random forest model approach in a biodata assessment. Data was collected across two organizations. The data from the first sample (N=1,410), was used to train the model using sample sizes of 100, 300, 500, and 1,000 cases, whereas data from the second organization (N=524) was used as an external benchmark only. When using a random forest model, predictive validity rose from 0.382 to 0.412 in the first organization, while a smaller increase was seen in the second organization. It was concluded that predictive validity of biodata measures can be improved using a random forest modeling approach. Additional considerations and suggestions for future research are discussed.
Read full abstract