Abstract

The classification performance of the statistical methods binary logistic regression (BLR), multinomial and penalized multinomial logistic regression (MLR, pMLR), linear discriminant analysis (LDA), and the machine learning algorithms naïve Bayes classification (NBC), decision trees (DT), random forest (RF), artificial neural networks (ANN), support vector machines (linear, polynomial or radial) (SVM), multivariate adaptive regression splines (MARS), and extreme gradient boosting (XGB) is examined in skeletal sex/ancestry estimation. The datasets used to test the performance of these methods were obtained from a documented human skeletal collection, Athens Collection, and the Howells Craniometric data set. For their implementation, an R package has been written to search for the optimum tuning parameters under cross-validation and perform sex/ancestry classification. It was found that the classification performance may vary significantly depending on the problem. From the methods tested, LDA and the machine learning technique of linear SVM exhibit the best performance, with high prediction accuracy and relatively low bias in most of the tests. ANN and pMLR can generally be considered to give satisfactory predictions, whereas NBC when using metric traits and DT are the worst of the classification methods examined. The possibility of making the models developed via the machine learning algorithms applicable to other assemblages without the use of a training sample is also discussed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.