Comparison of Machine Learning Strategies in Hazardous Asteroids Prediction

Yao Wang

doi:10.54097/hset.v39i.6527

Abstract

The purpose of this study is to use multiple classification algorithm from machine learning to predict hazardous asteroids that orbit Earth. Seven models are Logistic Regression, K-Nearest Neighbor Classifier, Random Forest Classifier, Decision Tree Classifier, Multinomial Naïve Bayes Classifier, Gradient Boosting Classifier, and Voting Classifier. Confusion matrix is used to evaluate those models. Evaluation metrics include accuracy, precision, recall, and f1-score. The result shows that random forest classifier has the greatest overall performance with highest accuracy. Decision Tree classifier, Gradient Boosting classifier, and Voting classifier also perform well. Gradient Boosting classifier is capable of greatly reducing the risk of hazardous asteroid, which is, reduce the number of hazardous asteroids that is predicted as non-hazardous. Because of assumptions of some models like Logistic Regression, data used in the experiment do not follow them, so the overall performance of those models are not well. It would be better to select data for fitting the model. The result shows that combined classifiers perform better. Voting Classifier can be used to assemble those accurate models and get a more accurate result by offsetting disadvantages of each model.

Full Text