A comparative study on classification models for stock rating prediction

Justin Yap,Trianggoro Wiradinata

doi:10.24246/aiti.v21i1.140-151

Abstract

The digital transformation in the stockbroker industry has led to a significant increase in retail investors, who often lack the expertise to analyse stocks thoroughly. This research addresses the challenge by proposing a classification model to predict stock ratings such as "Reduce", "Hold", "Moderate Buy", and "Buy”, allowing retail investors to make informed decisions. The data analysed is collected from the S&P 500 index through web scraping using Beautiful Soup, resulting in a dataset used for training and testing the classification model. Popular stock indicators are used as attributes in predicting the rating of the stock, which includes the Exchange, Price, Volume, Market Cap, ROE, ROA, P/E Ratio, EPS, Annual Sales, Net Income, Net Margins, and PB Ratio of the stock. The models selected for classification include K-Nearest Neighbors (k-NN), Gaussian Naive Bayes, Support Vector Machine (SVM), Decision Tree, and Random Forest. GridSearch is employed to maximize each algorithm's parameters for optimal performance. Results indicate that the k-NN model outperforms others, achieving the highest accuracy (0.618644) and weighted F1-score (0.605011). However, all models exhibit relatively low accuracy, suggesting the complexity of predicting stock ratings due to external factors not considered in the study.

Full Text