Abstract

Stock market forecasting has been a subject of interest for many researchers; the essential market analyses can be integrated with historical stock market data to derive a set of features. It is crucial to select features with useful information about the specific aspect. In this article, we propose coefficient of variation (CV)-based feature selection for stock prediction. The unitless statistical method, CV, is widely used to obtain variability among data distributions. We calculate CV for each feature and integrate an existing method, k-means algorithm, as well as proposed methods, median range and top-M, to select a set of features with specific characteristics such as features belonging to the largest cluster, the defined range, and with the highest CV values, respectively. We apply the set of selected features to models such as backpropagation neural network (BPNN), long short-term memory (LSTM), gated recurrent unit (GRU), and convolutional neural network (CNN) for stock price and trend prediction. We demonstrate the applicability of our proposed approach using five of the existing feature selection methods, namely, correlation coefficient, Chi2, mutual information, principal component analysis, and variance threshold; comparison indicates remarkable performance enhancement using several accuracy-based, as well as error-based, metrics and the same is statistically supported using Wilcoxon signed-rank test.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.