Proposed Feature Selection Approach Research Articles

Stock market prediction is considered as an important yet challenging aspect of financial analysis. The difficulty of forecasting arises from volatile and non-linear nature of stock market, which is affected by varied uncertain factors, ranging from financial ratios to macroeconomic indicators. Recent advances in machine learning, particularly ensembles, have made it possible for academic researchers and financial practitioners to forecast the stock market more efficiently. The novelty of this work is to evaluate how stock return in an oil-dependent country (i.e., Iran), which has been facing stagflation for a long time due to economic and political issues, is affected by fundamental and macroeconomic indicators. Our main objectives are to (1) find the most important fundamental and macroeconomic indicators that control the stock returns of companies listed on the Tehran Stock Exchange (TSE); (2) compare the performance of newly developed bagging- and boosting-based ensembles in predicting annual real stock returns of the TSE; and (3) develop multiclass classification models to forecast stock returns. Prior studies mainly focused on developing binary classification models, which simply predict whether stock returns will be positive or negative in the future. We, however, design multiclass classification models to provide more information for the investors and reduce the uncertainties associated with the prediction. To this end, we first provide a comprehensive list of 57 potential features affecting the stock returns. Next, the data are carefully preprocessed and fed to 14 different bagging- and boosting-based ensembles (e.g., Random Forest, LightGBM, XGBoost, Extra-Trees, AdaBoost, CatBoost) to predict the stock returns. The performance of ensembles is evaluated through different measures (e.g., accuracy, F-score, G-mean). We then propose a novel feature selection method to identify the most contributing features to the stock returns. Our proposed model identifies nearly 65% of 57 original features as redundancy, resulting in 20 most significant features. The selected features are fed to the mentioned ensembles to re-predict the stock returns. Finally, the performance of stock returns forecasts with and without selected features is compared. To design the ensembles, we employ the data from listed companies on the TSE for a 15-year period, spanning between 2005 and 2020. Results suggest that boosting ensembles, in general, outperform bagging-based methods. Among the boosting ensembles, XGBoost and AdaBoost provide the best and worst predictive performance, respectively. Among the bagging-like ensembles, Rotation Forest is the most accurate one, whereas Random Patches performs the worst. Further, our proposed feature selection approach effectively identifies the most representative features for stock returns prediction and can be used as a reliable framework for future investment decisions.

Read full abstract

Although fuzziness universally diffuses in the real-world data, the fuzzy information is tricky to harness for feature selection such that it is rarely utilized. Therefore, how to efficiently exploit fuzzy information has become the major focus for feature selection recently. In this article, a novel unsupervised feature selection method is proposed via exploiting the sparse fuzzy membership efficiently. In general, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\ell _{2,1}$</tex-math></inline-formula> norm is utilized to induce sparsity, while Frobenius norm is used to prevent overfitting. To obtain sparsity and avoid overfitting simultaneously, adaptive loss regularization is introduced to the least-squares regression, such that a sparse and nontrivial projection matrix can be achieved via continuous interpolation between <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\ell _{2,1}$</tex-math></inline-formula> and Frobenius regularization. Additionally, the fuzzy <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$k$</tex-math></inline-formula> -means problem is further embedded with the adaptive loss regression model to avoid the trivial solution caused by the linearity of fuzzy membership. Therefore, the fuzzy cluster structure of fuzzy <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$k$</tex-math></inline-formula> -means is exploited for the efficient feature selection. By performing fuzzy clustering and subspace regression simultaneously, the embedded problem is then reformulated into a general quadratic problem with <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\ell _1$</tex-math></inline-formula> ball constraint. Equipped with an auxiliary variable and standard augmented Lagrangian method, the quadratic problem, i.e., the corresponding dual problem can be solved with the closed form solutions regarding the fuzzy membership and the projection matrix. Consequently, empirical results are provided to demonstrate the effectiveness of the proposed feature selection approach.

Read full abstract

Proposed Feature Selection Approach Research Articles

Related Topics

Articles published on Proposed Feature Selection Approach

PermDroid a framework developed using proposed feature selection approach and machine learning techniques for Android malware detection

An Adapted Ant Colony Optimization for Feature Selection

A Class Specific Feature Selection Method for Improving the Performance of Text Classification

Causality-Driven Efficient Feature Selection for Deep-Learning-Based Surface Roughness Prediction in Milling Machines

MRI-Based Radiomics Analysis of Levator Ani Muscle for Predicting Urine Incontinence after Robot-Assisted Radical Prostatectomy.

Developing an Artificial Intelligence Based Model for Autism Spectrum Disorder Detection in Children

Feature Selection as a Hedonic Coalition Formation Game for Arabic Topic Detection

Evaluating the performance of ensemble classifiers in stock returns prediction using effective features

Intelligent injury prediction for traumatic airway obstruction.

Sequential feature selection for heart disease detection using random forest

Optimized Signal Quality Assessment for Photoplethysmogram Signals Using Feature Selection.

An Intelligent System for Parkinson's Diagnosis Using Hybrid Feature Selection Approach

Empowering IoT Predictive Maintenance Solutions With AI: A Distributed System for Manufacturing Plant-Wide Monitoring

Classification of surface settlement levels induced by TBM driving in urban areas using random forest with data-driven feature selection

Analyzing the Features Affecting the Performance of Teachers during Covid-19: A Multilevel Feature Selection

A Cost-Efficient MFCC-Based Fault Detection and Isolation Technology for Electromagnetic Pumps

BSSA: Binary Salp Swarm Algorithm With Hybrid Data Transformation for Feature Selection

Regularized Regression With Fuzzy Membership Embedding for Unsupervised Feature Selection

Classification of multi-lingual tweets, into multi-class model using Naïve Bayes and semi-supervised learning

Developing comprehensive geocomputation tools for landslide susceptibility mapping: LSM tool pack

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Proposed Feature Selection Approach Research Articles

Related Topics

Articles published on Proposed Feature Selection Approach

PermDroid a framework developed using proposed feature selection approach and machine learning techniques for Android malware detection

An Adapted Ant Colony Optimization for Feature Selection

A Class Specific Feature Selection Method for Improving the Performance of Text Classification

Causality-Driven Efficient Feature Selection for Deep-Learning-Based Surface Roughness Prediction in Milling Machines

MRI-Based Radiomics Analysis of Levator Ani Muscle for Predicting Urine Incontinence after Robot-Assisted Radical Prostatectomy.

Developing an Artificial Intelligence Based Model for Autism Spectrum Disorder Detection in Children

Feature Selection as a Hedonic Coalition Formation Game for Arabic Topic Detection

Evaluating the performance of ensemble classifiers in stock returns prediction using effective features

Intelligent injury prediction for traumatic airway obstruction.

Sequential feature selection for heart disease detection using random forest

Optimized Signal Quality Assessment for Photoplethysmogram Signals Using Feature Selection.

An Intelligent System for Parkinson's Diagnosis Using Hybrid Feature Selection Approach

Empowering IoT Predictive Maintenance Solutions With AI: A Distributed System for Manufacturing Plant-Wide Monitoring

Classification of surface settlement levels induced by TBM driving in urban areas using random forest with data-driven feature selection

Analyzing the Features Affecting the Performance of Teachers during Covid-19: A Multilevel Feature Selection

A Cost-Efficient MFCC-Based Fault Detection and Isolation Technology for Electromagnetic Pumps

BSSA: Binary Salp Swarm Algorithm With Hybrid Data Transformation for Feature Selection

Regularized Regression With Fuzzy Membership Embedding for Unsupervised Feature Selection

Classification of multi-lingual tweets, into multi-class model using Naïve Bayes and semi-supervised learning

Developing comprehensive geocomputation tools for landslide susceptibility mapping: LSM tool pack