Feature Selection for High Dimensional Data Using Monte Carlo Tree Search

Muhammad Umar Chaudhry,Jee-Hyong Lee

doi:10.1109/access.2018.2883537

Muhammad Umar Chaudhry, Jee-Hyong Lee

Open Access

https://doi.org/10.1109/access.2018.2883537

Copy DOI

Export

Save

Cite

Journal: IEEE Access	Publication Date: Jan 1, 2018
Citations: 19	License type: CC BY 3.0

Affiliation: Sungkyunkwan University

Abstract
Highlights/Summary
Full-Text
Similar Papers

Abstract

Listen

Feature selection is the preliminary step in machine learning and data mining. It identifies the most important and relevant features within a dataset by eliminating the redundant or irrelevant features. The substantial benefits may include an improved performance in terms of high prediction accuracy, reduced computational complexity, and simply interpretable underlying models. In this paper, we present a novel framework to investigate and understand the importance of Monte Carlo tree search (MCTS) in feature selection for very high-dimensional datasets. We construct a binary feature selection tree where each node represents one of the two feature states: a feature is selected or not. The search starts with an empty root node reflecting that no feature is selected. Then, the search tree is expanded by adding nodes in an incremental fashion through MCTS-based simulations. Following tree and default policy, every iteration generates an initial feature subset, where a filter is used to select the top k features forming the candidate feature subset. The classification accuracy is used as the goodness or reward of the candidate feature subset and propagated backward up to the root node following the active path. Finally, the candidate subset with highest reward is selected as the best feature subset. Experiments are performed on 30 real-world datasets, including 14 very high-dimensional microarray datasets, and results are also compared with state-of-the-art methods in the literature, which proves the efficacy, validity, and significance of the proposed method.

Highlights

In the present era of big data, most of the datasets are high dimensional ranging from few hundreds to thousands of features
Monte Carlo tree search (MCTS) is deployed in conjunction with the hybrid of filterwrapper methods
H-MOTiFS (HYBRID-MONTE CARLO TREE SEARCH BASED FEATURE SELECTION) We propose a novel framework based on MCTS, to deal with very high dimensional datasets with an objective to achieve high accuracy with reduced dimensions

Summary

INTRODUCTION

In the present era of big data, most of the datasets are high dimensional ranging from few hundreds to thousands of features. The objective of a feature selection algorithm is to find the most significant features while maintaining the underlying structure of the dataset This helps in making better predictive models in terms of performance by achieving high accuracy and reduced time complexity. Optimization (PSO) [15]–[18], Bat Algorithms (BA) [19], [20], Ant Colony Optimization (ACO) [21], [22] and Multi-Objective Evolutionary Algorithms [23], [24] belong to this category The use of these meta-heuristic approaches have opened up new horizons in feature selection, they are in infancy phases and need more thorough investigations and research. The significance of MCTS is investigated and a novel hybrid framework is proposed for feature selection in very high dimensional datasets, referred as H-MOTiFS.

BACKGROUND

CANDIDATE FEATURE SUBSET GENERATION

EXPERIMENT AND RESULTS

Findings

CONCLUSIONS

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Feature Selection for High Dimensional Data Using Monte Carlo Tree Search

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

A Novel Feature Selection Method for High-Dimensional Mixed Decision Tables.
Nguyen Ngoc Thuy ... Sartra Wongthanavasu
IEEE Transactions on Neural Networks and Learning Systems | VOL. 33
Nguyen Ngoc Thuy, et. al.Nguyen Ngoc Thuy ... Sartra Wongthanavasu
15 Jan 2021
IEEE Transactions on Neural Networks and Learning Systems | VOL. 33

Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets.
Muhammad Umar Chaudhry ... Muhammad Nabeel Asghar
Entropy | VOL. 22
Muhammad Umar Chaudhry, et. al.Muhammad Umar Chaudhry ... Muhammad Nabeel Asghar
29 Sep 2020
Entropy | VOL. 22

An Effective Feature Selection method using Monte Carlo Search
Muhammad Umar Chaudhry ... Jee-Hyong Lee
-
Muhammad Umar Chaudhry, et. al.Muhammad Umar Chaudhry ... Jee-Hyong Lee
20 Sep 2017
20 Sep 2017

MOTiFS: Monte Carlo Tree Search Based Feature Selection.
Muhammad Umar Chaudhry ... Jee-Hyong Lee
Entropy (Basel, Switzerland) | VOL. 20
Muhammad Umar Chaudhry, et. al.Muhammad Umar Chaudhry ... Jee-Hyong Lee
20 May 2018
Entropy (Basel, Switzerland) | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Feature Selection for High Dimensional Data Using Monte Carlo Tree Search

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: IEEE Access