Optimizing Storage Space for Higher-Dimensional Data Using Feature Subset Selection Approach

Donia Augustine

doi:10.23956/ijermt.v6i6.241

Donia Augustine

PDF Available

https://doi.org/10.23956/ijermt.v6i6.241

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

As applications producing data of higher dimensions has increased tremendously, clustering of data under reduced memory became a necessity. Feature selection is a typical approach to cluster higher dimensional data. It involves identifying a subset of most relevant features from the entire set of features. Our approach suggests a method to efficiently cluster higher dimensional data under reduced memory. An N-dimensional feature selection algorithm, NDFS is used for identifying the subset of relevant features. The concept of feature selection helps in removing the irrelevant and redundant features from each cluster. In the initial phase of NDFS algorithm features are divided into clusters using graph-theoretic clustering methods. The final phase of the algorithm generates the subset of relevant features that are closely related to the target class. Features in different clusters are relatively independent. In particular, the minimum spanning tree is constructed to efficiently manipulate the subset of features. Traditionally, feature subset selection research has focused on searching for relevant features. The clustering based strategy of NDFS have a high probability of producing a subset of useful and independent features.

Full Text