Abstract

In pattern recognition, the elimination of unnecessary and/or redundant attributes is known as feature selection. Irreducible testors have been used to perform this task. An objective of the Minimum Description Length Principle (MDL) applied to feature selection in pattern recognition and data mining is to select the minimum number of attributes in a data set. Consequently, the MDL principle leads us to consider the subset of irreducible testors of minimum length. Some algorithms that find the whole set of irreducible testors have been reported in the literature. However, none of these algorithms was designed to generate only minimum-length irreducible testors. In this paper, we propose the first algorithm specifically designed to calculate all minimum-length irreducible testors from a training sample. The paper presents some experimental results obtained using synthetic and real data in which the performance of the proposed algorithm is contrasted with other state-of-the-art algorithms that were adapted to generate only irreducible testers of minimum length.

Highlights

  • Feature selection is a significant task in supervised classification and other pattern recognition problems focused on removing irrelevant and/or redundant features [22]

  • We propose the first algorithm designed for computing all minimum length irreducible testors from a training sample

  • Like most of the algorithms reported for computing irreducible testors, these algorithms operate over the basic matrix, instead of the training sample [28]

Read more

Summary

INTRODUCTION

Feature selection is a significant task in supervised classification and other pattern recognition problems focused on removing irrelevant and/or redundant features [22]. An irreducible testor has the same capability as the entire feature set to discern between objects belonging to different classes This is the reason why irreducible testors have been used for solving several problems related to supervised pattern recognition (see for example [11], [20], [25], [29], [47]). Some algorithms for data compression, noise elimination, and pattern candidate generation, are based on the principle of Minimum Description Length [31] One goal of this principle applied to feature selection in pattern recognition and data mining is selecting the minimal number of features from a dataset. The rest of this paper is structured as follows: in section II we briefly comment about related work; section III contains the theoretic background, section IV details our proposal; in section V, we show and discuss the experimental results, and our conclusions constitute section VI

RELATED WORK
BACKGROUND
PROPOSED ALGORITHM
IN-PLACE SEARCH BASED ON NEXT COMBINATION CALCULATION
SEARCH WITH PRUNING BASED ON FEATURE AND ROW CONTRIBUTIONS
COMPLEXITY ANALYSIS
EXPERIMENTS AND RESULTS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.