Abstract

Feature selection method has become the focus of research in the area of engineering data processing where there exists a large amount of high-dimensional data from the high-frequency acquisition system. For high-dimensional data processing, engineers often resort to feature extraction methods and statistical theories to convert the original features into new features. However, the converted data always lose the engineering meaning of the original features and the choice and use of conversion methods are challenging. In this paper, a hybrid feature selection model is presented to select the most significant input features from all potentially relevant features. The algorithm combines a filter model with a wrapper model. In the filter model, four variable ranking methods are used to pre-rank the candidate features. These four methods including Pearson correlation coefficient, relief algorithm, Fisher score and class separability, measure features from various angles, which leads to different ranking results. Therefore, a weighted voting scheme is introduced to re-rank features based on the degree of significance of the four methods on the classification error rate of radial basis function (RBF) classifier. In wrapper model, a binary search (BS) method and a sequential backward search (SBS) method are utilized to minimize the number of relevant features when promising to keep the classification error rate of RBF classifier below a given threshold. To demonstrate the potential of applying the method to large-scale engineering data processing, a case study is conducted.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call