Abstract

Feature engineering is a branch of science that provides tools to support, for example, the preparation of feature spaces for a pattern recognition task. The present work focuses on the problem of feature extraction. The proposed model is based on the mechanisms of PCA principal component analysis. It fills a gap in the implementation of feature extraction by looking for spaces that best discriminate between classes. This was realized by rotating the features according to the centroids of the classes. In addition, a measure of their consistency was determined which allows precise estimation of the number of features for a particular component. Four experiments were conducted in this study. The first two were done on synthetic datasets, while the next two were conducted on ten real datasets. The synthetic data allowed to determine the characteristics depending on the percentage of informative features, the number of input features, the level of imbalance and the number of output components in the extraction task. The obtained results showed that the developed solution allows for a more precise extraction, thus increasing the quality of classification. Moreover, it was shown that the method based on class centroids allows to construct efficient ensembles of classifiers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.