Abstract

One of the issues encountered in classification and regression is the processing inefficiency caused by a large number of input dimensions involved in the given training data set. Many dimensionality reduction approaches have been proposed to address this issue by reducing the number of input dimensions and maintaining the generalization capability of the original data set. However, less attention has been paid to regression than to classification. Besides, the computation with covariance matrices involved results in an inefficient reduction process in most existing methods. In this paper, we propose a machine learning based dimensionality reduction approach for regression problems. For a given set of training instances, a group of clusters are formed such that the instances included in the same cluster are similar to each other. Then one new feature is extracted from each cluster through a certain weighted combination of the training instances. Consequently, the dimensionality of the original data set is reduced. The clusters are created incrementally and automatically without the need of specifying the number of clusters in advance by the user. The characteristics of the original data set are substantially retained since all the original features are involved in the derivation of the extracted features. Also, the computation with covariance matrices is avoided, and thus efficiency is maintained. A number of experiments on real-world data sets are conducted to demonstrate the effectiveness of the proposed approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.