Abstract

Real world data analysis problems often require nonlinear methods to get successful prediction. Kernel methods, e.g. Kernelized Principal Component Analysis, are a common way to get nonlinear properties based on linear representations in a high-dimensional feature space. Unfortunately, traditional kernel methods are unscalable for large-size or even medium-size data. On the other hand, randomized algorithms have been recently proposed to extract nonlinear features in kernel methods. Compared with exact kernel methods, this family of approaches is capable of speeding up the training process dramatically, while maintaining acceptable the classification accuracy. However, these methods fail to engage discriminative features. This significantly limits their classification accuracy. In this paper, we propose a scalable and approximate technique called SDRNF for introducing both nonlinear and discriminative features based on randomized methods. By combining randomized kernel approximation with a couple of generalized eigenvector problems, the proposed approach proves both scalable and accurate for large-scale data. A series of experiments on two benchmark data sets MNIST and CIFAR-10 reveal that our method is fast and scalable, and also generates better classification accuracy over other competitive kernel approximation methods.

Highlights

  • Real world data analysis problems often require nonlinear methods to get successful prediction

  • Results and discussion we evaluate the proposed method of Scalable and Discriminative Randomized Nonlinear Features (SDRNF) in comparison with other competitive methods, e.g., the famous approach Random Kitchen Sinks (RKS) and Nikos Generalized Eigenvectors for Multi-class (GEM)

  • We have proposed a scalable and approximate technique called SDRNF for introducing both nonlinear and discriminative features based on randomized methods

Read more

Summary

Introduction

Real world data analysis problems often require nonlinear methods to get successful prediction. Compared with exact kernel methods, this family of approaches is capable of speeding up the training process dramatically, while maintaining acceptable the classification accuracy These methods fail to engage discriminative features. Kernel Principal Component Analysis (KPCA) [5] and Kernel Discriminant Analysis (KDA) [6] are two common methods to enhance the compressed representation of the data They both utilize the kernel trick to map data into a highdimensional Reproducing Kernel Hilbert Space, where a regular linear PCA and LDA is performed. These two methods are both inefficient and are hard to use in real applications, especially when the data scale is large.

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call