SDCRKL-GP: Scalable deep convolutional random kernel learning in gaussian process for image recognition

Tingting Wang,Lei Xu,Junbao Li

doi:10.1016/j.neucom.2021.05.092

Abstract

Deep convolutional neural networks have shown great potential in image recognition tasks. However, the fact that the mechanism of deep learning is difficult to explain hinders its development. It involves a large amount of parameter learning, which results in high computational complexity. Moreover, deep convolutional neural networks are often limited by overfitting in regimes in which the number of training samples is limited. Conversely, kernel learning methods have a clear mathematical theory, fewer parameters, and can contend with small sample sizes; however, they are not able to handle high-dimensional data, e.g., images. It is important to achieve a performance and complexity trade-off in complicated tasks. In this paper, we propose a novel scalable deep convolutional random kernel learning in Gaussian process architecture called SDCRKL-GP, which is characterized by excellent performance and low complexity. First, we successfully incorporated the deep convolutional architecture into kernel learning by implementing the random Fourier feature transform for Gaussian processes, which can effectively capture hierarchical and local image-level features. This approach enabled the kernel method to effectively handle image processing problems. Second, we optimized the parameters of deep convolutional filters and Gaussian kernels by stochastic variational inference. Then, we derived the lower variational bound of the marginal likelihood. Finally, we explored the model architecture design space selection method to determine the appropriate network architecture for different datasets. The design space consists of the number of layers, the channels per layer, and so on. Different design space selections improved the scalability of the SDCRKL-GP architecture. We evaluated SDCRKL-GP on the MNIST, FMNIST, CIFAR10, and CALTECH4 benchmark datasets. Taking MNIST as an example, the error rate of classification is 0.60%, and the number of parameters, number of computations and memory access cost of the architecture are 19.088k, 0.984M, and 1.057M, respectively. The experimental results verified that the proposed SDCRKL-GP method outperforms several state-of-the-art algorithms in both accuracy and speed in image recognition tasks. The code is available at https://github.com/w-tingting/deep-rff-pytorch.

Full Text