Factorization guided lightweight neural networks for visual analysis

Bin Sun

doi:10.17760/d20474684

Abstract

Deep learning has become popular in recent years primarily due to powerful computing devices such as GPUs. However, many applications such as face alignment, image classification, and gesture recognition need to be deployed to multimedia devices, smartphones, or embedded systems with limited resources. Thus, there is an urgent need for high-performance but memory-efficient deep learning models. For this, we design several lightweight deep learning models for different tasks with factorization strategies.Specifically, we constructed a lightweight face alignment model by proposing a factorization-based deep convolution module named Depthwise Separable Block (DSB) and a light but practical module based on the spatial configuration of the faces. Experiments on four popular datasets verify that Block Mobilenet has better overall performance with less than 1MB storage size. We also design a lightweight module inspired Singular Value Decomposition which can reduce the parameters and computations of the original structure. Besides, we also explored a general, lightweight deep learning module to replace the convolution layer for image classification with low-rank pointwise residual (LRPR) convolution, called LRPRNet. Essentially, LRPR aims at using a low-rank approximation to factorize the pointwise convolution while keeping depthwise convolutions as the residual module to rectify the LRPR module. Moreover, our LRPR is quite general and can be directly applied to many existing network architectures. Due to the success of the factorization strategy on convolution operations, we extended factorization on time sequence operations like Recurrent Neural Network. We use the factorization to extend the work evolutionary face alignment to the lightweight style. To this end, we propose a computationally efficient deep evolutionary model integrated with 3D Diffusion Heap Maps (DHM). we propose an efficient network structure to accelerate the evolutionary learning process through a factorization strategy. Besides, we also propose a fast recurrent module to replace the traditional RNN for real-time regression. We also tried to factorize the features for single image super resolution (SISR). Factorization on features will reduce the feature size in order to reduce the computation costs. However, the reduction of the spatial size is counter-intuitive for the super resolution task. With our exploration, we demonstrated a network named Hybrid Pixel-Unshuffled Network (HPUN), which factorized the features to achieve the lightweight purpose while keeping high performance. Specifically, we utilized pixel-unshuffle operation to factorize the input features. After the factorization, we improved the performance by the grouped convolution, max-pooling, and self-residual. The experiments on popular benchmarks showed that the factorization strategy could achieve SOTA performance on SISR.--Author's abstract

Full Text