Some conventional transforms such as Discrete Walsh-Hadamard Transform (DWHT) and Discrete Cosine Transform (DCT) have been widely used as feature extractors in image processing but rarely applied in neural networks. However, we found that these conventional transforms can serve as a powerful feature extractor in channel dimension without any learnable parameters in deep neural networks. This paper firstly proposes to apply conventional transforms on pointwise convolution, showing that such transforms can significantly reduce the computational complexity of neural networks without accuracy degradation on various classification tasks and even on face detection task. Our comprehensive experiments show that the proposed DWHT-based model gained 1.49% accuracy increase with 79.4% reduced parameters and 49.4% reduced FLOPs compared with its baseline model on the CIFAR 100 dataset while achieving comparable accuracy under the condition that 81.4% of parameters and 49.4% of FLOPs reduced on SVHN dataset. Additionally, our DWHT-based model showed comparable accuracy with 89.2% reduced parameters and 26.5% reduced FLOPs compared to the baseline models on WIDER FACE and FDDB datasets.
Read full abstract