Abstract

Deep Convolutional Neural Networks (CNNs) have been extremely effective in many areas of computer vision, particularly in the domain of facial recognition where many CNN models have delivered a near perfect performance on long-standing benchmarks. A major challenge being faced by CNN architectures today is their need for a large amount of memory and processing power to deliver their exceptional performance. This hampers their deployment on less powerful devices like smart-phones whose ubiquity is unmatched. This paper critically analyzes a CNN based network architecture wherein convolutions are approximated by XNOR operations and bit-counts which are primarily binary operations hence making the network much faster and more memory efficient than standard CNN architectures. The aim of this paper is to assess the performance of the aforementioned architecture for the task of facial recognition with Minimum Barrier Salient Object Detection (MBD) as a preprocessing step and to examine the affect self-normalizing properties induced by the Scaled Exponential Linear Unit (SELU) activation function on the convergence of the network and benchmark it against the convergence performance of the same network incorporating the widely used Rectified Linear Unit (RELU) as its activation function. Two widely known datasets, Faces95 and Faces96, pre-processed using Minimum Barrier Detection (MBD) for saliency detection were used benchmarking the performance of the proposed networks. It was observed that the network using SELU converged faster than RELU for the Faces95 dataset and both converged at similar rates for Faces96 dataset. The network with SELU gave better accuracy than RELU for both the datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call