How Deep Should be the Depth of Convolutional Neural Networks: a Backyard Dog Case Study

Alexander N Gorban,Ivan Y Tyukin,Evgeny M Mirkes

doi:10.1007/s12559-019-09667-7

Abstract

The work concerns the problem of reducing a pre-trained deep neuronal network to a smaller network, with just few layers, whilst retaining the network’s functionality on a given task. In this particular case study, we are focusing on the networks developed for the purposes of face recognition. The proposed approach is motivated by the observation that the aim to deliver the highest accuracy possible in the broadest range of operational conditions, which many deep neural networks models strive to achieve, may not necessarily be always needed, desired or even achievable due to the lack of data or technical constraints. In relation to the face recognition problem, we formulated an example of such a use case, the ‘backyard dog’ problem. The ‘backyard dog’, implemented by a lean network, should correctly identify members from a limited group of individuals, a ‘family’, and should distinguish between them. At the same time, the network must produce an alarm to an image of an individual who is not in a member of the family, i.e. a ‘stranger’. To produce such a lean network, we propose a network shallowing algorithm. The algorithm takes an existing deep learning model on its input and outputs a shallowed version of the model. The algorithm is non-iterative and is based on the advanced supervised principal component analysis. Performance of the algorithm is assessed in exhaustive numerical experiments. Our experiments revealed that in the above use case, the ‘backyard dog’ problem, the method is capable of drastically reducing the depth of deep learning neural networks, albeit at the cost of mild performance deterioration. In this work, we proposed a simple non-iterative method for shallowing down pre-trained deep convolutional networks. The method is generic in the sense that it applies to a broad class of feed-forward networks, and is based on the advanced supervise principal component analysis. The method enables generation of families of smaller-size shallower specialized networks tuned for specific operational conditions and tasks from a single larger and more universal legacy network.

Highlights

With the explosive pace of progress in computing, availability of cloud resources and open-source dedicated software frameworks, current artificial intelligence (AI) systems are capable of spotting minute patterns in large data sets and may outperform humans and early-generation AIs in highly complicated cognitive tasks including object detection [1], medical diagnosis [2] and face and facial expression recognition [3, 4]
We propose a technology and an algorithm for constructing a family of the ‘backyard dog’ networks derived from larger pretrained legacy convolutional neural nets (CNN)
The method is generic in the sense that it applies to a broad class of feed-forward networks, and is based on the ASPCA

Summary

Introduction

With the explosive pace of progress in computing, availability of cloud resources and open-source dedicated software frameworks, current artificial intelligence (AI) systems are capable of spotting minute patterns in large data sets and may outperform humans and early-generation AIs in highly complicated cognitive tasks including object detection [1], medical diagnosis [2] and face and facial expression recognition [3, 4]. Cogn Comput (2020) 12:388–397 context of face recognition [4], these include the need for larger volumes of high-resolution and balanced training and validation data as well as the inevitable presence of hardware constraints limiting training and deployment of large models. Hardware limitations, such as memory constraints, restrict adoption, development and spread of technology. These challenges constitute fundamental obstacles for creation of universal data-driven AI systems, including for face recognition

Objectives

Results

Conclusion