Abstract

Formulation of the problem. The problem of using large neural networks with complex architectures on modern devices is considered. They work well, but sometimes their speed is unacceptably low and the amount of memory required to place them on the device is not always available. Briefly describes how to solve these problems by using pruning and quantization. It is proposed to consider an unconventional type of neural networks that can meet the requirements for the occupied memory space, speed and quality of work and describe approaches to training this type of networks. The aim of the work is to describe modern approaches to reducing the size of neural networks with minimal loss of the quality of their work and to propose an alternative type of networks of small size and high accuracy. Results. The proposed type of neural network has a large number of advantages in terms of the size and flexibility of layer settings. By varying the parameters of the layers, you can control the size, speed and quality of the network. However, the greater accuracy, the greater the memory volume. To train such a small network, it is proposed to use specific techniques that allow learning complex dependencies based on a more complex and voluminous network. As a result of this learning procedure, it is assumed that only a small network is used, which can then be placed on low-power devices with a small amount of memory. Practical significance. The described methods allow the use techniques to reduce the size of networks with minimal loss of quality of their work. The proposed architecture makes it possible to train simpler networks without using their size reduction techniques. These networks can work with various data, be it pictures, text, or other information encoded in a numerical vector.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call