Abstract
Currently, methods for recognizing objects in images work poorly and use intellectually unsatisfactory methods. The existing identification systems and methods do not completely solve the problem of identification, namely, identification in difficult conditions: interference, lighting, various changes on the face, etc. To solve these problems, a local detector for a reprint model of an object in an image was developed and described. A transforming autocoder (TA), a model of a neural network, was developed for the local detector. This neural network model is a subspecies of the general class of neural networks of reduced dimension. The local detector is able, in addition to determining the modified object, to determine the original shape of the object as well. A special feature of TA is the representation of image sections in a compact form and the evaluation of the parameters of the affine transformation. The transforming autocoder is a heterogeneous network (HS) consisting of a set of networks of smaller dimension. These networks are called capsules. Artificial neural networks should use local capsules that perform some rather complex internal calculations on their inputs, and then encapsulate the results of these calculations in a small vector of highly informative outputs. Each capsule learns to recognize an implicitly defined visual object in a limited area of viewing conditions and deformations. It outputs both the probability that the object is present in its limited area and a set of “instance parameters” that can include the exact pose, lighting, and deformation of the visual object relative to an implicitly defined canonical version of this object. The main advantage of capsules that output instance parameters is a simple way to recognize entire objects by recognizing their parts. The capsule can learn to display the pose of its visual object in a vector that is linearly related to the “natural” representations of the pose that are used in computer graphics. There is a simple and highly selective test for whether visual objects represented by two active capsules A and B have the correct spatial relationships for activating a higher-level capsule C. The transforming autoencoder solves the problem of identifying facial images in conditions of interference (noise), changes in illumination and angle.
Highlights
Financial disclosure: The author has no a financial or property interest in any material or method mentioned
A transforming autocoder (TA), a model of a neural network, was developed for the local detector. This neural network model is a subspecies of the general class of neural networks of reduced dimension
A special feature of TA is the representation of image sections in a compact form and the evaluation of the parameters of the affine transformation
Summary
Задача распознавания объектов на изображениях является актуальной в настоящее время, поскольку существующие системы и методы не решают полностью проблему идентификации в сложных условиях: помехи, освещение, различные изменения на лице и т.д. С целью решения этой задачи разработан и описан локальный детектор для модели репринта объекта на изображении. Для локального детектора разработан трансформирующий автокодер (ТА) – модель нейронной сети. Трансформирующий автокодер представляет собой гетерогенную сеть (ГС), состоящую из множества сетей меньшей размерности, называемых капсулами. Она выводит как вероятность того, что объект присутствует в своей ограниченной области, так и набор «параметров экземпляра», которые могут включать точную позу, освещение и деформацию визуального объекта относительно неявно определенной канонической версии этого объекта. Трансформирующий автокодер решает проблему идентификации лицевых изображений в условиях помех (шумности), изменения освещенности и ракурса. Структура локального детектора модели репринта объекта на изображении. The structure of the local detector of the reprint model of the object in the image
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.