Abstract
We propose a model for image classification by attentional search. Analogous to how humans scan an image by a sequence of saccades, in this model, an attentional window of size much smaller than the target size scans the target by a sequence of “saccades”, integrates the information acquired, and makes a classification decision. In order to process a sequence of attended image segments, the network must have memory, which is incorporated through 3 kinds of recurrent elements in the network architecture: Elman connections, Jordan connections, and Flip-flop neurons. The architecture of the model is designed as three separate channels labeled as – classifier network, eye-position network, and saccade network. Multiple attentional windows with different resolutions and a common center are given as input to the classifier network and the saccade network. The heat-map representation of the location of the attentional windows is given as input to the eye-position network. The saccade network predicts the next jump of the attention windows with the help of reward signals received by the classifier network. The output features of all the three channels are concatenated, before finally terminating in two output layers representing class prediction and next saccade prediction. The model is trained using deep Q-learning algorithm. Attentional search model is evaluated on MNIST handwritten digit, Kannada MNIST, Medical-MNIST, OCTMNIST, and QuickDraw datasets. Translated and Cluttered Translated versions of each dataset are generated to perform the task of classification based on local target search. Original datasets are used to show the task of classification based on search with global target integration. We also evaluate the saccade performance on Extended Yale Face B database. In various problem cases, the model exhibits comparable or superior performance to a state-of-the-art recurrent attention model. Demo code is available in this link.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.