Abstract

This paper presents FocalNet - an iterative information extraction algorithm that uses the concept of foveal attention to post-process the outputs of Deep Neural Networks (DNNs) by performing variable sampling of the input/feature space. FocalNet is integrated into an existing task-driven deep learning model without modifying the weights of the network. Layers, at which to perform foveation are automatically selected using a data-driven approach. We apply FocalNet to the task of object detection using a state of the art convolutional detector, RFCN ResNet-101. On the PASCAL VOC 2007 dataset, we are able to achieve a mAP increase of 3.7%. On the MS COCO 2017 validation dataset we achieve an increase in mAP by 0.3%. Further, a higher increase in mAP is observed for a computationally efficient detector (1.7% for Faster R-CNN with ResNet50). In additon to object detection, we show effectiveness of FocalNet to the problem of single object tracking with a 0.5% increase in average IoU on the MOT17 dataset over a simple tracking by detection approach using DNNs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call