Improving Neural Non-Maximum Suppression for Object Detection by Exploiting Interest-Point Detectors

Charalampos Symeonidis,Ioannis Mademlis,Nikos Nikolaidis,Ioannis Pitas

doi:10.1109/mlsp.2019.8918769

Abstract

Non-maximum suppression (NMS) is a post-processing step in almost every visual object detector. Its goal is to drastically prune the number of overlapping detected candidate regions-of-interest (ROIs) and replace them with a single, more spatially accurate detection. The default algorithm (Greedy NMS) is fairly simple and suffers from drawbacks, due to its need for manual tuning. Recently, NMS has been improved using deep neural networks that learn how to solve a spatial overlap-based detections rescoring task in a supervised manner, where only ROI coordinates are exploited as input. In this paper, neural NMS performance is augmented by feeding the network additional information extracted from the appearance of each candidate ROI. This information captures statistical properties regarding the spatial distribution of interest-points detected within the corresponding image region. Thus, the deviation in 2D distribution between the interest-points detected inside a ROI that encloses the actual object entirely, and within one that only captures it partially, is exploited as a discriminant factor, with the NMS network being implicitly forced to also learn how to solve an additional, appearance-based binary classification task (complete vs partial object silhouettes). The empirical evaluation on three public person detection datasets leads to state-of-the-art results, at a small computational overhead.

Full Text