Abstract
Recent top performing methods in PASCAL VOC [6] and ImageNet [13] make use of object proposal to replace exhaustive window search. Object proposal’s effectiveness is rooted in the assumption that there are general cues to differentiate objects from the background. Since the very first work by Alexe et al. [1], many object proposal methods have been proposed [2, 3, 4, 5, 7, 8, 10, 11, 12, 14, 15] and tested on various large scale datasets [6, 9, 13], and their overall detection rates versus different thresholds or window number have also been reported. Yet such partial performance summaries give us little idea of a method’s strengths and weaknesses for further improvement, and users are still facing difficulties in choosing methods for their applications. Therefore, more detailed analysis of existing state-of-the-arts is critical for future research and applications. Our contributions can be summarized in three aspects. First, we investigate the influence of object-level characteristics over state-of-the-art object proposal methods for the first time. Although there are some similar works in categorical object detection, few research has been conducted on object proposal side to the best of our knowledge. Second, we introduce the concept of localization latency to evaluate a method’s localization efficiency and accuracy. Third, we create a fully annotated PASCAL VOC dataset with various object-level characteristics to facilitate our analysis. The annotations take us nearly one month’s time which will be released to facilitate further related research. Our experiments are based on PASCAL VOC2007 test set, which has been widely used in evaluating object proposal methods. A proposed window B is treated as detected if its Intersection-over-Union (IoU) with a ground truth bounding box B: IoU(B,B) = area(B ∩ B) area(B ∪ B) is above a certain threshold T . We first study the localization accuracy of the existing methods. The region based methods have higher localization accuracy than window based methods. MCG and SelectiveSearch are the top performing region based methods, though window based EdgeBox shows comparable performance. The localization accuracy for region based methods are similar. One potential explanation is that all region based methods follow similar pipeline by grouping superpixels with either learned or handcrafted edge measures. A good object proposal method should not only produce candidates with high accuracy, but also use as less windows as possible. To summarize a method’s performance in terms of the accuracy and window number, we propose the localization latency metric:
Submitted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have