Abstract

In public security systems, visual instance retrieval has an explosive growing requirement, especially for large-scale image or video databases. Due to its wide range of applications in surveillance scenario, this paper aims at the retrieval tasks centered around ‘vehicle’ and ‘pedestrian’ targets. Many previous CNN-based methods have not exploited the ensemble abilities of different models, which achieve limited accuracy since a certain kind of deep architecture is not comprehensive. On the other hand, some features in the original deep representation are useless for retrieval tasks, while the attention-aware compact representation will be much more efficient and effective. To address the above problems, we propose a Selective Deep Ensemble (SDE) framework to combine various models and features in a complementary way, inspired by the attention mechanism. It is demonstrated that a large improvement can be acquired with slight increase on computation cost. Finally, we evaluate the performance on three public instance-retrieval datasets, VehicleID, VeRi and Market-1501, outperforming state-of-the-art methods by a large margin.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.