Abstract

The rapid development and popularization of video surveillance highlight the critical and challenging problem, vehicle reidentification, which suffers from the limited interinstance discrepancy between different vehicle identities and large intrainstance differences of the same vehicle. In this article, we propose a novel multilevel attention network to hierarchically learn an efficient feature embedding for vehicle re-ID. Three kinds of attention are designed in the network: hard local-level attention to localize vehicle salient parts, soft pixel-level attention to refine attended pixels both globally and locally, and spatial attention to enhance the encoder’s spatial awareness of salient regions within the windscreen area. Multigrain features are subsequently learned from semantic awareness to spatial awareness, guaranteeing the intraclass compactness and interclass separability for vehicle re-ID. Extensive experiments validate the effectiveness of each attention component and demonstrate that our approach outperforms the state-of-the-art re-ID methods on two challenging datasets: VehicleID and Vehicle-1 M.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call