Abstract
The goal of fine-grained vehicle recognition is to identify the exact subtype of the vehicle from a given image. It plays an important role in the intelligent traffic surveillance system. Although fine-grained vehicle recognition has attracted more and more research interest, it remains as an open problem for vehicle images taken from arbitrary viewpoints. In this study, we present a one-stage deep multi-task learning framework for fine-grained vehicle recognition in traffic surveillance, which performs the fine-grained vehicle recognition and viewpoint estimation simultaneously. In the proposed framework, the fine-grained vehicle recognition is the main task which classifies images into different types, and the viewpoint estimation task is the auxiliary task to learn helpful viewpoint-aware features for improving the main task. We evaluate our method on two common used large-scale fine-grained vehicle recognition datasets, including BoxCars116k and CompCars. The experimental results testify that the proposed multi-task framework can improve the accuracy by incorporating the viewpoint information of the vehicles. In comparison to the state-of-the-art, our approach goes beyond, only requiring 3D bounding box for the training phase, which is important for future inferences using the trained model.
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have