Abstract

The goal of fine-grained vehicle recognition is to identify the exact subtype of the vehicle from a given image. It plays an important role in the intelligent traffic surveillance system. Although fine-grained vehicle recognition has attracted more and more research interest, it remains as an open problem for vehicle images taken from arbitrary viewpoints. In this study, we present a one-stage deep multi-task learning framework for fine-grained vehicle recognition in traffic surveillance, which performs the fine-grained vehicle recognition and viewpoint estimation simultaneously. In the proposed framework, the fine-grained vehicle recognition is the main task which classifies images into different types, and the viewpoint estimation task is the auxiliary task to learn helpful viewpoint-aware features for improving the main task. We evaluate our method on two common used large-scale fine-grained vehicle recognition datasets, including BoxCars116k and CompCars. The experimental results testify that the proposed multi-task framework can improve the accuracy by incorporating the viewpoint information of the vehicles. In comparison to the state-of-the-art, our approach goes beyond, only requiring 3D bounding box for the training phase, which is important for future inferences using the trained model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call