Vision-based Autonomous Vehicle Recognition
Vision-based Automated Vehicle Recognition (VAVR) has attracted considerable attention recently. Particularly given the reliance on emerging deep learning methods, which have powerful feature extraction and pattern learning abilities, vehicle recognition has made significant progress. VAVR is an essential part of Intelligent Transportation Systems. The VAVR system can fast and accurately locate a target vehicle, which significantly helps improve regional security. A comprehensive VAVR system contains three components: Vehicle Detection (VD), Vehicle Make and Model Recognition (VMMR), and Vehicle Re-identification (VRe-ID). These components perform coarse-to-fine recognition tasks in three steps. In this article, we conduct a thorough review and comparison of the state-of-the-art deep learning--based models proposed for VAVR. We present a detailed introduction to different vehicle recognition datasets used for a comprehensive evaluation of the proposed models. We also critically discuss the major challenges and future research trends involved in each task. Finally, we summarize the characteristics of the methods for each task. Our comprehensive model analysis will help researchers that are interested in VD, VMMR, and VRe-ID and provide them with possible directions to solve current challenges and further improve the performance and robustness of models.
- Research Article
11
- 10.1016/j.jksuci.2023.101885
- Dec 13, 2023
- Journal of King Saud University - Computer and Information Sciences
Vehicle make and model recognition (VMMR) is a crucial task for developing automatic vehicle recognition (AVR) systems, and has gained significant attention in the fields of computer vision and artificial intelligence in recent years. The ability to automatically identify a vehicle's make and model has numerous practical applications, such as traffic monitoring, vehicle re-identification, etc. This survey paper provides a comprehensive overview of the state-of-the-art techniques developed for VMMR problem. The survey begins with an introduction to the problem of AVR, followed by a discussion of the various factors that affect the accuracy of recognition, including lighting conditions, viewpoint variations, and occlusions. We then discuss a solution to this problem and provide an overview of the different approaches for VMMR, such as machine learning approaches and deep learning approaches. This survey also provides a comprehensive review of publicly available datasets that have been used for evaluating VMMR methods. Finally, the paper concludes with a discussion of some of the remaining challenges in VMMR, such as the need for large-scale datasets with more diverse vehicle models, the need for more robust methods that can handle variations in lighting and viewpoint, and the need for real-time methods that can operate in a variety of settings. This survey aims to serve as a valuable resource for researchers working in the field of computer vision that includes AVR.
- Research Article
8
- 10.3390/s22218439
- Nov 2, 2022
- Sensors
In recent years, Vehicle Make and Model Recognition (VMMR) has attracted a lot of attention as it plays a crucial role in Intelligent Transportation Systems (ITS). Accurate and efficient VMMR systems are required in real-world applications including intelligent surveillance and autonomous driving. The paper introduces a new large-scale dataset and a novel deep learning paradigm for VMMR. A new large-scale dataset dubbed Diverse large-scale VMM (DVMM) is proposed collecting image-samples with the most popular vehicle brands operating in Europe. A novel VMMR framework is proposed which follows a two-branch architecture performing make and model recognition respectively. A two-stage training procedure and a novel decision module are proposed to process the make and model predictions and compute the final model prediction. In addition, a novel metric based on the true positive rate is proposed to compare classification confusion of the proposed 2B-2S and the baseline methods. A complex experimental validation is carried out, demonstrating the generality, diversity, and practicality of the proposed DVMM dataset. The experimental results show that the proposed framework provides 93.95% accuracy over the more diverse DVMM dataset and 95.85% accuracy over traditional VMMR datasets. The proposed two-branch approach outperforms the conventional one-branch approach for VMMR over small-, medium-, and large-scale datasets by providing lower vehicle model confusion and reduced inter-make ambiguity. The paper demonstrates the advantages of the proposed two-branch VMMR paradigm in terms of robustness and lower confusion relative to single-branch designs.
- Research Article
15
- 10.1109/tits.2021.3131530
- Aug 1, 2022
- IEEE Transactions on Intelligent Transportation Systems
Vehicle Make and Model Recognition (VMMR) requires fast and accurate recognition of a vehicle’s information. Generally, the vision-based VMMR method recognizes different vehicle models that mainly rely on locating and extracting the discriminative part features of a vehicle. In this paper, we propose a Lightweight Recurrent Attention Unit (LRAU) to enhance the feature extraction ability of the standard Convolutional Neural Network (CNN) architectures for VMMR. The proposed LRAU extracts the discriminative part features by generating attention masks to locate the keypoints of a vehicle (e.g., logo, headlight). The attention mask is generated based on the feature maps received by the LRAU and the preceding attention state generated by the preceding LRAU. By adding LRAUs to receive the multi-scale feature maps generated by the standard CNN architecture, discriminative features of different scales can be efficiently extracted and combined. We conduct comprehensive experiments on three challenging VMMR datasets to evaluate the proposed VMMR models. Experimental results show our models have a stable performance under different environmental conditions. Our models achieve state-of-the-art results with 93.94% accuracy on the Stanford Cars dataset, 98.31% accuracy on the CompCars dataset, and 99.41% accuracy on the NTOU-MMR dataset. Moreover, we demonstrate that our models outperform the traditional machine learning-based VMMR models in terms of recognition accuracy and processing speed. In addition, we construct a one-stage Vehicle Detection and Fine-grained Recognition (VDFR) model by combining our LRAU with the general object detection model. Results show the proposed VDFR model can achieve excellent performance with real-time processing speed.
- Conference Article
15
- 10.1109/iscc50000.2020.9219660
- Jul 1, 2020
With the increasing highlighted security concerns in Intelligent Transportation System (ITS), Vehicle Make and Model Recognition (VMMR) has attracted a lot of attention in recent years. The VMMR method can be widely used in suspicious vehicle recognition, urban traffic monitoring, and the automated driving system. With the development of the Vehicle-to-Everything (V2X) technology, the vehicle information recognized by the AI-based VMMR method can be shared among vehicles and other participants within the transportation system, and can help the police fast locate the suspicious vehicles. VMMR is complicated due to the subtle visual differences among vehicle models. In this paper, we propose a novel Recurrent Attention Unit (RAU) to expand the standard Convolutional Neural Network (CNN) architecture for VMMR. The proposed RAU learns to recognize the discriminative part of a vehicle on multiple scales and builds up a connection with the prominent information in a recurrent way. RAU is a modular unit. It can be easily applied to different layers of the vanilla CNN architectures to boost their performance on VMMR. The efficiency of our models is tested on three challenging VMMR benchmark datasets, i.e., Stanford Cars, CompCars, and CompCars Surveillance. The proposed ResNet101-RAU achieves the best recognition accuracy of 93.81% on the Stanford Cars dataset and 97.84% on the CompCars dataset.
- Research Article
- 10.5815/ijigsp.2014.04.08
- Mar 8, 2014
- International Journal of Image, Graphics and Signal Processing
Vehicle Make and Model Recognition (VMMR) has emerged as a significant element of vision based systems because of its application in access control systems, traffic control and monitoring systems, security systems and surveillance systems, etc.So far a number of techniques have been developed for vehicle recognition.Each technique follows different methodology and classification approaches.The evaluation results highlight the recognition technique with highest accuracy level.In this paper we have pointed out the working of various vehicle make and model recognition techniques and compare these techniques on the basis of methodology, principles, classification approach, classifier and level of recognition.After comparing these factors we concluded that Locally Normalized Harris Corner Strengths (LHNS) performs best as compared to other techniques.LHNS uses Bayes and K-NN classification approaches for vehicle classification.It extracts information from frontal view of vehicles for vehicle make and model recognition.
- Conference Article
53
- 10.1109/cvprw.2017.121
- Jul 1, 2017
Vehicle Make and Model Recognition (VMMR) has evolved into a significant subject of study due to its importance in numerous Intelligent Transportation Systems (ITS) and corresponding components such as Automated Vehicular Surveillance (AVS). A highly accurate and real-time VMMR system significantly reduces the overhead cost of resources otherwise required. The VMMR problem is a multiclass classification task with a peculiar set of issues and challenges like multiplicity, inter- and intra-make ambiguity among various vehicle makes and models, which need to be solved in an efficient and reliable manner to achieve a highly robust VMMR system.,,,,,, In this paper, facing the growing importance of make and model recognition of vehicles, we present an image dataset1 with 9; 170 different classes of vehicles to advance the corresponding tasks. Extensive experiments conducted using baseline approaches yield superior results for images that were occluded, under low illumination, partial or nonfrontal camera views, available in our VMMR dataset. The approaches presented herewith provide a robust VMMR system for applications in realistic environments.
- Research Article
9
- 10.1109/access.2021.3104340
- Jan 1, 2021
- IEEE Access
Fine-grained vehicle classification from images, also known as Vehicle Make and Model Recognition (VMMR), has become an important research topic in the last years, with a growing number of scientific contributions in multiple application areas, such as autonomous vehicles, surveillance systems, traffic monitoring and management, among others. Recent techniques based on deep learning have proven to be very effective in addressing this problem. So effective that, based on the state-of-the-art results (above 95% accuracy), it would seem that the problem is practically solved. However, our main hypothesis is that the existing datasets to date have limited variability, which precludes good and unbiased generalisation of the models trained with them. In particular, it is observed that the test datasets are very similar in nature to those used for training and validation which makes these benchmarks prone to dataset bias and to overfitting. When these systems are tested with more challenging data or data from different datasets performance degrades considerably. In this paper, on the one hand, we evaluate state-of-the-art deep learning models to perform fine-grained vehicle classification and explore multiple training techniques, such as curriculum learning or weighted losses, to mitigate the bias between different makes and models and to assess the limits of current approaches. On the other hand, we analyse the existing datasets, present an additional dataset from a challenging scenario, and merge all the data into a cross-dataset that includes common samples and classes from the existing datasets. In this way, we can evaluate geographical, make and model biases, and performance and generalisation capabilities from a more realistic perspective. The obtained results suggest that we are still far from accurate and unbiased vehicle make and model recognition in realistic traffic and driving scenarios.
- Book Chapter
6
- 10.1007/978-981-16-8129-5_138
- Jan 1, 2022
The increasingly dense traffic is becoming a challenge in our local settings, urging the need for a better traffic monitoring and management system. Fine-grained vehicle classification appears to be a challenging task as compared to vehicle coarse classification. Exploring a robust approach for vehicle detection and classification into fine-grained categories is therefore essentially required. Existing Vehicle Make and Model Recognition (VMMR) systems have been developed on synchronized and controlled traffic conditions. Need for robust VMMR in complex, urban, heterogeneous, and unsynchronized traffic conditions still remain an open research area. In this paper, vehicle detection and fine-grained classification are addressed using deep learning. To perform fine-grained classification with related complexities, local dataset THS-10 having high intraclass and low interclass variation is exclusively prepared. The dataset consists of 4250 vehicle images of 10 vehicle models, i.e., Honda City, Honda Civic, Suzuki Alto, Suzuki Bolan, Suzuki Cultus, Suzuki Mehran, Suzuki Ravi, Suzuki Swift, Suzuki Wagon R and Toyota Corolla. This dataset is available online. Due to having almost no design variation in some make and models over the years, vehicle models are not separated by their year of generation. Two approaches have been explored and analyzed for classification of vehicles i.e., fine-tuning, and feature extraction from deep neural networks. A comparative study is performed, and it is demonstrated that simpler approaches can produce good results in local environment to deal with complex issues such as dense occlusion and lane departures. Hence reducing computational load and time, e.g. finetuning Inception-v3 produced highest accuracy of 97.4% with lowest misclassification rate of 2.08%. Finetuning MobileNet-v2 and ResNet-18 produced 96.8% and 95.7% accuracies, respectively. Extracting features from fc6 layer of AlexNet produces an accuracy of 93.5% with a misclassification rate of 6.5%.KeywordsVehicle detectionVehicle classificationFine-grained classificationDeep learningTransfer learningDeep neural networksUrban traffic scenario
- Book Chapter
3
- 10.1007/978-3-031-14859-0_1
- Aug 28, 2022
Object detection is largely used in the area of computer vision and is critical for variety of applications. During the development of half a century, object detection methods have been continuously developed, and generated numerous approaches which obtained promising achievements. At present, the approach of object detection has been largely evolved into two categories which are traditional machine learning methods utilizing varied computer vision techniques and deep learning methods. In spite of this evolution, accurate implementation of Vehicle Make and Model Recognition (VMMR) is exacting owing to alike (kindred) appearance of different models of vehicles. Therefore this paper presents machine as well as deep learning techniques along with transfer learning models for car detection where the classification is generally at the extent of Make, Model and Year. In this paper, firstly the existing techniques centered on traditional machine learning are introduced and summarized. Then, two main schools of deep learning methods, Convolutional Neural Network (CNN) and ResNeXt50 are selected for analysis. At the end, the methods mentioned are briefly compared and discussed.KeywordsVehicle make and model recognitionStanford cars datasetResNeXt50
- Research Article
4
- 10.3390/s23187920
- Sep 15, 2023
- Sensors (Basel, Switzerland)
Vehicle make and model recognition (VMMR) is an important aspect of intelligent transportation systems (ITS). In VMMR systems, surveillance cameras capture vehicle images for real-time vehicle detection and recognition. These captured images pose challenges, including shadows, reflections, changes in weather and illumination, occlusions, and perspective distortion. Another significant challenge in VMMR is the multiclass classification. This scenario has two main categories: (a) multiplicity and (b) ambiguity. Multiplicity concerns the issue of different forms among car models manufactured by the same company, while the ambiguity problem arises when multiple models from the same manufacturer have visually similar appearances or when vehicle models of different makes have visually comparable rear/front views. This paper introduces a novel and robust VMMR model that can address the above-mentioned issues with accuracy comparable to state-of-the-art methods. Our proposed hybrid CNN model selects the best descriptive fine-grained features with the help of Fisher Discriminative Least Squares Regression (FDLSR). These features are extracted from a deep CNN model fine-tuned on the fine-grained vehicle datasets Stanford-196 and BoxCars21k. Using ResNet-152 features, our proposed model outperformed the SVM and FC layers in accuracy by 0.5% and 4% on Stanford-196 and 0.4 and 1% on BoxCars21k, respectively. Moreover, this model is well-suited for small-scale fine-grained vehicle datasets.
- Conference Article
3
- 10.1145/3416013.3426461
- Nov 16, 2020
The recent works on automated vehicle make and model recognition (VMMR) have embraced the use of advanced deep learning models such as convolutional neural networks. In this work, we introduce an adversarial attack against such VMMR systems through adversarially learnt patches. We demonstrate the effectiveness of the adversarial patches against VMMR through experimental evaluations on a real-world surveillance dataset. The developed adversarial patches achieve reductions of upto 37% in VMMR recall scores. It is hoped that this work shall motivate future studies in developing VMMR systems that are robust to adversarial learning-based attacks.
- Research Article
2
- 10.1088/1742-6596/1982/1/012077
- Jul 1, 2021
- Journal of Physics: Conference Series
Vehicle target detection technology refers to the process of vehicle detection and recognition in the image data set of different backgrounds by means of feature extraction. The vehicle target detection technology based on deep learning shows obvious advantages in the accuracy and speed of target detection. With the development of science and technology, the detection and recognition of vehicles in UAV aerial images has become an important applied research direction. This paper studies the detection and recognition of UAV aerial vehicle based on deep learning, and proposes a new deep learning-based algorithm to solve the problem that incomplete vehicle targets in the UAV aerial vehicle based on YOLOV3 algorithm cannot be recognized, and vehicles close to the UAV aerial vehicle are missed. Experimental verification results show that, compared with the existing algorithms, the proposed algorithm can significantly improve the detection accuracy of UAV aerial vehicle based on deep learning while ensuring real-time performance.
- Research Article
6
- 10.1177/03611981211019743
- Jul 1, 2021
- Transportation Research Record: Journal of the Transportation Research Board
A vehicle make and model recognition (VMMR) system is a common requirement in the field of intelligent transportation systems (ITS). However, it is a challenging task because of the subtle differences between vehicle categories. In this paper, we propose a hierarchical scheme for VMMR. Specifically, the scheme consists of (1) a feature extraction framework called weighted mask hierarchical bilinear pooling (WMHBP) based on hierarchical bilinear pooling (HBP) which weakens the influence of invalid background regions by generating a weighted mask while extracting features from discriminative regions to form a more robust feature descriptor; (2) a hierarchical loss function that can learn the appearance differences between vehicle brands, and enhance vehicle recognition accuracy; (3) collection of vehicle images from the Internet and classification of images with hierarchical labels to augment data for solving the problem of insufficient data and low picture resolution and improving the model’s generalization ability and robustness. We evaluate the proposed framework for accuracy and real-time performance and the experiment results indicate a recognition accuracy of 95.1% and an FPS (frames per second) of 107 for the framework for the Stanford Cars public dataset, which demonstrates the superiority of the method and its availability for ITS.
- Book Chapter
- 10.1007/978-3-319-50212-0_17
- Nov 22, 2016
In computer vision, the vehicle detection and identification is a very popular research topic. The intelligent vehicle detection application must first be able to detect ROI (Region of Interest) of vehicle exactly in order to obtain the vehicle-related information. This paper uses symmetrical SURF descriptor which enhances the ability of SURF to detect all possible symmetrical matching pairs for vehicle detection and analysis. Each vehicle can be found accurately and efficiently by the matching results even though only single image without using any motion features. This detection scheme has a main advantages that no need using background subtraction method. After that, modified vehicle make and model recognition (MMR) scheme has been presented to resolve vehicle identification process. We adopt a grid division scheme to construct some weak vehicle classifier and then combine such weak classifier into a stronger vehicle classifier. The ensemble classifier can accurately recognize each type vehicle. Experimental results prove the superiorities of our method in vehicle MMR.
- Research Article
35
- 10.1109/tits.2018.2835471
- May 1, 2019
- IEEE Transactions on Intelligent Transportation Systems
This paper presents a novel recognition scheme for vehicle make and model recognition (VMMR) from frontal images of vehicles. In general, we introduce some domain knowledge to cope with this task. The structural components contained in the frontal appearance of vehicles present different visual characteristics and their discriminating ability varies when vehicle models belonging to the same brand or different brands are compared. In light of the particularities, we take advantage of the varying discriminating ability of these structural components to perform the recognition task sequentially in two stages. At the first stage, the logo sub-region (which is one of the component-related sub-regions in the region of interest) is applied to classify the vehicle models at the brand level. Different from the traditional brand-level classification that the models of the same brand are considered as a single class, in this paper, multiple sub-classes in one brand class are allowed, since the intra-brand models also exhibit a certain degree of diversity. In this way, the problem of inter-class similarity is remitted. At the second stage, several customized classifiers are trained for each sub-class in the light of the discriminant ability of the remaining sub-regions. The proposed approach has been tested on a large-scale vehicle image database collected in this paper and has achieved the state-of-the-art results.