Abstract

In this paper, we present a conceptually simple, flexible, and general classifier for road scene multi-label classification. This method, called a combined classifier, enhances the main multi-label classifier with the single-label classifiers using a fusing and stacking strategy. Our model uses the bottleneck ResnetV2-50 [1], [2] as a network backbone, one of state-of-the-art (SOTA) architecture for image features extraction. For each RGB input image (512×256), our model provides one specific class for each label group: time of day, location, and weather. By training and testing the model on the Berkeley DeepDrive Dataset (BDD100k) [3], a popular largescale dataset for road scene understanding, our proposed classifier can improve the accuracy of more than 9% compared to the traditional classifier for multi-label classification problems. To solve the imbalanced problem, which commonly exists in the road scene dataset, we applied the class-balanced (CB)focal loss in [4]. Overall, our model achieves high accuracy in the validation set, which is 98.11%,76.06%,69.38%, and 80.79% for time class, location class, weather class and all-type class, respectively. Our classifier can also be easily applied to other multi-label classification problems, which can be an effective approach for transfer learning problems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call