Abstract

With the advancements made in deep learning, computer vision problems have seen a great improvement in performance. However, in many real-world applications such as autonomous driving vehicles, the risk associated with incorrect predictions of objects or segmentation of images is very high. Standard deep learning models for object detection and segmentation such as YOLO models are often overconfident in their predictions and do not take into account the uncertainty in predictions on out-of-distribution data. In this work, we propose an efficient and effective approach, Monte-Carlo DropBlock (MC-DropBlock), to model uncertainty in YOLO and convolutional vision Transformers for object detection. The proposed approach applies drop-block during training time and testing time on the convolutional layer of the deep learning models such as YOLO and convolutional transformer. We theoretically show that this leads to a Bayesian convolutional neural network capable of capturing the epistemic uncertainty in the model. Additionally, we capture the aleatoric uncertainty in the data using a Gaussian likelihood. We demonstrate the effectiveness of the proposed approach on modeling uncertainty in object detection and segmentation tasks using out-of-distribution experiments. Experimental results show that MC-DropBlock improves the generalization, calibration, and uncertainty modeling capabilities of YOLO models and convolutional Transformer models for object detection and segmentation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call