Abstract

Most Convolution Neural Network (CNN) based object detectors, to date, have been optimized for accuracy and/or detection performance on datasets typically comprised of well exposed 8-bits/pixel/channel Standard Dynamic Range (SDR) images. A major existing challenge in this area is to accurately detect objects under extreme/difficult lighting conditions as SDR image trained detectors fail to accurately detect objects under such challenging lighting conditions. In this paper, we address this issue for the first time by introducing High Dynamic Range (HDR) imaging to object detection. HDR imagery can capture and process ≈13 orders of magnitude of scene dynamic range similar to the human eye. HDR trained models are therefore able to extract more salient features from extreme lighting conditions leading to more accurate detections. However, introducing HDR also presents multiple new challenges such as the complete absence of resources and previous literature on such an approach. Here, we introduce a methodology to generate a large scale annotated HDR dataset from any existing SDR dataset and validate the quality of the generated dataset via a robust evaluation technique. We also discuss the challenges of training and validating HDR trained models using existing detectors. Finally, we provide a methodology to create an out of distribution (OOD) HDR dataset to test and compare the performance of HDR and SDR trained detectors under difficult lighting condition. Results suggest that using the proposed methodology, HDR trained models are able to achieve 10 – 12% more accuracy compared to SDR trained models on real-world OOD dataset consisting of high-contrast images under extreme lighting conditions.

Highlights

  • Object detection has been an active research area for the past few decades [1], [2]

  • We provide a side-by-side comparative results of Faster RCNN, Single Shot Multibox Detector (SSD) 300 and SSD 512 on the generated High Dynamic Range (HDR) data

  • PASCAL VOC TEST RESULTS - Standard Dynamic Range (SDR) VS HDR Since our first goal is to evaluate the quality of detection models trained on generated HDR data compared to SDR trained models, we provide a category-wise comparative AP of SDR

Read more

Summary

Introduction

Object detection has been an active research area for the past few decades [1], [2]. the introduction of deep Convolution Neural Networks (CNNs) has brought about a paradigm shift in visual object recognition tasks including object detection [3]. A major existing issue with CNN based detectors is that they are typically trained on datasets consisting of well exposed images with little or no examples of difficult lighting conditions. The training distributions are typically 8-bits/pixel/channel Standard Dynamic Range (SDR) images (jpeg/png format) with ≈3 orders of magnitude of scene dynamic range as opposed to ≈13 orders of the dynamic range seen by the human eye [6], [7]. This leads to a truncated representation of a scene

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.