Abstract

Object detection is a fundamental computer vision task for many real-world applications. In the maritime environment, this task is challenging due to varying light, view distances, weather conditions, and sea waves. In addition, light reflection, camera motion and illumination changes may cause to false detections. To address this challenge, we present three fusion architectures to fuse two imaging modalities: visible and infrared. These architectures can provide complementary information from two modalities in different levels: pixel-level, feature-level, and decision-level. They employed deep learning for performing fusion and detection. We investigate the performance of the proposed architectures conducting a real marine image dataset, which is captured by color and infrared cameras on-board a vessel in the Finnish archipelago. The cameras are employed for developing autonomous ships, and collect data in a range of operation and climatic conditions. Experiments show that feature-level fusion architecture outperforms the state-of-the-art other fusion level architectures.

Highlights

  • Object detection is a crucial problem for autonomous vehicles and has been studied for years to make it efficient and faster

  • The multi-sensor fusion architectures are generally classified into three groups that are based on the level of data abstraction used for fusion [2]. (1) Early fusion, called pixel-level fusion, combines raw data from the sensors before applying any information extraction strategies

  • The image fusion method provides an essential functionality in our proposed middle fusion architecture

Read more

Summary

Introduction

Object detection is a crucial problem for autonomous vehicles and has been studied for years to make it efficient and faster. A reliable autonomous driving system relies on accurate object detection for providing robust perception of the environment. Object detection is a challenging problem due to varying light, view distances, weather conditions, and dynamic sea nature. Multi-sensor fusion technology is a promising solution for achieving accurate object detection by obtaining the complementary properties of objects based on multiple sensors. The multi-sensor fusion architectures are generally classified into three groups that are based on the level of data abstraction used for fusion [2]. (1) Early fusion, called pixel-level fusion, combines raw data from the sensors before applying any information extraction strategies. (2) Middle fusion, called feature-level fusion, fuses the extracted features from each raw sensor data and performs detection on the fused data. The multi-sensor fusion architectures are generally classified into three groups that are based on the level of data abstraction used for fusion [2]. (1) Early fusion, called pixel-level fusion, combines raw data from the sensors before applying any information extraction strategies. (2) Middle fusion, called feature-level fusion, fuses the extracted features from each raw sensor data and performs detection on the fused data. (3) Late fusion, called decision-level fusion, independently performs detection from each sensor and the outputs of each sensor are fused at the decision level for final detection

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.