Multi-stream fish detection in unconstrained underwater videos by the fusion of two convolutional neural network detectors

Abdelouahid Ben Tamou,Abdesslam Benzinou,Kamal Nasreddine

doi:10.1007/s10489-020-02155-8

Abstract

Recently, marine biologists have begun using underwater videos to study species diversity and fish abundance. These techniques generate a large amount of visual data. Automatic analysis using image processing is therefore necessary, since manual processing is time-consuming and labor-intensive. However, there are numerous challenges to implementing the automatic processing of underwater images: for example, high luminosity variation, limited visibility, complex background, free movement of fish, and high diversity of fish species. In this paper, we propose two new fusion approaches that exploit two convolutional neural network (CNN) streams to merge both appearance and motion information for automatic fish detection. These approaches consist of two Faster R-CNN models that share either the same region proposal network or the same classifier. We significantly improve the fish detection performances on the LifeClef 2015 Fish benchmark dataset not only compared with the classic Faster R-CNN but also with all the state-of-the-art approaches. The best F-score and mAP measures are 83.16% and 73.69%, respectively.

Full Text