The regular and consistent monitoring of marine ecosystems and fish communities is becoming more and more crucial due to increasing human pressures. To this end, underwater camera technology has become a major tool to collect an important amount of marine data. As the size of the data collected outgrew the ability to process it, new means of automatic processing have been explored. Convolutional neural networks (CNNs) have been the most popular method for automatic underwater video analysis for the last few years. However, such algorithms are rather image-based and do not exploit the potential of video data. In this paper, we propose a method of coupling video tracking and CNN image analysis to perform a robust and accurate fish classification on deep sea videos and improve automatic classification accuracy. Our method fused CNNs and tracking methods, allowing us to detect 12% more individuals compared to CNN alone.
Read full abstract