Semi-supervised and weakly-supervised deep neural networks and dataset for fish detection in turbid underwater videos

Mohammad Jahanbakht,Mostafa Rahimi Azghadi,Nathan J Waltham

doi:10.1016/j.ecoinf.2023.102303

Abstract

Fish are key members of marine ecosystems, and they have a significant share in the healthy human diet. Besides, fish abundance is an excellent indicator of water quality, as they have adapted to various levels of oxygen, turbidity, nutrients, and pH. To detect various fish in underwater videos, Deep Neural Networks (DNNs) can be of great assistance. However, training DNNs is highly dependent on large, labeled datasets, while labeling fish in turbid underwater video frames is a laborious and time-consuming task, hindering the development of accurate and efficient models for fish detection. To address this problem, firstly, we have collected a dataset called FishInTurbidWater, which consists of a collection of video footage gathered from turbid waters, and quickly and weakly (i.e., giving higher priority to speed over accuracy) labeled them in a 4-times fast-forwarding software. Next, we designed and implemented a semi-supervised contrastive learning fish detection model that is self-supervised using unlabeled data, and then fine-tuned with a small fraction (20%) of our weakly labeled FishInTurbidWater data. At the next step, we trained, using our weakly labeled data, a novel weakly-supervised ensemble DNN with transfer learning from ImageNet. The results show that our semi-supervised contrastive model leads to more than 20 times faster turnaround time between dataset collection and result generation, with reasonably high accuracy (89%). At the same time, the proposed weakly-supervised ensemble model can detect fish in turbid waters with high (94%) accuracy, while still cutting the development time by a factor of four, compared to fully-supervised models trained on carefully labeled datasets. Our dataset and code are publicly available at the hyperlink FishInTurbidWater.

Full Text