Fast and Accurate Fish Detection Design with Improved YOLO-v3 Model and Transfer Learning

Kazim Raza,Song Hong

doi:10.14569/ijacsa.2020.0110202

Abstract

Object Detection is one of the problematic Computer Vision (CV) problems with countless applications. We proposed a real-time object detection algorithm based on Improved You Only Look Once version 3 (YOLOv3) for detecting fish. The demand for monitoring the marine ecosystem is increasing day by day for a vigorous automated system, which has been beneficial for all of the researchers in order to collect information about marine life. This proposed work mainly approached the CV technique to detect and classify marine life. In this paper, we proposed improved YOLOv3 by increasing detection scale from 3 to 4, apply k-means clustering to increase the anchor boxes, novel transfer learning technique, and improvement in loss function to improve the model performance. We performed object detection on four fish species custom datasets by applying YOLOv3 architecture. We got 87.56% mean Average Precision (mAP). Moreover, comparing to the experimental analysis of the original YOLOv3 model with the improved one, we observed the mAP increased from 87.17% to 91.30. It showed that improved version outperforms than the original YOLOv3 model.

Highlights

Deep learning (DL) is the subfield of Machine learning (ML), which is built on artificial neural networks that can be unsupervised, semi-supervised, or supervised learning
The experiment performed by the DL open-source library TensorFlow 1.11, OpenCV 4.1.1, and coding concluded with the high-level language python 3.5 at Ubuntu 18.04 operating system
The initial and end learning rate set to 1e-4 and 1e-6, respectively, Intersection over Union (IOU) threshold value 0.5, average decay 0.9, and the batch size is 4

Summary

Introduction

Deep learning (DL) is the subfield of Machine learning (ML), which is built on artificial neural networks that can be unsupervised, semi-supervised, or supervised learning. Researchers tried hard to train a deep multi-layer network for decades, but still, before 2006, there were not many successful experiments at that time where they only passed on effective results with one or two hidden layers. Those results were not producing substantial outcomes due to exploding gradients. [7] used CNN models pipeline, including VGG16 and SSD on 9 common species of fish in the Missouri river to classify into category and position. They have achieved 87.22% accuracy in the classification of the fish

Methods

Results

Conclusion