Modern aquaculture utilizes computer vision technology to analyze underwater images of fish, contributing to optimized water quality and improved production efficiency. The purpose of this study is to efficiently perform underwater fish detection and tracking using multi-object tracking (MOT) technology. To achieve this, the FairMOT model was employed to simultaneously implement pixel-level object detection and re-identification (Re-ID) functions, comparing two backbone models: FairMOT+YOLOv5s and FairMOT+DLA-34. The study constructed a dataset targeting the popular black porgy in Korean aquaculture, using underwater video data from five different environments collected from the internet. During the training process, the FairMOT+YOLOv5s model rapidly reduced train loss and demonstrated stable performance. The FairMOT+DLA-34 model showed better results in ID tracking performance, with an accuracy of 44.1%, an IDF1 of 11.0%, an MOTP of 0.393, and an IDSW of 1. In contrast, the FairMOT+YOLOv5s model recorded an accuracy of 43.8%, an IDF1 of 14.6%, an MOTP of 0.400, and an IDSW of 10. The results of this study indicate that the FairMOT+YOLOv5s model demonstrated higher IDF1 and MOTP scores compared to the FairMOT+DLA-34 model, while the FairMOT+DLA-34 model showed superior performance in ID tracking accuracy and had fewer ID switches.
Read full abstract