Concurrent channel and spatial attention in Fully Convolutional Network for individual pig image segmentation

Zhiwei Hu,Tiantian Lou,Hua Yang,Hongwen Yan

doi:10.25165/j.ijabe.20231601.6528

Zhiwei Hu, Tiantian Lou + Show 2 more

Open Access

https://doi.org/10.25165/j.ijabe.20231601.6528

Copy DOI

Abstract

The separation of individual pigs from the pigpen scenes is crucial for precision farming, and the technology based on convolutional neural networks can provide a low-cost, non-contact, non-invasive method of pig image segmentation. However, two factors limit the development of this field. On the one hand, the individual pigs are easy to stick together, and the occlusion of debris such as pigpens can easily make the model misjudgment. On the other hand, manual labeling of group-raised pig data is time-consuming and labor-intensive and is prone to labeling errors. Therefore, it is urgent for an individual pig image segmentation model that can perform well in individual scenarios and can be easily migrated to a group-raised environment. In order to solve the above problems, taking individual pigs as research objects, an individual pig image segmentation dataset containing 2066 images was constructed, and a series of algorithms based on fully convolutional networks were proposed to solve the pig image segmentation problem. In order to capture the long-range dependencies and weaken the background information such as pigpens while enhancing the information of individual parts of pigs, the channel and spatial attention blocks were introduced into the best-performing decoders UNet and LinkNet. Experiments show that using ResNext50 as the encoder and Unet as the decoder as the basic model, adding two attention blocks at the same time achieves 98.30% and 96.71% on the F1 and IOU metrics, respectively. Compared with the model adding channel attention block alone, the two metrics are improved by 0.13% and 0.22%, respectively. The experiment of introducing channel and spatial attention alone shows that spatial attention is more effective than channel attention. Taking VGG16-LinkNet as an example, compared with channel attention, spatial attention improves the F1 and IOU metrics by 0.16% and 0.30%, respectively. Furthermore, the heatmap of the feature of different layers of the decoder after adding different attention information proves that with the increase of layers, the boundary of pig image segmentation is clearer. In order to verify the effectiveness of the individual pig image segmentation model in group-raised scenes, the transfer performance of the model is verified in three scenarios of high separation, deep adhesion, and pigpen occlusion. The experiments show that the segmentation results of adding attention information, especially the simultaneous fusion of channel and spatial attention blocks, are more refined and complete. The attention-based individual pig image segmentation model can be effectively transferred to the field of group-raised pigs and can provide a reference for its pre-segmentation. Keywords: pig, image segmentation, Fully Convolutional Network (FCN), attention mechanism, channel and spatial attention DOI: 10.25165/j.ijabe.20231601.6528 Citation: Hu Z W, Yang H, Lou T T, Yang H W. Concurrent channel and spatial attention in Fully Convolutional Network for individual pig image segmentation. Int J Agric & Biol Eng, 2023; 16(1): 232–242.

Full Text