Dual attention-guided feature pyramid network for instance segmentation of group pigs

Zhiwei Hu,Hua Yang,Tiantian Lou

doi:10.1016/j.compag.2021.106140

Zhiwei Hu, Hua Yang + Show 1 more

https://doi.org/10.1016/j.compag.2021.106140

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

In respect of pig instance segmentation, the application of traditional computer vision techniques is constrained by sundries barrier, overlapping, and different perspectives in the pig breeding environment. In recent years, the attention-based methods have achieved remarkable performance. In this paper, we introduce two types of attention blocks into the feature pyramid network (FPN) (see nomenclature table) framework, which encode the semantic interdependencies in the channel (named channel attention block (CAB)) (see nomenclature table) and spatial (named spatial attention block (SAB)) (see nomenclature table) dimensions, respectively. By integrating the associated features, the CAB selectively emphasizes the interdependencies among the channels. Meanwhile, the SAB selectively aggregates the features at each position through a weighted sum of the features at all positions. A dual attention block (DAB) (see nomenclature table) is proposed to integrate CAB features with SAB information flexibly. A total of 45 pigs with 8 pens are captured as the experiment subjects. In comparison with such state-of-art attention modules as convolutional block attention module (CBAM) (see nomenclature table), bottleneck attention module (BAM) (see nomenclature table), and spatial-channel squeeze & excitation (SCSE) (see nomenclature table), embedding DAB can contribute to the most significant performance improvement in different task networks with distinct backbone networks. Especially with HTC-R101-DAB (hybrid task cascade) (see nomenclature table), the best performance is produced, with the AP0.5 (average precision) (see nomenclature table) AP0.75, AP0.5:0.95, and AP0.5:0.95-large reaching 93.1%, 84.1%, 69.4%, and 71.8%, respectively. Also, as indicated by ablation experiments, the SAB contributes more than CAB. Meanwhile, the predictive results appear a trend of increasing initially and decreasing afterwards after different numbers of SAB are merged. Besides, as revealed by the visualization of attention maps, attention blocks can extract regions with similar semantic information. The attention-based models also produce outstanding segmentation performance on public dataset, which evidences the practicability of our attention blocks. Ourbaseline models are available11https://github.com/zhiweihu1103/pig-instance-segmentation.

Full Text