To solve the problem of low segmentation model accuracy due to the complex shape of carbon slag in the aluminum electrolysis fire-eye image and the blurring of the boundary between the slag and the surrounding electrolyte, this paper proposes a segmentation model of the fire-eye image based on an improved U-Net. The model reduces the depth of the traditional U-Net to four layers and uses the multiscale dilated convolution module (MDCM) in the down-sampling stage. Second, the Convolutional Block Attention Module (CBAM) is embedded in the skip connection part of the network to improve the ability of the model to extract contextual features from images of multiple scales, enhance the guidance of high-level features to low-level features, and make the model pay more attention to the critical regions. To alleviate the negative impact of the imbalance of positive and negative examples in the dataset, the weighted binary cross-entropy loss and the Dice loss are used to replace the traditional cross-entropy loss. The experimental results show that the segmentation accuracy of the improved model on the fire-eye dataset reaches 88.03%, which is 5.61 percentage points higher than U-Net.
Read full abstract