The article proposes an inverse architecture of the U-Net neural network, named InvU-Net, which differs from the traditional scheme by increasing the dimensionality of images during the initial stages of processing. A comparison was conducted between two approaches for increasing image resolution: UpSampling2D layers and transposed Conv2DTranspose convolutional layers. The latter demonstrated superior results due to its ability to learn weighting coefficients. As part of the study, several InvU-Net modifications were developed and tested: Small, Medium, and Large, differing in structural complexity, the number of layers, and parameters. To improve segmentation accuracy, the integration of attention mechanisms was proposed to enhance the relevance of feature processing. Experiments revealed that simplifying attention mechanisms, including reducing the number of parameters and optimizing integration points, achieves high performance with lower computational complexity. The best-performing model, which incorporated a simplified attention mechanism, achieved 95.6% accuracy, surpassing larger architectures. The results highlight the potential of InvU-Net for segmentation tasks and suggest further optimization directions, such as employing adaptive attention mechanisms and automating the selection of neural network parameters
Read full abstract