In this paper, we propose a novel deep neural network (DNN) architecture with fractal structure and attention blocks. The new method is tested to identify and segment 2D and 3D brain tumor masks in normal and pathological neuroimaging data. To circumvent the problem of limited 3D volumetric datasets with raw and ground truth tumor masks, we utilized data augmentation using affine transformations to significantly expand the training data prior to estimating the network model parameters. The proposed Attention-based Fractal Unet (AFUnet) technique combines benefits of fractal convolutional networks, attention blocks, and the encoder-decoder structure of Unet. The AFUnet models are fit on training data and their performance is assessed on independent validation and testing datasets. The Dice score is used to measure and contrast the performance of AFUnet against alternative methods, such as Unet, attention Unet, and several other DNN models with relative number of parameters. In addition, we explore the effects of the network depth to the AFUnet prediction accuracy. The results suggest that with a few network structure iterations, the attention-based fractal Unet achieves good performance. Although deeper nested network structure certainly improves the prediction accuracy, this comes with a very substantial computational cost. The benefits of fitting deeper AFUnet models are relative to the extra time and computational demands. Some of the AFUnet networks outperform current state-of-the-art models and achieve highly accurate and realistic brain-tumor boundary segmentation (contours in 2D and surfaces in 3D). In our experiments, the sensitivity of the Dice score to capture significant inter-models differences is marginal. However, there is improved validation loss during long periods of AFUnet training. The lower binary cross entropy loss suggests that AFUNet is superior in finding true negative voxels (i.e., identifying normal tissue), which suggests the new method is more conservative. This approach may be generalized to higher dimensional data, e.g., 4D fMRI hypervolumes, and applied for a wide range of signal, image, volume, and hypervolume segmentation tasks.
Read full abstract