Targeted prostate biopsy guided by multiparametric magnetic resonance imaging (mpMRI) detects more clinically significant lesions than conventional systemic biopsy. Lesion segmentation is required for planning MRI-targeted biopsies. The requirement for integrating image features available in T2-weighted and diffusion-weighted images poses a challenge in prostate lesion segmentation frommpMRI. A flexible and efficient multistream fusion encoder is proposed in this work to facilitate the multiscale fusion of features from multiple imaging streams. A patch-based loss function is introduced to improve the accuracy in segmenting smalllesions. The proposed multistream encoder fuses features extracted in the three imaging streams at each layer of the network, thereby allowing improved feature maps to propagate downstream and benefit segmentation performance. The fusion is achieved through a spatial attention map generated by optimally weighting the contribution of the convolution outputs from each stream. This design provides flexibility for the network to highlight image modalities according to their relative influence on the segmentation performance. The encoder also performs multiscale integration by highlighting the input feature maps (low-level features) with the spatial attention maps generated from convolution outputs (high-level features). The Dice similarity coefficient (DSC), serving as a cost function, is less sensitive to incorrect segmentation for small lesions. We address this issue by introducing a patch-based loss function that provides an average of the DSCs obtained from local image patches. This local average DSC is equally sensitive to large and small lesions, as the patch-based DSCs associated with small and large lesions have equal weights in this average DSC. The framework was evaluated in 931 sets of images acquired in several clinical studies at two centers in Hong Kong and the United Kingdom. In particular, the training, validation, and test sets contain 615, 144, and 172 sets of images, respectively. The proposed framework outperformed single-stream networks and three recently proposed multistream networks, attaining F1 scores of 82.2 and 87.6% in the lesion and patient levels, respectively. The average inference time for an axial image was 11.8 ms. The accuracy and efficiency afforded by the proposed framework would accelerate the MRI interpretation workflow of MRI-targeted biopsy and focaltherapies.
Read full abstract