Adversarially Trained Convolutional Neural Networks for Semantic Segmentation of Ischaemic Stroke Lesion using Multisequence Magnetic Resonance Imaging.
Ischaemic stroke is a medical condition caused by occlusion of blood supply to the brain tissue thus forming a lesion. A lesion is zoned into a core associated with irreversible necrosis typically located at the center of the lesion, while reversible hypoxic changes in the outer regions of the lesion are termed as the penumbra. Early estimation of core and penumbra in ischaemic stroke is crucial for timely intervention with thrombolytic therapy to reverse the damage and restore normalcy. Multisequence magnetic resonance imaging (MRI) is commonly employed for clinical diagnosis. However, a sequence singly has not been found to be sufficiently able to differentiate between core and penumbra, while a combination of sequences is required to determine the extent of the damage. The challenge, however, is that with an increase in the number of sequences, it cognitively taxes the clinician to discover symptomatic biomarkers in these images. In this paper, we present a data-driven fully automated method for estimation of core and penumbra in ischaemic lesions using diffusion-weighted imaging (DWI) and perfusion-weighted imaging (PWI) sequence maps of MRI. The method employs recent developments in convolutional neural networks (CNN) for semantic segmentation in medical images. In the absence of availability of a large amount of labeled data, the CNN is trained using an adversarial approach employing cross-entropy as a segmentation loss along with losses aggregated from three discriminators of which two employ relativistic visual Turing test. This method is experimentally validated on the ISLES-2015 dataset through three-fold cross-validation to obtain with an average Dice score of 0.82 and 0.73 for segmentation of penumbra and core respectively.
- Conference Article
4
- 10.1109/ijcnn.2017.7965909
- May 1, 2017
This paper investigates the problem of weakly-supervised semantic segmentation, where image-level labels are used as weak supervision. Inspired by the successful use of Convolutional Neural Networks (CNNs) for fully-supervised semantic segmentation, we choose to directly train the CNNs over the oversegmented regions of images for weakly-supervised semantic segmentation. Although there are a few studies on CNNs-based weakly-supervised semantic segmentation, they have rarely considered the noise issue, i.e., the initial weak labels (e.g., social tags) may be noisy. To cope with this issue, we thus propose graph-boosted CNNs (GB-CNNs) for weakly-supervised semantic segmentation. In our GB-CNNs, the graph-based model provides the initial supervision for training the CNNs, and then the outcomes of the CNNs are used to retrain the graph-based model. This training procedure is iteratively implemented to boost the results of semantic segmentation. Experimental results demonstrate that the proposed model outperforms the state-of-the-art weakly-supervised methods. More notably, the proposed model is shown to be more robust in the noisy setting for weakly-supervised semantic segmentation.
- Research Article
7
- 10.1007/s11042-018-7121-z
- Jan 22, 2019
- Multimedia Tools and Applications
Over the past several years, deep convolutional neural networks have pushed the performance of computer vision systems to soaring heights on semantic segmentation. In this paper, we put forward a novel practical convolutional neural network for effective semantic segmentation. The network is comprised of three modules. The module of widening residual convolutional neural network is firstly used to generate a full-size segmentation with localization of object boundaries. Feature maps generated by widening residual network are integrated by the module called as refine residual feature pyramid. The module of rolling guidance edge reserved layer is used for improving segmentation boundary. The innovations of this paper consist of these points. Widening residual skipped connections are utilized to preserve low-level features allowing accurate boundary localization. The refine residual feature pyramid is incorporated to achieve the observation of existence of objects at multiple scale. For modeling long-range context dependence and increasing accuracy near object boundary, rolling guidance edge reserved layer are implemented. The experimental results, on PASCAL VOC 2012 semantic segmentation dataset and Cityscapes dataset, demonstrate the proposed method can offer an effective way of producing pixel-wise image labeling.
- Research Article
229
- 10.1016/j.isprsjprs.2019.03.015
- Mar 29, 2019
- ISPRS Journal of Photogrammetry and Remote Sensing
A new fully convolutional neural network for semantic segmentation of polarimetric SAR imagery in complex land cover ecosystem
- Research Article
17
- 10.1016/j.sysarc.2024.103242
- Jul 20, 2024
- Journal of Systems Architecture
Evaluating single event upsets in deep neural networks for semantic segmentation: An embedded system perspective
- Research Article
42
- 10.3390/rs14020305
- Jan 10, 2022
- Remote Sensing
Semantic segmentation is one of the significant tasks in understanding aerial images with high spatial resolution. Recently, Graph Neural Network (GNN) and attention mechanism have achieved excellent performance in semantic segmentation tasks in general images and been applied to aerial images. In this paper, we propose a novel Superpixel-based Attention Graph Neural Network (SAGNN) for semantic segmentation of high spatial resolution aerial images. A K-Nearest Neighbor (KNN) graph is constructed from our network for each image, where each node corresponds to a superpixel in the image and is associated with a hidden representation vector. On this basis, the initialization of the hidden representation vector is the appearance feature extracted by a unary Convolutional Neural Network (CNN) from the image. Moreover, relying on the attention mechanism and recursive functions, each node can update its hidden representation according to the current state and the incoming information from its neighbors. The final representation of each node is used to predict the semantic class of each superpixel. The attention mechanism enables graph nodes to differentially aggregate neighbor information, which can extract higher-quality features. Furthermore, the superpixels not only save computational resources, but also maintain object boundary to achieve more accurate predictions. The accuracy of our model on the Potsdam and Vaihingen public datasets exceeds all benchmark approaches, reaching 90.23% and 89.32%, respectively.
- Research Article
134
- 10.1007/s11280-018-0556-3
- Apr 19, 2018
- World Wide Web
Recent years have witnessed the great progress for semantic segmentation using deep convolutional neural networks (DCNNs). This paper presents a novel fully convolutional network for semantic segmentation using multi-scale contextual convolutional features. Since objects in natural images tend to be with various scales and aspect ratios, capturing the rich contextual information is very critical for dense pixel prediction. On the other hand, when going deeper in convolutional layers, the convolutional feature maps of traditional DCNNs gradually become coarser, which may be harmful for semantic segmentation. According to these observations, we attempt to design a multi-scale deep context convolutional network (MDCCNet), which combines the feature maps from different levels of network in a holistic manner for semantic segmentation. The segmentation outputs of MDCCNets are further enhanced using dense connected conditional random fields (CRF). The proposed network allows us to fully exploit local and global contextual information, ranging from an entire scene to every single pixel, to perform pixel-wise label estimation. The experimental results demonstrate that our method outperforms or is comparable to state-of-the-art methods on PASCAL VOC 2012 and SIFTFlow semantic segmentation datasets.
- Research Article
291
- 10.1109/access.2020.2993937
- Jan 1, 2020
- IEEE Access
Predicting the presence of Microaneurysms in the fundus images and the identification of diabetic retinopathy in early-stage has always been a major challenge for decades. Diabetic Retinopathy (DR) is affected by prolonged high blood glucose level which leads to microvascular complications and irreversible vision loss. Microaneurysms formation and macular edema in the retinal is the initial sign of DR and diagnosis at the right time can reduce the risk of non proliferated diabetic retinopathy. The rapid improvement of deep learning makes it gradually become an efficient technique to provide an interesting solution for medical image analysis problems. The proposed system analysis the presence of microaneurysm in fundus image using convolutional neural network algorithms that embeds deep learning as a core component accelerated with GPU(Graphics Processing Unit) which will perform medical image detection and segmentation with high-performance and low-latency inference. The semantic segmentation algorithm is utilized to classify the fundus picture as normal or infected. Semantic segmentation divides the image pixels based on their common semantic to identify the feature of microaneurysm. This provides an automated system that will assist ophthalmologists to grade the fundus images as early NPDR, moderate NPDR, and severe NPDR. The Prognosis of Microaneurysm and early diagnosis system for non - proliferative diabetic retinopathy system has been proposed that is capable to train effectively a deep convolution neural network for semantic segmentation of fundus images which can increase the efficiency and accuracy of NPDR (non proliferated diabetic retinopathy) prediction.
- Research Article
7
- 10.1016/j.zemedi.2021.11.004
- Jan 10, 2022
- Zeitschrift für medizinische Physik
The application of deep neural networks for segmentation in medical imaging has gained substantial interest in recent years. In many cases, this variant of machine learning has been shown to outperform other conventional segmentation approaches. However, little is known about its general applicability. Especially the robustness against image modifications (e.g., intensity variations, contrast variations, spatial alignment) has hardly been investigated. Data augmentation is often used to compensate for sensitivity to such changes, although its effectiveness has not yet been studied. Therefore, the goal of this study was to systematically investigate the sensitivity to variations in input data with respect to segmentation of medical images using deep learning. This approach was tested with two publicly available segmentation frameworks (DeepMedic and TractSeg). In the case of DeepMedic, the performance was tested using ground truth data, while in the case of TractSeg, the STAPLE technique was employed. In both cases, sensitivity analysis revealed significant dependence of the segmentation performance on input variations. The effects of different data augmentation strategies were also shown, making this type of analysis a useful tool for selecting the right parameters for augmentation. The proposed analysis should be applied to any deep learning image segmentation approach, unless the assessment of sensitivity to input variations can be directly derived from the network.
- Research Article
96
- 10.1016/j.compbiomed.2020.104036
- Oct 8, 2020
- Computers in Biology and Medicine
A comparative study of pre-trained convolutional neural networks for semantic segmentation of breast tumors in ultrasound
- Research Article
104
- 10.3390/ijgi9040189
- Mar 25, 2020
- ISPRS International Journal of Geo-Information
Automatic water body extraction method is important for monitoring floods, droughts, and water resources. In this study, a new semantic segmentation convolutional neural network named the multi-scale water extraction convolutional neural network (MWEN) is proposed to automatically extract water bodies from GaoFen-1 (GF-1) remote sensing images. Three convolutional neural networks for semantic segmentation (fully convolutional network (FCN), Unet, and Deeplab V3+) are employed to compare with the water bodies extraction performance of MWEN. Visual comparison and five evaluation metrics are used to evaluate the performance of these convolutional neural networks (CNNs). The results show the following. (1) The results of water body extraction in multiple scenes using the MWEN are better than those of the other comparison methods based on the indicators. (2) The MWEN method has the capability to accurately extract various types of water bodies, such as urban water bodies, open ponds, and plateau lakes. (3) By fusing features extracted at different scales, the MWEN has the capability to extract water bodies with different sizes and suppress noise, such as building shadows and highways. Therefore, MWEN is a robust water extraction algorithm for GaoFen-1 satellite images and has the potential to conduct water body mapping with multisource high-resolution satellite remote sensing data.
- Research Article
14
- 10.3390/rs12081289
- Apr 18, 2020
- Remote Sensing
We studied the applicability of point clouds derived from tri-stereo satellite imagery for semantic segmentation for generalized sparse convolutional neural networks by the example of an Austrian study area. We examined, in particular, if the distorted geometric information, in addition to color, influences the performance of segmenting clutter, roads, buildings, trees, and vehicles. In this regard, we trained a fully convolutional neural network that uses generalized sparse convolution one time solely on 3D geometric information (i.e., 3D point cloud derived by dense image matching), and twice on 3D geometric as well as color information. In the first experiment, we did not use class weights, whereas in the second we did. We compared the results with a fully convolutional neural network that was trained on a 2D orthophoto, and a decision tree that was once trained on hand-crafted 3D geometric features, and once trained on hand-crafted 3D geometric as well as color features. The decision tree using hand-crafted features has been successfully applied to aerial laser scanning data in the literature. Hence, we compared our main interest of study, a representation learning technique, with another representation learning technique, and a non-representation learning technique. Our study area is located in Waldviertel, a region in Lower Austria. The territory is a hilly region covered mainly by forests, agriculture, and grasslands. Our classes of interest are heavily unbalanced. However, we did not use any data augmentation techniques to counter overfitting. For our study area, we reported that geometric and color information only improves the performance of the Generalized Sparse Convolutional Neural Network (GSCNN) on the dominant class, which leads to a higher overall performance in our case. We also found that training the network with median class weighting partially reverts the effects of adding color. The network also started to learn the classes with lower occurrences. The fully convolutional neural network that was trained on the 2D orthophoto generally outperforms the other two with a kappa score of over 90% and an average per class accuracy of 61%. However, the decision tree trained on colors and hand-crafted geometric features has a 2% higher accuracy for roads.
- Research Article
111
- 10.1016/j.isprsjprs.2018.06.005
- Jun 19, 2018
- ISPRS Journal of Photogrammetry and Remote Sensing
Developing a multi-filter convolutional neural network for semantic segmentation using high-resolution aerial imagery and LiDAR data
- Research Article
11
- 10.1007/s11548-020-02185-0
- May 22, 2020
- International Journal of Computer Assisted Radiology and Surgery
The manual generation of training data for the semantic segmentation of medical images using deep neural networks is a time-consuming and error-prone task. In this paper, we investigate the effect of different levels of realism on the training of deep neural networks for semantic segmentation of robotic instruments. An interactive virtual-reality environment was developed to generate synthetic images for robot-aided endoscopic surgery. In contrast with earlier works, we use physically based rendering for increased realism. Using a virtual reality simulator that replicates our robotic setup, three synthetic image databases with an increasing level of realism were generated: flat, basic, and realistic (using the physically-based rendering). Each of those databases was used to train 20 instances of a UNet-based semantic-segmentation deep-learning model. The networks trained with only synthetic images were evaluated on the segmentation of 160 endoscopic images of a phantom. The networks were compared using the Dwass-Steel-Critchlow-Fligner nonparametric test. Our results show that the levels of realism increased the mean intersection-over-union (mIoU) of the networks on endoscopic images of a phantom ([Formula: see text]). The median mIoU values were 0.235 for the flat dataset, 0.458 for the basic, and 0.729 for the realistic. All the networks trained with synthetic images outperformed naive classifiers. Moreover, in an ablation study, we show that the mIoU of physically based rendering is superior to texture mapping ([Formula: see text]) of the instrument (0.606), the background (0.685), and the background and instruments combined (0.672). Using physical-based rendering to generate synthetic images is an effective approach to improve the training of neural networks for the semantic segmentation of surgical instruments in endoscopic images. Our results show that this strategy can be an essential step in the broad applicability of deep neural networks in semantic segmentation tasks and help bridge the domain gap in machine learning.
- Conference Article
1
- 10.1109/aiid51893.2021.9456469
- May 28, 2021
The study of cell nuclei is the starting point of modern medical pathology analysis and new drug development, and the semantic segmentation of pathological tissue slice images is a fundamental task of cell nucleus research[1]. This paper proposes a deep learning convolutional neural network for semantic segmentation of cell nuclei, where V-Net [6] is used as the basic framework for segmentation, and then the channel attention mechanism is added to its skip connections. The experiment is evaluated on the dataset of pathological tissue slice images, publicly released in the 2018 Kaggle Challenge data science bowl. The experimental results show that the improved deep learning convolutional neural network achieves excellent performance on the semantic segmentation task of pathological tissue slice images, and can be used as a tool for automatic segmentation of pathological tissue slice images.
- Research Article
30
- 10.1016/j.neucom.2018.01.022
- Feb 14, 2018
- Neurocomputing
Contour-aware network for semantic segmentation via adaptive depth