Deep boundary‐aware semantic image segmentation

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Abstract While extensive research efforts have been made in semantic image segmentation, the state‐of‐the‐art methods still suffer from blurry boundaries and mismatched objects due to the insufficient multiscale adaptability. In this paper, we propose a two‐branch convolutional neural network (CNN) approach to capture the multiscale context and the boundary information with the two branches, respectively. To capture the multiscale context, we propose to embed self‐attention mechanism to the atrous spatial pyramid pooling network. To capture the boundary information, we propose to fuse the low‐level features in boundary feature extraction for refining the extracted boundaries via a feature fusion layer (FFL). With FFL, our method can improve the segmentation result with clearer boundaries. A new loss function is proposed which contains a segmentation loss and a boundary loss. Experiments show that our method can predict the boundaries of objects more clearly and have better performance for small‐scale objects.

Similar Papers
  • Conference Article
  • Cite Count Icon 79
  • 10.1109/cvprw.2018.00188
A Comparison of Deep Learning Methods for Semantic Segmentation of Coral Reef Survey Images
  • Jun 1, 2018
  • Andrew King + 2 more

Two major deep learning methods for semantic segmentation, i.e., patch-based convolutional neural network (CNN) approaches and fully convolutional neural network (FCNN) models, are studied in the context of classification of regions in underwater images of coral reef ecosystems into biologically meaningful categories. For the patch-based CNN approaches, we use image data extracted from underwater video accompanied by individual point-wise ground truth annotations. We show that patch-based CNN methods can outperform a previously proposed approach that uses support vector machine (SVM)-based classifiers in conjunction with texture-based features. We compare the results of five different CNN architectures in our formulation of patch-based CNN methods. The Resnet152 CNN architecture is observed to perform the best on our annotated dataset of underwater coral reef images. We also examine and compare the results of four different FCNN models for semantic segmentation of coral reef images. We develop a tool for fast generation of segmentation maps to serve as ground truth segmentations for our FCNN models. The FCNN architecture Deeplab v2 is observed to yield the best results for semantic segmentation of underwater coral reef images.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 46
  • 10.1186/s12911-020-01277-w
Can natural language processing help differentiate inflammatory intestinal diseases in China? Models applying random forest and convolutional neural network approaches
  • Sep 29, 2020
  • BMC Medical Informatics and Decision Making
  • Yuanren Tong + 9 more

BackgroundDifferentiating between ulcerative colitis (UC), Crohn’s disease (CD) and intestinal tuberculosis (ITB) using endoscopy is challenging. We aimed to realize automatic differential diagnosis among these diseases through machine learning algorithms.MethodsA total of 6399 consecutive patients (5128 UC, 875 CD and 396 ITB) who had undergone colonoscopy examinations in the Peking Union Medical College Hospital from January 2008 to November 2018 were enrolled. The input was the description of the endoscopic image in the form of free text. Word segmentation and key word filtering were conducted as data preprocessing. Random forest (RF) and convolutional neural network (CNN) approaches were applied to different disease entities. Three two-class classifiers (UC and CD, UC and ITB, and CD and ITB) and a three-class classifier (UC, CD and ITB) were built.ResultsThe classifiers built in this research performed well, and the CNN had better performance in general. The RF sensitivities/specificities of UC-CD, UC-ITB, and CD-ITB were 0.89/0.84, 0.83/0.82, and 0.72/0.77, respectively, while the values for the CNN of CD-ITB were 0.90/0.77. The precisions/recalls of UC-CD-ITB when employing RF were 0.97/0.97, 0.65/0.53, and 0.68/0.76, respectively, and when employing the CNN were 0.99/0.97, 0.87/0.83, and 0.52/0.81, respectively.ConclusionsClassifiers built by RF and CNN approaches had excellent performance when classifying UC with CD or ITB. For the differentiation of CD and ITB, high specificity and sensitivity were achieved as well. Artificial intelligence through machine learning is very promising in helping unexperienced endoscopists differentiate inflammatory intestinal diseases.ConferenceThe abstract of this article has won the first prize of the Young Investigator Award during the Asian Pacific Digestive Week (APDW) 2019 held in Kolkata, India.

  • News Article
  • Cite Count Icon 1
  • 10.1016/s1351-4180(12)70438-5
Evonik and BioAmber to cooperate
  • Oct 23, 2012
  • Focus on Catalysts

Evonik and BioAmber to cooperate

  • Research Article
  • Cite Count Icon 107
  • 10.1016/j.eswa.2021.115090
DSANet: Dilated spatial attention for real-time semantic segmentation in urban street scenes
  • Apr 29, 2021
  • Expert Systems with Applications
  • Mohammed A.M Elhassan + 3 more

DSANet: Dilated spatial attention for real-time semantic segmentation in urban street scenes

  • Research Article
  • Cite Count Icon 10
  • 10.2174/2213275912666190822093403
Genes Expression Classification Through Histone Modification Using Temporal Neural Network
  • Aug 30, 2021
  • Recent Advances in Computer Science and Communications
  • Rajit Nair + 1 more

Background: Genes expression is high dimensional data, so it is very difficult to classify high dimensional data through traditional machine learning approaches. In this work we have proposed a model based on combined approach of Convolutional Neural Network and Recurrent Neural Network, both belong to deep learning model. The prediction has shown improved result than other machine learning algorithms. Expressions are generated through histone modification. Methods: To improve the accuracy deep learning model is proposed i.e. based on Convolutional and Recurrent neural network. This proposed model uses filter, causal convolutional layers and Residual Block for predictions. Results: In this work we have implemented the machine learning algorithms and deep learning algorithms like Logistic Regression, SVM, CNN, Deep Chrome and the proposed Temporal Neural Network. The performance is measured on the basis of parameters like accuracy, precision and AUC on the training and testing set. Conclusion: The proposed Temporal Neural Network model has shown better performance than other machine learning and deep learning algorithms. Due to this proposed deep learning algorithm can be successfully applied on the genes expression dataset.

  • Conference Article
  • Cite Count Icon 14
  • 10.1117/12.2049958
Convolutional neural network approach for buried target recognition in FL-LWIR imagery
  • May 29, 2014
  • Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE
  • K Stone + 1 more

A convolutional neural network (CNN) approach to recognition of buried explosive hazards in forward-looking long-wave infrared (FL-LWIR) imagery is presented. The convolutional filters in the first layer of the network are learned in the frequency domain, making enforcement of zero-phase and zero-dc response characteristics much easier. The spatial domain representations of the filters are forced to have unit l2 norm, and penalty terms are added to the online gradient descent update to encourage orthonormality among the convolutional filters, as well smooth first and second order derivatives in the spatial domain. The impact of these modifications on the generalization performance of the CNN model is investigated. The CNN approach is compared to a second recognition algorithm utilizing shearlet and log-gabor decomposition of the image coupled with cell-structured feature extraction and support vector machine classification. Results are presented for multiple FL-LWIR data sets recently collected from US Army test sites. These data sets include vehicle position information allowing accurate transformation between image and world coordinates and realistic evaluation of detection and false alarm rates.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 45
  • 10.3390/rs14081874
Earth Observation Mission of a 6U CubeSat with a 5-Meter Resolution for Wildfire Image Classification Using Convolution Neural Network Approach
  • Apr 13, 2022
  • Remote Sensing
  • Muhammad Azami + 4 more

The KITSUNE satellite is a 6-unit CubeSat platform with the main mission of 5-m-class Earth observation in low Earth orbit (LEO), and the payload is developed with a 31.4 MP commercial off-the-shelf sensor, customized optics, and a camera controller board. Even though the payload is designed for Earth observation and to capture man-made patterns on the ground as the main mission, a secondary mission is planned for the classification of wildfire images by the convolution neural network (CNN) approach. Therefore, KITSUNE will be the first CubeSat to employ CNN to classify wildfire images in LEO. In this study, a deep-learning approach is utilized onboard the satellite in order to reduce the downlink data by pre-processing instead of the traditional method of performing the image processing at the ground station. The pre-trained CNN models generated in Colab are saved in RPi CM3+, in which, an uplink command will execute the image classification algorithm and append the results on the captured image data. The on-ground testing indicated that it could achieve an overall accuracy of 98% and an F1 score of a 97% success rate in classifying the wildfire events running on the satellite system using the MiniVGGNet network. Meanwhile, the LeNet and ShallowNet models were also compared and implemented on the CubeSat with 95% and 92% F1 scores, respectively. Overall, this study demonstrated the capability of small satellites to perform CNN onboard in orbit. Finally, the KITSUNE satellite is deployed from ISS on March 2022.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 8
  • 10.1088/1742-6596/2312/1/012064
Exploration of Pattern Recognition Methods for Motor Imagery EEG Signal with Convolutional Neural Network Approach
  • Aug 1, 2022
  • Journal of Physics: Conference Series
  • Hanina N Zahra + 2 more

As an application of EEG, Motor Imagery based Brain-Computer Interface (MI BCI) plays a significant role in assisting patients with disability to communicate with their environment. MI BCI could now be realized through various methods such as machine learning. Many attempts using different machine learning approaches as MI BCI applications have been done with every one of them yielding various results. While some attempts managed to achieve agreeable results, some still failed. This failure may be caused by the separation of the feature extraction and classification steps, as this may lead to the loss of information which in turn causes lower classification accuracy. This problem can be solved by integrating feature extraction and classification by harnessing a classification algorithm that processed the input data as a whole until it produces the prediction, hence the use of convolutional neural network (CNN) approach which is known for its versatility in processing and classifying data all in one go. In this study, the CNN exploration involved a task to classify 5 different classes of fingers’ imaginary movement (thumb, index, middle, ring, and pinky) based on the processed raw signal provided. The CNN performance was observed for both non-augmented and augmented data with the data augmentation techniques used include sliding window, noise addition, and the combination of those two methods. From these experiments, the results show that the CNN model managed to achieve an averaged accuracy of 47%, meanwhile with the help of augmentation techniques of sliding window, noise addition, and the combined methods, the model achieved even higher averaged accuracy of 57,1%, 47,2%, and 57,5% respectively.

  • Research Article
  • Cite Count Icon 684
  • 10.1016/j.isprsjprs.2017.11.009
Classification with an edge: Improving semantic image segmentation with boundary detection
  • Dec 5, 2017
  • ISPRS Journal of Photogrammetry and Remote Sensing
  • D Marmanis + 5 more

Classification with an edge: Improving semantic image segmentation with boundary detection

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 20
  • 10.1016/j.apr.2023.101689
An integrated approach of deep learning convolutional neural network and google earth engine for salt storm monitoring and mapping
  • Feb 11, 2023
  • Atmospheric Pollution Research
  • Firouz Aghazadeh + 5 more

This study aims to develop an integrated approach of deep learning convolutional neural network (DL-CNN) and Google Earth Engine (GEE) platform for salt storm modeling and monitoring. First, we selected several ST's predisposing factors, including Land Surface Temperature (LST), soil salinity, AOD, NDWI and NDVI to train models. We then collected 957 Ground Control Points (GCPs) from the study area, which were randomly divided into training (70%) and validation (30%) datasets. Finally, ReLu, Cross-Entropy, and Adam employed as activation function, loss function and optimizer, respectively. Our findings demonstrate the efficiency of an integrated DL-CNN and GEE for monitoring salt storms (Overall Accuracy (OA) = 0.93.02, 0.92.99, 0.93.88, and 0.92.01 for years 2002, 2010, 2015 and 2021, respectively). The results also show an increase in the frequency of salt storm in the study area from 2002 to 2021. Such approach is a promising step toward understanding, controlling, and managing salt storms and recommend salt storm spatial monitoring in other favored areas with similar environmental conditions. In addition, the results of this study provide critical insights into the environmental impacts of the Lake Urmia drought and its intensive environmental impacts on the human health and wellbeing of the residents.

  • Conference Article
  • Cite Count Icon 20
  • 10.1109/icdmw.2017.48
Convolutional Neural Network Approach for Mapping Arctic Vegetation Using Multi-Sensor Remote Sensing Fusion
  • Nov 1, 2017
  • Zachary L Langford + 2 more

Accurate and high-resolution maps of vegetation are critical for projects seeking to understand the terrestrial ecosystem processes and land-atmosphere interactions in Arctic ecosystems, such as U.S. Department of Energy's Next Generation Ecosystem Experiment (NGEE) Arctic. However, most existing Arctic vegetation maps are at a coarse resolution and with a varying degree of detail and accuracy. Remote sensing-based approaches for mapping vegetation, while promising, are challenging in high latitude environments due to frequent cloud cover, polar darkness, and limited availability of high-resolution remote sensing datasets (e.g., ~ 5 m). This study proposes a new remote sensing based multi-sensor data fusion approach for developing high-resolution maps of vegetation in the Seward Peninsula, Alaska. We focus detailed analysis and validation study around the Kougarok river, located in the central Seward Peninsula of Alaska. We seek to evaluate the integration of hyper-spectral, multi-spectral, radar, and terrain datasets using unsupervised and supervised classification techniques over a ~343.72 km 2 area for generating vegetation classifications at a variety of resolutions (5 m and 12.5 m). We fist applied a quantitative goodness-of-fit method, called Mapcurves, that shows the degree of spatial concordance between the public coarse resolution maps and k-means clustering values and relabels the k values based on the best overlap. We develop a convolutional neural network (CNN) approach for developing high resolution vegetation maps for our study region in Arctic. We compare two CNN approaches: (1) breaking up the images into small patches (e.g., 6 × 6) and predict the vegetation class for entire patch and (2) semantic segmentation and predict the vegetation class for every pixel. We also perform accuracy assessments of the developed data products and evaluate varying CNN architectures. The fusion of hyperspectral and optical datasets performed the best, with accuracy values increased from 0.64 to 0.96-0.97 when using a training map produced by unsupervised clustering and Mapcurves labeling for both CNN models.

  • Conference Article
  • Cite Count Icon 48
  • 10.1109/icasi.2018.8394293
A personalized music recommendation system using convolutional neural networks approach
  • Apr 1, 2018
  • Shun-Hao Chang + 3 more

In this paper, we present a personalized music recommendation system (PMRS) based on the convolutional neural networks (CNN) approach. The CNN approach classifies music based on the audio signal beats of the music into different genres. In PMRS, we propose a collaborative filtering (CF) recommendation algorithm to combine the output of the CNN with the log files to recommend music to the user. The log file contains the history of all users who use the PMRS. The PMRS extracts the user's history from the log file and recommends music under each genre. We use the million song dataset (MSD) to evaluate the PMRS. To show the working of the PMRS, we developed a mobile application (an Android version). We used the confidence score metrics for different music genre to check the performance of the PMRS.

  • Research Article
  • Cite Count Icon 15
  • 10.3397/1/377039
Time-series prediction and forecasting of ambient noise levels using deep learning and machine learning techniques
  • Sep 1, 2022
  • Noise Control Engineering Journal
  • S.K Tiwari + 2 more

Ambient day and night noise levels prediction problems have traditionally been addressed using various statistical and machine learning methods. This paper presents the time-series predictions and forecasting of ambient noise levels using support vector machine (SVM) and deep learning method such as convolutional neural network (CNN) approach. This approach has been rarely reported for modeling ambient noise levels so far, although it has been widely used in air and water pollution predictions and forecasting. The study presents the applications of these techniques in time-series modeling of ambient day and night equivalent noise levels. A case study of ambient noise levels of one site each lying in commercial, residential, industrial and silence zone is presented. Ten-fold cross-validation is used in SVM model to train the model effectively and determine the optimized value of hyper-parameter (g, «, C). Also, CNN with a convolutional and pooling layer architecture framework is designed with optimum value of batch size, activation function, and filter size, among others. The validation and suitability of developed SVM and CNN models are ascertained by various statistical tests. Convolutional neural network approach is observed to outperform SVM model and thus can be a reliable approach for time-series modeling of ambient noise levels with a prediction error of 2.1 dB(A). The forecasting root mean squared error obtained for all the four zones using CNN model is observed to be less than 2.1 dB(A) for day equivalent noise levels and 1.9 dB(A) for night equivalent noise levels.

  • Conference Article
  • Cite Count Icon 26
  • 10.1109/indicon47234.2019.9030307
A Convolutional Neural Network Approach Towards Self-Driving Cars
  • Dec 1, 2019
  • Akhil Agnihotri + 2 more

A convolutional neural network (CNN) approach is used to implement a level 2 autonomous vehicle by mapping pixels from the camera input to the steering commands. The network automatically learns the maximum variable features from the camera input, hence requires minimal human intervention. Given realistic frames as input, the driving policy trained on the dataset by NVIDIA and Udacity can adapt to real-world driving in a controlled environment. The CNN is tested on the CARLA open-source driving simulator. Details of a beta-testing platform are also presented, which consists of an ultrasonic sensor for obstacle detection and an RGBD camera for real-time position monitoring at 10Hz. Arduino Mega and Raspberry Pi are used for motor control and processing respectively to output the steering angle, which is converted to angular velocity for steering.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.47065/josh.v5i1.4380
Analisis Validasi dan Evaluasi Model Deteksi Objek Varian Jahe Menggunakan Algoritma Yolov5
  • Oct 28, 2023
  • Journal of Information System Research (JOSH)
  • Lydia Palupi + 2 more

Object detection is one of the important techniques in the field of computer vision and image processing. In this study, a validation and evaluation analysis of the object detection model of ginger variants using the YOLOv5 algorithm with a Convolutional Neural Network (CNN) approach was carried out. The dataset used consists of various ginger variants taken from several sources. The dataset is divided into two parts, namely the training data and the testing data. Model training is carried out on the training data using the YOLOv5 algorithm with a CNN approach. Testing is carried out on the testing data to measure the model's performance in detecting ginger variants. The analysis results showed that the object detection model of ginger variants using the YOLOv5 algorithm with a CNN approach can provide quite accurate results with a detection accuracy rate of 93,9%, So, the detection of ginger variants can be a useful recommendation as a means of varieties authenticity verification utilizing diverse ginger variants. However, there were several challenges faced in processing the dataset, such as variations in lighting and different angles of image capture. Therefore, this study provides recommendations for improving the dataset and optimizing parameter settings to improve the performance of the object detection model of ginger variants using the YOLOv5 algorithm with a CNN approach.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant