A Pixel-Level Explainable Approach of Convolutional Neural Networks and Its Application
Convolutional neural network (CNN) currently has been widely used to undertake the task of image classification. Unfortunately, a trained CNN model is a nonlinear system with high complexity, and the implicit decision knowledge carried by the CNN model is often difficult to be comprehended by humans. A feasible method to make human understanding of decision knowledge is to explain the classification basis of the trained CNN model. In order to solve the problem of insufficient interpretation accuracy of the existing methods, this paper presents a novel pixel-level explainable approach based on a guided symbolic execution strategy. A large number of experiments are conducted on the PyTorch team published CNN models, and the experimental results show that the presented approach is a 100% accurate technique for interpreting classification basis of input images on pixel-level compared the existing explainable methods. In addition, a scheme to enhance the adversarial robustness of CNN models is designed based on the presented explainable approach. The evaluation experiments show that the designed scheme provides an effective way to improve the adversarial robustness of the CNN models, and is a transferable technique in the CNN models that hold different structures.
- Research Article
10
- 10.2174/2213275912666190822093403
- Aug 30, 2021
- Recent Advances in Computer Science and Communications
Background: Genes expression is high dimensional data, so it is very difficult to classify high dimensional data through traditional machine learning approaches. In this work we have proposed a model based on combined approach of Convolutional Neural Network and Recurrent Neural Network, both belong to deep learning model. The prediction has shown improved result than other machine learning algorithms. Expressions are generated through histone modification. Methods: To improve the accuracy deep learning model is proposed i.e. based on Convolutional and Recurrent neural network. This proposed model uses filter, causal convolutional layers and Residual Block for predictions. Results: In this work we have implemented the machine learning algorithms and deep learning algorithms like Logistic Regression, SVM, CNN, Deep Chrome and the proposed Temporal Neural Network. The performance is measured on the basis of parameters like accuracy, precision and AUC on the training and testing set. Conclusion: The proposed Temporal Neural Network model has shown better performance than other machine learning and deep learning algorithms. Due to this proposed deep learning algorithm can be successfully applied on the genes expression dataset.
- Research Article
2
- 10.47065/josh.v5i1.4380
- Oct 28, 2023
- Journal of Information System Research (JOSH)
Object detection is one of the important techniques in the field of computer vision and image processing. In this study, a validation and evaluation analysis of the object detection model of ginger variants using the YOLOv5 algorithm with a Convolutional Neural Network (CNN) approach was carried out. The dataset used consists of various ginger variants taken from several sources. The dataset is divided into two parts, namely the training data and the testing data. Model training is carried out on the training data using the YOLOv5 algorithm with a CNN approach. Testing is carried out on the testing data to measure the model's performance in detecting ginger variants. The analysis results showed that the object detection model of ginger variants using the YOLOv5 algorithm with a CNN approach can provide quite accurate results with a detection accuracy rate of 93,9%, So, the detection of ginger variants can be a useful recommendation as a means of varieties authenticity verification utilizing diverse ginger variants. However, there were several challenges faced in processing the dataset, such as variations in lighting and different angles of image capture. Therefore, this study provides recommendations for improving the dataset and optimizing parameter settings to improve the performance of the object detection model of ginger variants using the YOLOv5 algorithm with a CNN approach.
- Research Article
46
- 10.1186/s12911-020-01277-w
- Sep 29, 2020
- BMC Medical Informatics and Decision Making
BackgroundDifferentiating between ulcerative colitis (UC), Crohn’s disease (CD) and intestinal tuberculosis (ITB) using endoscopy is challenging. We aimed to realize automatic differential diagnosis among these diseases through machine learning algorithms.MethodsA total of 6399 consecutive patients (5128 UC, 875 CD and 396 ITB) who had undergone colonoscopy examinations in the Peking Union Medical College Hospital from January 2008 to November 2018 were enrolled. The input was the description of the endoscopic image in the form of free text. Word segmentation and key word filtering were conducted as data preprocessing. Random forest (RF) and convolutional neural network (CNN) approaches were applied to different disease entities. Three two-class classifiers (UC and CD, UC and ITB, and CD and ITB) and a three-class classifier (UC, CD and ITB) were built.ResultsThe classifiers built in this research performed well, and the CNN had better performance in general. The RF sensitivities/specificities of UC-CD, UC-ITB, and CD-ITB were 0.89/0.84, 0.83/0.82, and 0.72/0.77, respectively, while the values for the CNN of CD-ITB were 0.90/0.77. The precisions/recalls of UC-CD-ITB when employing RF were 0.97/0.97, 0.65/0.53, and 0.68/0.76, respectively, and when employing the CNN were 0.99/0.97, 0.87/0.83, and 0.52/0.81, respectively.ConclusionsClassifiers built by RF and CNN approaches had excellent performance when classifying UC with CD or ITB. For the differentiation of CD and ITB, high specificity and sensitivity were achieved as well. Artificial intelligence through machine learning is very promising in helping unexperienced endoscopists differentiate inflammatory intestinal diseases.ConferenceThe abstract of this article has won the first prize of the Young Investigator Award during the Asian Pacific Digestive Week (APDW) 2019 held in Kolkata, India.
- Conference Article
14
- 10.1117/12.2049958
- May 29, 2014
- Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE
A convolutional neural network (CNN) approach to recognition of buried explosive hazards in forward-looking long-wave infrared (FL-LWIR) imagery is presented. The convolutional filters in the first layer of the network are learned in the frequency domain, making enforcement of zero-phase and zero-dc response characteristics much easier. The spatial domain representations of the filters are forced to have unit l2 norm, and penalty terms are added to the online gradient descent update to encourage orthonormality among the convolutional filters, as well smooth first and second order derivatives in the spatial domain. The impact of these modifications on the generalization performance of the CNN model is investigated. The CNN approach is compared to a second recognition algorithm utilizing shearlet and log-gabor decomposition of the image coupled with cell-structured feature extraction and support vector machine classification. Results are presented for multiple FL-LWIR data sets recently collected from US Army test sites. These data sets include vehicle position information allowing accurate transformation between image and world coordinates and realistic evaluation of detection and false alarm rates.
- Research Article
43
- 10.3390/rs14081874
- Apr 13, 2022
- Remote Sensing
The KITSUNE satellite is a 6-unit CubeSat platform with the main mission of 5-m-class Earth observation in low Earth orbit (LEO), and the payload is developed with a 31.4 MP commercial off-the-shelf sensor, customized optics, and a camera controller board. Even though the payload is designed for Earth observation and to capture man-made patterns on the ground as the main mission, a secondary mission is planned for the classification of wildfire images by the convolution neural network (CNN) approach. Therefore, KITSUNE will be the first CubeSat to employ CNN to classify wildfire images in LEO. In this study, a deep-learning approach is utilized onboard the satellite in order to reduce the downlink data by pre-processing instead of the traditional method of performing the image processing at the ground station. The pre-trained CNN models generated in Colab are saved in RPi CM3+, in which, an uplink command will execute the image classification algorithm and append the results on the captured image data. The on-ground testing indicated that it could achieve an overall accuracy of 98% and an F1 score of a 97% success rate in classifying the wildfire events running on the satellite system using the MiniVGGNet network. Meanwhile, the LeNet and ShallowNet models were also compared and implemented on the CubeSat with 95% and 92% F1 scores, respectively. Overall, this study demonstrated the capability of small satellites to perform CNN onboard in orbit. Finally, the KITSUNE satellite is deployed from ISS on March 2022.
- Research Article
8
- 10.1088/1742-6596/2312/1/012064
- Aug 1, 2022
- Journal of Physics: Conference Series
As an application of EEG, Motor Imagery based Brain-Computer Interface (MI BCI) plays a significant role in assisting patients with disability to communicate with their environment. MI BCI could now be realized through various methods such as machine learning. Many attempts using different machine learning approaches as MI BCI applications have been done with every one of them yielding various results. While some attempts managed to achieve agreeable results, some still failed. This failure may be caused by the separation of the feature extraction and classification steps, as this may lead to the loss of information which in turn causes lower classification accuracy. This problem can be solved by integrating feature extraction and classification by harnessing a classification algorithm that processed the input data as a whole until it produces the prediction, hence the use of convolutional neural network (CNN) approach which is known for its versatility in processing and classifying data all in one go. In this study, the CNN exploration involved a task to classify 5 different classes of fingers’ imaginary movement (thumb, index, middle, ring, and pinky) based on the processed raw signal provided. The CNN performance was observed for both non-augmented and augmented data with the data augmentation techniques used include sliding window, noise addition, and the combination of those two methods. From these experiments, the results show that the CNN model managed to achieve an averaged accuracy of 47%, meanwhile with the help of augmentation techniques of sliding window, noise addition, and the combined methods, the model achieved even higher averaged accuracy of 57,1%, 47,2%, and 57,5% respectively.
- Research Article
15
- 10.3397/1/377039
- Sep 1, 2022
- Noise Control Engineering Journal
Ambient day and night noise levels prediction problems have traditionally been addressed using various statistical and machine learning methods. This paper presents the time-series predictions and forecasting of ambient noise levels using support vector machine (SVM) and deep learning method such as convolutional neural network (CNN) approach. This approach has been rarely reported for modeling ambient noise levels so far, although it has been widely used in air and water pollution predictions and forecasting. The study presents the applications of these techniques in time-series modeling of ambient day and night equivalent noise levels. A case study of ambient noise levels of one site each lying in commercial, residential, industrial and silence zone is presented. Ten-fold cross-validation is used in SVM model to train the model effectively and determine the optimized value of hyper-parameter (g, «, C). Also, CNN with a convolutional and pooling layer architecture framework is designed with optimum value of batch size, activation function, and filter size, among others. The validation and suitability of developed SVM and CNN models are ascertained by various statistical tests. Convolutional neural network approach is observed to outperform SVM model and thus can be a reliable approach for time-series modeling of ambient noise levels with a prediction error of 2.1 dB(A). The forecasting root mean squared error obtained for all the four zones using CNN model is observed to be less than 2.1 dB(A) for day equivalent noise levels and 1.9 dB(A) for night equivalent noise levels.
- Conference Article
25
- 10.1109/indicon47234.2019.9030307
- Dec 1, 2019
A convolutional neural network (CNN) approach is used to implement a level 2 autonomous vehicle by mapping pixels from the camera input to the steering commands. The network automatically learns the maximum variable features from the camera input, hence requires minimal human intervention. Given realistic frames as input, the driving policy trained on the dataset by NVIDIA and Udacity can adapt to real-world driving in a controlled environment. The CNN is tested on the CARLA open-source driving simulator. Details of a beta-testing platform are also presented, which consists of an ultrasonic sensor for obstacle detection and an RGBD camera for real-time position monitoring at 10Hz. Arduino Mega and Raspberry Pi are used for motor control and processing respectively to output the steering angle, which is converted to angular velocity for steering.
- Conference Article
48
- 10.1109/icasi.2018.8394293
- Apr 1, 2018
In this paper, we present a personalized music recommendation system (PMRS) based on the convolutional neural networks (CNN) approach. The CNN approach classifies music based on the audio signal beats of the music into different genres. In PMRS, we propose a collaborative filtering (CF) recommendation algorithm to combine the output of the CNN with the log files to recommend music to the user. The log file contains the history of all users who use the PMRS. The PMRS extracts the user's history from the log file and recommends music under each genre. We use the million song dataset (MSD) to evaluate the PMRS. To show the working of the PMRS, we developed a mobile application (an Android version). We used the confidence score metrics for different music genre to check the performance of the PMRS.
- Research Article
26
- 10.23919/jsee.2020.000068
- Oct 1, 2020
- Journal of Systems Engineering and Electronics
Non-orthogonal multiple access (NOMA), featuring high spectrum efficiency, massive connectivity and low latency, holds immense potential to be a novel multi-access technique in fifth- generation (5G) communication. Successive interference cancellation (SIC) is proved to be an effective method to detect the NOMA signal by ordering the power of received signals and then decoding them. However, the error accumulation effect referred to as error propagation is an inevitable problem. In this paper, we propose a convolutional neural networks (CNNs) approach to restore the desired signal impaired by the multiple input multiple output (MIMO) channel. Especially in the uplink NOMA scenario, the proposed method can decode multiple users' information in a cluster instantaneously without any traditional communication signal processing steps. Simulation experiments are conducted in the Rayleigh channel and the results demonstrate that the error performance of the proposed learning system outperforms that of the classic SIC detection. Consequently, deep learning has disruptive potential to replace the conventional signal detection method.
- Research Article
1
- 10.7717/peerj-cs.2052
- Aug 12, 2024
- PeerJ. Computer science
Most natural disasters result from geodynamic events such as landslides and slope collapse. These failures cause catastrophes that directly impact the environment and cause financial and human losses. Visual inspection is the primary method for detecting failures in geotechnical structures, but on-site visits can be risky due to unstable soil. In addition, the body design and hostile and remote installation conditions make monitoring these structures inviable. When a fast and secure evaluation is required, analysis by computational methods becomes feasible. In this study, a convolutional neural network (CNN) approach to computer vision is applied to identify defects in the surface of geotechnical structures aided by unmanned aerial vehicle (UAV) and mobile devices, aiming to reduce the reliance on human-led on-site inspections. However, studies in computer vision algorithms still need to be explored in this field due to particularities of geotechnical engineering, such as limited public datasets and redundant images. Thus, this study obtained images of surface failure indicators from slopes near a Brazilian national road, assisted by UAV and mobile devices. We then proposed a custom CNN and low complexity model architecture to build a binary classifier image-aided to detect faults in geotechnical surfaces. The model achieved a satisfactory average accuracy rate of 94.26%. An AUC metric score of 0.99 from the receiver operator characteristic (ROC) curve and matrix confusion with a testing dataset show satisfactory results. The results suggest that the capability of the model to distinguish between the classes 'damage' and 'intact' is excellent. It enables the identification of failure indicators. Early failure indicator detection on the surface of slopes can facilitate proper maintenance and alarms and prevent disasters, as the integrity of the soil directly affects the structures built around and above it.
- Book Chapter
20
- 10.1007/978-3-030-67667-4_28
- Jan 1, 2021
An essential task in predictive maintenance is the prediction of the Remaining Useful Life (RUL) through the analysis of multivariate time series. Using the sliding window method, Convolutional Neural Network (CNN) and conventional Recurrent Neural Network (RNN) approaches have produced impressive results on this matter, due to their ability to learn optimized features. However, sequence information is only partially modeled by CNN approaches. Due to the flatten mechanism in conventional RNNs, like Long Short Term Memories (LSTM), the temporal information within the window is not fully preserved. To exploit the multi-level temporal information, many approaches are proposed which combine CNN and RNN models. In this work, we propose a new LSTM variant called embedded convolutional LSTM (ECLSTM). In ECLSTM a group of different 1D convolutions is embedded into the LSTM structure. Through this, the temporal information is preserved between and within windows. Since the hyper-parameters of models require careful tuning, we also propose an automated prediction framework based on the Bayesian optimization with hyperband optimizer, which allows for efficient optimization of the network architecture. Finally, we show the superiority of our proposed ECLSTM approach over the state-of-the-art approaches on several widely used benchmark data sets for RUL Estimation.
- Conference Article
8
- 10.1109/dasc/picom/cbdcom/cyberscitech.2019.00157
- Aug 1, 2019
This paper proposes a method for classifying Japanese sign language (JSL) using a combined gathered image generation technique and a convolutional neural network (CNN) approach. In the combined gathered image generation, the maximum difference from the previous and next images is calculated for each block, and the block information that had maximum difference was embedded into an image on all blocks. After information on all images has been gathered into single words, the CNNs are used to extract features for the classification of JSL words. A multi-class support vector machine (SVM) is then used to classify words related to greeting and requesting. The mean and the standard deviation of the recognition accuracy of the proposed method were experimentally shown to be 84.2% and 4%, respectively. These results suggest that it is possible to obtain information for classifying 10 JSL words using the proposed combined gathered image generation and CNN approach.
- Research Article
3
- 10.1029/2023ea003023
- Dec 1, 2023
- Earth and Space Science
Terrestrial water storage anomaly (TWSA), derived from Gravity Recovery and Climate Experiment (GRACE) satellites, has been widely used in hydrology studies. The inversion is commonly achieved by truncating and filtering spherical harmonic coefficients (SHC), whereby the result is characterized by leakage error and low resolution. It remains unclear whether machine learning methods can help resolve this challenging issue. In this study, we present a convolutional neural network (CNN) approach to correct TWSA from GRACE SHC by leveraging the knowledge of the leakage effect determined from global hydrological models (GHMs) and land surface models (LSMs). The CNN approach is implemented in three representative regions in China, that is, the human‐impacted Haihe River Basin, the nature‐impacted Yangtze River Basin and the model‐limited Tibetan Plateau. The results show the following: (a) The recovery performance of CNN at the basin scale is better than that at the grid scale, and the grid‐scale recovery is significantly influenced by the spatial heterogeneity of TWSA and the input GHM/LSMs; (b) The more accurate the GHM/LSMs used for training, the better the recovery performance of CNN; and (c) The trained model retains comparable performance in deriving the TWSA time series from GRACE SHC when compared to that derived from other methods (i.e., scaling factor and mass concentration solutions) with average r = 0.90 and RMSE = 21.50 mm. This study highlights the potential of machine learning to supplement conventional correction methods when deriving the TWSA from GRACE SHC by utilizing signal restoration knowledge learned from multiple accurate GHMs/LSMs.
- Research Article
1
- 10.21608/ijci.2021.63200.1042
- May 8, 2021
- IJCI. International Journal of Computers and Information
Abstract— the world those days focuses on protecting human health and combating the irruption of coronavirus patients (COVID-19). As results of its extra ordinarily contagious infection that have caused a disturbance in everyone's lives in various ways. For early screening, Reverse Transcription Protein Chain Reaction (RT-PCR) test is used to examine the onset of the patients by detecting the RNA material of the virus among the patients’ samples. Recent results indicate that the applying of X-ray images and X-radiation (CT) improves the detection accuracy of this disease. However, the classification task of medical images is tough due to several factors such as lack of dataset for COVID-19, and difficulty in identifying type of infection. Recent research works have been proposed for COVID-19 detection that has been applied on specific datasets. Thus, it is vital to validate their performance on various datasets with different imaging disease conditions. The paper presents a comparison study between top performer CNN models that recorded the very best detection accuracy in image detection and classification: COVID-Net, VGG16, ResNet, Bayesian, DenseNet, and DarkNet. Such CNN approaches can assist medical staff in the early detection of infection. Additionally, we improved dataset in terms of quality, clarity, and quantity using augmentation technique. The quantitative results show that Darknet and COVID-net yield high detection accuracy when applied on CT and X-ray dataset. We validated our results by training the models on multiple different datasets, using CPU and GPU with various bach sizes and optimizers.