FAST: Boosting Uncertainty-based Test Prioritization Methods for Neural Networks via Feature Selection

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Due to the vast testing space, the increasing demand for effective and efficient testing of deep neural networks (DNNs) has led to the development of various DNN test case prioritization techniques. However, the fact that DNNs can deliver high-confidence predictions for incorrectly predicted examples, known as the over-confidence problem, causes these methods to fail to reveal high-confidence errors. To address this limitation, in this work, we propose FAST, a method that boosts existing prioritization methods through guided FeAture SelecTion. FAST is based on the insight that certain features may introduce noise that affects the model's output confidence, thereby contributing to high-confidence errors. It quantifies the importance of each feature for the model's correct predictions, and then dynamically prunes the information from the noisy features during inference to derive a new probability vector for the uncertainty estimation. With the help of FAST, the high-confidence errors and correctly classified examples become more distinguishable, resulting in higher APFD (Average Percentage of Fault Detection) values for test prioritization, and higher generalization ability for model enhancement. We conduct extensive experiments to evaluate FAST across a diverse set of model structures on multiple benchmark datasets to validate the effectiveness, efficiency, and scalability of FAST compared to the state-of-the-art prioritization techniques.

Similar Papers
  • Research Article
  • Cite Count Icon 17
  • 10.1016/j.ijcce.2024.05.004
A novel rice plant leaf diseases detection using deep spectral generative adversarial neural network
  • Jan 1, 2024
  • International Journal of Cognitive Computing in Engineering
  • K Mahadevan + 2 more

The farming industry widely requires automatic detection and analysis of rice diseases to avoid wasting financial and other resources, reduce yield loss, improve processing efficiency, and obtain healthy crop yields. The proposed Deep Spectral Generative Adversarial Neural Network (DSGAN2) method is used for detecting rice plant leaf disease. Initially, fed into the input of healthy and non-healthy leaves from the collected dataset. Then, apply an Improved Threshold Neural Network (ITNN) method to enhance the image quality. Next, it uses a Segmentation using a Segment Multiscale Neural Slicing (SMNS) algorithm to identify the support-intensive color saturation based on the enhanced image. After that, the Spectral Scaled Absolute Feature Selection (S2AFS) method is applied to select optimal features and the closest weight from segmented rice plant leaves. Social Spider Optimization will select the feature using the Closest Weight (S2O-FCW) algorithm to analyze the feature weight values. Finally, the proposed Soft-Max Logistic Activation Function with Deep Spectral Generative Adversarial Neural Network (DSGAN2) algorithm detects rice plant disease based on selected features. With an accuracy of 97 %, the model helps farmers identify and identify Rice Plant diseases. The proposed system Deep Spectral Generative Adversarial Neural Network (DSGAN2) produces a decreasing false rate compared to the existing system of ACPSOSVM-Dual Channels Convolutional Neural Network (APS-DCCNN) is 55.2 %, Alex Net is 50.4 %, and Convolutional Neural Network (CNN) is 49.5 %.

  • Conference Article
  • 10.1109/jictee.2014.6804070
The Significant Matrix without Genetic Algorithm for the feature selection (Significant Matrix 2)
  • Mar 1, 2014
  • Ekapong Chuasuwan

This paper presents to the improvement of the Significant Matrix [1] that works along with Genetic Algorithm in feature selection of appropriate data for a decision tree structure. This work proposes the reduction of time that cut off the Genetic Algorithm's work times. The new method is proposed in the name “Significant Matrix 2” which is calculated from the relationship between categorical data and a class label for determining the threshold of the feature selection and the sub-dataset from the method contains the appropriate feature to create decision trees. The results of experiment of feature selection times. The proposed work can work faster than [1], average 28 times and the results of experiments of the decision tree model is constructed from the feature of the method and model of neural networks. The proposed work gives the average accuracy of the classification at 95.9% of the 11 sample database, also a number of the data features are less than a number of the features from the method of neural networks [6] that uses the feature only 48.08% from all feature in example dataset. Furthermore, when comparing the accuracy of the classification decision tree which another feature selected method. This proposed work have the amount of average accuracy higher than the selected data from another method. Experimental results show that the proposed method does not only provide a higher accuracy, but reduce the complexity by using less features of the dataset.

  • Research Article
  • Cite Count Icon 568
  • 10.1016/j.dss.2010.11.006
Detection of financial statement fraud and feature selection using data mining techniques
  • Nov 12, 2010
  • Decision Support Systems
  • P Ravisankar + 3 more

Detection of financial statement fraud and feature selection using data mining techniques

  • Research Article
  • 10.33480/pilar.v9i1.4
Prediksi Beban Listrik Jangka Pendek Berbasis Backward Elimination
  • Mar 1, 2013
  • Jurnal Pilar Nusa Mandiri
  • Veti Apriana

Short-term electrical load prediction is one way that can be used to generate and distribute electrical energy economically, so that the provider can determine the load and power demand for some time to come. Many researchers who examined the short-term electric load by the method of Neural Network. However, the use of the data set that a lot of the neural network can result in an excessive number of neurons so that can cause over generalizes phenomenon, so that the necessary process of feature selection to reduce attributes in the data set that much. This research begins with the ability to process data loads daily system time series per-30 minutes. The method used is Neural Network based on Backward Elimination with the input data used is the data in January 2012. Several experiments were conducted to obtain the optimal architecture and generate accurate predictions. The results showed an experiment with Neural Network-based methods Backward Elimination produces a lower RMSE is 0,018 compared to RMSE produced by the method of Neural Network is 0,035.

  • Research Article
  • Cite Count Icon 36
  • 10.1016/j.asoc.2017.08.007
Building selective ensembles of Randomization Based Neural Networks with the successive projections algorithm
  • Aug 15, 2017
  • Applied Soft Computing
  • Diego P.P Mesquita + 4 more

Building selective ensembles of Randomization Based Neural Networks with the successive projections algorithm

  • Research Article
  • Cite Count Icon 2
  • 10.1049/2024/9937803
Efficient Intrusion Detection System Data Preprocessing Using Deep Sparse Autoencoder with Differential Evolution
  • Jan 1, 2024
  • IET Information Security
  • Saranya N + 1 more

A great amount of data is generated by the Internet and communication areas’ rapid technological improvement, which expands the size of the network. These cutting‐edge technologies could result in unique network attacks that present security risks. This intrusion launches many attacks on the communication network which is to be monitored. An intrusion detection system (IDS) is a tool to prevent from intrusions by inspecting the network traffic and to make sure the network integrity, confidentiality, availability, and robustness. Many researchers are focused to IDS with machine and deep learning approaches to detect the intruders. Yet, IDS face challenges to detect the intruders accurately with reduced false alarm rate, feature selection, and detection. High dimensional data affect the feature selection methods effectiveness and efficiency. Preprocessing of data to make the dataset as balanced, normalized, and transformed data is done before the feature selection and classification process. Efficient data preprocessing will ensure the whole IDS performance with improved detection rate (DR) and reduced false alarm rate (FAR). Since datasets are required for the various feature dimensions, this article proposes an efficient data preprocessing method that includes a series of techniques for data balance using SMOTE, data normalization with power transformation, data encoding using one hot and ordinal encoding, and feature reduction using a proposed deep sparse autoencoder (DSAE) with differential evolution (DE) on data before feature selection and classification. The efficiency of the transformation methods is evaluated with recursive Pearson correlation‐based feature selection and graphical convolution neural network (G‐CNN) methods.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-642-18129-0_89
Bacteria Foraging Based Agent Feature Selection Algorithm
  • Jan 1, 2011
  • Dongying Liang + 2 more

This paper provides an agent genetic algorithm based on bacteria foraging strategy (BFOA-L) as the feature selection method, and presents the combined method of link-like agent structure and neural network based on bacteria foraging algorithm (BFOA). It introduces the bacteria foraging (BF) action into the feature selection and utilizes the neural network structure achieve fuzzy logic inference, so that the weights with no definite physical meaning in traditional neural network are endowed with the physical meaning of fuzzy logic inference parameters. Furthermore, to overcome the defects of traditional optimization methods, it applies the agent link-like competition strategy into the global optimization process to raise the convergence accuracy. The curve tracing test results show that this algorithm has good stability and high accuracy.

  • Research Article
  • Cite Count Icon 6
  • 10.26798/jiko.2018.v3i2.127
IMPLEMENTASI PARTICLE SWARM OPTIMIZATION UNTUK OPTIMALISASI DATA MINING DALAM EVALUASI KINERJA ASISTEN DOSEN
  • Sep 30, 2018
  • Indah Ariyati + 2 more

The existing complaints on the performance of assistant lecturers show the impact of the absence of better competence, so that an accurate evaluation process on the performance of lecturer assistants based on their duties and obligations in a certain period of time. The evaluation process required an improved model of accuracy which was a formidable challenge in the selection of more efficiency and effectiveness features, in which case we proposed a method of particle swarm optimization to improve the accuracy of neural network methods that experienced problems in the selection of features that were weighted in detailed analysis by particle swarm optimization with neural network learning performance. This study aims to find a complex alternative solution in the evaluation of lecturer's assistant where research is based on parameters obtained from UCI Machine Repository. The final research shows that particle swarm optimization method can in-crease the accuracy of 75.56% from the previous value of 51.75% and increase the kappa value of 0,632 from the previous kappa value 0,276. The result of developing particle swarm optimization toward neural network by increasing the accuracy and kappa value can be used as controlling periodically in evaluating the performance of assistant lecturer.

  • Research Article
  • Cite Count Icon 15
  • 10.11591/ijra.v8i3.pp194-204
Fuzzy neuro-genetic approach for feature selection and image classification in augmented reality systems
  • Sep 1, 2019
  • IAES International Journal of Robotics and Automation (IJRA)
  • Rajendra Thilahar C + 1 more

In this paper, a new approach for implementing an Augmented Reality system by applying fuzzy genetic neural networks is proposed. It consists of two components namely feature selection and classification modules. For feature detection, extraction and selection, the proposed model uses a fuzzy logic based incremental feature selection algorithm which has been proposed in this work in order to recognize the important features from 3D images. Moreover, this paper explains the implementation and results of the proposed algorithms for an Augmented Reality system using image recognition, feature extraction, feature selection and classification by considering the global and local features of the images. For this purpose, we propose a three layer fuzzy neural network that has been implemented based on weight adjustments using fuzzy rules in the convolutional neural networks with genetic algorithm for effective optimization of rules. The classification algorithm is also based on fuzzy neuro-genetic approach which consists of two phases namely Training phase and testing phase. During the training phase, rules are formed based on objects and these rules are applied during the testing phase for recognizing the objects which can be used in robotics for effective object recognition. From the experiments conducted in this work, it is proved that the proposed model is more accurate in 3D <br /> object recognition.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/scored.2003.1459731
A novel feature selection and extraction method for neural network based transfer capacity assessment of power systems
  • Jun 10, 2016
  • M.M Othman + 2 more

A new feature selection and extraction method is presented in this paper for the neural network (NN) based available transfer capability assessment in the deregulated power system. The objective of feature selection and extraction is to speed up the NN training process and to achieve a more accurate NN results. The proposed method is known as the SDFT method in which it is a combination of the sensitivity and discrete Fourier transform methods. The sensitivity analysis is first used in selecting the input features and then followed by the discrete Fourier transform (DFT) method for extracting NN input features. The hypothesis set of pre-selected data performed by the sensitivity method only offers no improvement in the NN training performance in such cases where many features are highly correlated. Thus, the DFT method is considered so as to extract the pre-selected data to a set of meaningful extracted data. To illustrate the effectiveness of the proposed method, a comparative study of the SDFT, DFT and sensitivity methods is made so as to investigate the effectiveness of the methods in extracting and selecting the NN features. In this study, the NN based available transfer capability assessment has been performed on the Malaysian power system.

  • Research Article
  • Cite Count Icon 130
  • 10.1109/tcss.2019.2914499
Forecasting Horticultural Products Price Using ARIMA Model and Neural Network Based on a Large-Scale Data Set Collected by Web Crawler
  • Jun 1, 2019
  • IEEE Transactions on Computational Social Systems
  • Yuchen Weng + 5 more

The sales of agricultural products are an important component of the product supply chain. The price of agricultural products, a social signal of product supply and demand, is affected by many factors, such as climate, price, policy, and so on. Due to the asymmetry between production and marketing information, the price of many agricultural products fluctuates greatly. Horticultural products are especially sensitive to price since they are not suitable for long-term storage. Therefore, forecasting the price of horticultural products is very helpful in designing a cropping plan. In this paper, AutoRegressive Integrated Moving Average (ARIMA) model, back propagation (BP) network method, and recurrent neural network (RNN) method were tested to forecast the price of agricultural products (cucumber, tomato, and eggplant) in short term (several days) and long term (several weeks or months). A large-scale price data of agricultural products were collected from the website based on web crawler technology. Since ARIMA requires continuous and periodic data, it is suitable for small-scale periodic data. It gave good performance for average monthly data but not for daily data. Instead, the neural network methods (including BP network and RNN) can predict well daily, weekly, and monthly trend of price fluctuation. It is more suitable for large-scale data. It is expected that the deep learning method represented by a neural network will become the mainstream method of agricultural product price forecasting.

  • Research Article
  • Cite Count Icon 29
  • 10.1016/j.asoc.2016.04.041
Spectral entropy feature subset selection using SEPCOR to detect alcoholic impact on gamma sub band visual event related potentials of multichannel electroencephalograms (EEG)
  • May 21, 2016
  • Applied Soft Computing
  • T.K Padma Shri + 1 more

Spectral entropy feature subset selection using SEPCOR to detect alcoholic impact on gamma sub band visual event related potentials of multichannel electroencephalograms (EEG)

  • Research Article
  • Cite Count Icon 50
  • 10.1016/j.prime.2024.100534
Automatic recognition of Rice Plant leaf diseases detection using deep neural network with improved threshold neural network
  • Mar 29, 2024
  • e-Prime - Advances in Electrical Engineering, Electronics and Energy
  • K Mahadevan + 2 more

Automatic recognition of Rice Plant leaf diseases detection using deep neural network with improved threshold neural network

  • Research Article
  • Cite Count Icon 13
  • 10.4467/20838476si.16.012.6193
Data Selection for Neural Networks
  • Jan 1, 2017
  • Schedae Informaticae
  • Mirosław Kordos

Several approaches to joined feature and instance selection in neural network leaning are discussed and experimentally evaluated in respect to classification accuracy and dataset compression, considering also their computational complexity. These include various versions of feature and instance selection prior to the network learning, the selection embedded in the neural network and hybrid approaches, including solutions developed by us. The advantages and disadvantages of each approach are discussed and some possible improvements are proposed. Keywords: Neural Networks, Data Selection, Feature Selection, Instance Selection

  • Research Article
  • Cite Count Icon 18
  • 10.1016/j.eswa.2024.124560
A novel interpretability machine learning model for wind speed forecasting based on feature and sub-model selection
  • Jun 27, 2024
  • Expert Systems With Applications
  • Zhihao Shang + 4 more

A novel interpretability machine learning model for wind speed forecasting based on feature and sub-model selection

Save Icon
Up Arrow
Open/Close