Failure Analysis of Static Analysis Software Module Based on Big Data Tendency Prediction

Jian Zhu,Shi Ying,Qian Li,M Irfan Uddin

doi:10.1155/2021/6660830

Jian Zhu, Shi Ying + Show 2 more

Open Access

https://doi.org/10.1155/2021/6660830

Copy DOI

Journal: Complexity	Publication Date: Mar 25, 2021
Citations: 2	License type: CC BY 4.0

Affiliation: Wuhan University, Nanning Normal University

Abstract

With the continuous development of software, it is inevitable that there will be various unpredictable problems in computer software or programs that will damage the normal operation of the software. In the paper, static analysis software is taken as the research object, the errors or failures caused by the potential defects of the software modules are analyzed, and a software analysis method based on big data tendency prediction is proposed to use the software defects of the stacked noise reduction sparse analyzer to predict. This method can learn features from original defect data, directly and efficiently extract required features of all levels from software defect data by setting different number of hidden layers, sparse regularization parameters, and noise ratio, and then classify and predict the extracted features by combining with big data. Through experimental tests, the performance of the presented method is better than that of the comparison method in correct rate, accuracy rate, recall rate, F1-measurement, AUC value, and running time, which proves that the research results in this paper have more accurate failure prediction effect and can timely eliminate software failures.

Highlights

IntroductionLoss Cost Function Based on Cross Entropy
E encoder is used to extract the features of software defect data. ere is no need to define the features in advance but only input the defect data into the network. e encoder will learn to obtain the feature representation of the defect data, and the obtained feature representation will be classified and predicted by logistic regression classifier, which can achieve good prediction effect. It shows that the prediction model using the square difference cost function and the cross-entropy cost function is basically the same in the prediction accuracy rate and remains above 0.8, indicating that the prediction models using the two cost functions have high predictive capabilities; e prediction model using the cross-entropy cost function is better than that using the square error cost function, the prediction model of the function has advantages in prediction accuracy, recall, F1-measure, AUC value, and running time, and the running time of the prediction model using the cross-entropy cost function is only about 1/ 3 of the running time of the prediction model using the square deviation cost function
In order to verify the influence of the depth of the stacked forest and the number of trees on the performance of the prediction model, a software defect prediction model based on the stacked forest is constructed where the depth of the stacked forest is one, two, three, and four layers and the number of trees is 500

Summary

Introduction

Loss Cost Function Based on Cross Entropy. In order to overcome the problem of low parameter update efficiency when using the cost function of square deviation in traditional encoders, it is hoped that the partial derivative of the loss cost function is independent of the derivative of the activation function [7], namely: εL εw zi − xi􏼁yi,. In formula (1), ε is the sparse parameter, w is the weight matrix, b is the bias vector, l is the activation function, and xi is the training sample set. Compared with the square variance cost function, the cross-entropy cost function has obvious advantages; its partial derivative is independent of the derivative of the activation function, so it will not be affected by the saturation of sigmoid function. (1) Nonnegative: within the scope of the definition domain, its value is nonnegative (2) e smaller the difference between reconstructed data zi and input data xi is, the more its cost function approaches to 0 [10,11,12]

Methods

Results

Conclusion