Abstract

Software risk prediction is the most sensitive and crucial activity of Software Development Life Cycle (SDLC). It may lead to the success or failure of a project. The risk should be predicted earlier to make a software project successful. A model is proposed for the prediction of software requirement risks using requirement risk dataset and machine learning techniques. In addition, a comparison is made between multiple classifiers that are K-Nearest Neighbour (KNN), Average One Dependency Estimator (A1DE), Naïve Bayes (NB), Composite Hypercube on Iterated Random Projection (CHIRP), Decision Table (DT), Decision Table/Naïve Bayes Hybrid Classifier (DTNB), Credal Decision Trees (CDT), Cost-Sensitive Decision Forest (CS-Forest), J48 Decision Tree (J48), and Random Forest (RF) achieve the best suited technique for the model according to the nature of dataset. These techniques are evaluated using various evaluation metrics including CCI (correctly Classified Instances), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Relative Absolute Error (RAE), Root Relative Squared Error (RRSE), precision, recall, F-measure, Matthew’s Correlation Coefficient (MCC), Receiver Operating Characteristic Area (ROC area), Precision-Recall Curves area (PRC area), and accuracy. The inclusive outcome of this study shows that in terms of reducing error rates, CDT outperforms other techniques achieving 0.013 for MAE, 0.089 for RMSE, 4.498% for RAE, and 23.741% for RRSE. However, in terms of increasing accuracy, DT, DTNB, and CDT achieve better results.

Highlights

  • This article is an open access articleThe development of software consistently faces uncertain occasions that may have negative impact on the success of software development, such occasions are called software risks [1]

  • K-Nearest Neighbour (KNN), A1DE, Naïve Bayes (NB), Composite Hypercube on Iterated Random Projection (CHIRP), Decision Table (DT), Decision Table/Naïve Bayes Hybrid Classifier (DTNB), Credal Decision Trees (CDT), CS-Forest, J48, and Random Forest (RF) are employed on the risk dataset

  • The evaluation is done on the basis of CCI, Mean Absolute Error (MAE) [20,21], Root Mean Square Error (RMSE) [20,21,22], Relative Absolute Error (RAE) [23], Root Relative Squared Error (RRSE) [23], precision [21,24], recall [21], F-measure [21], Matthew’s Correlation Coefficient (MCC) [25,26], ROC Area [27], PRC Area [28]

Read more

Summary

Introduction

This article is an open access articleThe development of software consistently faces uncertain occasions that may have negative impact on the success of software development, such occasions are called software risks [1]. The risks have a vital impact on software requirements. Software projects are completed on-time and with-in-the-budget. In the completed projects, many are no more than a mere shadow of their original specification requirements. According to the literature the reactive strategy is not a mature strategy to assess the risk, since it increases the budget schedule and resources but degrades the quality and success of the project. The prediction of risks at this stage is more beneficial and raises the productivity of software. It helps in reducing the probabilities of software project failure when risks are managed properly in requirements gathering phase [1]

Objectives
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.