Abstract
The security of software has always been a vital concern for software developers. Infringements on software systems can result in significant losses in terms of time and confidential data. Software vulnerabilities are considered to be the gateway for attackers to harm information systems. Hence, it is crucial to build effective software vulnerability prediction models. Machine learning algorithms produce high-performance prediction models but the performance of these models is affected by the hyperparameter settings of machine learning methods and imbalanced datasets. The current study aims to find out the effect of single- and multi-objective hyperparameter optimization on software vulnerability prediction models. The paper has proposed an experimental methodology that considers both optimizations and uses eight machine learning methods on PHP open-source public datasets (Drupal, Moodle, and PHPMyAdmin). The experimental results show that after applying multi-objective hyperparameter optimization, the highest AUC achieved is 0.9774 and the F1-Score attained is 0.9222, which is better than the benchmark studies. Random Forest has performed satisfactorily for all three datasets. The comparative analysis of single and multi-objective HPO is performed using post hoc Tukey’s HSD test. Furthermore, the effectiveness of the resampling technique ‘SMOTE’ is observed.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have