An Empirical Study on Application of Word Embedding Techniques for Prediction of Software Defect Severity Level

Lov Kumar,Vipul Kocher,Srinivas Padmanabhuni,Sanjay Misra,Mukesh Kumar,Lalita Bhanu Murthy

doi:10.15439/2021f100

Abstract

Software defect severity level helps to indicate the impact of bugs on the execution of the software and how rapidly these bugs need to be addressed by the team. The working team is regularly analyzing the bugs report and prioritizing the defects. The manual prioritization of these defects based on the experience may be an inaccurate prediction of the severity that will delay in fixing of critical bugs. It is compulsory to automate the process of assigning an appropriate level of severity based on bug report results with an objective to fix critical bugs without any delay. This work aims to develop defect severity level prediction models that have the ability to assign severity level of defects based on bugs report. In this work, seven different word embedding techniques are applied to defect description to represent the word, not just as a number but as a vector in n-dimensional space in order to reduce the number of features. Since the predictive ability of the developed models depends on the vectors extracted from text as they are used as an input to the defect severity level prediction models. Further, three feature selection techniques have been applied to find the right set of relevant vectors. The effectiveness of these word embedding techniques and different sets of vectors are evaluated using eleven different classification techniques with Synthetic Minority Oversampling Technique (SMOTE) to overcome the class imbalance problem. The experimental results show that the word embedding, feature selection techniques and SMOTE have the ability to predict the severity level of the defect in a software.

Highlights

A PPLYING data mining techniques on software repositories such as software fault prediction, maintainability prediction, version control systems, source code analysis, bug archives, etc. is an emerging field that has received significant research interest in recent times
Seven different word embedding techniques are applied to defect description to represent the word, not just as a number but as a vector in n-dimensional space in order to reduce the number of features
Since the predictive ability of the developed models depends on the vectors extracted from text as they are used as an input to the defect severity level prediction models

Summary

Introduction

A PPLYING data mining techniques on software repositories such as software fault prediction, maintainability prediction, version control systems, source code analysis, bug archives, etc. is an emerging field that has received significant research interest in recent times. Forrest et al observed that the finding and fixing defects in software is a time-consuming and expensive process They have found that the median time to repair bugs for ArgoUML software is 190 days, and PostgreSQL is 200 days. Defect severity level prediction has been emerged as a novel research field for the effective allocation of resources and plans to fix the defects based on their severity level [3]. These models help to find the severity level of defects that can be used to find the effect of defects on the software. Recent research has used different data mining techniques to extract numerical features from defect descriptions for the severity level of defect prediction using machine learning techniques. There are three main technical challenges in building defect severity level prediction models for predicting the proper severity level of the defects using defect description

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Empirical Study on Application of Word Embedding Techniques for Prediction of Software Defect Severity Level

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Sep 26, 2021
Citations: 10	License type: cc-by

Similar Papers

Software Functional Requirements Classification Using Ensemble Learning
Sanidhya Vijayvargiya ... Lalita Bhanu Murthy
-
Sanidhya Vijayvargiya, et. al.Sanidhya Vijayvargiya ... Lalita Bhanu Murthy
01 Jan 2021
01 Jan 2021

Mining Bug Report Repositories to Identify Significant Information for Software Bug Fixing
Bancha Luaphol ... Jantima Polpinij
Applied Science and Engineering Progress | VOL. -
Bancha Luaphol, et. al.Bancha Luaphol ... Jantima Polpinij
17 Mar 2021
Applied Science and Engineering Progress | VOL. -

The Cardinality of Sets of k-Independent Vectors over Finite Fields
S B Damelin ... G Michalski
Monatshefte für Mathematik | VOL. 150
S B Damelin, et. al.S B Damelin ... G Michalski
19 Jan 2007
Monatshefte für Mathematik | VOL. 150

A Similarity Integration Method based Information Retrieval and Word Embedding in Bug Localization
Shasha Cheng ... Arif Ali Khan
-
Shasha Cheng, et. al.Shasha Cheng ... Arif Ali Khan
01 Dec 2020
01 Dec 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Empirical Study on Application of Word Embedding Techniques for Prediction of Software Defect Severity Level

Abstract

Highlights

Summary

Talk to us

Similar Papers