Abstract

Software bug report classification is a critical process to understand the nature, implications, and causes of software failures. Furthermore, classification enables a fast and appropriate reaction to software bugs. However, for large-scale projects, one must deal with a broad set of bugs from multiple types. In this context, manually classifying bugs becomes cumbersome and time-consuming. Although several studies have addressed automated bug classification using machine learning techniques, they have mainly focused on academic case studies, open-source software, and unilingual text input. This paper presents our automated bug classification approach applied and validated in an industrial case study. In contrast to earlier studies, our study is applied to a commercial software system based on unstructured bilingual bug reports written in English and Turkish. The presented approach adopts and integrates machine learning (ML), text mining, and natural language processing (NLP) techniques to support the classification of software bugs. The approach has been applied within an industrial case study. Compared to manual classification, our results show that bug classification can be automated and even performs better than manual bug classification. Our study shows that the presented approach and the corresponding tools effectively reduce the manual classification time and effort.

Highlights

  • Due to software systems’ increased complexity and size, software failures are inevitable in software development projects

  • This paper presents the results and lessons learned of our developed bug classification approach and the corresponding tools, applied and validated in an industrial case study research

  • The presented approach adopts and integrates machine learning (ML), text mining, and natural language processing (NLP) techniques to support the classification of software bugs

Read more

Summary

Introduction

Due to software systems’ increased complexity and size, software failures are inevitable in software development projects. The machine is trained using labeled data. This means the correct answer of classification is already known for the training data. In the manual bug classification approach, no automated tool is used, but the categorization relies entirely on the decision of human experts. The reviewer/triager inspects the bug reports and classifies the bugs concerning the bug classification schema used in the project. In this classification, since the reviewer/triager does not know the root cause of the problem, the decision is based on the reports’ explanations only. We have classified 504 bugs manually before introducing machine learning algorithms. The bugs were classified according to Seaman’s bug categories [11]

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call