Automated Classification of Unstructured Bilingual Software Bug Reports: An Industrial Case Study Research

Ömer Köksal,Bedir Tekinerdogan

doi:10.3390/app12010338

Ömer Köksal, Bedir Tekinerdogan

Open Access

https://doi.org/10.3390/app12010338

Copy DOI

Abstract

Software bug report classification is a critical process to understand the nature, implications, and causes of software failures. Furthermore, classification enables a fast and appropriate reaction to software bugs. However, for large-scale projects, one must deal with a broad set of bugs from multiple types. In this context, manually classifying bugs becomes cumbersome and time-consuming. Although several studies have addressed automated bug classification using machine learning techniques, they have mainly focused on academic case studies, open-source software, and unilingual text input. This paper presents our automated bug classification approach applied and validated in an industrial case study. In contrast to earlier studies, our study is applied to a commercial software system based on unstructured bilingual bug reports written in English and Turkish. The presented approach adopts and integrates machine learning (ML), text mining, and natural language processing (NLP) techniques to support the classification of software bugs. The approach has been applied within an industrial case study. Compared to manual classification, our results show that bug classification can be automated and even performs better than manual bug classification. Our study shows that the presented approach and the corresponding tools effectively reduce the manual classification time and effort.

Highlights

Due to software systems’ increased complexity and size, software failures are inevitable in software development projects
This paper presents the results and lessons learned of our developed bug classification approach and the corresponding tools, applied and validated in an industrial case study research
The presented approach adopts and integrates machine learning (ML), text mining, and natural language processing (NLP) techniques to support the classification of software bugs

Summary

Introduction

Due to software systems’ increased complexity and size, software failures are inevitable in software development projects. The machine is trained using labeled data. This means the correct answer of classification is already known for the training data. In the manual bug classification approach, no automated tool is used, but the categorization relies entirely on the decision of human experts. The reviewer/triager inspects the bug reports and classifies the bugs concerning the bug classification schema used in the project. In this classification, since the reviewer/triager does not know the root cause of the problem, the decision is based on the reports’ explanations only. We have classified 504 bugs manually before introducing machine learning algorithms. The bugs were classified according to Seaman’s bug categories [11]

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Dec 30, 2021
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Automated Classification of Unstructured Bilingual Software Bug Reports: An Industrial Case Study Research

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

10 - Identification of duplicate bug reports in software bug repositories: a systematic review, challenges, and future scope
Naresh Kumar Nagwani
Data Deduplication Approaches | VOL. -
Naresh Kumar NagwaniNaresh Kumar Nagwani
01 Jan 2020
Data Deduplication Approaches | VOL. -

Mining Publication Papers via Text Mining: A Case Study
Ahmed S Ibrahim ... Mostafa Aref
-
Ahmed S Ibrahim, et. al.Ahmed S Ibrahim ... Mostafa Aref
01 Jan 2020
01 Jan 2020

Demo: Automatically Retrainable Self Improving Model for the Automated Classification of Software Incidents into Multiple Classes
Badal Agrawal ... Mohit Mishra
-
Badal Agrawal, et. al.Badal Agrawal ... Mohit Mishra
01 Jul 2021
01 Jul 2021

Software bug severity and priority prediction using SMOTE and intuitionistic fuzzy similarity measure
Rama Ranjan Panda ... Naresh Kumar Nagwani
Applied Soft Computing | VOL. 150
Rama Ranjan Panda, et. al.Rama Ranjan Panda ... Naresh Kumar Nagwani
14 Nov 2023
Applied Soft Computing | VOL. 150

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automated Classification of Unstructured Bilingual Software Bug Reports: An Industrial Case Study Research

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences