Abstract
Predicting the fix time of a bug is important for managing the resources and release milestones of a software development project. However, it is considered non-trivial to achieve high accuracy when predicting bug-fix times. We view that such difficulties come from the lack of continuous or posterior estimation based on subsequent developers’ activities after a bug is initially reported. In this paper, we formulate the problem of bug-fix time prediction into a continual update of estimates with more activities. Logging data of bug-related activities that are streamed to a bug tracking system change the bug reports, enabling us to recalculate predictions over time. To do so, we propose a deep learning-based two-staged activity stream embedding model, DASENet that employs (i) a merged network for extracting contextual features across different types of logs, and (ii) a sequence network for exploring temporal relations of the logs. Through experiments with bug tracking system datasets from open source projects including Firefox, Chromium, and Eclipse, we show that DASENet achieves stable performance, e.g., for the Firefox dataset, top-1 accuracy of 4.6 to 8.5 % higher than other state-of-the-art works. Our approach also provides a transferable structure, yielding robust performance with a small dataset for different tasks; the DASENet model trained with a small dataset of about 900 samples (2 % of a full dataset) can show competitive performance to the other models with a full dataset. To the best of our knowledge, we are the first to employ deep learning on log streams in the context of bug-fix time prediction.
Highlights
Data in a bug tracking system are frequently used as an essential part of managing the schedule, quality, and resources of software development in both industry practice and academic literature
Several researchers investigated the problem of predicting bug-fix times, and most of this research addressed the problem by performing either regression [17], [23]–[25] or classification [18]–[22] based on the features extracted from the attributes of bug reports
On the contrary to the purpose of those deep learning-based approaches, we focus on the continual prediction of bug-fix times and adapt deep neural networks for analyzing log streams of bug-related activities
Summary
Data in a bug tracking system are frequently used as an essential part of managing the schedule, quality, and resources of software development in both industry practice and academic literature. Y. Lee et al.: Continual Prediction of Bug-Fix Time Using Deep Learning-Based Activity Stream Embedding. We consider heterogeneity of log types, and we develop joint learning and merged network This network structure facilitates automated feature extraction from various types of bug-related activity logs, by combining a set of individual embedding networks wherein each is structured respectively for a specific type. We propose a two-staged leaning model, DASENet (Deep learning-based Activity Stream Embedding Network) that leverages the integrated use of a merged network and a sequence network; the former combines different types of logging data to a per-day activity summation, and the other generates embedding that reflects all the accumulated per-day activities in a common vector space. We first propose a continual approach for bug-fix time prediction by exploiting deep learning techniques with data streams of bug-related activity logs. We present a data- and time-efficient transferring procedure for variant tasks, which leverages the activity stream embedding of DASENet
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.