Software is constructed by a series of changes and each change has the risk to introduce bugs. Predicting the existence of bugs in source code changes could help developers detect and fix bugs immediately upon the completion of a change, which accelerates the bug fixing process and save the limited time and human resources effectively. However, because of altering nature in the underlying bug generation process, the concept used to depict the bug introducing patterns is drifting, which makes it difficult to predict latent bugs of source code changes accurately, especially in the long-term prediction scenario. In order to deal with this problem, a feature-based incremental learning framework is proposed. It is comprised of three components:(1) an incremental discretization method, which is used to transform the quantitive features in the corpus incrementally, (2) an incremental feature selection method, which is always keeping a subset with the most informative features, and (3) an incremental classification algorithm, which updates the classifier dynamically and considers the current best subset of features during prediction. This proposed approach is evaluated on three famous open source systems, Eclipse, Mozilla and jedit. The results show that our approach performs better than the non-incremental method in dealing with concept drift, with the consideration of keeping the value of both precision and recall stable at a suitable level over time. We also implement a prototype with this learning framework and apply it to a real software development scenario.
Read full abstract