Abstract

The resource release bugs are a common type of serious programming bug. However, it is hard to catch them by using static detection for the lacking of comprehensive prior knowledge about the release functions. In this paper, a resource release bug detection method is proposed by introducing analogical reasoning on word vectors. First, the functions of the target source code are encoded into word vectors by the word embedding technique in natural language processing. Second, a two-stage reasoning method is developed for automatically identifying unknown resource release functions according to a few well-known seed functions. 3CosAvg algorithm is employed for the first stage, and a new algorithm is designed for the latter, called 3CosAddExchange. Finally, the identified release functions are translated into static analysis rules to detect potential bugs. The experiment shows that the proposed method is effective and efficient for the large-scale software project. Five unknown resource release bugs are successfully detected in the Linux kernel and confirmed by kernel developers.

Highlights

  • In this paper, the word embedding technique, for example, Word2vec [2], is introduced to address the problem

  • In view of the above, we propose a resource-releaserelated bug detection method based on analogical reasoning

  • The obtained functions are configured into the detection rule of the static detection tool to find the resource release bugs from the target project. e proposed method has been applied to the Linux kernel. rough the two-stage reasoning, we identified hundreds of potential release-related functions and successfully detected five unknown bugs, which have been confirmed by the kernel developers. e experiment result shows that our method is effective and efficient and can be employed for analyzing and detecting large-scale software projects

Read more

Summary

Introduction

E function call sequences of the target project can be used as the training samples (i.e., sentences) to train a word embedding model. It should be pointed out that unlike the analogical reasoning in NLP, we will identify potential unknown resource allocate-release function pairs first, instead of reasoning the unknown resource release functions directly. The function call sequences from the target project are extracted to train a word embedding model and encode all functions as vectors. With a small number of well-known resource allocate-release function pairs, the potential resource-release-related functions are identified via a two-stage reasoning method. Rough the two-stage reasoning, we identified hundreds of potential release-related functions and successfully detected five unknown bugs, which have been confirmed by the kernel developers. The obtained functions are configured into the detection rule of the static detection tool to find the resource release bugs from the target project. e proposed method has been applied to the Linux kernel. rough the two-stage reasoning, we identified hundreds of potential release-related functions and successfully detected five unknown bugs, which have been confirmed by the kernel developers. e experiment result shows that our method is effective and efficient and can be employed for analyzing and detecting large-scale software projects

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call