AbstractJust‐In‐Time (JIT) defect prediction aims to predict the defect proneness of software changes when they are initially submitted. It has become a hot topic in software defect prediction due to its timely manner and traceability. Researchers have proposed many JIT defect prediction approaches. However, these approaches cannot effectively utilise line labels representing added or removed lines and ignore the noise caused by defect‐irrelevant files. Therefore, a JIT defect prediction model enhanced by the joint method of line label Fusion and file Filtering (JIT‐FF) is proposed. Firstly, to distinguish added and removed lines while preserving the original software changes information, the authors represent the code changes as original, added, and removed codes according to line labels. Secondly, to obtain semantics‐enhanced code representation, a cross‐attention‐based line label fusion method to perform complementary feature enhancement is proposed. Thirdly, to generate code changes containing fewer defect‐irrelevant files, the authors formalise the file filtering as a sequential decision problem and propose a reinforcement learning‐based file filtering method. Finally, based on generated code changes, CodeBERT‐based commit representation and multi‐layer perceptron‐based defect prediction are performed to identify the defective software changes. The experiments demonstrate that JIT‐FF can predict defective software changes more effectively.
Read full abstract