Abstract
Commit messages are important for developers to understand the content and the reason for code changes. However, poor and even empty commit messages widely exist. To improve the quality of commit messages and development efficiency, many commit message generation methods have been proposed. Nevertheless, previous methods mainly focus on a brief generation problem, where both the input code change and the output commit messages are restricted to short. This may initiate a debate on the performance of these methods in practice. In this paper, we attempt to remove the restrictions and move the needle forward to a holistic commit message generation problem. In particular, we conduct experiments to evaluate the performance of existing commit message generation methods in holistic commit message generation. In the experiments, we choose seven state-of-the-art commit generation methods and focus on two important scenarios in commit message generation (i.e., the within-project scenario and the cross-project scenario). To conduct our experiments, we publish a holistic commit message dataset HORDA with test data manually labeled. In our evaluations, we find that in generating holistic commit messages, the IR-based method has a better performance than non-pre-trained generation-based methods in the within-project scenario, contradicting previous research findings. Further, while the pre-trained generation-based methods are better than non-pre-trained generation-based methods, they are still constrained by the limitations of generation models.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: ACM Transactions on Software Engineering and Methodology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.