Abstract

Descriptions of pull requests (PRs) are posted by developers for describing the modifications that they have made and the corresponding reasons in these PRs. Although PRs help developers improve the development efficiency, some developers usually ignore writing the descriptions for PRs. To alleviate the above problem, researchers generally utilize text summarization model to automatically generate descriptions for PRs. However, current RNN-based models still face the challenges such as low efficiency and out-of-vocabulary (OOV), which may influence the further performance improvement to their models. To break this bottleneck, we propose a novel model aiming at the above challenges, named PRHAN (Pull Requests Description Generation Based on Hybrid Attention Network). Specifically, the core of PRHAN is the hybrid attention network, which has faster execution efficiency than RNN-based model. Moreover, we address the OOV problem by the utilizing byte-pair encoding algorithm to build a vocabulary at the sub-word level. Such a vocabulary can represent the OOV words by combining sub-word units. To reduce the sensitivity of the model, we take a simple but effective method into the cross-entropy loss function, named label smoothing. We choose three baseline models, including LeadCM, Transformer and the state-of-the-art model built by Liu et al. and evaluate all the models on the open-source dataset through ROUGE, BLEU, and human evaluation. The experimental results demonstrate that PRHAN is more effective than baselines. Moreover, PRHAN can execute faster than the state-of-the-art model proposed by Liu et al.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call