Abstract
Requirement traceability is a crucial quality factor that highly impacts the software evolution process and maintenance costs. Automated traceability links recovery techniques are required for a reliable and low-cost software development life cycle. Pre-trained language models have shown promising results on many natural language tasks. However, using such pre-trained models for requirement traceability needs large and quality traceability datasets and accurate fine-tuning mechanisms. This paper proposes code augmentation and fine-tuning techniques to prepare the MS-CodeBERT pre-trained language model for various types of requirements traceability prediction including documentation-to-method, issue-to-commit, and issue-to-method links. Three program transformation operations, namely, Rename Variable, Swap Operands, and Swap Statements are designed to generate new quality samples increasing the sample diversity of the traceability datasets. A 2-stage and 3-stage fine-tuning mechanism is proposed to fine-tune the language model for the three types of requirement traceability prediction on provided datasets. Experiments on 14 Java projects demonstrate a 6.2% to 8.5% improvement in the precision, 2.5% to 5.2% improvement in the recall, and 3.8% to 7.3% improvement in the F1 score of the traceability prediction models compared to the best results from the state-of-the-art methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.