MTLink: Adaptive multi-task learning based pre-trained language model for traceability link recovery between issues and commits

Yang Deng,Bangchao Wang,Qiang Zhu,Junping Liu,Jiewen Kuang,Xingfu Li

doi:10.1016/j.jksuci.2024.101958

Yang Deng, Bangchao Wang + Show 4 more

Open Access

https://doi.org/10.1016/j.jksuci.2024.101958

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Traceability links between issues and commits (issue-commit links recovery (ILR)) play a significant role in software maintenance tasks by enhancing developers’ observability in practice. Recent advancements in large language models, particularly pre-trained models, have improved the effectiveness of automated ILR. However, these models’ large parameter sizes and extended training time pose challenges in large software projects. Besides, existing methods often overlook the association and distinction among artifacts, leading to the generation of erroneous links. To mitigate these problems, this paper proposes a novel link recovery method called MTLink. It utilizes multi-teacher knowledge distillation (MTKD) to compress the model and employs an adaptive multi-task strategy to reduce information loss and improve link accuracy. Experiments are conducted on four open-source projects. The results show that (i) MTLink outperforms state-of-the-art methods; (ii) The multi-teacher knowledge distillation maintains accuracy despite model size reduction; (iii) The adaptive multi-task tracing method effectively handles confusion caused by similar artifacts and balances each task. In conclusion, MTLink offers an efficient solution for ILR in software traceability. The code is available at https://zenodo.org/records/10321150.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of King Saud University - Computer and Information Sciences	Publication Date: Jan 30, 2024
Citations: 1	License type: cc-by-nc-nd

R Discovery Prime

MTLink: Adaptive multi-task learning based pre-trained language model for traceability link recovery between issues and commits

Abstract

Published Version

Talk to us

Similar Papers

More From: Journal of King Saud University - Computer and Information Sciences

Lead the way for us

Similar Papers

Large pre-trained language models contain human-like biases of what is right and wrong to do
Patrick Schramowski ... Kristian Kersting
Nature Machine Intelligence | VOL. 4
Patrick Schramowski, et. al.Patrick Schramowski ... Kristian Kersting
01 Mar 2022
Nature Machine Intelligence | VOL. 4

Enhancing sentiment and intent analysis in public health via fine-tuned Large Language Models on tobacco and e-cigarette-related tweets.
Sherif Elmitwalli ... Raouf Alebshehy
Frontiers in big data | VOL. 7
Sherif Elmitwalli, et. al.Sherif Elmitwalli ... Raouf Alebshehy
01 Jan 2024
Frontiers in big data | VOL. 7

Jigsaw
Naman Jain ... Nagarajan Natarajan
-
Naman Jain, et. al.Naman Jain ... Nagarajan Natarajan
21 May 2022
21 May 2022

A Large and Diverse Arabic Corpus for Language Modeling
Abbas Raza Ali ... Hasan Raza Ali
Procedia Computer Science | VOL. 225
Abbas Raza Ali, et. al.Abbas Raza Ali ... Hasan Raza Ali
01 Jan 2023
Procedia Computer Science | VOL. 225

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

MTLink: Adaptive multi-task learning based pre-trained language model for traceability link recovery between issues and commits

Abstract

Published Version

Talk to us

Similar Papers

More From: Journal of King Saud University - Computer and Information Sciences