This study addresses the challenge of requirements-to-code traceability by proposing a novel model, Genetic Algorithm-XGBoost With Code Dependency (GA-XWCoDe), which integrates eXtreme Gradient Boosting (XGBoost) with a Node2Vec model-weighted code dependency strategy and genetic algorithms for parameter optimisation. XGBoost mitigates overfitting and enhances model stability, while Node2Vec improves prediction accuracy for low-confidence links. Genetic algorithms are employed to optimise model parameters efficiently, reducing the resource intensity of traditional methods. Experimental results show that GA-XWCoDe outperforms the state-of-the-art method TRAceability lInk cLassifier (TRAIL) by 17.44% and Deep Forest for Requirement traceability (DF4RT) by 33.36% in terms of average F1 performance across four datasets. It is significantly superior to all baseline methods at a confidence level of α¡0.01 and demonstrates exceptional performance and stability across various training data scales.
Read full abstract