Software requirement traceability analysis using text mining methods

Poyraz Umut Hatipoglu,Ali Demir,Oguzhan Sereflisan,Anil Atvar,Yusuf Oguzhan Artan

doi:10.1109/siu.2017.7960424

Abstract

In this study, text mining based methods are proposed for requirement traceability analysis which is one of the most essential steps in the software life cycle. It is aimed to automate the requirements traceability process of the software architecture, which is conducted by a data analyst manually, with the proposed methods. For this purpose, besides the tf-idf and Latent Semantic Analysis (LSI/LSA) based approaches which are commonly used in the literature, requirement and design matching activities are realized by using Latent Dirichlet Allocation (LDA) title modelling technique and word2vec models. While the tf-idf based LSI approach achieve the highest classification accuracy, the LDA based approach produces relatively lower classification accuracy than LSI models. The word2vec + tf-idf method which has better classification accuracy than both of the word2vec + BOW and BOW alone models is the method producing the third highest performance.

Full Text