Software Requirement Specification (SRS) describes a software system to be developed that captures the functional, non-functional, and technical aspects of the stakeholder’s requirements. Retrieval and extraction of software information from SRS are essential to the development of software product line (SPL). Albeit Natural Language Processing (NLP) techniques, such as information retrieval and standard machine learning, have been advocated in the recent past as a semi-automatic means of optimising requirements specifications, they have not been widely embraced. The complexity in the organization’s information makes requirement analysis intricately a challenging task. The interdependence of subsystems and within an organisation drives this complexity. A plain multi-class classification framework may not address this issue. Hence, this paper propounds an automated non-exclusive approach for classification of functional requirements from SRS, using a deep learning framework. Specifically, Word2Vec and FastText word embeddings are utilised for document representation for training a convolutional neural network (CNN). The study was carried out by the compilation of manually categorised relevant enterprise data (AUTomotive Open System ARchitecture (AUTOSAR)), which were also employed for model training. Over a convolutional neural network, the impact of data trained with Word2Vec and FastText word embeddings from SRS documentation were compared to pre-trained word embeddings models, available online.
Read full abstract