Abstract

Software Requirement Specifications (SRS) documents provide the description of requirements and expectations attributed to software products. The structured text present in the SRS documents serves as a guide for developers in defining various functions in the process of software development. Software specific entity extraction is an important pre-processing step for various Natural Language Processing (NLP) tasks in the requirement engineering domain such as entity-centric search systems, SRS document summmarization, requirement classification, and requirement quality management. Recent advances in transformer-based models have significantly contributed to NLP and information retrieval problems, and achieved state-of-the-art performance for domain specific entity extraction tasks. In this study, we employ the transformer models including BERT, RoBERTa and ALBERT for software specific entity extraction. For this purpose, we annotate three requirement datasets, namely, DOORS, SRE, and RQA with varied sets of software specific entities. Our numerical study shows that transformer models are able to outperform the traditional approaches such as ML-CRF, and we find that BERT variants improve the F1-scores by 4% and 5% on the DOORS and SRE datasets, respectively. We conduct entity level error analysis to examine the partial and exact matching of entities and respective boundaries. Lastly, we experiment with few-shot learning to create sample efficient NER systems with template-based BART model

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call