Employing rules for the automatic extraction of conceptual diagrams from software requirements has been in practice for some time. However, considering only rules for extraction makes the system complex to handle. Moreover, the rules are predominantly based on the syntactic structure such as Part of Speech tags along with Dependency Grammar of sentences and rarely on semantics. In this paper, we propose to use a probabilistic approach in configuration with the rule-based technique and the Word embeddings to preserve the semantics of the sentence. Hence, reduces the complexity of the extraction procedure. Further, we advocate the use of a divide-and-conquer policy of extraction instead of extracting classes for one entire use case description. We extract the class diagram from small use cases and then merge it to obtain the class diagram. As generated class diagram corresponding to small use cases can be utilized in another similar software design, thus, it increases the scalability and decreases the extraction time. The proposed hybrid approach integrates the knowledge from the experiences. Thus, the proposed approach achieved 90% as F1-score whereas the F1-Score for the existing methods ranged between 79-88%. The proposed hybrid approach also shows a 19.44% reduction in terms of the number of iterations performed to carry out extraction procedures for individual use cases. Hence, reduces the extraction procedure complexity.
Read full abstract