Abstract
This paper presents an architecture which applies document similarity measures to the documentation produced during the phases of software development in order to generate recommendations of process and people metrics for similar projects. The application makes a judgment of similarity of the Service Provision Offer (SPO) document of a new proposed project to a collection of Project History Documents (PHD), stored in a repository of unstructured texts. The process is carried out in three stages: firstly, clustering of the Offer document with the set of PHDs which are most similar to it; this provides the initial indication of whether similar previous projects exist, and signifies similarity. Secondly, determination of which PHD in the set is most comparable with the Offer document, based on various parameters: project effort, project duration (time), project resources (members/size of team), costs, and sector(s) involved, indicating comparability of projects. The comparable parameters are extracted using ...
Highlights
The importance of software in today’s industry is without doubt
In order to set the scene for the use case, it is assumed that the Project History Documents (PHD) repository is already populated, and that the repository of the metrics extracted from the PHDs analyzed has been created, including the weights of the metrics as a function of their suitability for being used in projects
On the one hand, we have presented a novel initiative based on the success of software metrics, and on the other hand, on the use of an organization’s own information
Summary
The importance of software in today’s industry is without doubt. Given the critical role of software, the requirement for project plans adjusted for time, effort, cost and quality has become a fundamental element for organizations producing software. Data analysis routines can be written to collect derived data from the raw data in the data base” It is precisely this statement which is the motivation of the current work, adapted to present-day – to recollect metrics and parameters of past projects with the objective of planning future projects with better precision, based on the Offer documentation. This is followed by an introduction to the theory of information extraction, and an overview of the Natural Language Processing (NLP) techniques used for the practical implementation of such tasks, namely the GATE (General Architecture for Text Engineering) architecture and document clustering methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have