Abstract

This paper presents an architecture which applies document similarity measures to the documentation produced during the phases of software development in order to generate recommendations of process and people metrics for similar projects. The application makes a judgment of similarity of the Service Provision Offer (SPO) document of a new proposed project to a collection of Project History Documents (PHD), stored in a repository of unstructured texts. The process is carried out in three stages: firstly, clustering of the Offer document with the set of PHDs which are most similar to it; this provides the initial indication of whether similar previous projects exist, and signifies similarity. Secondly, determination of which PHD in the set is most comparable with the Offer document, based on various parameters: project effort, project duration (time), project resources (members/size of team), costs, and sector(s) involved, indicating comparability of projects. The comparable parameters are extracted using ...

Highlights

  • The importance of software in today’s industry is without doubt

  • In order to set the scene for the use case, it is assumed that the Project History Documents (PHD) repository is already populated, and that the repository of the metrics extracted from the PHDs analyzed has been created, including the weights of the metrics as a function of their suitability for being used in projects

  • On the one hand, we have presented a novel initiative based on the success of software metrics, and on the other hand, on the use of an organization’s own information

Read more

Summary

Introduction

The importance of software in today’s industry is without doubt. Given the critical role of software, the requirement for project plans adjusted for time, effort, cost and quality has become a fundamental element for organizations producing software. Data analysis routines can be written to collect derived data from the raw data in the data base” It is precisely this statement which is the motivation of the current work, adapted to present-day – to recollect metrics and parameters of past projects with the objective of planning future projects with better precision, based on the Offer documentation. This is followed by an introduction to the theory of information extraction, and an overview of the Natural Language Processing (NLP) techniques used for the practical implementation of such tasks, namely the GATE (General Architecture for Text Engineering) architecture and document clustering methods.

Software Metrics
Personnel in Software Metrics
Information Extraction with Natural Language Processing
Document Clustering Techniques
Vector Space Model
Neural Networks
Latent Semantic Indexing
Support Vector Machines SVMs
BMR: Benchmarking Metrics Recommender
Use case scenario
Conclusions and future work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call