Using Fuzzy Clustering and Software Metrics to Predict Faults in large Industrial Software Systems

Journals Iosr ,Nurudeen Sherif

doi:10.6084/m9.figshare.1203283

Abstract

Faults are a key problem in software systems. Awareness of possible flaws from the initialization of a project could save money, time and work. Estimating the possible deficiency of software could help in executing software development activities. This paper proposes a model to predict the possibility of faults on a software system before testing. The model predicts possible faults during software development using Fuzzy Clustering and Software Metrics. This research is aimed at predicting faults in large software systems by creating clusters and then finding out the distance of each point in the data set with the clusters created to determine their degree of membership within each cluster Reliance on software in our daily lives has increased so much in the last decade that in our day living without devices controlled by software is almost impossible. The Industrial domains such as medical applications, power plants, air traffic control and railway signaling have all integrated software as a fundamental part of their operation. Software engineers have to deal with a large number of quality requirements such as reliability, safety, availability, performance, maintainability and security which makes the development of these large software applications very challenging. The industrial reliance on software gives rise to the likelihood of gross crises in the case of a failure and the effect of these catastrophes ranges from economic damage to loss of lives. Therefore, there is an increasing necessity to ensure the steadfastness of software systems. Moreover, it is well known that the earlier a problem can be identified, the better and more cost effectively this problem can be fixed. Therefore, it is necessary to predict faults during the software development. There are numerous techniques and metrics for investigating fault prone modules which may aid software developers in performing testing activities during development. It is almost impossible to produce software that is free of faults due to the rising complexity and the constraints under which the software is developed. Such faults may lead to an increment in development & maintenance cost and time, due to software failures and decrease customer's satisfaction (1). Data Clustering is a basic technique in many modeling algorithms. The objective of clustering is to construct new collections of data from large data set. One of the most acceptable contributions to the field of data clustering is Fuzzy C-Means clustering. It has more benefits compared to other methods of data clustering, specifically the ability to split data for different size clusters with fuzzy logic. The Fuzzy C-Means can be seen as the modified version of the k-means algorithm. Which is a method of clustering that allows one piece of data to belong to two or more clusters. The degree of being in a certain cluster is related to the inverse of the distance to the cluster (2). Fuzzy C-Means iteratively moves the cluster centers to the right location within a data set. This research is aimed at predicting faults in large industrial software systems by creating clusters and then finding out the distance of each point in the data set with the clusters created to determine their degree of membership within each cluster. The Factors like Mean Absolute Error, Accuracy and Root Mean Square Error help us in predicting the software system as faulty or fault-free. The literature, (3)-(17) presents various types of Fault-Proneness Estimation Models. The results are also compared with (18) in which Hierarchical clustering based approach is used for Finding Fault Prone Classes in large software systems. The paper is organized as follows: section II exploits some literature on related works, section III explains the methodology followed in this research and section IV the result of the study. Finally conclusions of the research are presented in section V.

Full Text