Abstract

Hierarchical clustering groups similar entities on the basis of some similarity (or distance) association and results in a tree like structure, called dendrogram. Dendrograms represent clusters in a nested manner, where at each step an entity makes a new cluster or merges into an existing cluster. Hierarchical clustering has many applications, therefore researchers have made efforts to come up with improved hierarchical clustering approaches. An approach that has received attention is based on combining clustering results, since different hierarchical clustering algorithms produce different dendrograms and their combination has produced more promising results as compared to individual hierarchical clustering. This paper proposes the hierarchical clustering combination (HCC) approach which uses the different types of structural features present in the dendrogram. Firstly, the dendrograms are represented in a 4+N (4 is the extracted number of features and can be extended to N number) dimensional euclidean space (4+NDES) which results in vector matrices. 4+NDES is the structural representation of the dendrogram which contains not only the relative features but also the absolute features of the entities in the dendrogram. Then the vector matrices are aggregated and the distance is calculated between each two vector using the Euclidean distance measure. The final hierarchy is obtained using a recovery tool like individual hierarchical clustering. 4+NDES-HCC utilizes the structural contents of the dendrogram and has the flexibility to handle an increasing number of features. The proposed approach is tested for software clustering which plays an important role in maintenance of software systems. The experimental results of the proposed approach and comparative analysis with existing approaches reveal the effectiveness of the HCC for software clustering.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call