Abstract

A clustering algorithm for datasets with pairwise constraints using the Centroid Neural Network (Cent.NN) is proposed in this paper. The proposed algorithm, referred to as the Centroid Neural Network with Pairwise Constraints (Cent. NN-PC) algorithm, utilizes Cent.NN as its backbone algorithm for data clustering and adopts a semi-supervised learning process for pairwise constraints. A newly formulated energy function is adopted from the original Cent.NN algorithm for the proposed Cent.NN-PC algorithm, introducing penalty terms for violating constraints. The weight update procedure of the proposed Cent.NN-PC algorithm finds optimal prototypes for the given dataset that minimize the quantization error while minimizing the number of violated constraints. In order to evaluate the performance of the proposed Cent.NN-PC algorithm, experiments on six different datasets from the UCI database and two bioinformatics datasets from the KEEL repository are carried out. The performance of the proposed algorithm is compared to that of the the Linear Constrained Vector Quantization Error (LCVQE) algorithm, one of the most commonly used algorithms for data clustering with pairwise constraints. In the experiments, five different numbers of pairwise constraints are utilized to evaluate the clustering performance with constraints of different sizes. The results show that the proposed Cent.NN-PC algorithm outperforms the LCVQE algorithm on most performance criteria, including the total quantization error, the number of violated constraints, and on the three performance metrics of the classification accuracy rate, F-score, and NMI measure outcome. The experiments also show that Cent.NN-PC provides much more stable clustering results at an improved operational speed compared to LCVQE.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call