Abstract

Clustering is the process of identifying objects with similar intrinsic properties and grouping them together into separate clusters. It also serves as a preliminary step in data classification by separating heterogeneous data into reasonably homogeneous groups, which can be further processed. Existing literature discusses a host of different clustering algorithms and their applications, albeit no single approach for all applications has yet emerged. In this paper, we introduce a novel clustering algorithm, YAC2, based on data binning and α-proximity/neighborhood. The binning process is a data transformation step that converts cardinal values of the attributes into their ordinal equivalence. The algorithm introduces a specialized centroid that is used with the α-proximity and a matching algorithm to partition data set into a sequentially generated clusters. We present the results of applying YAC2 to a set of established benchmark data sets, using a host of evaluation metrics. Our results show YAC2 to perform well besting a number of well-established algorithms. For several metrics, YAC2 has provided improvements averagely in the range of 6% (in the Iris data set)-180% (in the Wholesale Customer data set) over other algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.