Abstract

As generated by clustering algorithms, clusterings (or partitions) are hypotheses on data explanation which are better evaluated by experts from the application domain. In general, clustering algorithms allow a limited usage of domain knowledge about the cluster formation process. In this study, we propose both a design technique and a new partitioning-based clustering algorithm which can be used to assist the data analyst while looking for a set of meaningful clusters, i.e., clusters that actually correspond to the underlying data structure. Following an observer metaphor according to which the perception of a group of objects depends on the observer position-the closer an observer is from an image more details (s)he perceives-we resort to shrinkage to incorporate a regularization term, accounting for the observation point, within the objective function of an otherwise unbiased clustering algorithm. This technique allows our resulting biased algorithm to generate a set of reasonable partitions, i.e., partitions validated by a given cluster validity index, corresponding to views of data with different levels of granularity (levels of detail) in different regions of the data space. For the illustration of the design technique, we adopted the fuzzy c-means (FCM) algorithm as the unbiased clustering algorithm and include a convergence theorem assuring that changing the point of observation in the corresponding biased algorithm FCM with focal point (FCMFP) does not jeopardize its convergence. Experimental studies on both synthetic and real data are included to illustrate the usefulness of the approach. In addition, and as a convenient side effect of using shrinkage, the experimental results suggest that our biased algorithm (FCMFP) not only seems to scale better than the successive runs of the unbiased one (FCM) but on the average, seems to produce clusters exhibiting higher validity index values as well. In addition, less sensitivity to initialization was observed for the biased algorithm when compared with the unbiased one.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.