Abstract
The aim of cluster analysis is to divide a set of multidimensional observations into subsets according to their similarities and dissimilarities. These observations are generally represented as data points scattered through an N-dimensional data space, each point corresponding to a vector of observed features measured on the objects to be classified. In the framework of the statistical approach, many clustering procedures have been proposed, based on the analysis of the underlying probability density function (pdf) (Devijver & Kittler, 1982). Independently from cluster analysis, a large amount of research effort has been devoted to image segmentation. To humans, an image is not just an unstructured collection of pixels. We generally agree about the different regions constituting an image due to our visual grouping capabilities. Among the factors that lead to such perceptual grouping, the most important are similarity, proximity and connectedness. The segmentation process can be considered as a partitioning scheme such that: -Every pixel of the image must belong to a region, -The regions must be composed of contiguous pixels, -The pixels constituting a region must share a given property of similarity. These three conditions can be easily adapted to the clustering process. Indeed, each data point must be assigned to a cluster, and the clusters must be composed of neighbouring data points since the points assigned to the same cluster must share some properties of similarity.Considering this analogy between segmentation and clustering, some image segmentation procedures based on the gray-level function analysis can be adapted to multidimensional density function analysis for pattern classification, assuming there is a one-to-one correspondence between the modes of the underlying pdf and the clusters. In this framework of unsupervised pattern classification, the underlying pdf is estimated on a regular discrete array of sampling points (Cf. section 2). The idea of using a pdf estimation for mode seeking is not new (Parzen, 1962) and, in very simple situations, the modes can be detected by thresholding the pdf at an appropriate level, using a procedure similar to image binarization. A solution for improving this thresholding scheme is to adapt a probalistic labelling scheme directly derived from image processing techniques (Cf. section 3).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.