Abstract

Clustering is the task of classifying patterns or observations into clusters or groups. Generally, clustering in high-dimensional feature spaces has a lot of complications such as: the unidentified or unknown data shape which is typically non-Gaussian and follows different distributions; the unknown number of clusters in the case of unsupervised learning; and the existence of noisy, redundant, or uninformative features which normally compromise modeling capabilities and speed. Therefore, high-dimensional data clustering has been a subject of extensive research in data mining, pattern recognition, image processing, computer vision, and other areas for several decades. However, most of existing researches tackle one or two problems at a time which is unrealistic because all problems are connected and should be tackled simultaneously. Thus, in this paper, we propose two novel inference frameworks for unsupervised non-Gaussian feature selection, in the context of finite asymmetric generalized Gaussian (AGG) mixture-based clustering. The choice of the AGG distribution is mainly due to its ability not only to approximate a large class of statistical distributions (e.g. impulsive, Laplacian, Gaussian and uniform distributions) but also to include the asymmetry. In addition, the two frameworks simultaneously perform model parameters estimation as well as model complexity (i.e., both model and feature selection) determination in the same step. This was done by incorporating a minimum message length (MML) penalty in the model learning step and by fading out the redundant densities in the mixture using the rival penalized EM (RPEM) algorithm, for first and second frameworks, respectively. Furthermore, for both algorithms, we tackle the problem of noisy and uninformative features by determining a set of relevant features for each data cluster. The efficiencies of the proposed algorithms are validated by applying them to real challenging problems namely action and facial expression recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.