Abstract

In this work, we tackle the problem of clustering spherical (i.e. L2 normalized) data vectors using nonparametric Bayesian mixture models with von Mises distributions. Our model is formulated by employing a nonparametric Bayesian framework known as the Pitman–Yor process mixture model. Different from finite mixture models in which the determination of the number of clusters is a crucial problem and often requires extra effort (e.g. by inspecting information criteria), the proposed model is nonparametric such that the number of clusters in the model is assumed to be infinite at the initial stage and will be inferred automatically based on the data. Moreover, an unsupervised feature selection scheme is incorporated into the proposed model to remove features that do not contribute significantly to the clustering process. We develop a stochastic variational inference algorithm to estimate model parameters, model complexity and feature saliencies simultaneously and effectively through the method of stochastic gradient ascent. We demonstrate the merits of the proposed nonparametric Bayesian mixture model on clustering spherical data vectors by conducting experiments on both synthetic datasets and two real-world applications namely topic novelty detection and flower images categorization.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.