Developing effective machine learning methods for multimedia data modeling continues to challenge computer vision scientists. The capability of providing effective learning models can have significant impact on various applications. In this work, we propose a nonparametric Bayesian approach to address simultaneously two fundamental problems, namely clustering and feature selection. The approach is based on infinite generalized Dirichlet (GD) mixture models constructed through the framework of Dirichlet process and learned using an accelerated variational algorithm that we have developed. Furthermore, we extend the proposed approach using another nonparametric Bayesian prior, namely Pitman---Yor process, to construct the infinite generalized Dirichlet mixture model. Our experiments, which were conducted through synthetic data sets, the clustering analysis of real-world data sets and a challenging application, namely automatic human action recognition, indicate that the proposed framework provides good modeling and generalization capabilities.