The mixture of Gaussian processes is a powerful statistical learning model that can be effectively applied to curve clustering and prediction. However, the corresponding model selection problem, that is, selecting an appropriate number of components in the mixture, is rather difficult to solve. In our previous work, we established the split-and-merge automatic model selection algorithm for mixtures of Gaussian processes along the output space under the framework of Reversible Jump Markov Chain Monte Carlo (RJMCMC), which can not only determine the number of actual Gaussian processes but also dynamically adjust the Gaussian process components to avoid dependence on parameter initialization and initial partitioning of the dataset during the parameter learning on a given dataset. In this study, we propose two algorithms: Penalized Likelihood RJMCMC and Penalized Prior RJMCMC. The former integrates a penalized term into the likelihood, while the latter incorporates a penalized term into the prior and operates within the full Bayesian inference framework, both aiming to focus more sharply on determining the number of components in the convergence process. Furthermore, we prove the geometric ergodicity of the RJMCMC algorithm for the mixture of Gaussian processes model, ensuring convergence of the posterior distribution with sufficient iterations. The experimental results further demonstrate the robustness of our PP-RJMCMC algorithm in model selection, showing superior performance compared to traditional approaches in curve classification and clustering. Additionally, the prediction performance is comparable to the EM algorithm. Although not directly explored in this study, the RJMCMC results can be used to initialize the EM algorithm, which could potentially improve prediction accuracy and accelerate computation.
Read full abstract