The issue of model uncertainty has been gaining interest in education and the social sciences community over the years, and the dominant methods for handling model uncertainty are based on Bayesian inference, particularly, Bayesian model averaging. However, Bayesian model averaging assumes that the true data-generating model is within the candidate model space over which averaging is taking place. Unlike Bayesian model averaging, the method of Bayesian stacking can account for model uncertainty without assuming that a true model exists. An issue with Bayesian stacking, however, is that it is an optimization technique that uses predictor-independent model weights and is, therefore, not fully Bayesian. Bayesian hierarchical stacking, proposed by Yao et al. further incorporates uncertainty by applying a hyperprior to the stacking weights. Considering the importance of multilevel models commonly applied in educational settings, this paper investigates via a simulation study and a real data example the predictive performance of original Bayesian stacking and Bayesian hierarchical stacking along with two other readily available weighting methods, pseudo-BMA and pseudo-BMA bootstrap (PBMA and PBMA+). Predictive performance is measured by the Kullback–Leibler divergence score. Although the differences in predictive performance among these four weighting methods in Bayesian stacking are small, we still find that Bayesian hierarchical stacking performs as well as conventional stacking, PBMA, and PBMA+ in settings where a true model is not assumed to exist.