Abstract

This paper proposes the minimization of α-divergences for approximate inference in the context of deep Gaussian processes (DGPs). The proposed method can be considered as a generalization of variational inference (VI) and expectation propagation (EP), two previously used methods for approximate inference in DGPs. Both VI and EP are based on the minimization of the Kullback-Leibler divergence. The proposed method is based on a scalable version of power expectation propagation, a method that introduces an extra parameter α that specifies the targeted α-divergence to be optimized. In particular, such a method can recover the VI solution when α→0 and the EP solution when α→1. An exhaustive experimental evaluation shows that the minimization of α-divergences via the proposed method is feasible in DGPs and that choosing intermediate values of the α parameter between 0 and 1 can give better results in some problems. This means that one can improve the results of VI and EP when training DGPs. Importantly, the proposed method allows for stochastic optimization techniques, making it able to address datasets with several millions of instances.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call