Abstract
MAP estimation plays an important role in many probabilistic models. However, in many cases, the MAP problem is non-convex and intractable. In this work, we propose a novel algorithm, called BOPE, which uses Bernoulli randomness for Online Maximum a Posteriori Estimation. We show that BOPE has a fast convergence rate. In particular, BOPE implicitly employs a prior which plays as regularization. Such a prior is different from the one of the MAP problem and will be vanishing as BOPE does more iterations. This property of BOPE is significant and enables to reduce severe overfitting for probabilistic models in ill-posed cases, including short text, sparse data, and noisy data. We validate the practical efficiency of BOPE in two contexts: text analysis and recommender systems. Both contexts show the superior of BOPE over the baselines.
Highlights
Maximum a Posteriori (MAP) estimation is a popular approach to inference in probabilistic models [1], [2]
We evaluate the efficiency of BOPE for solving the MAP problem in topic models via results of Online-BOPE for learning Latent Dirichlet allocation (LDA) on Log Predictive Probability (LPP) and Normalised Pointwise Mutual Information (NPMI) measures and comparing with other learning algorithms such as Online-Variational Bayes (VB), Online-CVB0, Online-Collapsed Gibbs Sampling (CGS), and Online-OPE
CASE STUDY 2: APPLICATION TO RECOMMENDER SYSTEMS we investigate the application of BOPE for solving the MAP problem in Collaborative Topic Model for Poisson distributed ratings (CTMP) model [8] which is used for recommendation systems
Summary
Maximum a Posteriori (MAP) estimation is a popular approach to inference in probabilistic models [1], [2]. It plays an essential role in various practical scenarios where there exist hidden variables or uncertainty. Adding the prior probability information reduces the overdependence on the observed data for parameter estimation, MAP estimation be seen as a regularization of Maximum Likelihood Estimation (MLE), MAP can deal well with low training data. In MAP estimation, our task is to find x∗ = arg max P(x|D) (1) x∈. Where D denotes the observed data, x denotes a hidden/unobserved variable, and denotes the domain of x.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.