Abstract

Observed rating data in Web2.0 applications concerns user attributes and rating scores, which explicitly reflects users’ overall evaluation on events, products and various informative items. However, the unobservable user preference is critical for personalized services, precise marketing, accurate advertising, etc. In this paper, by adopting Bayesian network (BN) with a latent variable as the knowledge framework to describe user preference using the latent variable, we propose user preference Bayesian network (UPBN) to represent dependence relations among the latent and observed variables. By incorporating the classic expectation maximization (EM) algorithm and scoring & search idea for learning a BN, we focus on UPBN construction from rating data, i.e., the learning of probability parameters and graphical structure. To make UPBN fit the rating data, we first give the constraints of structure and parameters in terms of inherence dependencies among user preference, latent variable and characteristics of EM. Consequently, we present a parallel and constraint induced algorithm for UPBN construction based on EM, structural EM (SEM) and Bayesian information criterion. To deal with the large amount of iterations of probability computations and guarantee the efficiency of model construction, we implement our algorithms upon Spark for the massive intermediate results and large scale rating datasets. Experimental results show the expressiveness of UPBN for preference modeling and the efficiency of model construction, and also demonstrate that UPBN outperforms some state-of-the-art models for user preference estimation and rating prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call