Abstract

With fast development of Internet technologies and sensor techniques, it is much easier to acquire data from different sources in different dates and times. However, how to compute the correlation of those heterogeneous data is a big challenge for data mining and information retrieval. Here, data feature from one source is called as a view, and the multiview features denote the same data point. In the paper, hidden correlation of two-view features is proposed to construct a Heterogeneous (multiview) Topic Model (HTM). In particular, probabilistic topic model is utilized for different views as usually, generative models provide much richer features when handling high-dimensional data such as texts. Nevertheless, it is necessary to know the form of probability distribution for most existent probabilistic topic models, such as latent Dirichlet allocation. By avoiding the limitation of probabilistic topic model, the HTM is reduced to solving a non-negative matrix tri-factorization problem with certain constraints such that the proposed approach can be used in terms of an arbitrary model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call