RETRACTED ARTICLE: Multimedia text classification algorithm using potential Dirichlet distribution in mobile cloud computing environment

Xiaohong Zhang,Yan Gao

doi:10.1007/s11042-019-08253-1

Abstract

In order to solve the problem of inaccurate description of news content features and user interest features in mobile cloud computing, proposed a multimedia text classification algorithm that utilizes multi-tag potential Dirichlet distribution. The algorithm is based on the traditional latent Dirichlet allocation (LDA) model and assumes a linear relationship between user tags and potential topics. Therefore, a relational matrix is introduced in the LDA model to describe the corresponding relationship between the tag and the topic, so that the probability distribution of the tag on the word can be inferred from the probability distribution of the topic on the word. The algorithm first learns the probability distribution table of label words by Gibbs sampling method, then infers the probability distribution of new documents on labels according to the model parameters, so as to realize the purpose of predicting the corresponding multiple labels of documents. In order to improve the ability of the algorithm to deal with massive data, the parallel algorithm has been improved. Since the bottleneck of the algorithm lies mainly in the serial nature of global variable updating and communication, the core idea of our parallelization is that in massive text training, global delay updating and asynchronous communication will not affect the final training results. Experiments show that the proposed algorithm has greatly improved the training efficiency. The classification accuracy is higher than that of Naive Bayesian algorithm and Support Vector Machine (SVM) algorithm proposed in other literatures. The average classification accuracy can achieve at about 95%, and it can be used as a general parallel framework of supervised LDA algorithm.

Full Text