Abstract

In this paper, we propose a robot editor called XiaoA to predict the popularity of online news. A method for predicting the popularity of online news based on ensemble learning is proposed with the component learners such as support vector machine, random forest, and neural network. The page view (PV) of news article is selected as the surrogate of popularity. A document embedding method Doc2vec is used as the basic analysis tool and the topic of the news is modeled by Latent Dirichlet Allocation (LDA). Experimental results demonstrate that our robot outperforms the state of the art method on popularity prediction.

Highlights

  • Online news articles are attractive to a large amount of Internet users for the short length and rich content

  • Several classifiers are chosen as our component learners such as Random forest (RF), Neural network (NN), Support vector machine (SVM), Logistic regression (LR), Nearest centroid (NC) and Restricted Boltzmann machine (RBM)

  • This paper presents a robot editor called XiaoA to predict the popularity of online news based on ensemble learning

Read more

Summary

Introduction

Online news articles are attractive to a large amount of Internet users for the short length and rich content. For the popularity of online content is always related to the revenue, it is important to predict it beforehand. The Blossom bot built by New York Times can solve this problem well It is a chat bot within the messaging app Slack, which utilizes machine learning in its backend. It is more valuable to predict the early popularity of a news These data are available from kinds of news rankings. We present a method for popularity prediction of online news based on ensemble learning. The contributions of our paper are: 1) We propose a popularity prediction method for online news based on ensemble learning which outperforms the state of the art method. The contributions of our paper are: 1) We propose a popularity prediction method for online news based on ensemble learning which outperforms the state of the art method. 2) We evaluate the performance of several classifiers on popularity prediction and get some meaningful conclusions. 3) We find the relationship between popularity and the news features

Related work
Problem statement
Original dataset
Data preprocessing
Proposed methodology
Component learners
Ensemble learning
Experiments
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call