Abstract

Social media website (e.g., Twitter, Facebook, Instagram) contains rich information about people's preferences. An example is that people often share their thoughts about movies using Twitter. In this paper, we conduct big data analysis using real-time tweets about movies to predict the revenue and success of movies. In particular, we collect a large dataset of tweets (e.g., 20GB) about 50 movies in 30 days. Sentimental analysis is applied on the tweets to examine the users' preferences towards the movies. In addition, a linear regression model that considers the effect of both positive and negative tweets is built to classify the movies into three categories, including lower than average movies, average movies and best seller movies. We use 10 released movies' data to train the model which will be used to predict the success of the rest 40 unreleased movies. The experimental results demonstrate that the movie revenue prediction model is valid. The success of the movie can be accurately predicted using the proposed linear regression model. We also find out that not only positive tweets but also negative tweets can contribute to the success of a movie.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call