Abstract

The Internet can be considered as an open field for expression regarding products, politics, ideas, and people. Those expressive interactions generate a large amount of data pinned per users and groups. In that scope, Big data along with various technologies, such as social media, cloud computing, and machine learning can be used as a toolbox to make sense of the data and give the opportunity to generate efficient analysis and studies of the individuals and crowds regarding market orientation, politics, and industry. The recommendation system for this acts as the pillars of technology, in the field of sentiment analysis and predictive analysis to make sense of user’s data. However, this complex operation comes at the price of this. To each analysis, there is a personalized architecture and tool. In this paper, a novel design of a recommender system is provided powered by sentiment analysis and predictive models applied onto an example of data flow from the social media Twitter.

Highlights

  • A significant amount of available data on Internet represent a rich source of knowledge extraction, in particular, the social media like Twitter or Facebook

  • A proposition on how to extract data from Twitter based on keywords and store it in Hadoop Ecosystem [1] to carry out a sentiment analysis

  • Sentiment analysis [4] or Opinion mining can be described as the computational study of people's opinions, judgments, attitudes, and emotions toward entities and their aspects

Read more

Summary

INTRODUCTION

A significant amount of available data on Internet represent a rich source of knowledge extraction, in particular, the social media like Twitter or Facebook. How this data processing is performed enables data scientists to deliver relevant and exploitable results for economic, social, industrial, government policies or business purposes. A proposition on how to extract data from Twitter based on keywords and store it in Hadoop Ecosystem [1] to carry out a sentiment analysis This processing allows us to categorize tweets as positive, negative or neutral using scoring methods. The Naïve Bayes classifier assumes independence between dependent predictive variables and a Gaussian distribution of the numerical predictors with mean and standard deviation calculated from the training dataset. If the test data set has missing values, these predictors are omitted in the probability calculation during prediction

Opinion Mining
Ease of Use Big Data and Components
APPROACH
Storage
Data Analysis
7: AVGrate tweetRating
EXPERIMENTAL EVALUATION
Predictive Analysis
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call