Abstract

Today, we live in the Big Data age. Social networks, online shopping, mobile data are main sources generating huge text data by users. This "text data" will provide companies with useful insight on how customers view their brand and encourage them to make business strategies actively in order to maintain their trade. Hence, it is essential for the enterprises to analyse the sentiments of social media big data to make predictions. Because of the variety and existence of data, the study of sentiment on broad data has become difficult. However, it includes open-source Big Data platforms and machine learning techniques to process large text information in real-time. The advancement in fields including Big Data and Deep Learning technology has influenced and overcome the traditional restrictions of distributed computing. The primary aim is to perform sentiment analysis on the pipelined architecture of Apache Spark ML to speed upward the computations and improve machine efficiency in different environments. Therefore, the Hybrid CNN-SVM model is designed and developed. Here, CNN is pipeline with SVM for sentiment feature extraction and classification in ML to improve the accuracy. It is more flexible, fast and scalable. In addition, Naive Bayes, Support Vector Machines (SVM), Random Forest, Logistic Regression classifiers have been used to measure the efficiency of the proposed system on multi-node environment. The experimental results demonstrate that in terms of different evaluation metrics, the hybrid sentiment analysis model outperforms the conventional models. The proposed method makes it convenient for effective handling of big sentiment datasets. It would be more beneficial for corporations, government and individuals to improve their great value.

Highlights

  • The success of Smart devices' makes people's daily lives more focused to mobile services

  • The proposed method starts with data pre-processing and feature extraction, followed by the use of machine learning classifiers, Naïve Bayes, Support vector machine and logistic regression separately under Spark ML and proposed Convolutional Neural Network (CNN)-Support Vector Machines (SVM) using Spark DL environment

  • The main focus in this study was on rapidly implementing sentiment analysis on the Big Data sets

Read more

Summary

Introduction

The success of Smart devices' makes people's daily lives more focused to mobile services. Reviews from customers are one of the massive amounts of information Since it includes millions of reviews from different websites, and the number of reviews is rising every day. A large amount of data cannot be processed by conventional methods, so to handle the huge amount of information, a new computing platform for big data, such as Apache Hadoop and Apache Spark, are intended to incorporate machine learning systems to attain high performance [2], [3]. Sentimental analysis is used in various places, for example: To analyse the reviews of a product whether they are positive or negative, If a political party strategy has been successful or not, evaluate the ratings of a film and analyse information of tweets or another social media data [5].

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call