Abstract

Sentiment analysis has become a stimulating field for both research and industrial domains. The expression sentiment refers to the emotions or thought of the person across some certain issues. Furthermore, it's also considered an immediate application for opinion mining. the large amount of tweets jotted down daily makes Twitter an upscale source of textual data and one among the foremost essential data volumes; therefore, this data has different aims, like business, industrial or social aims consistent with the info requirement and needed processing. Actually, the quantity of knowledge, which is very large, grows rapidly per second and this is often called big data which needs special processing techniques and high computational power so as to perform the specified mining tasks. During this work, we perform a sentiment analysis with the assistance of PySpark framework, an interface for Apache Spark in Python which is taken into account an open source distributed processing platform which utilizes distributed memory abstraction. The goal of using PySpark is that we can run applications parallelly on the distributed cluster (multiple nodes). The effectiveness of our proposed approach is proved against other approaches achieving better classification results when using Naïve Bayes, Logistic Regression and Decision trees classification algorithms. Finally, our solution estimates the performance of Apache Spark concerning its scalability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.