Abstract

Terabytes of data are created every day by modern information systems and new technology. It takes a lot of work across many levels to get useful information out of these massive datasets for decision-making. Social media, as well as other Internet-based applications, have been a major source of big data in recent years. When it comes to social media, Twitter is a household name across the globe. Unstructured data can be found in abundance on Twitter and other social media platforms. An innovative way to examine the emotions expressed on social media is through the use of sentiment analysis clustering techniques. Unsupervised machine learning methods for sentiment analysis of unstructured bigdata is discussed here. The Meanshift clustering method and the Bisecting K means algorithm is then compared using the metrics of precision, recall, and accuracy, among other things the f1 score. Python programming and Pyspark are employed for data analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call