Abstract

AbstractText mining is a process by which information and patterns are extracted from textual data. Online Social Networks, which have attracted immense attention in recent years, produces enormous text data related to the human behaviours based on their interactions with each other. This data is intrinsically unstructured and ambiguous in nature. The data involves incorrect spellings and inaccurate grammars leading to lexical, syntactic and semantic ambiguities. This causes wrong analysis and inappropriate pattern identification. Various Text Mining approaches are being used by researchers which can help in Anomaly Detection through Topic Modeling, identification of Trending Topics, Hate Speeches and evolution of the communities in Online Social Networks. In this paper, a comparative analysis of the performance of four classification algorithms, Support Vector Machine (SVM), Rocchio, Decision Trees and K-Nearest Neighbour (KNN) for a Twitter data set is presented. The experimental study revealed that SVM outperforms better than other classifiers, and also classifies the dataset into anomalous and non-anomalous user’s opinions.KeywordsAnomaly detectionSocial mediaRocchio algorithmDecision treesK-Nearest neighbour (KNN) and Support Vector Machine (SVM)Kernel functionGini indexEntropy and Euclidean distance

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.