Abstract

The social media has become increasingly popular for personal and professional use. Therefore, a significant amount of uncontrolled data has contributed by the users of social media. The uncontrolled information may be harmful for some individual or a community. Therefore, in this paper, we proposed to investigate the different text and image feature selection techniques that can be used for social media-based text and image data. The aim of this investigation is to demonstrate how we can get the optimal classification accuracy with limited and linguistic dynamic text and Web images. In this context, we design an experiment for demonstrating the effectiveness of different feature selection models for text and images. The experiment involves three text feature selection techniques, three image feature selection technique, and two classifiers for measuring the influence of feature selection on the classifier performance. The experiments on Twitter data for text and image datasets found the following consequences: (1) Small size of text classification is complex due to limited amount of features and variation of vocabulary, (2) individual features are less accurate as compared to their combinations in both the scenarios (i.e., text and images), (3) for text and image classification, SVM is more computationally expensive as compared to ANN, and (4) ANN-based classification provides higher accurate classification as compared to SVM-based combinations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call