Abstract

Although Twitter has become an important source of information, the number of accessible tweets is too large for users to easily find their desired information. To overcome this difficulty, a method for tweet clustering is proposed in this paper. Inspired by the reports that network representation is useful for multimedia content analysis including clustering, a network-based approach is employed. Specifically, a consensus clustering method for tweet networks that represent relationships among the tweets' semantics and sentiment are newly derived. The proposed method integrates multiple clustering results obtained by applying successful clustering methods to the tweet networks. By integrating complementary clustering results obtained based on semantic and sentiment features, the accurate clustering of tweets becomes feasible. The contribution of this work can be found in the utilization of the features, which differs from existing network-based consensus clustering methods that target only the network structure. Experimental results for a real-world Twitter dataset, which includes 65 553 tweets of 25 datasets, verify the effectiveness of the proposed method.

Highlights

  • With the development of social media, new forms of communication and information acquisition have been established [1]

  • Inspired by the reports that network representation is useful for multimedia content analysis including clustering [22], we employ a network-based approach in this paper

  • By newly introducing sentiment features as well as semantic features extracted from textual data, we enable the accurate clustering of tweets

Read more

Summary

INTRODUCTION

With the development of social media, new forms of communication and information acquisition have been established [1]. The number of accessible tweets is too large for users to find their desired information To overcome this difficulty, it is useful to create an overview of many tweets by grouping tweets with similar topics. Methods have been proposed for tweet clustering using semantic features extracted from the message body [6]–[9], user information [10], hashtags [11]–[13] and combination of them [14], [15]. We newly derive a consensus clustering scheme that integrates multiple clustering results obtained by applying successful methods called the Louvain method [23] and signed Louvain method [19] to the semantic and sentiment networks.

RELATED WORK
GENERATION OF MULTIPLE CLUSTERING RESULTS
EXPERIMENTAL RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call