Time-sensitive clustering evolving textual data streams

Mohamed Ammar,Minyar Sassi Hidri,Adel Hidri

doi:10.1504/ijcat.2020.10030073

Abstract

Clustering a stream of text documents is an emerging subject of interest since it is widely used in analysing the content in social media and e-journals. The aim is to find a certain structure for unlabelled data based on a similarity criterion. However, few works have focused on this field and fall in this perspective, that's why a new document clustering approach adapted to a stream of text data and test it on news articles data sets is proposed. A distributed representation of words is used, and a bottom-up approach is used to represent documents as vectors on a unit hyper-sphere. The proposed approach gains its roots from the SPherical k-means (SPKM) algorithm and its underlying mixture of von-Mises Fisher (vMF) distributions. The proposed approach yields comparable results to baseline batch algorithm for stable data streams and superior results for rapidly evolving data streams.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Time-sensitive clustering evolving textual data streams

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Applications in Technology

Lead the way for us

Similar Papers

Efficient streaming text clustering
Shi Zhong
Neural Networks | VOL. 18
Shi ZhongShi Zhong
01 Jul 2005
Neural Networks | VOL. 18

Social media sentiment analysis through parallel dilated convolutional neural network for smart city applications
Muhammad Alam ... L.V Yunrong
Computer Communications | VOL. 154
Muhammad Alam, et. al.Muhammad Alam ... L.V Yunrong
19 Feb 2020
Computer Communications | VOL. 154

Social Media as Distribution Tool
Mikko Villi
-
Mikko VilliMikko Villi
29 Apr 2019
29 Apr 2019

THE IMPACT OF SOCIAL MEDIA ACTIVITY, INTERACTIVITY, AND CONTENT ON CUSTOMER SATISFACTION: A STUDY OF FASHION PRODUCTS
Muhammad Tahir Jan ... Johan De Jager
EURASIAN JOURNAL OF BUSINESS AND MANAGEMENT | VOL. 8
Muhammad Tahir Jan, et. al.Muhammad Tahir Jan ... Johan De Jager
01 Jan 2020
EURASIAN JOURNAL OF BUSINESS AND MANAGEMENT | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Time-sensitive clustering evolving textual data streams

Abstract

Talk to us

Similar Papers

More From: International Journal of Computer Applications in Technology