Abstract

AbstractCommon text-clustering algorithms often require a given hyperparameter a-priori to specify the number of clusters as well as the presence of all documents beforehand. From practical point of view this is a shortcoming especially for applications that process dynamically changing corpora, e.g. like in search-engines. Novel graph-based clustering algorithms have been recently developed in order to group similar topics, represented by documents or terms, in clusters without the need of the users intervention. Within this paper a comparison between classical clustering algorithms and graph-based algorithms is made in order to evaluate the state of the art and find further optimizations especially for the SeqClu (sequential clustering algorithm) which is in focus of the authors work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call