A modified fuzzy relational clustering approach for sentence-level text

Sikder Tahsin Al-Amin,Mahade Hasan,M M A Hashem

doi:10.1109/eict.2015.7392016

Abstract

This paper proposes a fuzzy relational clustering (FRC) to find similar sentences from a set of sentences as well as group them in clusters. For finding similar sentences here FRC used both word-to-word and order similarity. For word-to-word similarity FRC used Jiang and Conrath similarity measure (JnC) with the help of WordNet database. Order similarity is calculated from joint word set. As a sentence may relate to more than one theme so FRC used a fuzzy clustering approach. Here FRC used FRECCA algorithm for the sentence clustering purpose. The algorithm works on Expectation-Maximization where importance of a sentence is expressed by PageRank score which is treated as likelihood. The PageRank scores and mixing coefficients are initialized with Uniform Random Number generation technique. Applying this method on a quotation dataset of different classes we found that it is capable of identifying and grouping similar sentences in a cluster. FRC is also applied on a news article dataset and found admirable results.

Full Text