Abstract

This paper proposes a fuzzy relational clustering (FRC) to find similar sentences from a set of sentences as well as group them in clusters. For finding similar sentences here FRC used both word-to-word and order similarity. For word-to-word similarity FRC used Jiang and Conrath similarity measure (JnC) with the help of WordNet database. Order similarity is calculated from joint word set. As a sentence may relate to more than one theme so FRC used a fuzzy clustering approach. Here FRC used FRECCA algorithm for the sentence clustering purpose. The algorithm works on Expectation-Maximization where importance of a sentence is expressed by PageRank score which is treated as likelihood. The PageRank scores and mixing coefficients are initialized with Uniform Random Number generation technique. Applying this method on a quotation dataset of different classes we found that it is capable of identifying and grouping similar sentences in a cluster. FRC is also applied on a news article dataset and found admirable results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call