Abstract

Abstract Statistical measure of finding Similar Sentences using a novel Fuzzy clustering algorithm framework is developed which organizes text from one or more documents into different clusters. The traditional fuzzy clustering approaches are not applicable to sentence clustering because most sentence similarity measures do not represent sentences in a common metric space. An enhanced Fuzzy clustering algorithm is applied in the sentence of datasets to group the related sentences. Page Rank algorithm highlights the more relevant inter clusters which interprets the Page-Rank score of an object. Expectation- Maximization (EM) framework has been developed in order to predict the overlapping clusters of semantically related sentences. Quotations dataset and News article dataset empirically implies the Similarity measure of matching Semantically Similar Sentences in which our system out performs the baseline method and projection methods. Our proposed method performs 34% higher in similarity scoring of related sentences. It also analyzes the clustering performance in terms of Entropy and Purity which yields more Purity and less Entropy. Our Experimental results demonstrates that our method is capable of identifying the overlapping clusters of semantically related sentences, and can be used in a variety of text mining tasks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.