Abstract
Document clustering is a traditional technique, and is used in multiple fields like data mining, information retrieval, knowledge discovery from data, pattern recognition etc. Large volumes of textual data being created in the modern world have resulted in the rise in importance of document clustering techniques. Although various document-clustering techniques have been studied in recent years, clustering quality still remains an area of concern. Particularly, majority of the present document clustering methods do not account for the semantic relationships and as a result give unsatisfactory clustering results. Semantic relationships consider the context of the usage of the term and do not solely rely on its isolated meaning. In the recent years, a lot of effort has gone into applying semantics to document clustering. This paper presents a survey of various research papers that have been studied and highlights the merits and demerits of each clustering algorithm. This will give a direction to future research in a more focused manner.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.