Abstract
Most of web users use various search engines to get specific information. A key factor in the success of web search engines are their ability to rapidly find good quality results to the queries that are based on specific terms. This paper aims at retrieving more relevant documents from a huge corpus based on the required information. We propose a particle swarm optimization algorithm based on latent semantic indexing (PSO+LSI) for text clustering. PSO family of bio-inspired algorithms has recently successfully been applied to a number of real word clustering problems. We use an adaptive inertia weight (AIW) that do proper exploration and exploitation in search space. PSO can merge with LSI to achieve best clustering accuracy and efficiency. This framework provides more relevant documents to the user and reduces the irrelevant documents. It would be seen that for all numbers of dimensions, PSO+LSI are faster than PSO+Kmeans algorithms using vector space model (VSM). It takes 22.3 s for PSO+LSI method with 1000 terms to obtain its best performance on 150 dimensions. Key words: Vector space model, particle swarm optimization (PSO) algorithm, latent semantic indexing, text clustering, adaptive inertia weight.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.