Abstract

Latent Semantic Indexing is a conceptual method in information retrieval systems. In this method, a term-document matrix is built through term weighting techniques. This matrix is mapped to a conceptual space by mathematical decomposition techniques like Singular Value Decomposition. The more documents and key terms collection are, the more element of term-document matrix is created, causes difficulty to manage. Such a huge size of matrix needs more memory space to save and more calculation to find out the solutions. With the assumption of using distribution in order to decrease the required memory space and to reduce the run-time problem, we did a research and implemented distributed LSI. To meet a better improvement, clustering is concerned for document too. In this combination, term-document matrix is recreated for each cluster and retrieval is accomplished on these set of term-document matrices. We evaluate our combinational method on Hamshahri Collection which is the largest collection in Persian language. Evaluation shows remarkable improvement in contrast with non-combinational LSI method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call