Abstract

In the field of intelligent information retrieval (IR), latent semantic indexing (LSI) is a popular technique used to retrieve information related more in meaning than in lexical matching. A core component in the process is the use of the singular value decomposition (SVD) which is used to remove the lexical noise in the term document matrix (TDM). The topic of mathematical modelling for noise reduction in LSI is important and demands attention. In this paper some observations on aspects of this topic are introduced. The work addresses a definition for noise in text processing and seeks to determine the best structure for the TDM. In other words, the structure of the TDM that would facilitate efficient searching within the LSI.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call