Abstract

Many documents with descriptions of halal products are available through resources from the Internet web pages. User may enquire for halal-related information through query words and as a result of the query user will be present- ed list of documents relevant to the query. We investigate on topic analysis techniques such as Latent Semantic Anal- ysis (LSA). For retrieval purposes, frequency-based inverted indexing and latent semantic indexing (LSI) techniques are used to discover the important association of the relationship between terms and terms, terms and documents and documents and documents. Cosine similarity measurement is used to measure the similarity between the query word and terms as well as the documents. We develop a prototype and evaluate the techniques on Malay test collection which contain documents extracted from translated Al-Quran collection, translated hadiths collection and web pages written in Malay language. Results and analysis show that, LSI technique outperformed the exact frequency-based technique despite the longer processing time it took during the indexing. We compare and discuss the result we get from using latent semantic with the result from using conventional frequency analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call