Abstract

Latent Semantic Indexing (LSI) successfully retrieves the relations between the text documents. LSI technique achieves dynamic clustering on the basis of conceptual contents of documents. But, inaccurate extraction to essential sentences, retrieval time and redundancy poses significant challenges and has to be addressed. Also, the LSI technique does not address the problem of document search with multimedia contents. This paper introduces a new concept of vantage point called, Vantage Point Latency Semantic Indexing (VP-LSI) to improve the retrieving accuracy and fastness of clustered multimedia web document. Rayleigh cluster SOM web document are indexed with vantage point of predominant word in the clusters. Similarly, the vantage point for the multimedia objects are used to cluster the multimedia documents. Vantage point is calculated based on the relative strength of the predominant word to Rayleigh clusters in the domain area with mean and standard deviation. The Vantage point for the image objects are estimated based on the mean and standard deviation of histogram values. Then, indexing of the cluster objects are initialized to the documents. Finally, retrievals of web document based on user query are searched in the indexed Rayleigh cluster objects to fetch more accurate web document in quicker time. Experimental evaluation is conducted with searching of web document in multiple domain area available in research repositories. The results show that the proposed method results in lesser time than the existing algorithm. Moreover, the proposed method results in better search document accuracy, reducing the time to retrieve web documents and indexing rate than the existing Lexical Semantic Indexing approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call