Web Document Clustering and Visualization Results of Semantic Web Search Engine Using V-Ranking

S K Jayanthi,S Prema

doi:10.7763/ijcte.2011.v3.350

Abstract

As the number of available Web pages grows; it is become more difficult for users finding documents relevant to their interests. Clustering is the classification of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often proximity according to some defined distance measure. Because of the short lengths of queries, approaches based on keywords are not suitable for document clustering. This paper describes a new Web Document Clustering method that makes use of user logs which allow identifying the documents the users have selected for a query. The similarity between two queries may be deduced from the common documents the users selected for them. This research paper show that a combination of both content based and session based clustering (1) is better than using either method alone. The clustered documents are arranged based on V-Ranking. In this research work, it has been proposed to display the result in visual mode of semantic search engine using V (Visual) - Ranking algorithm and bookshelf data structure. This paper proposes a semantic web search results in visualize web graphs, representations of web structure overlaid with information and pattern tiers by providing the viewer with a qualitative understanding of the information contents. for a search engine, of session based clustering. In this paper it is proposed a clustering based approach to support the comprehension of web applications. The approach is based on a clustering process that first computes the dissimilarity between the web pages using Latent Semantic Indexing, a well known information retrieval technique, and then group's similar pages. To automate the clustering process a prototype has been also implemented. The results obtained by applying the different clustering algorithms on the static pages of three web applications developed using JSP technology. Documents clustered based on both content based and session based clustering are ordered based on V-Ranking. It is named so because the ordered documents are arranged in the shelf of book shelf data structure.Finallly the web search result is executed in visual mode. So, the ranking algorithm is named as visual ranking. The main focus of this paper is the processing of the results coming from an information retrieval system. Although the relevance depends on the results quality, the effectiveness of the results processing represents an alternative way to improve the relevance for the user. Given the current expectations this processing is composed by an organization step and a visualization step. Then the proposed approach organizes the results according to their meaning using a Bookshelf Data Structure, and visualizes (2) them in a 3D scene to increase the representation space. This paper deals with the processing of query results. This processing, still neglected in some information retrieval systems, is becoming more and more important and essential. The two main points to reach this goal are a good document organization and an effective visualization. Concerning these two aspects, the main directions of this paper are a Clustering method and a 3D visualization.

Full Text