Domain Specific Web Pages Research Articles

World Wide Web (WWW) which is predominant source for Information Retrieval today (IR) is essentially a set of hyperlinked documents. A web page containing more number of related hyperlinks satisfy the user needs in a single page. The IR systems should give high priority to such web pages. While assigning a rank for a web page, existing web mining techniques such as Hypertext Induced Topic Selection (HITS) and Page Ranking algorithms focus on the number of in links and out links present in the web page. Instead of just relying on the number of links present in the web page, the discovery of semantic relations between the web page and the hyperlinks present in the web page can improve the quality of the IR systems. The Rhetorical Structure Theory (RST) is widely used to find the semantic relations between text fragments by analysing the discourse structure of a text. In this paper, we propose a novel approach to find the semantic relation between a web page and the links present in the web page using RST. The proposed approach uses RST based discourse relations to find the relation between a web page and the hyperlinks present in the web page. We have implemented and evaluated our approach on an IR system using 500 Tamil language and 50 English tourism domain specific web pages. A comparison between the proposed approach and an existing page ranking algorithm has also been done.

Read full abstract

Neural networks have been used in various applications on the World Wide Web, but most of them only rely on the available input-output examples without incorporating Web-specific knowledge, such as Web link analysis, into the network design. In this paper, we propose a new approach in which the Web is modeled as an asymmetric Hopfield Net. Each neuron in the network represents a Web page, and the connections between neurons represent the hyperlinks between Web pages. Web content analysis and Web link analysis are also incorporated into the model by adding a page content score function and a link score function into the weights of the neurons and the synapses, respectively. A simulation study was conducted to compare the proposed model with traditional Web search algorithms, namely, a breadth-first search and a best-first search using PageRank as the heuristic. The results showed that the proposed model performed more efficiently and effectively in searching for domain-specific Web pages. We believe that the model can also be useful in other Web applications such as Web page clustering and search result ranking

Read full abstract

Domain Specific Web Pages Research Articles

Related Topics

Articles published on Domain Specific Web Pages

An Approach to Page Ranking Based on Discourse Structures

Prioritize the ordering of URL queue in Focused crawler

Incorporating Web Analysis Into Neural Networks: An Example in Hopfield Net Searching

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Domain Specific Web Pages Research Articles

Related Topics

Articles published on Domain Specific Web Pages

An Approach to Page Ranking Based on Discourse Structures

Prioritize the ordering of URL queue in Focused crawler

Incorporating Web Analysis Into Neural Networks: An Example in Hopfield Net Searching