Abstract

Studies the use of neural networks in a stochastic simulation of the number of rejected Web pages per search query. The evaluation of the quality of search engines should involve not only the resulting set of Web pages but also an estimate of the rejected set of pages. The iterative radial basis function (RBF) neural network developed by G. Meghabghab and G. Nasr (1999) was adapted to an actual evaluation of the number of rejected Web pages on four search engines, viz. Yahoo, Alta Vista, Google and Northern Light. Nine input variables were selected for the simulation. Typical stochastic simulation meta-models use regression models in response surface methods. An RBF divides the resulting set of responses to a query into accepted and rejected Web pages. The RBF meta-model was trained on 937 examples from a set of 9,000 different simulation runs on nine input variables. The results show that the number of rejected Web pages for a specific set of search queries on these four engines is very high. Also, a goodness measure of a search engine for a given set of queries can be designed which is a function of the coverage of the search engine and the normalized age of a new document in the resulting set for the query. This study concludes that, unless search engine designers address the issues of rejected Web pages, indexing and crawling, then the usage of the Web as a research tool for academic and educational purposes will remain hindered.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call