Abstract

The rapid growth of the Internet and support for interoperability protocols has increased the number of Web accessible sources, WebSources. Current wrapper mediator architectures need to be extended with a Wrapper Cost Model (WCM) for WebSources that can estimate the response time (delays) to access sources as well as other relevant statistics. In this paper we present a Web Prediction Tool (WebPT), that is used by the WCM to estimate delays. We compare WebPT learning with the more traditional Neural Network (NN) learning, for this environment. Both the WebPT and the NN learning is based on query feedback (qfb) of response times from accessing WebSources. Experiment data was collected from several sources, and those dimensions that were significant in estimating the response time were determined This includes Time of day, Day, and Quantilty of data. Both the WebPT and the NN use these dimensions to learn response times (delay) from a particular source, and then to predict the expected response times for some query. We note that the WebPT learning is always online, i.e., it learns from each new query feedback. NN training can be online (per-pattern learning), which is time consuming and can be very sensitive to the choice of training parameters. The more common and robust learning is of fine batch learning (per-epoch). We compared the WebPT learning with both types of NN learning, in a number of experiments. The ease of training the WebPT makes it preferable compared to the per-pattern NN. Further the prediction error of both the WebPT and the NN was comparable We conclude that both the online WebPT and the more sophisticated NN learning are useful in constructing a Wrapper Cost Model for the dynamic Web environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call