Abstract

Ever since the beginning of the Web, finding interesting/useful information from the Web has been an important problem. Existing approaches mainly include keyword-based search, wrapper-based information extraction, Web query, and explicit user preferences. All these approaches essentially find relevant information using explicit specifications of the user. In this paper, we argue that this is insufficient. There is another type of information that is also of great interest, i.e., the unexpected information, which is unanticipated by the user. Finding unexpected information is important in many applications. For example, it is useful for a company to find unexpected information about its competitors. With more and more companies venturing into e-commerce and using the Web to promote their products and services, finding interesting information from these Web sites is increasingly becoming an important issue for business intelligence. Since the number of pages of a typical commercial site is large and there are also many relevant sites, it is very difficult for a human user to view each page to discover interesting/useful information. Automated assistance is needed. In this paper, we first study the issue of information interestingness in the context of the Web. We then propose a number of methods to help the user find various types of interesting (including unexpected) information from a Web site. A system (called WebCompare) based on the proposed methods has been implemented. Experimental results and applications show that the system is useful in practice and also efficient.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call