Abstract

With the explosive growth of data available on the World Wide Web (WWW), discovery and analysis of useful information from the World Wide Web becomes a practical necessity. Data Mining is primarily concerned with the discovery of knowledge and aims to provide answers to questions that people do not know how to ask. It is not an automatic process but one that exhaustively explores very large volumes of data to determine otherwise hidden relationships. The process extracts high quality information that can be used to draw conclusions based on relationships or patterns within the data. Using the techniques used in Data Mining, Web Mining applies the techniques to the Internet by analyzing server logs and other personalized data collected from customers to provide meaningful information and knowledge. Web access pattern, which is the sequence of accesses pursued by users frequently, is a kind of interesting and useful knowledge in practice (Pei, 2000). Today web browsers provide easy access to myriad sources of text and multimedia data. With approximately 4.3 billion documents online and 20 million new web pages published each day (Tanasa and Trousse, 2004), more than 1 000 000 000 pages are indexed by search engines, and finding the desired information is not an easy task (Pal et al., 2002). Web Mining is now a popular term of techniques to analyze the data from World Wide Web (Pramudiono, 2004). A widely accepted definition of the web mining is the application of data mining techniques to web data. With regard to the type of web data, web mining can be classified into three types: Web Content Mining, Web Structure Mining and Web Usage Mining. As an important extension of data mining, Web mining is an integrated technology of various research fields including computational linguistics, statistics, informatics, artificial intelligence (AI) and knowledge discovery (Fayyad et al., 1996; Lee and Liu, 2001). Srivastava et al. (2002) classified Web Mining into three categories: Web Content Mining, Web Structure Mining, and Web Usage Mining (see Figure 1). 11

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call