Abstract
Websites on the internet are useful source of infor mation in our day-to-day activity. Web Usage Mining (WUM) is one of the major applications of data mini ng, artificial intelligence and so on to the web da ta to predict the user’s visiting behaviours and obtains their interests by analyzing the patterns.WUM has t urned out to be one of the considerable areas of research in the field of computer and information science. Weblog is one of the major sources which contain all the i nformation regarding the users visited links, brows ing patterns, time spent on a page or link and this inf ormation can be used in several applications like a daptive web sites, personalized services, customer profilin g, pre-fetching, creating attractive web sites etc. WUM consists of preprocessing, pattern discovery and pa ttern analysis. Log data is typically noisy and unc lear, so preprocessing is an essential process for effective mining process. In the preprocessing phase, the da ta cleaning process includes removal of records of gra phics, videos, format information, records with the failed HTTP status code and robots cleaning. In the second phase, the user behaviour is organized into a set of clusters using Weighted Fuzzy-Possibilistic C-Me ans (WFPCM), which consists of “similar” data items based on the user behaviour and navigation patterns for the use of pattern discovery. In the third pha se, classification of the user behaviour is carried out for the purpose of analyzing the user behaviour us ing Adaptive Neuro-Fuzzy Inference System with Subtractive Algorithm (ANFIS-SA). The performance of the proposed work is evaluated based on accuracy, execution time and convergence behaviour using anonymous microsoft web dataset.
Highlights
The enormous amount of data stored in files, databases and other repositories, it is progressively more important to develop powerful means for analysis and perhaps interpretation of such data for the extraction of interesting knowledge that could help in decision-making.Huge development of the information accessible by means of the Internet induces its complexity in manageability
It is observed that the proposed ANFIS-Subtractive Algorithm (SA) approach has higher prediction accuracy since it predicts the user navigation pattern accurately when compared with the other proposed approaches
The accuracy obtained by the proposed ANFIS-SA is 98.52% whereas the accuracy obtained by LCS and ANFIS approaches are 88.36 and 89.29% respectively
Summary
Huge development of the information accessible by means of the Internet induces its complexity in manageability. The beginning of the World Wide Web (WWW) has overwhelmed home computer users with vast amount of information (Berners-Lee et al, 1994). Almost any kind of topic one is in need, can find certain pieces of information that are made obtainable by other internet citizens, ranging from individual users that upload an inventory of their record gatherings, to major companies that do business through the web. Internet activity resulting from the user interaction produces a huge quantity of data accumulated in web access log files. WUM exploits data mining techniques to analyze the user access to websites
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have