Abstract
We present a technique for simultaneously mining Web navigation patterns and maximally frequent context-sensitive itemsets (URL associations) from the historic user access data stored in Web server logs. A new hierarchical clustering technique that exploits the symbiosis between clusters in feature space and genetic biological niches in nature, called Hierarchical Unsupervised Niche Clustering (H-UNC) is presented. We use H-UNC as part of a complete system of knowledge discovery in Web usage data. Our approach does not necessitate fixing the number of clusters in advance, is insensitive to initialization, can handle noisy data, general non-differentiable similarity measures, and automatically provides profiles at multiple resolution levels. Our experiments show that our algorithm is not only capable of extracting meaningful user profiles on real Web sites, but also discovers associations between distinct URL pages on a site, with no additional cost. Unlike content based association methods, our approach discovers associations between different Web pages based only on the user access patterns and not on the page content. Also, unlike traditional context-blind association discovery methods, H-UNC discovers context-sensitive associations which are only meaningful within a limited context/user profile.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Computational Intelligence and Applications
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.