Combining the Web content and usage mining to understand the visitor behavior in a Web site

J Velasquez,H Yasuda,T Aoki

doi:10.1109/icdm.2003.1251004

Abstract

A Web site is a semi structured collection of different kinds of data, whose motivation is to show relevant information to a visitor and in this way capture her/his attention. Understanding the specific preferences that define the visitor behavior in a Web site is a complex task. An approximation is supposed that depends on the content, navigation sequence and time spent in each page visited. These variables can be extracted from the Web log files and the Web site itself, using Web usage and content mining respectively. Combining the described variables, a similarity measure among visitor sessions is introduced and used in a clustering algorithm, which identifies groups of similar sessions, allowing the analysis of visitor behavior. In order to prove the methodology's effectiveness, it was applied in a certain Web site, showing the benefits of the described approach.

Full Text