Abstract
This paper presents a new indiscernibility-based rough agglomerative hierarchical clustering algorithm for sequential data. In this approach, the indiscernibility relation has been extended to a tolerance relation with the transitivity property being relaxed. Initial clusters are formed using a similarity upper approximation. Subsequent clusters are formed using the concept of constrained-similarity upper approximation wherein a condition of relative similarity is used as a merging criterion. We report results of experimentation on msnbc web navigation dataset that are intrinsically sequential in nature. We have compared the results of the proposed approach with that of the traditional hierarchical clustering algorithm using vector coding of sequences. The results establish the viability of the proposed approach. The rough clusters resulting from the proposed algorithm provide interpretations of different navigation orientations of users present in the sessions without having to fit each object into only one group. Such descriptions can help web miners to identify potential and meaningful groups of users.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.