Mining Fuzzy Weighted Browsing Patterns from Time Duration and with Linguistic Thresholds

Ming-Jer Chiang,Shyue-Liang Wang,Tzung-Pei Hong

doi:10.3844/ajassp.2008.1611.1621

Abstract

World-wide-web applications have grown very rapidly and have made a significant impact on computer systems. Among them, web browsing for useful information may be most commonly seen. Due to its tremendous amounts of use, efficient and effective web retrieval has become a very important research topic in this field. Techniques of web mining have thus been requested and developed to achieve this purpose. In this research, a new fuzzy weighted web-mining algorithm is proposed, which can process web-server logs to discover useful users’ browsing behaviors from the time durations of the paged browsed. Since the time durations are numeric, fuzzy concepts are used here to process them and to form linguistic terms. Besides, different web pages may have different importance. The importance of web pages are evaluated by managers as linguistic terms, which are then transformed and averaged as fuzzy sets of weights. Each linguistic term is then weighted by the importance for its page. Only the linguistic term with the maximum cardinality for a page is chosen in later mining processes, thus reducing the time complexity. The minimum support is set linguistic, which is more natural and understandable for human beings. An example is given to clearly illustrate the proposed approach.

Highlights

World-wide-web applications have recently grown very rapidly and have made a significant impact on computer systems
The algorithm Input: A set of n web log records, a set of m web pages with their importance evaluated by d managers, three sets of membership functions, respectively for browsing duration, web page importance and minimum support and a pre-defined linguistic minimum support value α
A simple example to show how the proposed algorithm can be used to generate fuzzy weighted browsing patterns for clients' browsing behavior according to the log data in a web server

Summary

INTRODUCTION

World-wide-web applications have recently grown very rapidly and have made a significant impact on computer systems. Since the time durations are numerical and the page importance and the minimum support are linguistic, fuzzy-set concepts are used to process them. The proposed fuzzy weighted web-mining algorithm uses the set of membership functions for importance to transform managers’ linguistic evaluations of the importance of web pages into fuzzy weights. The algorithm calculates the weighted supports of the linguistic terms of web pages from browsing sequences. The algorithm Input: A set of n web log records, a set of m web pages with their importance evaluated by d managers, three sets of membership functions, respectively for browsing duration, web page importance and minimum support and a pre-defined linguistic minimum support value α. Denote the d-th tuple in Di as Did. Step 6: Transform the duration value vgid of the web page Ig in Did into a fuzzy set figd , represented as f g1 id. Step 8: Calculate the count of each fuzzy region Rgk in the browsing sequences as:

9: Find max- countg l

Membership value

Weight

CONCLUSION AND FUTURE