The frequent usage of website access the wide range of information in the internet world. There is a need of system to handle the web usage data efficiently. The preprocessing data is further used for knowledge discovery process. The data preprocessing system for web usage mining gives a new approach to the user with the help of path completion algorithm. The user session identification algorithm is also implemented to automatically append the user data and the missing pages of the user access paths are also identified and added using the referrer-based method which resolves the problems facing by proxy server and local caching. Maximal forward reference and reference algorithms are considered as a solution for the average reference length of the auxiliary pages which is estimated in advance. Web access log are used to retrieve data from the server which is generated by the client whenever the request is made. By using the proposed path completion algorithm with the web access log, it efficiently appends the lost information of the visitor and the reliability of the access data also shows better improvement.
Read full abstract