Abstract

This paper discusses about the significance of data preprocessing methods and various steps involved in getting the required content successfully. An entire preprocessing technique is being planned to preprocess the web log for extraction of user patterns. Data cleansing algorithm is applied to eliminate the extraneous entries from web log at the same time filtering algorithm is used to discard the impassive attributes from log file. The outlier are detected and removed from the dataset. The User and sessions are identified. The performance of the data cleansing process was evaluated by adapting the wrapper approach in which the resultant cleaned dataset are clustered using five different clustering algorithms namely Farthest First, K-means, COBWEB, make density based algorithm and Expectation maximization algorithm to identify the quality of web log data

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.