Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis

Mohammed Ali Mohammed,Rula A Hamid,Reem Razzaq Abdulhussein

doi:10.25195/ijci.v50i2.486

Abstract

Data collection and data preprocessing are crucial stages in web usage mining, mainly because of the unstructured, diverse, and noisy nature of log data. During data collection, log file datasets are loaded and merged. Effective and comprehensive data preprocessing plays a vital role in ensuring the efficiency and scalability of algorithms used in the pattern discovery phase of web usage mining. This work aims to address these phases by introducing two innovative approaches. The first approach focuses on determining the device used for accessing the web, distinguishing between computers and mobile devices. The second approach aims to determine user sessions and complete paths by utilizing the referrer URL. The entire preprocessing pipeline has been implemented using the C# programming language, and the source code is available on GitHub at the following link: https://github.com/Mohammed91/Web-Usage-Mining.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis

Abstract

Talk to us

Similar Papers

More From: Iraqi Journal for Computers and Informatics

Lead the way for us

Journal: Iraqi Journal for Computers and Informatics	Publication Date: Nov 16, 2024
License type: CC BY-NC-ND 4.0

Similar Papers

Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis
Mohammed Ali Mohammed ... Reem Razzaq Abdulhussein
Iraqi Journal for Computers and Informatics | VOL. 50
Mohammed Ali Mohammed, et. al.Mohammed Ali Mohammed ... Reem Razzaq Abdulhussein
16 Nov 2024
Iraqi Journal for Computers and Informatics | VOL. 50

Hybrid Machine Learning Approaches for 5G Traffic Prediction
Mohamed Burhan Mohamed Almajamie
Iraqi Journal for Computers and Informatics | VOL. 50
Mohamed Burhan Mohamed AlmajamieMohamed Burhan Mohamed Almajamie
12 Nov 2024
Iraqi Journal for Computers and Informatics | VOL. 50

Chronic Kidney Disease (CKD) Diagnosis using Machine Learning Methodology Classifications
Ahmed Sami Jaddoa
Iraqi Journal for Computers and Informatics | VOL. -
Ahmed Sami JaddoaAhmed Sami Jaddoa
05 Oct 2024
Iraqi Journal for Computers and Informatics | VOL. -

Using Artificial Intelligence Algorithms to Study the Relative Importance of Macroeconomic Variables on Foreign Trade in Iraq
Hassan Muayad Ibrahim ... Ali N Yousif
Iraqi Journal for Computers and Informatics | VOL. 50
Hassan Muayad Ibrahim, et. al.Hassan Muayad Ibrahim ... Ali N Yousif
01 Oct 2024
Iraqi Journal for Computers and Informatics | VOL. 50

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data Collection and Preprocessing in Web Usage Mining: Implementation and Analysis

Abstract

Talk to us

Similar Papers

More From: Iraqi Journal for Computers and Informatics