On verifying the authenticity of e-commercial crawling data by a semi-crosschecking method

Tran Khanh Dang,Duc Minh Chau Pham,Duc Dan Ho

doi:10.1108/ijwis-10-2018-0075

Tran Khanh Dang, Duc Minh Chau Pham + Show 1 more

Open Access

https://doi.org/10.1108/ijwis-10-2018-0075

Copy DOI

Abstract

Purpose Data crawling in e-commerce for market research often come with the risk of poor authenticity due to modification attacks. The purpose of this paper is to propose a novel data authentication model for such systems. Design/methodology/approach The data modification problem requires careful examinations in which the data are re-collected to verify their reliability by overlapping the two datasets. This approach is to use different anomaly detection techniques to determine which data are potential for frauds and to be re-collected. The paper also proposes a data selection model using their weights of importance in addition to anomaly detection. The target is to significantly reduce the amount of data in need of verification, but still guarantee that they achieve their high authenticity. Empirical experiments are conducted with real-world datasets to evaluate the efficiency of the proposed scheme. Findings The authors examine several techniques for detecting anomalies in the data of users and products, which give the accuracy of 80 per cent approximately. The integration with the weight selection model is also proved to be able to detect more than 80 per cent of the existing fraudulent ones while being careful not to accidentally include ones which are not, especially when the proportion of frauds is high. Originality/value With the rapid development of e-commerce fields, fraud detection on their data, as well as in Web crawling systems is new and necessary for research. This paper contributes a novel approach in crawling systems data authentication problem which has not been studied much.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On verifying the authenticity of e-commercial crawling data by a semi-crosschecking method

Abstract

Talk to us

Similar Papers

More From: International Journal of Web Information Systems

Lead the way for us

Journal: International Journal of Web Information Systems	Publication Date: Oct 7, 2019
Citations: 5

Similar Papers

Consumer Willingness to Share Personal Digital Information for Health-Related Uses
David Grande ... Nandita Mitra
JAMA Network Open | VOL. 5
David Grande, et. al.David Grande ... Nandita Mitra
24 Jan 2022
JAMA Network Open | VOL. 5

Credit Card-Not-Present Fraud Detection and Prevention Using Big Data Analytics Algorithms
Abdul Razaque ... Muder Almiani
Applied Sciences | VOL. 13
Abdul Razaque, et. al.Abdul Razaque ... Muder Almiani
21 Dec 2022
Applied Sciences | VOL. 13

Fraud Detection Protocol for Web-Based Research Among Men Who Have Sex With Men: Development and Descriptive Evaluation.
April M Ballard ... Trey Cardwell
JMIR Public Health and Surveillance | VOL. 5
April M Ballard, et. al.April M Ballard ... Trey Cardwell
04 Feb 2019
JMIR Public Health and Surveillance | VOL. 5

Feature Extracted Deep Neural Collaborative Filtering for E-Book Service Recommendations
Ji-Yoon Kim ... Chae-Kwan Lim
Applied Sciences | VOL. 13
Ji-Yoon Kim, et. al.Ji-Yoon Kim ... Chae-Kwan Lim
05 Jun 2023
Applied Sciences | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On verifying the authenticity of e-commercial crawling data by a semi-crosschecking method

Abstract

Talk to us

Similar Papers

More From: International Journal of Web Information Systems