Abstract
In online advertising, the user-clicks dataset based fraudulent publishers’ classification models exhibit poor performance due to high skewness in class distribution of the publishers. The nearest-neighbor based classification techniques are popularly used to reduce the impact of class skewness on performance. The Nearest-Neighbor techniques use Prototype Selection (PS) methods to select promising samples before classifying them for reducing the size of training data. Although Nearest-Neighbor techniques are simple to use and reduce the negative impact of the loss of potential information, they suffer from higher storage requirements and slower classification speed when applied on datasets with skewed class distributions. In this paper, we propose a Quad Division Prototype Selection-based k-Nearest Neighbor classifier (QDPSKNN) by introducing quad division method for handling uneven class distribution. The quad-division divides the data into four quartiles (groups) and performs controlled under-sampling for balancing class distribution. It reduces the size of the training dataset by selecting only the relevant prototypes in the form of nearest-neighbors. The performance of QDPSKNN is evaluated on Fraud Detection in Mobile Advertising (FDMA) user-click dataset and fifteen other benchmark imbalanced datasets to test its generalizing behaviour. The performance is also compared with one baseline model (k-NN) and four other prototype selection methods such as NearMiss-1, NearMiss-2, NearMiss-3, and Condensed Nearest-Neighbor. The results show improved classification performance with QDPSKNN in terms of precision, recall, f-measure, g-mean, reduction rate and execution time, compared to existing prototype selection methods in the classification of fraudulent publishers as well as on other benchmark imbalanced datasets. Wilcoxon signed ranked test is conducted to demonstrate significant differences amid QDPSKNN and state-of-the-art methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Engineering Science and Technology, an International Journal
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.