A Comparative Analysis of Sampling Techniques for Click-Through Rate Prediction in Native Advertising

Nadir Sahllal,El Mamoun Souidi

doi:10.1109/access.2023.3255983

Nadir Sahllal, El Mamoun Souidi

Open Access

PDF Available

https://doi.org/10.1109/access.2023.3255983

Copy DOI

Export

Save

Cite

Journal: IEEE Access	Publication Date: Jan 1, 2023
Citations: 2	License type: CC BY-NC-ND 4.0

Affiliation: Mohammed V University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Native advertising is a popular form of online advertisements that has similar styles and functions with the native content displayed on online platforms, such as news, sports and social websites. It can better capture users’ attention, and they have gained increasing popularity in many online platforms and among advertisers. In advertising, Click Trough Rate (CTR) prediction is essential but challenging due to data sparsity: the non-clicks constitute most of the data, whereas clicks form a significantly smaller portion. The performance of 19 class imbalance approaches is compared in this study with the use of four traditional classifiers, to determine the most effective imbalance methods for our native ads dataset. The data used is real traffic data from Finland over the course of seven days provided by the native advertising platform ReadPeak. The resampling methods used include seven undersampling techniques, four oversampling techniques, four hybrid sampling techniques, and four ensemble systems. The findings demonstrate that class imbalance learning can enhance the model’s capacity for classification by as much as 20%. In general, oversampling is more stable comparatively. But, undersampling performed the best with Random Forest. Our study also demonstrates that the imbalance ratio plays an important role in the performance of the model and the features importance.

Full Text