Abstract

Besides the ID class features, the advertisement click log file contains many significant features, which make the study of the advertisement clickthrough rate prediction more difficult. In this study, we convert original features into numerical meaningful ones, which reduce the sparsity and redundancy. In order to solve the problem of class imbalance, we propose a downsampling algorithm based on K-means model to classify large samples, then divide them into some sensible and rational features by the heuristic methods. To further improve the feature representation, we finally select and combine features by the Gradient Boosting Decision Tree model and process high-dimensional features by the logistic regression method. We conducted experiments on the dataset of Tencent SOSO and demonstrated that our approach outperforms the state-of-the-art baseline methods by 0.05% on average in terms of R2 and by 50.5% on average in terms of RMSE.

Highlights

  • The rapid development of the Internet provides a broad platform for the advertising industry

  • We proposed a downsampling algorithm based on K-means model to solve the class imbalance problem from the data level, and at the same time, alleviate the problem of useful information loss caused by downsampling

  • Based on the current research difficulties,we first performs feature extraction based on experimental data and actual business analysis, with the aim of reducing feature redundancy and feature sparsity and improving feature expression

Read more

Summary

Introduction

The rapid development of the Internet provides a broad platform for the advertising industry. Internet advertising [1] has the advantages of wide user, strong interaction, and real-time flexibility, which makes the advertising industry gradually tilt toward it. Most of search engines earn profit by placing text advertisements next to search results. According to eMarketer data, all online advertising revenue (including PC and mobile advertising revenue) reached $207.3 billion in the global advertising market in 2018, with an increase of 13.76%. In 2018, Google is still the dominant player in the mobile advertising market. Its revenue in the mobile Internet advertising market is 63.7 billion, accounting for 40.59% of the market share; followed by Facebook, its Internet advertising revenue is 50.1 billion and the market share was 31.92%; Alibaba with a market share of 14.49%, ranking third

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call