An extensible approach for real-time bidding with model-free reinforcement learning

Yin Cheng,Luobao Zou,Zhiwei Zhuang,Jingwei Liu,Bin Xu,Weidong Zhang

doi:10.1016/j.neucom.2019.06.009

Abstract

In this paper, we propose an extensible framework for model-free reinforcement learning (RL) for real-time bidding (RTB) in display advertising. This framework can be applied into both simple environments and extend to the comprehensive environment that the DSP bids for multiple advertisers at the same time. To process new information that is collected via real-time interaction with the environment, an extensible model is first introduced, which is based on the distribution of the recharging probability. Substantial effort is expended to alleviate the problem of the sparsity of the click signal with the reward function. The proposed scheme has high feasibility and can address dynamic environments in contrast to prior works, which assumed that the distribution of the feature vectors and the dealing price were already known. Furthermore, a fund-recharging mechanism is introduced for transforming the RTB model into an endless task, which allows the policy to be optimized in a farsighted rather than a myopic manner. Illustrative experiments on both the small- and large-scale real datasets demonstrate the state-of-the-art performance of the proposed framework for the issue of interest.

Full Text