Abstract
Online advertising enables advertisers to reach customers with personalized ads. Advertisers need to determine the right targets for their ads and how much they are willing to pay to engage those targets. A large portion of online ads are priced using real-time auctions, thus advertisers need to decide which targets to bid on in these auctions. Collaborating with one of the largest ad-tech firms in the world, we develop new algorithms that help advertisers bid optimally on target portfolios while taking into account some limitations inherent to online advertising. We study this problem as a Multi-Armed Bandit (MAB) problem with periodic budgets. At the beginning of each time period, the advertiser needs to determine which portfolio of target to select to maximize the expected total revenue (revenue from clicks/conversions), while maintaining the total cost of auction payments within the advertising budget. In this paper, we formulate the problem and develop an Optimistic-Robust Learning (ORL) algorithm that uses ideas from Upper Confidence Bound (UCB) algorithms and robust optimization. We prove that the expected cumulative regret of the algorithm is bounded. Additionally, simulations on synthetic and real-world data show that the ORL algorithm reduces regret by at least 10-20% compared to benchmarks.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.