Abstract

Motivated by pricing in ad exchange markets, we consider the problem of robust learning of reserve prices against strategic buyers in repeated contextual second-price auctions. Buyers’ valuations for an item depend on the context that describes the item. However, the seller is not aware of the relationship between the context and buyers’ valuations, i.e., buyers’ preferences. The seller’s goal is to design a learning policy to set reserve prices via observing the past sales data, and her objective is to minimize her regret for revenue, where the regret is computed against a clairvoyant policy that knows buyers’ heterogeneous preferences. Given the seller’s goal, utility-maximizing buyers have the incentive to bid untruthfully in order to manipulate the seller’s learning policy. We propose learning policies that are robust to such strategic behavior. These policies use the outcomes of the auctions, rather than the submitted bids, to estimate the preferences while controlling the long-term effect of the outcome of each auction on the future reserve prices. When the market noise distribution is known to the seller, we propose a policy called Contextual Robust Pricing (CORP) that achieves a T-period regret of O(d log(Td)log(T)), where d is the dimension of the contextual information. When the market noise distribution is unknown to the seller, we propose two policies whose regrets are sublinear in T.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.