Offline Multi-Action Policy Learning: Generalization and Optimization

Zhengyuan Zhou,Susan Athey,Stefan Wager

doi:10.1287/opre.2022.2271

Abstract

As a result of digitization of the economy, more and more decision makers from a wide range of domains have gained the ability to target products, services, and information provision based on individual characteristics. Examples include selecting offers, prices, advertisements, or emails to send to consumers, choosing a bid to submit in a contextual first-price auctions, and determining which medication to prescribe to a patient. The key to enabling this is to learn a treatment policy from historical observational data in a sample-efficient way, hence uncovering the best personalized treatment choice recommendation. In “Offline Policy Learning: Generalization and Optimization,” Z. Zhou, S. Athey, and S. Wager provide a sample-optimal policy learning algorithm that is computationally efficient and that learns a tree-based treatment policy from observational data. In our quest toward fully automated personalization, the work provides a theoretically sound and practically implementable approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Offline Multi-Action Policy Learning: Generalization and Optimization

Abstract

Talk to us

Similar Papers

More From: Operations Research

Lead the way for us

Journal: Operations Research	Publication Date: Jun 7, 2022
Citations: 51

Similar Papers

A SuperLearner-enforced approach for the estimation of treatment effect in pediatric trials.
Danila Azzolina ... Dario Gregori
Digital health | VOL. 9
Danila Azzolina, et. al.Danila Azzolina ... Dario Gregori
01 Jan 2023
Digital health | VOL. 9

Distributionally Robust Batch Contextual Bandits
Nian Si ... Zhengyuan Zhou
Management Science | VOL. 69
Nian Si, et. al.Nian Si ... Zhengyuan Zhou
31 Mar 2023
Management Science | VOL. 69

Prospects for climate-scale regional numerical modelling for the Arabian Gulf and Qatar's marine region
Y Sinan Husrevoglu ... Ebrahim S Al-Ansari
-
Y Sinan Husrevoglu, et. al.Y Sinan Husrevoglu ... Ebrahim S Al-Ansari
01 Jan 2015
01 Jan 2015

Train Localization Environmental Scenario Identification Using Features Extracted from Historical Data
Tao Zhang ... Debiao Lu
-
Tao Zhang, et. al.Tao Zhang ... Debiao Lu
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Offline Multi-Action Policy Learning: Generalization and Optimization

Abstract

Talk to us

Similar Papers

More From: Operations Research