Abstract
Practitioners and academics have long appreciated the benefits of covariate balancing when they conduct randomized experiments. For web-facing firms running online A/B tests, however, it still remains challenging in balancing covariate information when experimental subjects arrive sequentially. In this paper, we study an online experimental design problem, which we refer to as the online blocking problem. In this problem, experimental subjects with heterogeneous covariate information arrive sequentially and must be immediately assigned into either the control or the treated group. The objective is to minimize the total discrepancy, which is defined as the minimum weight perfect matching between the two groups. To solve this problem, we propose a randomized design of experiment, which we refer to as the pigeonhole design. The pigeonhole design first partitions the covariate space into smaller spaces, which we refer to as pigeonholes, and then, when the experimental subjects arrive at each pigeonhole, balances the number of control and treated subjects for each pigeonhole. We analyze the theoretical performance of the pigeonhole design and show its effectiveness by comparing against two well-known benchmark designs: the matched-pair design and the completely randomized design. We identify scenarios when the pigeonhole design demonstrates more benefits over the benchmark design. To conclude, we conduct extensive simulations using Yahoo! data to show a 10.2% reduction in variance if we use the pigeonhole design to estimate the average treatment effect. This paper was accepted by George Shanthikumar, data science. Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2023.02184 .
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.