An evolutionary algorithm for the direct optimization of covariate balance between nonrandomized populations.

Frank Kleinjung,Hooman Sedghamiz,Stephen Privitera,Alexander Hartenstein,Tatsiana Vaitsiakhovich

doi:10.1002/pst.2352

Abstract

Matching reduces confounding bias in comparing the outcomes of nonrandomized patient populations by removing systematic differences between them. Under very basic assumptions, propensity score (PS) matching can be shown to eliminate bias entirely in estimating the average treatment effect on the treated. In practice, misspecification of the PS model leads to deviations from theory and matching quality is ultimately judged by the observed post-matching balance in baseline covariates. Since covariate balance is the ultimate arbiter of successful matching, we argue for an approach to matching in which the success criterion is explicitly specified and describe an evolutionary algorithm to directly optimize an arbitrary metric of covariate balance. We demonstrate the performance of the proposed method using a simulated dataset of 275,000 patients and 10 matching covariates. We further apply the method to match 250 patients from a recently completed clinical trial to a pool of more than 160,000 patients identified from electronic health records on 101 covariates. In all cases, we find that the proposed method outperforms PS matching as measured by the specified balance criterion. We additionally find that the evolutionary approach can perform comparably to another popular direct optimization technique based on linear integer programming, while having the additional advantage of supporting arbitrary balance metrics. We demonstrate how the chosen balance metric impacts the statistical properties of the resulting matched populations, emphasizing the potential impact of using nonlinear balance functions in constructing an external control arm. We release our implementation of the considered algorithms in Python.

Full Text