Transfer learning is crucial for training models that generalize to unlabeled target populations using labeled source data, especially in real-world studies where label scarcity and covariate shift are common. While most research focuses on model estimation, there is limited literature on transfer inference for model accuracy despite its importance. We introduce a novel Doubly Robust Augmented Model Accuracy Transfer Inferen Ce (DRAMATIC) method for point and interval estimation of commonly used classification performance measures in an unlabeled target population with labeled source data. DRAMATIC derives and evaluates a potentially misspecified risk model for a binary response, leveraging high-dimensional adjustment features from both source and target data. It builds on an imputation model for the response mean and a density ratio model to characterize distributional shifts. The method constructs doubly robust estimators that are valid when either model is correctly specified and certain sparsity assumptions hold. Simulations show negligible bias in point estimation and satisfactory empirical coverage levels in confidence intervals. The utility of DRAMATIC is illustrated by transferring a genetic risk prediction model and its accuracy evaluation for type II diabetes across two patient cohorts in Mass General Brigham (MGB). Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
Read full abstract