Reliable Machine Learning via Structured Distributionally Robust OptimizationData sets used to train machine learning (ML) models often suffer from sampling biases and underrepresent marginalized groups. Standard machine learning models are trained to optimize average performance and perform poorly on tail subpopulations. In “Distributionally Robust Losses for Latent Covariate Mixtures,” John Duchi, Tatsunori Hashimoto, and Hongseok Namkoong formulate a DRO approach for training ML models to perform uniformly well over subpopulations. They design a worst case optimization procedure over structured distribution shifts salient in predictive applications: shifts in (a subset of) covariates. The authors propose a convex procedure that controls worst case subpopulation performance and provide finite-sample (nonparametric) convergence guarantees. Empirically, they demonstrate their worst case procedure on lexical similarity, wine quality, and recidivism prediction tasks and observe significantly improved performance across unseen subpopulations.