Buildings account for 40% of the energy consumption and 31% of the CO2 emissions in the United States. Energy retrofits of existing buildings provide an effective means to reduce building consumption and carbon footprints. A key step in retrofit planning is to predict the effect of various potential retrofits on energy consumption. Decision-makers currently look to simulation-based tools for detailed assessments of a large range of retrofit options. However, simulations often require detailed building characteristic inputs, high expertise, and extensive computational power, presenting challenges for considering portfolios of buildings or evaluating large-scale policy proposals. Data-driven methods offer an alternative approach to retrofit analysis that could be more easily applied to portfolio-wide retrofit plans. However, current applications focus heavily on evaluating past retrofits, providing little decision support for future retrofits. This paper uses data from a portfolio of 550 federal buildings and demonstrates a data-driven approach to generalizing the heterogeneous treatment effect of past retrofits to predict future savings potential for assisting retrofit planning. The main findings include the following: (1) There is high variation in the predicted savings across retrofitted buildings, (2) GSALink, a dashboard tool and fault detection system, commissioning, and HVAC investments had the highest average savings among the six actions analyzed; and (3) by targeting high savers, there is a 110–300 billion Btu improvement potential for the portfolio in site energy savings (the equivalent of 12–32% of the portfolio-total site energy consumption).