AbstractWe present a comprehensive global evaluation of monthly precipitation and temperature forecasts from 16 seasonal forecasting models within the NMME Phase-1 system, using Multi-Source Weighted-Ensemble Precipitation version 2 (MSWEP-V2; precipitation) and Climate Research Unit TS4.01 (CRU-TS4.01; temperature) data as reference. We first assessed the forecast skill for lead times of 1–8 months using Kling–Gupta efficiency (KGE), an objective performance metric combining correlation, bias, and variability. Next, we carried out an empirical orthogonal function (EOF) analysis to compare the spatiotemporal variability structures of the forecasts. We found that, in most cases, precipitation skill was highest during the first lead time (i.e., forecast in the month of initialization) and rapidly dropped thereafter, while temperature skill was much higher overall and better retained at higher lead times, which is indicative of stronger temporal persistence. Based on a comprehensive assessment over 21 regions and four seasons, we found that the skill showed strong regional and seasonal dependencies. Some tropical regions, such as the Amazon and Southeast Asia, showed high skill even at longer lead times for both precipitation and temperature. Rainy seasons were generally associated with high precipitation skill, while during winter, temperature skill was low. Overall, precipitation forecast skill was highest for the NASA, NCEP, CMC, and GFDL models, and for temperature, the NASA, CFSv2, COLA, and CMC models performed the best. The spatiotemporal variability structures were better captured for precipitation than temperature. The simple forecast averaging did not produce noticeably better results, emphasizing the need for more advanced weight-based averaging schemes.