AbstractWildfires emit vast amounts of aerosols and trace gases into the atmosphere, exerting myriad effects on air quality, climate, and human health. Ensemble forecasting has been proposed to reduce the large uncertainties in the wildfire air pollution forecast. This study presents the development of a multi‐model ensemble (MME) wildfire air pollution forecast over North America. The ensemble members include regional models (GMU‐CMAQ, NACC‐CMAQ, and HYSPLIT), global models (GEFS‐Aerosols, GEOS5, and NAAPS), and global ensemble (ICAP‐MME). Performance of the ensemble forecast was evaluated with MAIAC and VIIRS‐SNPP retrieved aerosol optical depth (AOD) and AirNow surface PM2.5 measurements during the 2020 Western United States “Gigafire” events (August–September 2020). Compared to individual models, the ensemble mean significantly reduced the biases and produced more consistent and reliable forecasts during extreme fire events. For AOD forecasts, the ensemble mean was able to improve model performance, such as increasing the correlation to 0.62 from 0.33 to 0.57 by individual models compared to VIIRS AOD. The ensemble mean also yields the best overall RANK (a composite indicator of four statistical metrics) when compared to VIIRS and MAIAC AOD. For the surface PM2.5 forecast, the ensemble mean outperformed individual models with the strongest correlation (0.60 vs. 0.43–0.54 by individual models), lowest fractional bias (0.54 vs. 0.55–1.32), highest hit rate (87% vs. 40%–82%), and highest RANK (2.83 vs. 2.40–2.81). Finally, the ensemble shows the potential to provide a probability forecast of air quality exceedances. The exceedance probability forecast can be further applied to early warnings of extreme air pollution episodes during large wildfire events.