AbstractThis study presents a comprehensive evaluation of the mean and extreme surface air temperature and precipitation in the Coupled Model Intercomparison Project Phase 6 (CMIP6) multimodel ensemble simulations over the Tibetan Plateau. Simulations from 29 climate models are compared against gridded observations for the period 1961–2012. Results show that models reasonably reproduce the overall spatial patterns of the 1961–2012 averages of seasonal mean temperature and precipitation. Nevertheless, models tend to underestimate mean temperature and overestimate precipitation accumulations. To be specific, the multimodel mean estimates of seasonal average temperatures over the plateau are 0.5°C–2.6°C colder than observed, while the corresponding estimates of seasonal precipitation accumulations are 218% (spring: March–May), 76% (summer: June–August), 129% (autumn: September–November), and 533% (winter: December–February) of those observed. As in observations, models also well capture increasing trend of mean temperature and precipitation in all seasons but underestimate the rates of increasing trends for both variables in all seasons. Models’ ability to simulate temperature and precipitation extremes is also evaluated in terms of a set of chosen extreme indices defined by the Expert Team on Climate Change Detection and Indices. On average, models tend to underestimate the averages of annual maximum daily maximum temperature, annual minimum daily minimum temperature, frost days, warm nights, and consecutive dry days but overestimate the corresponding averages of annual maximum 5‐day precipitation accumulations and simply daily intensity. Generally, models are able to simulate the signs of the trends in extreme indices but underestimate their magnitudes and misrepresent spatial patterns.