AbstractModel ensembles may provide estimates of uncertainties arising from unknown initial conditions and model deficiencies. Often, the ensemble mean is taken as the best estimate, and quantities such as the mean‐squared error between model mean and observations decrease with the number of ensemble members. But the ensemble size is often limited by available resources, and so some idea of how many ensemble members that are needed before the error has saturated would be advantageous. The behaviour with ensemble size is often estimated by producing subsamples from a large ensemble. But this strategy requires that this large ensemble is already available. Fortunately, in many situations, the dependence on ensemble size follows simple analytical relations when the quantity under interest (such as the mean‐squared error between ensemble mean and observations) is calculated over many grid points or time points. This holds both for ensemble means and the related sampling variance. Here, we present such relations and demonstrate how they can be used to estimate the gain of increasing the ensemble. Whereas previous work has mainly focused on the size of the model ensemble, we recognize that uncertainties in observations play a role. We therefore also study the effect of using the mean of an ensemble of reanalyses. We show how the analytical relations can be used to estimate the point where the gain of increasing the size of the model ensemble is dwarfed by the gain of increasing the number of reanalyses. We demonstrate these points using two climate model ensembles: a large multimodel ensemble and a large single‐model initial‐condition ensemble.
Read full abstract