I am grateful to the editors for giving me the opportunity to comment on the recent paper “Ensemble Averaging and the Curse of Dimensionality,” by Bo Christiansen (Christiansen 2018). I will limit my comments to the theoretical material in section 2.
It is intriguing that in ensembles of climate simulations, the multimodel mean (MMM) sometimes outperforms even the best ensemble member in root-mean-square error (RMSE), computed over the pixels of a spatial field; a formal statement is given below. This outperformance is present in several fields in Fig. 9.7 of chapter 9 of the Fifth Assessment Report (FAR) of the IPCC (IPCC 2013). Prompted by this figure, in Rougier (2016) I provided a set of conditions that imply this outcome. To paraphrase my interpretation of these conditions, if individual simulators are tuned more on their overall bias than their large pixel errors, then we would expect the MMM to outperform all ensemble members, when the number of pixels is large.
The challenge in the mathematics was the correlations between all of the RMSEs (more on this below). I tackled this challenge in a rather blunt way, by considering the asymptotic limit as the number of pixels grows without bound. In this case, the concentration of means around their expectations plus some additional asymptotic theory produced my key result, result 2. This made for rather a technical paper, and I was keen to read Dr. Christiansen’s geometric explanation.





In my view (1) is not appropriate for ensembles of climate simulators [see Rougier et al. (2013) for a technical discussion]. In my analysis in Rougier (2016), I was careful not to require the assumption of zero bias for every member of the ensemble, let alone the stronger assumption that the ensemble members and the observations are jointly IID. IID is inappropriate both a priori, given what we know about how climate simulators are constructed and tuned, and a posteriori, in the light of the “genealogy” evidence of Knutti et al. (2013).
Putting these reservations aside, we need to be clear about what Dr. Christiansen must show with his geometric explanation:
In this notation,
Now there is another difficulty with Dr. Christiansen’s model—a kind of logical trap. For his model, I have already proved that the MMM always has the smallest RMSE for sufficiently large n. And yet this is not what we see in, say, the different output fields of Fig. 9.7 of the FAR: in some output fields the MMM is best, whereas in others it is little better than the median. To save his model, Dr. Christiansen would have to argue that n in the FAR is not large enough to enforce convergence. But if n is not large enough, then an explanation based on large n is vacuous. This issue does not arise in my model because each output field can have a different configuration of biases across the simulators, and the performance of the MMM depends on the configuration of biases.
As I have already remarked, the challenge in the mathematics is that the k + 1 RMSEs are all correlated, because of the common term Z, and because
The nature of Dr. Christiansen’s explanation is to treat each of the terms Z,
Dr. Christiansen’s treatment of
Also in section 2, there is the striking claim on p. 1590 that Dr. Christiansen’s model and his geometric explanation can resolve a long-standing regularity, which is that the RMSE of the MMM is often about 30% smaller than the median of the RMSEs of the k ensemble members. Unfortunately Dr. Christiansen’s explanation again relies on
In summary, section 2 of Dr. Christiansen’s paper aims to give a geometric explanation of the outperformance of the MMM. Unfortunately his explanation is not valid, owing to its violation of the probability calculus. I also maintain that Dr. Christiansen’s model is too restrictive for the current ensemble of climate simulators, and its strong conclusions are refuted empirically.
Acknowledgments
This research was supported by the EPSRC SuSTaIn Grant EP/D063485/1.
REFERENCES
Christiansen, B., 2018: Ensemble averaging and the curse of dimensionality. J. Climate, 31, 1587–1596, https://doi.org/10.1175/JCLI-D-17-0197.1.
IPCC, 2013: Climate Change 2013: The Physical Science Basis. Cambridge University Press, 1535 pp., https://doi.org/10.1017/CBO9781107415324.
Knutti, R., D. Masson, and A. Gettelman, 2013: Climate model genealogy: Generation CMIP5 and how we got there. Geophys. Res. Lett., 40, 1194–1199, https://doi.org/10.1002/grl.50256.
Rougier, J., 2016: Ensemble averaging and mean squared error. J. Climate, 29, 8865–8870, https://doi.org/10.1175/JCLI-D-16-0012.1.
Rougier, J., M. Goldstein, and L. House, 2013: Second-order exchangeability analysis for multimodel ensembles. J. Amer. Stat. Assoc., 108, 852–863, https://doi.org/10.1080/01621459.2013.802963.