Climate change detection and attribution (D&A) studies rely on historical simulations using specified combinations of forcings from a broad set of climate models to quantify the greenhouse gas and other contributions to observed climate change. In the last CMIP5 exercise (Taylor et al. 2012), in addition to the all-forcing simulations (ALL), natural-only (NAT), and greenhouse gas–only (GHG) simulations were prioritized among other possible experiments. However, the question of optimally designing this set of experiments to tackle some specific scientific questions has not been explicitly addressed.

Here we compare four possible strategies: ALL+GHG+NAT experiments as in CMIP5, ALL+AER+NAT, ALL+AER+GHG, and AER+GHG+NAT, where AER represents the response to aerosols. To make this comparison, we focus on how accurately each strategy will enable us to estimate the past GHG-induced warming, which is very closely related to the question of providing an observationally based constraint on the transient climate response (TCR).

*Y*are regressed onto the responses to individual forcings

*i*. Note that in (2), the simulated responses to the individual forcings may be inferred using any of the four sets of simulations discussed above under the assumption of linear additivity (i.e.,

Here we use Monte Carlo simulations (MCSs) to address this question. While the real world has only one realization, such MCSs allow us to generate a large sample of virtual observations and model responses under controlled assumptions. These are used to assess the accuracy of the estimation of *Y*. Based on (2), we also generate a large sample of virtual responses *Y* and *true* value, which is known because it was set to unity in the MCSs.

Realistic values of the *true* responses to individual forcings *true* covariance matrix of internal variability

Our analysis is based on decadal averages of 1901–2010 global temperature projected onto T2-spherical harmonics. The covariance matrix of internal variability is assumed to be known (i.e., uncertainty in its estimation is not considered). A more detailed description of such MCSs is provided in Ribes et al. (2013; see their Fig. 1, black lines). Finally, we assume a phase 6 of CMIP (CMIP6) experimental design with a “budget” of 25 historical climate change simulations that can be allocated to different numbers of simulations with different combinations of forcings. We assume that such a design would require modeling groups to perform 10 ALL simulations, as has been proposed, and that a total number of 15 additional experiments could be requested with different combinations of forcings, consistent with the number of historical simulations that individual modeling centers were able to provide for CMIP5.

Results from the MCSs are shown in Fig. 2a in terms of the square root of the 90% quantile of squared errors (Rq90SE) on

Over the four strategies considered, the highest accuracy (i.e., lowest Rq90SE) in estimating

Overall, these results suggest that accurate estimation of the signals with the smallest amplitudes (primarily NAT-only, then AER-only) is critical for D&A, consistent with the fact that the estimation of smaller-amplitude signals is more affected by internal variability than that of larger signals. The conclusion that explicit modeling of the AER response should be prioritized over modeling of the GHG response is expected to be robust to uncertainty as to the *true* AER response, as we based our analysis on a simulated response to aerosols that was particularly strong. Considering a lower AER response could lead to a slightly increased

This work suggests that an ALL+AER+NAT strategy be adopted for the upcoming CMIP6 D&A exercise. It also suggests that a large NAT ensemble is helpful: this would also be useful for studies that only seek to separate anthropogenic and natural responses. Nevertheless, a few limitations should be mentioned. First, our brief study is based on a single criterion (estimation of TCR) and a single method (standard D&A analysis). Results may also be influenced by the number of runs considered (here 25, with 10 runs allocated to ALL-forcing runs). Second, the study was conducted under a perfect model assumption, which means that modeling uncertainty was not taken into account. A more sophisticated statistical framework would have to be considered to account for such uncertainty. Third, forcings other than GHG, AER, and NAT have been neglected. Fourth, considerations other than D&A might be taken into account to design this set of experiments—for example, related to our understanding of the physical processes involved in the response to each forcing, our confidence in the specifications of the different forcings, or the technical difficulty in carrying out consistent individual forcing simulations in a range of models with different levels of complexity and different processes represented.

## Acknowledgments

AR gratefully acknowledges support from the Canadian Sea Ice and Snow Evolution network (CanSISE; http://www.cansise.ca), PCIC, and CNRM that facilitated a recent research visit to CCCMA and PCIC. The authors are grateful to Slava Kharin and Qiuzi Wen for useful comments on a previous version of this manuscript.

## REFERENCES

Allen, M., , and P. Stott, 2003: Estimating signal amplitudes in optimal fingerprinting, part I: Theory.

,*Climate Dyn.***21**, 477–491, doi:10.1007/s00382-003-0313-9.Ribes, A., , L. Terray, , and S. Planton, 2013: Application of regularised optimal fingerprinting to attribution. Part I: Method, properties and idealised analysis.

,*Climate Dyn.***41**, 2817–2836, doi:10.1007/s00382-013-1735-7.Ribes, A., , and L. Terray, 2013: Application of regularised optimal fingerprinting to attribution. Part II: Application to global near-surface temperature.

,*Climate Dyn.***41**, 2837–2853, doi:10.1007/s00382-013-1736-6.Taylor, K., , R. Stouffer, , and G. Meehl, 2012: An overview of CMIP5 and the experiment design.

,*Bull. Amer. Meteor. Soc.***93**, 485–498, doi:10.1175/BAMS-D-11-00094.1.