Ocean–Land Teleconnections and Chaotic Atmospheric Variability

Randal D. Koster Global Modeling and Assimilation Office, NASA Goddard Space Flight Center, Greenbelt, Maryland

Search for other papers by Randal D. Koster in
Current site
Google Scholar
PubMed
Close
,
Siegfried D. Schubert Global Modeling and Assimilation Office, NASA Goddard Space Flight Center, Greenbelt, Maryland
Science Systems and Applications, Inc., Lanham, Maryland

Search for other papers by Siegfried D. Schubert in
Current site
Google Scholar
PubMed
Close
,
Anthony M. DeAngelis Global Modeling and Assimilation Office, NASA Goddard Space Flight Center, Greenbelt, Maryland
Science Systems and Applications, Inc., Lanham, Maryland

Search for other papers by Anthony M. DeAngelis in
Current site
Google Scholar
PubMed
Close
,
Yehui Chang Global Modeling and Assimilation Office, NASA Goddard Space Flight Center, Greenbelt, Maryland
Morgan State University, Baltimore, Maryland

Search for other papers by Yehui Chang in
Current site
Google Scholar
PubMed
Close
, and
Adam A. Scaife Met Office Hadley Centre, Met Office, Exeter, Devon, United Kingdom
Department of Mathematics and Statistics, University of Exeter, Exeter, United Kingdom

Search for other papers by Adam A. Scaife in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

A large ensemble of multidecadal atmospheric general circulation model (AGCM) simulations is examined to determine a quantity central to the model’s potential predictability—the fraction of the simulated monthly air temperature (T2M) variability that is tied to the imposed sea surface temperature (SST) boundary conditions as opposed to the background atmospheric noise. Combining this information with ensemble simulation data from other AGCMs in turn allows an intermodel comparison of two separate quantities: model potential T2M predictability and the underlying (in the absence of noise) teleconnections between SSTs and continental T2M. To a large extent, the models tend to agree with each other regarding both—they all show, for example, the expected highest predictability in the tropics as well as low potential predictability in central Asia, and, particularly for DJF and MAM, they show very similar ocean–land teleconnections throughout the Americas. However, the models do show some differences in teleconnections, indicating room for model improvement that could, in principle, lead to benefits for long-term prediction. Importantly, by combining the model results with observational temperature data, we provide a new estimation of real-world predictability, a property of Nature that is not directly observable. The latter results suggest, with caveats, that for monthly T2M in the tropics, AGCMs tend to overestimate the ratio of predictable variance to noise-derived variance.

© 2025 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Randal Koster, randal.koster@gmail.com.

Abstract

A large ensemble of multidecadal atmospheric general circulation model (AGCM) simulations is examined to determine a quantity central to the model’s potential predictability—the fraction of the simulated monthly air temperature (T2M) variability that is tied to the imposed sea surface temperature (SST) boundary conditions as opposed to the background atmospheric noise. Combining this information with ensemble simulation data from other AGCMs in turn allows an intermodel comparison of two separate quantities: model potential T2M predictability and the underlying (in the absence of noise) teleconnections between SSTs and continental T2M. To a large extent, the models tend to agree with each other regarding both—they all show, for example, the expected highest predictability in the tropics as well as low potential predictability in central Asia, and, particularly for DJF and MAM, they show very similar ocean–land teleconnections throughout the Americas. However, the models do show some differences in teleconnections, indicating room for model improvement that could, in principle, lead to benefits for long-term prediction. Importantly, by combining the model results with observational temperature data, we provide a new estimation of real-world predictability, a property of Nature that is not directly observable. The latter results suggest, with caveats, that for monthly T2M in the tropics, AGCMs tend to overestimate the ratio of predictable variance to noise-derived variance.

© 2025 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Randal Koster, randal.koster@gmail.com.

1. Introduction

The Earth system is characterized by complex interactions between its oceanic, atmospheric, land surface, and cryospheric components. These interactions can imprint themselves on the variability of key meteorological quantities such as near-surface air temperature (T2M) and precipitation—quantities of great relevance to society. A long-lived (multimonth) tropical sea surface temperature (SST) anomaly, for example, may induce a similarly long-lived rainfall or temperature anomaly above it and perhaps in downwind land regions (Kumar et al. 2013). A large-scale soil moisture anomaly may survive for a month or more, encouraging a similar time scale in the overlying T2M and, to a lesser extent, precipitation (Koster et al. 2011). However, while the slower-changing components of the Earth system have the potential to impart variability and memory to atmospheric quantities, thereby imparting some predictability to these quantities (e.g., Smith et al. 2012; Kushnir et al. 2019), atmospheric variables are also subject to the whims of dynamical chaos—the atmosphere’s unpredictable variability can indeed swamp out the externally forced signals, rendering them irrelevant. Chaos imposes fundamental limits on our ability to predict important meteorological variations, with useful predictability usually decreasing with increasing lead time (Lorenz 1963).

Establishing the contribution of unpredictable chaotic atmospheric variability (hereafter often referred to as “noise”) to the variability of, say, T2M over a specific continental region is a critical first step toward quantifying the ability to predict that region’s T2M at subseasonal-to-seasonal time scales. One way to analyze the climate system’s unpredictable variability is through ensembles of simulations with forecast systems consisting of coupled ocean, sea ice, atmosphere, and land models. An early analysis was performed by Murphy (1990), who demonstrated that an ensemble mean produced a modestly more accurate forecast than did the individual simulations comprising the ensemble. Siegert et al. (2016) provided a Bayesian framework for characterizing, in the context of an ensemble forecasting system, the strength of a forced signal in the presence of noise, using it specifically to analyze the predictability of the North Atlantic Oscillation (NAO). Eade et al. (2014) introduced the “ratio of predictable components” (RPC) metric as a means of characterizing, through analysis of an ensemble of coupled model simulations, the strength of the forced response in the presence of unpredictable variability. Using this framework, they found in their model a weak amplitude of the predictable signal relative to the unpredictable noise—paradoxically, the model could then better predict an observed time series than it could predict its own ensemble members. This paradox was further illuminated by Scaife and Smith (2018) and examined using a Markov model by Zhang and Kirtman (2019).

So-called “AMIP style” (Gates 1992) atmospheric model simulations—simulations in which the atmospheric model runs uncoupled from the ocean model, instead responding to prescribed, time-varying fields of SST and sea ice—are also subject to chaotic atmospheric dynamics (e.g., Scaife et al. 2009). For examining the specific question of how a given time series of SST fields imparts a signature on the evolution of Earth’s atmospheric fields, the AMIP-style simulation framework has a unique advantage. An ensemble of parallel AMIP-style simulations, each simulation using the same set of time-varying SST boundary conditions, will provide (after a suitable spinup period) multiple possible realizations of multidecadal weather consistent with those SSTs. Averaging across the ensemble members filters out much of the model’s atmospheric noise, allowing the isolation of the underlying SST-forced signal.

There are, however, dangers with reading too much into AMIP-style analyses. For example, using such an approach, Rodwell et al. (1999) and Mehta et al. (2000) demonstrated a more accurate reproduction of the behavior of the NAO when an ensemble of SST-forced simulations was averaged together, ostensibly suggesting—given the predictability of SSTs at longer time scales—that long-term predictions of the NAO might be possible. In remarking on these two studies, however, Bretherton and Battisti (2000) point out that in the real world, SSTs themselves respond to ocean–atmosphere interactions (a fully coupled seasonal forecast simulation, given imperfections in the modeled atmospheric and oceanic components and given chaotic dynamics, would likely not reproduce the SST time series prescribed in the AMIP-style simulations), reducing the relevance of the two studies to multimonth predictability. True predictability involves coupled ocean–atmosphere behavior, such as that examined in later studies of surface NAO predictability in coupled forecast systems (Scaife et al. 2014; Athanasiadis et al. 2017).

This said, SSTs do persist from month to month. Because the ocean has substantially more thermal inertia than the atmosphere, most persistent signals in the atmosphere at, say, the monthly time scale are likely to come from the ocean, giving AMIP-style analyses some relevance for predictability studies. For the specific question of how a given time series of SST fields guides the atmosphere in the presence of atmospheric noise, without added interpretations regarding long-term prediction, an ensemble of AMIP-style simulations can provide useful insight. Note, in addition, that there generally appears to be little difference in the SST-forced responses in coupled and uncoupled climate model simulations (Chen and Schneider 2014) although some regions, including the Indian Ocean, do show differences (Copsey et al. 2006).

Accordingly, in this paper, we examine a large ensemble (45 members) of multidecadal (1981–2014) AMIP-style simulations to quantify a model’s chaotic background variability. While this part of the study is reminiscent of past predictability studies, we then supplement the analysis with data from other AMIP-style ensembles (smaller ensembles, produced by different AGCMs) to quantify similarities in their chaotic variability as well as the agreement in their underlying ocean–land teleconnections. This latter comparison, used in conjunction with observational T2M data, allows a heretofore unexplored means of estimating predictability levels in the real world. Through a novel approach, in addition to characterizing and comparing the predictability characteristics of the models examined, we, in effect, provide new estimates of this otherwise unmeasurable real-world property.

Our modeling and data sources are provided in section 2, and our methodologies for teasing out unpredictable variability and teleconnection metrics from the model results and observations are provided in section 3. Results are presented in section 4, followed by discussion and conclusions in sections 5 and 6.

2. Data and modeling systems

a. GEOS modeling system

The analyses in this paper focus largely on a 45-member ensemble of AMIP-style simulations with the National Aeronautics and Space Administration (NASA) Goddard Earth Observing System (GEOS) atmospheric general circulation model (AGCM; Molod et al. 2015). The version of the AGCM used here is essentially the same as that underlying the NASA Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2), reanalysis (Gelaro et al. 2017); it is run on a cubed-sphere grid and includes state-of-the-art representations of a host of atmospheric and land surface processes. Two important differences, however, should be noted between the AGCM used here and that underlying MERRA-2. First, we run the AGCM here at a coarser resolution (1° × 1°, rather than the ∼1/2° × 1/2° resolution underlying MERRA-2). Second, we use a tendency bias correction (TBC) approach to mitigate some of the model’s long-term biases (Chang et al. 2019). We apply diurnally varying and seasonally varying TBC corrections derived from the increments (forecast minus analysis) computed during the long-term MERRA-2 reanalysis itself; because the corrections applied are climatological, containing no prescribed interannual variability, the TBC-corrected AGCM remains a fully free-running global atmospheric model.

Each simulation in the 45-member ensemble covers the period 1981–2014, a period chosen for consistency with the other datasets used in our analysis. The simulations follow AMIP protocols, being forced by prescribed, observations-based fields of sea surface temperature and sea ice that vary daily and annually. For the present analysis, we process the simulated near-surface air temperature (T2M) values into monthly averages, our aim being to understand the sources of monthly scale T2M variability over continental areas.

Detrending the data was found to be necessary to isolate the signals borne of interannual SST variability from those associated with long-term SST trends. Trends were removed on a monthly basis; at each grid cell for a given month, the ensemble mean monthly T2M was computed for each year in 1981–2014, and the resulting time series was regressed against the year index. Then, for a given year, the T2M value computed with the regressed relationship was subtracted from each ensemble member’s monthly T2M.

b. Observed monthly temperature data

Gridded daily temperature (T2M) data at 0.5° × 0.5° resolution are available from the Climate Prediction Center (CPC; https://www.esrl.noaa.gov/psd/data/gridded/data.cpc.globaltemp.html). The raw source of the data is station-based measurements of minimum daily temperature Tmin and maximum daily temperature Tmax at the 2-m height; these point measurements were spatially interpolated by CPC onto the 0.5° × 0.5° grid. We converted the gridded Tmin and Tmax values into daily T2M using T2M = (Tmin + Tmax)/2, and we then spatially and temporally averaged the daily temperatures into monthly averages on the atmospheric model’s 1° × 1° grid. Finally, the T2M values for the period 1981–2014 were detrended on a monthly basis using the approach outlined above for the GEOS data, though computing the trend, of course, from the single time series of T2M values for a given month rather than from an ensemble mean.

We should note that using the mean of the minimum and maximum observed temperatures to approximate the daily T2M, while a common practice, can result in T2M values that are biased high (Bernhardt et al. 2018). This is an unavoidable limitation of the available observational data. Nevertheless, for our analysis framework, the relevant information in the observed T2M values lies not in their absolute magnitudes but in their time variability (specifically, their time correlation with other fields at the monthly scale), which arguably is minimally affected by this bias.

c. Supplemental AMIP experiments using independent models

For our analysis, we require estimates of T2M variations from a number of additional atmospheric general circulation models. For this, we utilize AMIP-style simulations produced by five other AGCMs as part of their contribution to the CMIP6 project (Eyring et al. 2016). See Table 1 for a list of these models and suitable references. While some models produced data over longer time periods, each model produced monthly T2M data for the 1981–2014 period with at least 10 ensemble members. To simplify the interpretation of our results, we use only 10 ensemble members from each AGCM.

Table 1.

Modeling systems providing monthly T2M data. Each model performed AMIP-style simulations as part of their contribution to the CMIP6 project; for each model, data from 10 ensemble members covering the period 1981–2014 were extracted for this study.

Table 1.

For the present work, for each of the five models, we regridded the T2M data for a given month to the GEOS model’s 1° × 1° grid using a nearest-neighbor approach. The 1° × 1° data for each model were then detrended independently on a monthly basis using the approach outlined above for the GEOS model’s data. One difference between these CMIP6 simulations and the GEOS simulations is the use in the former of time-varying constituent forcing—a forcing that, like SSTs, could impart a signal to the generated monthly T2M fields. We presume this impact to be small, even negligible, given that the applied detrending would largely remove the impacts of trends in the imposed constituents. Still, the lack of such time variations in the GEOS simulations should be interpreted as an additional source of error for the GEOS time series.

3. Analysis framework

a. Relative contributions of boundary forcing and unpredictable variability

We assume that the anomaly relative to a long-term climatic mean of a given variable Y(t) can be represented in terms of a boundary-forced response, B(t), and a contribution from internal, unpredictable variability, ε(t). In our study, Y is the monthly averaged 2-m air temperature, or T2M, anomaly. For climate model m, we have
Ymn(t)=Bm(t)+εmn(t),
where the extra subscript n refers to the nth member of an N-member ensemble of simulations performed with model m. As indicated in the equation, each simulation in a model ensemble includes the same boundary-forced response, Bm(t), but experiences a different time series of internal variability, εmn(t).
An important facet of our analysis is the characterization of the signal-to-total variance ratio within the model in the presence of model noise—the ability of N ensemble members to capture the behavior of a supplemental ensemble member, the latter being considered “truth” for the analysis. We characterize this ability with the square of the correlation between the supplemental ensemble member’s time series and the time series computed from the mean of the original N members. With this in mind, let Ym(t) represent the ensemble mean value of Y:
Ym(t)=1N1NYmn(t).
Section S1 in the online supplemental material shows how the following equation can be derived for the desired quantity, Corr2(Ymn, Ym):
Corr2(Ymn,Ym)=ρm2[1+(1ρm2)/ρm2N],
where
ρm2=Corr2(Ymn,Bm).
{For a stronger connection to its observational analog discussed below, (3) implicitly assumes the availability of N + 1 ensemble members. In practice, of course, if we have N ensemble members to work with, N − 1 rather than N would be used in the denominator of (3)}. As indicated in (4), ρm2 is the square of the temporal correlation between Ymn and Bm in (1). According to this definition, the quantity ρm2 represents the fraction of the variance of Ymn that is “explained” by variations in the boundary forcing. It is indeed equivalent to the model’s signal-to-total variance ratio, given that ε represents uncorrelated noise. The quantity ρm2 is central to this study. A ρm2 of 0 would imply that the boundary forcing does not contribute at all to the time variability of Ymn(t), whereas a ρm2 of 1 would indicate that Ymn(t) is completely determined by the boundary forcing and is not affected at all by chaotic atmospheric dynamics.
Note that if an infinite number of ensemble members were available, (3) reduces to
Corrmax2(Ymn,Ym)=ρm2.
The quantity ρm2 thus turns out to serve as the upper limit to Corr2(Ymn, Ym), the asymptote in a plot of Corr2(Ymn, Ym) against ensemble size.
Now consider a separate climate model q, a model with its own teleconnection characteristics and noise levels and its own ensemble size, Nq. In analogy to (1)(3), we would have
Yqn(t)=Bq(t)+εqn(t),
Yq(t)=1Nq1NqYqn(t),
Corr2(Yqn,Yq)=ρq2[1+(1ρq2)/ρq2Nq].
Manipulation of (3) and (8) (see section S1 in the supplemental material) provides an equation for the square of the temporal correlation between the ensemble-mean time series from the two models:
Corr2(Ym,Yq)=Corr2(Bm,Bq)[1+(1ρm2)/ρm2N][1+(1ρq2)/ρq2Nq],
where Corr2(Bm, Bq) has an important interpretation: it represents the extent to which the models m and q agree with each other on the underlying teleconnections between the SST fields and the quantity of interest—on how the two models’ time series of Y would compare if all complications associated with noise could be removed (if both models, for example, had an infinite number of ensemble members). If in the first model, for example, air temperatures in the northwestern United States were mostly influenced by SST variations in the eastern North Pacific, whereas in the second, due to a different simulation of teleconnections, these air temperatures were mostly influenced by SSTs further to the south, the resulting value of Corr2(Bm, Bq) would be less than unity.
We will be examining Corr2(Bm, Bq) for different model combinations; to calculate it, we recognize that Corr2(Ym, Yq) can be computed directly from model output (i.e., by correlating time series of ensemble means) and that estimates of ρm2 and ρq2 can be derived from each model’s output data using a fitting approach, as illustrated in section 4a below. A simple transformation of (9) gives the equation we accordingly use:
Corr2(Bm,Bq)=[1+(1ρm2)/ρm2N][1+(1ρq2)/ρq2Nq]Corr2(Ym,Yq).
Finally, we consider model m’s ability to capture real-world variations in the quantity of interest. Using the subscript o to represent the real-world quantities, we have
Yo(t)=Bo(t)+εo(t).
The quantities Bo(t) and εo(t) cannot, of course, be directly observed, but conceptually we can assume them to exist.
The square of the correlation between the observed variable Yo(t) and the ensemble mean Ym(t) from model m provides a first-order indication of how well that model can reproduce Nature—the fraction of the variance found in Nature that is “explained,” or captured, by the model for a given ensemble size. The equations above can be manipulated (see section S1 of the supplemental material) to produce
Corr2(Yo,Ym)=ρo2Corr2(Bo,Bm)[1+(1ρm2)/ρm2N],
where, in analogy to the definition of ρm2, ρo2 represents the square of the temporal correlation between Yo(t) and Bo(t)—the fraction of the real world’s variance that is explained by the boundary forced signal rather than by internal variability. According to this equation, and in agreement with intuition, the model’s agreement with Nature increases with the ensemble size N. As N approaches infinity, we obtain the highest fraction of variance that can be explained with the modeling system—the maximum degree to which an ensemble mean can capture the time variations measured in Nature:
Corrmax2(Yo,Ym)=ρo2Corr2(Bo,Bm).
The two factors on the right-hand side of (13) represent two distinct barriers to the model’s ability to capture the observed variations. The first, ρo2, is again the square of the correlation in Nature between the boundary-forced response and the value actually obtained; this will be less than one due to the noise—the chaotic dynamics—endemic to Nature. A model, by definition, would never be able to reproduce the exact evolution of Nature’s chaotic dynamics and thus could never hope to reproduce the observed signal perfectly. Note that ρo2 is intrinsic to Nature itself; the same value of ρo2 applies regardless of the modeling system analyzed. In contrast, the second term is model dependent; Corr2(Bo, Bm) will only be 1 if the model simulates fully realistic time variability (but not necessarily magnitude) of the teleconnections between the boundary forcing (the SST variations in this study) and the variable Y. Of course, given model biases and flaws, such equivalence is highly unlikely, and Corr2(Bo, Bm) will thus typically be less than 1.

The quantity ρo2 indeed describes a fundamental property of the real world’s predictability: stated another way, it is the relative degree to which an observed quantity such as monthly T2M is controlled by SST variability rather than by chaotic noise. Unfortunately, while fundamental, ρo2 cannot be directly measured. Past studies have approximated ρo2 using model-based ρm2 estimates (e.g., Koenigk and Mikolajewicz 2009), i.e., by assuming that climate models have accurate signal-to-total variance ratios. We make a nod to this approach ourselves below, though we will be emphasizing throughout that ρm2 can be quite different from ρo2 and does not even provide its upper limit—in some cases, the signal-to-noise ratio is too low in models, leading to an underestimate of the real world’s ρo2.

Importantly, in this study, we also introduce and use a novel second approach to estimating ρo2, one based on (13):
ρo2=Corrmax2(Yo,Ym)Corr2(Bo,Bm)[ρo2Corr2(Bo,Bm)]estCorr2(Bo,Bm)est,
where [ρo2Corr2(Bo,Bm)]est is determined directly from the joint analysis of model and observational data (as outlined in some detail in sections 4a and 4c), and where Corr2(Bo, Bm)est, an estimated value of Corr2(Bo, Bm), is derived from the joint analysis of the ensemble data from different model combinations. Our approach to determining Corr2(Bo, Bm)est is in fact simple. For a given pair of models m and q, we can compute Corr2(Bm, Bq) with (10), correlating the two models’ time series against each other to determine Corr2(Ym, Yq), and applying the methodology outlined in section 4a to produce each model’s ρ2. Pairing GEOS with each of the five CMIP6 models listed in Table 1, we accordingly obtain five separate values of Corr2(Ym, Yq). We then make the following major assumption: that Corr2(Bo, Bm) can be approximated by the mean of these five model-based estimates. Effectively, we assume that the boundary-forced signals underlying the GEOS model correlate with those underlying the real world at about the same level as they correlate with those underlying the five CMIP models. [Note that this is not at odds with the signal-to-noise paradox discussed in section 1 (Eade et al. 2014; Scaife and Smith 2018; Zhang and Kirtman 2019), which relates to the amplitude of the boundary-forced signal rather than to its temporal variability. In a sense, the signal-to-noise paradox relates more to biases in ρm2 relative to ρo2 than to the independent teleconnection behavior characterized with Corr2(Bo, Bm) or Corr2(Bm, Bq).] Naturally, this assumption is itself unsatisfying; one might wonder, for example, whether models tend to show more similarities to each other than any of them show to the real world, thereby biasing the Corr2(Bm, Bq) estimates. This is certainly a possibility, and we acknowledge the associated limitations to this part of the analysis. An analysis in the supplemental material (section S2) addressing this question, however, produces some encouraging results: the time series from the GEOS model ensemble does not prove to be more similar to individual time series from the CMIP6 models than to the observational data themselves. This suggests that the models do not tend to have similar errors due to some common deficiency and thus supports the idea that the Corr2(Bm, Bq) values may, on average, serve as a first-order approximation to Corr2(Bo, Bm).

The new estimates of ρo2 derived with (14) are independent of those determined through the more common approach of simply equating ρo2 and ρm2, for in the conceptual framework above, B and ρ are independent of each other. In other words, a high consistency in teleconnections, as characterized by Corr2(Bo, Bm) being close to one, does not imply a high consistency between the model’s and Nature’s signal-to-total variance ratios (ρm2 and ρo2), and vice versa. The key distinction in the two approaches is that values of the fundamental real-world predictability metric ρo2 obtained with (14) do not rely on an assumption of accurate signal-to-total variance ratios in the model; they instead rely on our ability to estimate the model’s independent success in simulating ocean–land teleconnections—i.e., on our ability to estimate Corr2(Bo, Bm).

We must emphasize here that in this analysis, we will not be evaluating the absolute magnitudes of the boundary-forced variance or of the variance induced by internal variability. An excessive ρ2 for a model, for example, could result from either an underestimate of the internal variability or an overestimate of the impact of the signal. Here, we are interested only in whether the model captures the correct fraction of the total variance induced by boundary forcing as opposed to noise, a critical quantity in itself.

This overall framework—particularly the part associated with estimating model values of ρm2—is essentially consistent with those offered already by Eade et al. (2014), Siegert et al. (2016), and Scaife and Smith (2018). Murphy (1990) also provides similar but alternative forms of some of the above equations; indeed, (2) of that paper can be shown to be mathematically identical to (3) above. The framework as expressed above, however—particularly the focuses on Corr2(Bo, Bm) and Corr2(Bm, Bq)—will facilitate our forthcoming analysis.

b. Details regarding ensemble sampling and correlation calculations

In some of our analyses, we will be examining correlations as a function of ensemble size. For a given ensemble size K in the determination of Corr2(Yo, Ym), we randomly select K members (without replacement) from the available members (45 available members for GEOS, 10 for the other models) and call this our reduced ensemble; after computing the desired correlation with this reduced set, we repeat the process 99 times. The correlation value for ensemble size K is the average over the 100 computed correlations.

For the model-based Corr2(Ymn, Ym) calculations, we consider five individual ensemble members in turn as Ymn in (3). Note that Ym in the equation accordingly differs each time, since each choice of Ymn defines a different set of remaining ensemble members. The five resulting correlations for a given ensemble size, each already representing an average over 100 computed values, are then averaged into the final result for that size. The use of five ensemble members (rather than 45 or 10) as Ymn here was considered a reasonable and tractable compromise given the need to perform at least some averaging in the face of the tremendous computational burden associated with each calculation.

Each individual correlation calculation is based on 102 sample pairs: 3 monthly averages per year for 34 years (1981–2014). As a consequence of the detrending described earlier, climatological monthly averages are removed from each value before computing the correlations so as not to capture trivial correlations associated with seasonality.

4. Results

a. Quantification of unpredictable variability within the GEOS model

Our analysis focuses first on the GEOS model rather than on the CMIP models. Given its much larger ensemble size, we expect the characterization of noise for GEOS to be the most robust.

Figure 1 shows, with blue dots, the degree to which the GEOS model’s ensemble mean captures the monthly T2M time series produced by a single ensemble member [i.e., Corr2(Ymn, Ym)] as a function of ensemble size for a grid cell in North America. We use (3) to produce the blue curve through the dots, with ρm2 determined via a fitting procedure—we find the single value of ρm2 that produces the lowest root-mean-square error (RMSE) between the dots and the curve. The ρm2 value so found indeed serves as the asymptote for the relationship, shown in the figure as a dashed blue line. This asymptote is interpreted as the model’s underlying strength of connection between the boundary forcing and a single simulation’s monthly T2M time series. Stated another way, 1 minus this asymptote is interpreted as the fraction of monthly T2M variability in a simulation that stems from chaotic noise.

Fig. 1.
Fig. 1.

Representative examples of how key model relationships vary with ensemble size, focusing on the simulation of monthly mean air temperature, T2M. Blue dots: Variation of Corr2(Ymn, Ym) with ensemble size, where Corr2(Ymn, Ym) characterizes the ability of the ensemble mean to capture the variability produced by a single ensemble member. Red dots: Variation of Corr2(Yo, Ym) with ensemble size, where Corr2(Yo, Ym) characterizes the ability of the ensemble mean to capture the variability seen in the observations. The lines through the dots are determined through a fitting procedure, which provides as a matter of course the indicated asymptotes, shown as dashed lines (see text). Results shown are for JJA.

Citation: Journal of Climate 38, 6; 10.1175/JCLI-D-23-0740.1

It is important to emphasize that our procedure does not simply equate ρm2 to the correlation achieved with all 45 ensemble members. By fitting the curve and determining the asymptote, we are effectively equating ρm2 to the correlation that would be achieved with an infinite number of ensemble members.

Note that the results in Fig. 1 are representative; all of the grid cells we examined show similarly strong (often even better) fits through their corresponding blue dots, though we caution that this success may be specific to the particular variable, monthly T2M, that we are examining here. Such success in the fitting provides confidence that the analysis framework represented by (1)(4) properly characterizes the relative contributions of signal and noise in the modeling system. In Fig. 2, we show global maps of ρm2 so obtained for GEOS as a function of season. The salient spatial pattern is a strong signal-to-total variance ratio in the tropics and much smaller values for the ratio as one moves poleward, a feature that is well documented in previous studies (e.g., Koster et al. 2000) and that leads to enhanced predictability in tropical seasonal forecasts (Scaife et al. 2017; Kumar et al. 2013). Remoteness from the oceans also appears to be a factor, with the lowest potential values of ρm2 often appearing deep in the heart of Eurasia. The ρm2 maps show some interesting seasonal variations, with, for example, North America showing lower ρm2 values in SON, Australia showing somewhat lower values in JJA, and the Sahel showing low values in MAM but higher values in JJA.

Fig. 2.
Fig. 2.

Derived global distribution of ρm2 (the fractional contribution of SST boundary forcing to monthly T2M variability in the GEOS atmospheric model) for (a) DJF, (b) MAM, (c) JJA, and (d) SON. The shading contour at 0.029 represents the 99% confidence level that the null hypothesis (i.e., no underlying signal) is invalid, as determined by Monte Carlo analysis. Similarly, values above 0.037 indicate that the null hypothesis is invalid at the 99.9% confidence level.

Citation: Journal of Climate 38, 6; 10.1175/JCLI-D-23-0740.1

Note that in these maps, any colored shading indicates at least a 99% confidence that the null hypothesis of no underlying signal is invalid. The value of 0.025 delimiting the lowest colored shading was determined from a Monte Carlo analysis in which appropriately sized ensembles of randomly generated time series, i.e., time series of normally distributed numbers produced with a random number generator, were processed in precisely the same manner as the AGCM data to quantify the statistics associated with that single null hypothesis. That same Monte Carlo analysis, by the way, indicates a 99.9% confidence that the null hypothesis is invalid for any value above 0.037.

b. Comparison of the GEOS model’s noise and teleconnection characteristics with those in other models

Now that the equations and fitting approach have been demonstrated for the GEOS model, we can take a look at the CMIP AGCMs listed in Table 1 to shed some light on how different models compare in regard to boundary forcing and noise levels. Because only 10 ensemble members are used for each of the CMIP models, the estimates obtained for them are presumably more uncertain than those for GEOS; still, enough data are available for some useful joint analyses. Indeed, as a check, we estimated ρm2 using a 10-member subset of the 45-member GEOS ensemble and found substantial agreement with the values found using all 45 members (see section S3 in the supplemental material), supporting the idea that an ensemble of 10 is adequate, at least to first order, for fitting the curve and determining the asymptote ρm2. Results below are for boreal spring (MAM). Results for other seasons are provided in the supplemental material (see section S4).

For each of the five CMIP models, curve fitting along the lines of that shown in Fig. 1 (blue curve) was performed to determine spatial distributions of ρm2, the fraction of the T2M variance explained by the boundary-forced signal. Results are shown in Fig. 3, with the GEOS results for MAM repeated in Fig. 3f. The different models agree on many features of the distributions, showing, for example, higher values in the tropics, negligible values in central Asia, and a dipole of higher values along the western coast of North America. The overall similarity is encouraging given that the six models use distinct structures and formulations (with the exception of NorCPM1 and CESM2, which use similar atmospheric models). The overall similarity supports the idea that the models are capturing, to first order, the same elements of the spatial distributions of the relative strength of the boundary-forced signal in Nature.

Fig. 3.
Fig. 3.

Spatial distribution of ρm2, the fraction of the simulated T2M variance explained by the boundary-forced signal, for six different AGCMs. (a) MIROC6. (b) CESM2. (c) NorCPM1. (d) ACCESS. (e) IPSL. (f) GEOS. The GEOS results are based on a 45-member ensemble, and those for the other AGCMs are based on 10-member ensembles (though in all cases, the values effectively represent what would be obtained with an infinite number of ensemble members). The shading contour at 0.052 represents, for the CMIP models, the 99% confidence level that the null hypothesis (i.e., no underlying signal) is invalid, as determined by Monte Carlo analysis. The corresponding value for the 99.9% confidence level is 0.067.

Citation: Journal of Climate 38, 6; 10.1175/JCLI-D-23-0740.1

Corr2(Bm, Bq), the degree to which two different models m and q agree with each other on the boundary-forced signal at a given location once the impact of noise is removed, is computed with (10). Figures 4a–e shows Corr2(Bm, Bq) between the GEOS model and each of the five CMIP models. We focus here on the GEOS model’s correlations with the CMIP models rather than on correlations among the different CMIP models because the size of the GEOS ensemble would presumably provide some extra robustness to the estimates. Also note that theoretically, while a large agreement in the boundary-forced signal is possible in the presence of substantial model noise, the presence of noise can make quantifying the agreement in this quantity difficult. Indeed, with ρm2 or ρq2 close to zero, the calculation of a reliable Corr2(Bm, Bq) value using (10) is essentially impossible. To account for this, we assign undefined values to the Corr2(Bm, Bq) calculations from (10) if either ρm2 or ρq2 falls below an arbitrary threshold of 0.05. The undefined values are whited out in the figure.

Fig. 4.
Fig. 4.

Spatial distribution of Corr2(Bm, Bq), the degree to which the different CMIP models agree with the GEOS model on the temporal variations in the boundary-forced signal. (a) Corr2(BGEOS, BMIROC6). (b) Corr2(BGEOS, BCESM2). (c) Corr2(BGEOS, BNorCPM1). (d) Corr2(BGEOS, BACCESS). (e) Corr2(BGEOS, BIPSL). (f) Arithmetic mean of the Corr2(Bm, Bq) values. Values are considered undefined (and shown as white) if the internal variability of either model involved in a calculation is overwhelmingly high (i.e., if ρm2 or ρq2 falls below a threshold of 0.05). Undefined values are not utilized in the calculation of the arithmetic mean.

Citation: Journal of Climate 38, 6; 10.1175/JCLI-D-23-0740.1

Figures 4a–e indicates that the teleconnections inherent in the GEOS system generally agree with those of the CMIP models in the Americas and along the northern coast of Australia, with squared correlations often exceeding 0.7. Agreement is weaker in Africa (though still in the neighborhood of 0.5) and is particularly weak in southern Asia. Excessive internal variability prevents a determination of the agreement in Europe and most of the rest of Asia. Figure 4f encapsulates these results by providing the arithmetic mean of the five Corr2(Bm, Bq) fields, with undefined values excluded from the mean calculations. To our knowledge, Fig. 4 represents a unique, first-of-its-kind comparison of teleconnection behavior between models. The indicated degree of disagreement underlines some uncertainty in this quantity; it indeed suggests that models could be improved in this regard, which could lead to substantial gains in long-range prediction.

Again, results for other seasons are provided in the supplemental material (section S4). For these other seasons, we find that, as with MAM, the ρm2 distributions are very similar across the models, both in terms of pattern and magnitude. We also find that, as with MAM, Corr2(Bm, Bq) for DJF is strong in the Americas and Australia and weaker in Africa and South Asia. The Corr2(Bm, Bq), however, is somewhat weaker in the Americas during JJA and SON, with values typically closer to 0.5, again indicating substantial room for the improved modeling of teleconnections.

c. Quantification of maximum realizable skill with the GEOS model

The red dots in Fig. 1 show the square of the correlation, as a function of ensemble size, between the ensemble mean T2M from GEOS and the observed monthly value (section 2b) at the sample grid cell. Here, having already obtained the model’s ρm2 from the calculations in section 4a, we need only fit the value of the product ρo2Corr2(Bo,Bm) in such a way that the red curve, computed with (12), best captures the relationship indicated by the red dots as determined by minimizing the RMSE between the two. The product ρo2Corr2(Bo,Bm) serves as the asymptote for the relationship, shown as the red dashed line in Fig. 1. Again, the strong fit of the red curve through the dots is representative of the fits obtained across the globe.

Figure 5 shows, as a function of season, the global distribution of the ρo2Corr2(Bo,Bm) values derived in this way {equivalent to [ρo2Corr2(Bo,Bm)]est in (14)}—the degree to which the model can capture, with an infinite number of ensemble members, the observed time variations of T2M. Monthly temperature variations in tropical South America are seen to be well reproducible by the model in DJF and MAM, sometimes with almost half the variance explained. Moderate levels of skill are possible during DJF in eastern Australia and much of Europe and during JJA in the Sahel and northwest Asia.

Fig. 5.
Fig. 5.

Derived global distribution of ρo2Corr2(Bo,Bm) (the fraction of the monthly T2M variance that the GEOS model can hope to reproduce, assuming an infinite number of ensemble members) for (a) DJF, (b) MAM, (c) JJA, and (d) SON. The shading contour at 0.059 represents the 99% confidence level that the null hypothesis (i.e., no underlying signal) is invalid, as determined by Monte Carlo analysis. The corresponding value for the 99.9% confidence level is 0.083.

Citation: Journal of Climate 38, 6; 10.1175/JCLI-D-23-0740.1

Of course, an infinite number of ensemble members is hardly realizable, limiting the practical usefulness of Fig. 5. Note, however, that having quantified both ρm2 and ρo2Corr2(Bo,Bm), we can use (12) to determine how well a given ensemble size might perform. Figure 6a shows, for JJA, the square of the correlation between observed monthly T2M and the model ensemble mean value when the ensemble consists of five members. Figures 6b–d then show, respectively, the increase in this skill metric when the ensemble size is increased from 5 to 20 members, from 20 to 50 members, and from 50 to 100 members.

Fig. 6.
Fig. 6.

(a) Square of the correlation [as determined with (12)] between the observed and ensemble mean monthly T2M for JJA, assuming a five-member ensemble. (b) Increase in this skill metric when the ensemble size is increased from 5 to 20 members. (c) Increase in the skill metric when the ensemble size is increased from 20 to 50 members. (d) Increase in the skill metric when the ensemble size is increased from 50 to 100 members.

Citation: Journal of Climate 38, 6; 10.1175/JCLI-D-23-0740.1

As expected, increasing the ensemble size has little impact in areas like northeastern Asia, which has little maximum realizable skill in the first place (Fig. 5c). Increasing the ensemble size also has little impact in the tropics, since most of the skill is already achieved with the small ensemble size of 5 (cf. Figs. 5c and 6a, noting the differences in the color bars). In contrast, going to 20 ensemble members has a significant impact on skill in many other areas, including the Sahel and much of Canada. Based on the more limited skill increases in going from 20 to 50 members (Fig. 6c), one might conclude that 20 members is adequate for capturing the SST-forced signal in monthly T2M. A section of northern Eurasia (just east of Scandinavia), however, continues to show a benefit in skill as the ensemble size is increased, even in going from 50 to 100 members (Fig. 6d). This appears to be consistent with the large ensemble size needed to provide skillful predictions of the winter NAO (Scaife et al. 2014, their Fig. 3). Of course, the skill obtained there with even 1000 members would still be suboptimal due to the imperfect reproduction of teleconnections in the model.

d. Implications for quantifying the signal-to-total ratio present in nature

The quantity ρo2, as used in (12) and (13), is a fundamental characterization of predictability in Nature, one that exists independently of models. Again, ρo2 describes the fraction of the real world’s T2M variance that is controlled by the time variation of SST fields as opposed to chaotic atmospheric noise. We emphasize again that ρo2 cannot be directly measured or otherwise quantified with observations alone. Past estimates of the quantity have relied on the determination of model-based predictability, assuming the two are the same (e.g., Koenigk and Mikolajewicz 2009). Such model-based estimates of ρo2 are indeed effectively provided in Fig. 3, which shows ρm2, which is the model climate’s version of ρo2, for six different models. In a sense, the average ρm2 from the six models, presented in Fig. 7, could be interpreted as a fully model-based estimate of this unmeasurable property. However, one must bear in mind that such model-based predictability can be less than the true predictability; the model-based estimates do not serve as upper bounds for Nature’s values (Kumar et al. 2013; Scaife and Smith 2018).

Fig. 7.
Fig. 7.

Spatial distribution of ρm2, the fraction of the simulated T2M variance explained by the boundary-forced signal, averaged over the individual values obtained for six different AGCMs. (a) DJF. (b) MAM. (c) JJA. (d) SON. Note that in contrast to Figs. 2, 3, and 5, colored shading here is not tied specifically to a 99% confidence level.

Citation: Journal of Climate 38, 6; 10.1175/JCLI-D-23-0740.1

Of course, one of the reasons using ρm2 to estimate ρo2 is unsatisfying is that no observational data whatsoever are used to produce the ρm2 estimates—we are fully at the mercy of the models’ abilities to reproduce the true predictable signals and the behavior of chaotic dynamics in Nature. For this reason, we now examine our alternative and independent second approach to estimating ρo2, one that does use observations. Analyzing the time series of observational T2M data in conjunction with model data allows the estimation of the product ρo2Corr2(Bo,Bm)—the maximum skill (relative to observations) a given model might attain with an unlimited number of ensemble members. The distribution of this product was provided for the GEOS model in Fig. 5. All that would be needed to obtain ρo2 from such fields is an estimate of Corr2(Bo, Bm)—an estimate for how the teleconnections in the real world, in the absence of noise, agree with those in the model. As noted in section 3a, we obtain this estimate by pairing GEOS with each of the five CMIP6 models in turn and then averaging the five resulting Corr2(Bm, Bq) estimates obtained with (10). We thus make the admittedly major assumption, supported by the analysis in section S2 of the supplemental material, that this average is a reasonable proxy for Corr2(Bo, Bm).

We reemphasize here an important point: the quantities Corr2(Bo, Bm) and ρm2 are conceptually independent of each other— they represent distinctly different facets of model behavior. Accordingly, dividing ρo2Corr2(Bo,Bm) by our model-based estimate of Corr2(Bo, Bm) provides a second estimate of ρo2, one that is fully independent of the more common approach of simply utilizing ρm2. It is this second approach that represents a novel application of the framework in section 3a to the estimation of real-world values of ρo2.

We present in Fig. 8 estimates of ρo2 obtained by dividing the ρo2Corr2(Bo,Bm) in Fig. 5 by the arithmetic mean of the Corr2(Bm, Bq) values derived with (10). Again, we focus on the ρo2Corr2(Bo,Bm) estimates from GEOS given the larger ensemble size available for that model. Values are not available everywhere; as before, if internal variability is so large that it overwhelms our ability to determine Corr2(Bm, Bq)—that is, if the derived ρm2 or ρq2 lies below the imposed 0.05 threshold—we assume the associated ρo2 values are undefined.

Fig. 8.
Fig. 8.

Estimates of the degree to which SST boundary forcing controls the time variability of T2M in the real world (as measured with ρo2) obtained by dividing the estimates of ρo2Corr2(Bo,Bm) in Fig. 5 by the arithmetic mean of the model-based Corr2(Bm, Bq) estimates, as shown for MAM in Fig. 4f. (a) DJF. (b) MAM. (c) JJA. (d) SON. Values deemed undefined are masked in white. Note that in contrast to Figs. 2, 3, and 5, colored shading here is not tied specifically to a 99% confidence level.

Citation: Journal of Climate 38, 6; 10.1175/JCLI-D-23-0740.1

Where the ρo2 values are defined, we see in Fig. 8 some of the patterns that were determined for ρm2 in Fig. 7, particularly the high relative control of boundary forcing on T2M variability in the tropics and the lower control toward midlatitudes. Some more specific features seen in the average ρm2 fields of Fig. 7 are also seen in the new ρo2 estimates in Fig. 8, such as higher ρo2 values in the Amazon during DJF than in JJA or SON and a dipole of high values along western North America in MAM. The “splotchiness” of the patterns in Fig. 8 presumably stems from a similar quality of the estimated ρo2Corr2(Bo,Bm) fields, which in turn must relate to the character of the T2M observations being used—the observations have their own sources of error (with a limited density of measurement stations in many parts of the world; see section 5) and represent, in any case only a single 36-yr time series with concomitant sampling error. Presumably, the true underlying patterns are smoother. The point of Fig. 8 is to provide first-order patterns of ρo2 based on this new approach.

Naturally all of these ρo2 estimates, whether from Fig. 7 or Fig. 8, are uncertain, and because ρo2 is an unmeasurable property of Nature, definitive uncertainty estimates for the quantity are themselves unattainable. To the extent that, as in past studies, ρm2 can be said to approximate ρo2, the intermodel ρm2 variations seen in Fig. 3 and Figs. S9S11 provide a flavor for the uncertainty. (It is worth noting in this context that the different models are indeed generally consistent in the spatial structures, and even in the magnitudes, of their ρm2 fields.) Considering the newer approach presented in Fig. 8, we can illustrate the uncertainty qualitatively by dividing, at each grid cell, the value of ρo2Corr2(Bo,Bm) by both the maximum and minimum value of Corr2(Bo, Bm) obtained from the five models rather than by their arithmetic mean. The resulting “lower bounds” and “upper bounds” for ρo2 are provided for MAM in Fig. 9 and for the other seasons in the supplemental material (section S5). The blackened areas in Fig. 9b indicate an unrealistic perfect predictability, a reflection of the limitations of the analysis approach. Even with such limitations, the minimum and maximum fields do show the same spatial structures, and they do significantly constrain the potential ranges of the ρo2 values in most areas.

Fig. 9.
Fig. 9.

Qualitative indication of the uncertainty associated with estimating ρo2 from ρo2Corr2(Bo,Bm). (a) Estimates of the lower bound of ρo2 obtained by dividing ρo2Corr2(Bo,Bm) by the maximum value of Corr2(Bo, Bm) found at the grid cell among the five model combinations. (b) As in (a), but obtained by dividing ρo2Corr2(Bo,Bm) by the minimum value of Corr2(Bm, Bq) found at the grid cell. Values deemed undefined are masked in white.

Citation: Journal of Climate 38, 6; 10.1175/JCLI-D-23-0740.1

5. Discussion

The AMIP-style ensembles examined here focus only on SST boundary forcing and do not address, for example, the potential impact of land boundary forcing on meteorological signals. Presumably, if realistic yearly varying soil moisture and/or vegetation conditions were prescribed along with the realistic SSTs in modified versions of the AMIP-style ensembles, the computed values of ρo2 and ρm2 would increase over certain continental regions—particularly, as suggested by past studies (e.g., Koster et al. 2004, 2011), in the transition zones between humid and dry continental climates. While land states do evolve with atmospheric states in Nature and in models, a study of the effects of imposed soil moisture on the overlying meteorology would nonetheless be relevant to the science of predictability—a land surface initialization in a forecast system may be “remembered” for weeks into a forecast period, thereby increasing, through land–atmosphere interaction, the potential for a meteorological signal to emerge through the atmosphere’s chaotic noise. A proper analysis of how the dynamic land boundary condition might imprint itself onto ρo2 and ρm2 is left for future study.

It is worth mentioning again that the ρo2 estimates in Fig. 8 rely on the assumption that the Corr2(Bm, Bq) values averaged over the five different model combinations can serve as a proxy for Corr2(Bo, Bm)—for the measure of how well the time variability of the boundary-forced signal in the GEOS model agrees with that in Nature. We have no way of properly evaluating the adequacy of this assumption; again, models may share systematic errors that bias the model-based correlations (though see section S2 in the supplemental material). Nevertheless, we emphasize again that the approach derives the ρo2 estimates in a wholly novel way, one that uses ensembles from multiple independent models in conjunction with the observational temperature record. Accordingly, the ρo2 estimates in Fig. 8 can be considered a new and independent contribution to existing attempts to quantify this unmeasurable property of Nature. Perhaps combining the five CMIP 10-member ensembles into a single 50-member ensemble for comparison with the GEOS ensemble would produce more robust results than averaging over the five Corr2(Bm, Bq) values; we leave this question for future research.

In addition to all of the caveats already discussed, another involves the reliability of the observed T2M data. While the estimation of ρm2 is not limited by data availability—the model provides complete spatial and temporal coverage over the multiple decades—the accuracy of estimated quantities involving ρo2 will be limited in areas that are not well covered by measurement stations. Figure 10 shows, for the different 0.5° × 0.5° grid cells that contain measurement stations (again, this is the resolution of the original CPC T2M data), the fraction of the 1981–2014 period for which monthly temperature observations are available. The red dots indicate complete temporal coverage; in areas where the red dots are relatively close together (e.g., the United States, Europe, China), we can expect the spatial interpolation performed to fill in the neighboring grid cells with reasonable values given the large spatial scales of temperature anomalies. In some areas (e.g., the Sahara, the Amazon, and the far northern reaches of North America and Asia), however, measurements are sparse, and interpolated T2M values—and thus the derived ρo2 and ρo2Corr2(Bo,Bm) values—will be questionable. A detailed evaluation of how these features affect the accuracy of the gridded T2M data and the associated ρo2 is beyond the scope of this study. We provide Fig. 10 mainly to emphasize that the T2M data themselves are not perfect and to provide some qualitative guidance on where the results should be more reliable.

Fig. 10.
Fig. 10.

For the 0.5° × 0.5° grid cells containing at least one T2M measurement station (considering here the dimensions of the raw CPC data arrays), the fraction of the 408 months during 1981–2014 for which a monthly T2M value can be computed based on at least some submonthly measurements.

Citation: Journal of Climate 38, 6; 10.1175/JCLI-D-23-0740.1

Having noted this particular caveat about the temperature record, we present in Fig. 11 the differences between Figs. 7 and 8—the differences between the ρo2 values estimated from ρo2Corr2(Bo,Bm) with those estimated as ρm2. Figure 10 indicates that the observational temperature record in the tropics and indeed across much of Africa is relatively poor. To the extent, however, that the temperature record there is valid, Fig. 11 shows that the ρo2 values estimated using the newer approach tend, in the tropics, to be somewhat lower than the ρm2 values characterizing the AGCMs. That is, while the spatial patterns in the fields generated with the two approaches are similar overall, the models tend to overestimate in the tropics the fraction attributable to potentially predictable signals and hence underestimate the fraction of the total variance attributable to noise—a result consistent with earlier studies (e.g., Eade et al. 2014).

Fig. 11.
Fig. 11.

Differences between the ρo2 values illustrated in Fig. 8 with the model-averaged ρm2 values shown in Fig. 7. (a) DJF. (b) MAM. (c) JJA. (d) SON.

Citation: Journal of Climate 38, 6; 10.1175/JCLI-D-23-0740.1

As for the extratropics, several studies (e.g., Eade et al. 2014; Scaife and Smith 2018; Zhang and Kirtman 2019; Cottrell et al. 2024) have identified behavior corresponding to a “signal to noise paradox,” the fact that a model’s ensemble mean can often capture the observed variations of a quantity better than it can capture the variations simulated by a single ensemble member. In the context of the mathematical framework utilized here, the signal-to-noise paradox is consistent with ρo2 exceeding ρm2—i.e., with models having too small a fraction of the total variance explained by predictable signal. We do see some indication of this in the extratropics in Fig. 11, but the paradox does not appear as widespread as indicated in some of the earlier studies. In this regard, two important points must be emphasized. First, the variable we consider here (monthly averaged T2M) may have unique predictability properties. Second, and more important, our ability to estimate ρo2 is severely limited in areas for which the models’ inherent noise is large, i.e., for which the value of ρm2 is close to zero. The whited-out areas in the extratropics of Fig. 11 show where ρm2 is too small to allow a reliable determination of the difference ρo2ρm2. These whited-out areas largely coincide with the occurrence of the paradox in the earlier studies; in these areas, our analysis approach is not reliable enough to address the phenomenon.

Finally, note that we have attempted in this work to minimize the use of the terms “prediction,” “predictability,” and “potential predictability.” The terms ρo2 and ρm2 are certainly key elements of real-world and model predictability, respectively, and thus their quantification here has strong relevance to predictability studies. However, our results are based on AMIP simulations, which effectively make the assumption that SST variations are perfectly predictable and do not evolve with atmospheric conditions—we are assuming here that the effects of ocean–atmosphere coupling can be neglected. The most we might try to claim here is that, based on the ocean’s thermal inertia, our ρo2 and ρm2 are relevant to predictability out to a few months lead, but even then we would need to understand how the character of the atmosphere’s unpredictable variability might differ in a coupled atmosphere–ocean system. Placing this work firmly in the context of predictability and prediction would require additional analysis. What we effectively provide here is a stepping stone toward that fuller understanding.

Even in light of all these caveats, the framework provided herein does have unique value. In addition to allowing us the ability to quantify an AGCM’s internal SST-forced signal and its diminishment by chaotic atmospheric noise, it has provided a means of extending this information, through a novel approach, into new observational estimates of ρo2.

6. Conclusions

A mathematical framework is provided for the processing of an ensemble of AMIP-style atmospheric model simulations into estimates of ρm2, a measure of the relative degree to which a given location’s simulated monthly T2M is determined by the global SST distribution rather than by unpredictable atmospheric variability. Model teleconnections and atmospheric noise, of course, have been studied in the past; the novel contribution of the present study is twofold: we provide an intercomparison of the noise and teleconnection characteristics of multiple AGCMs, and just as important, we introduce a new approach to estimating the corresponding fields of ρo2 for the real world through the quantification of the teleconnection metric Corr2(Bm, Bq) across the different models. While the ρo2 patterns obtained in this way are, to first order, similar to those obtained using the more common approach of using of ρm2 to estimate ρo2, our analysis suggests that the ρm2-based approach underestimates the fraction of tropical T2M variance attributable to noise, or, equivalently, overestimates the fraction attributable to the predictable signal. Furthermore, the fact that the computed Corr2(Bm, Bq) values are often well below one underlines the uncertainty present in current climate models concerning ocean–land teleconnections, indicating room for considerable model improvement in this regard.

Overall, the results presented herein serve to demonstrate how the framework could be applied to evaluate unpredictable T2M variability within any modeling system. Indeed, in principle, the framework can be used to examine the SST impacts on any meteorological variable (e.g., precipitation) if measurements of the variable spanning decades are available for analysis.

Acknowledgments.

This work was supported by the NASA Modeling, Analysis, and Prediction (MAP) Program (NNH20ZDA001N) and by the National Climate Assessment Enabling Tools project (WBS281945.02.03.05.13) at NASA’s Global Modeling and Assimilation Office (GMAO). Computational resources supporting this work were provided by the NASA High-End Computing (HEC) Program through the NASA Center for Climate Simulation (NCCS) at GSFC. We thank Wei Shi for help with the observed T2M data, and we thank the modeling centers that made the CMIP6 data used in our analysis available.

Data availability statement.

CMIP6 model simulation data are available from https://esgf-node.llnl.gov/search/cmip6/. Output from the GEOS AGCM simulations can be made available upon request. The gridded daily temperature (T2M) data are available from the Climate Prediction Center (CPC; https://www.esrl.noaa.gov/psd/data/gridded/data.cpc.globaltemp.html).

REFERENCES

  • Athanasiadis, P. J., and Coauthors, 2017: A multisystem view of wintertime NAO seasonal predictions. J. Climate, 30, 14611475, https://doi.org/10.1175/JCLI-D-16-0153.1.

    • Search Google Scholar
    • Export Citation
  • Bernhardt, J., A. M. Carleton, and C. LaMagna, 2018: A comparison of daily temperature-averaging methods: Spatial variability and recent change for the CONUS. J. Climate, 31, 979996, https://doi.org/10.1175/JCLI-D-17-0089.1.

    • Search Google Scholar
    • Export Citation
  • Bethke, I., and Coauthors, 2021: NorCPM1 and its contribution to CMIP6 DCPP. Geosci. Model Dev., 14, 70737116, https://doi.org/10.5194/gmd-14-7073-2021.

    • Search Google Scholar
    • Export Citation
  • Boucher, O., and Coauthors, 2020: Presentation and evaluation of the IPSL‐CM6A‐LR climate model. J. Adv. Model. Earth Syst., 12, e2019MS002010, https://doi.org/10.1029/2019MS002010.

    • Search Google Scholar
    • Export Citation
  • Bretherton, C. S., and D. S. Battisti, 2000: An interpretation of the results from atmospheric general circulation models forced by the time history of the observed sea surface temperature distribution. Geophys. Res. Lett., 27, 767770, https://doi.org/10.1029/1999GL010910.

    • Search Google Scholar
    • Export Citation
  • Chang, Y., S. D. Schubert, R. D. Koster, A. M. Molod, and H. Wang, 2019: Tendency bias correction in coupled and uncoupled global climate models with a focus on impacts over North America. J. Climate, 32, 639661, https://doi.org/10.1175/JCLI-D-18-0598.1.

    • Search Google Scholar
    • Export Citation
  • Chen, H., and E. K. Schneider, 2014: Comparison of the SST-forced responses between coupled and uncoupled climate simulations. J. Climate, 27, 740756, https://doi.org/10.1175/JCLI-D-13-00092.1.

    • Search Google Scholar
    • Export Citation
  • Copsey, D., R. Sutton, and J. R. Knight, 2006: Recent trends in sea level pressure in the Indian Ocean region. Geophys. Res. Lett., 33, L19712, https://doi.org/10.1029/2006GL027175.

    • Search Google Scholar
    • Export Citation
  • Cottrell, F. M., J. A. Screen, and A. A. Scaife, 2024: Signal-to-noise errors in free-running atmospheric simulations and their dependence on model resolution. Atmos. Sci. Lett., 25, e1212, https://doi.org/10.1002/asl.1212.

    • Search Google Scholar
    • Export Citation
  • Danabasoglu, G., and Coauthors, 2020: The Community Earth System Model Version 2 (CESM2). J. Adv. Model. Earth Syst., 12, e2019MS001916, https://doi.org/10.1029/2019MS001916.

    • Search Google Scholar
    • Export Citation
  • Eade, R., D. Smith, A. Scaife, E. Wallace, N. Dunstone, L. Hermanson, and N. Robinson, 2014: Do seasonal-to-decadal climate predictions underestimate the predictability of the real world? Geophys. Res. Lett., 41, 56205628, https://doi.org/10.1002/2014GL061146.

    • Search Google Scholar
    • Export Citation
  • Eyring, V., S. Bony, G. A. Meehl, C. A. Senior, B. Stevens, R. J. Stouffer, and K. E. Taylor, 2016: Overview of the Coupled Model Intercomparison Project phase 6 (CMIP6) experimental design and organization. Geosci. Model Dev., 9, 19371958, https://doi.org/10.5194/gmd-9-1937-2016.

    • Search Google Scholar
    • Export Citation
  • Gates, W. L., 1992: An AMS continuing series: Global change—AMIP: The Atmospheric Model Intercomparison Project. Bull. Amer. Meteor. Soc., 73, 19621970, https://doi.org/10.1175/1520-0477(1992)073<1962:ATAMIP>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Gelaro, R., and Coauthors, 2017: The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2). J. Climate, 30, 54195454, https://doi.org/10.1175/JCLI-D-16-0758.1.

    • Search Google Scholar
    • Export Citation
  • Koenigk, T., and U. Mikolajewicz, 2009: Seasonal to interannual climate predictability in mid and high northern latitudes in a global coupled model. Climate Dyn., 32, 783798, https://doi.org/10.1007/s00382-008-0419-1.

    • Search Google Scholar
    • Export Citation
  • Koster, R. D., M. J. Suarez, and M. Heiser, 2000: Variance and predictability of precipitation at seasonal-to-interannual timescales. J. Hydrometeor., 1, 2646, https://doi.org/10.1175/1525-7541(2000)001<0026:VAPOPA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Koster, R. D., and Coauthors, 2004: Regions of strong coupling between soil moisture and precipitation. Science, 305, 11381140, https://doi.org/10.1126/science.1100217.

    • Search Google Scholar
    • Export Citation
  • Koster, R. D., and Coauthors, 2011: The second phase of the global land–atmosphere coupling experiment: Soil moisture contributions to subseasonal forecast skill. J. Hydrometeor., 12, 805822, https://doi.org/10.1175/2011JHM1365.1.

    • Search Google Scholar
    • Export Citation
  • Kumar, A., M. Chen, and W. Wang, 2013: Understanding prediction skill of seasonal mean precipitation over the tropics. J. Climate, 26, 56745681, https://doi.org/10.1175/JCLI-D-12-00731.1.

    • Search Google Scholar
    • Export Citation
  • Kushnir, Y., and Coauthors, 2019: Towards operational predictions of the near-term climate. Nat. Climate Change, 9, 94101, https://doi.org/10.1038/s41558-018-0359-7.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20, 130141, https://doi.org/10.1175/1520-0469(1963)020%3C0130:DNF%3E2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Mehta, V. M., M. J. Suarez, J. V. Manganello, and T. L. Delworth, 2000: Oceanic influence on the North Atlantic Oscillation and associated Northern Hemisphere climate variations: 1959–1993. Geophys. Res. Lett., 27, 121124, https://doi.org/10.1029/1999GL002381.

    • Search Google Scholar
    • Export Citation
  • Molod, A., L. Takacs, M. Suarez, and J. Bacmeister, 2015: Development of the GEOS-5 atmospheric general circulation model: Evolution from MERRA to MERRA2. Geosci. Model Dev., 8, 13391356, https://doi.org/10.5194/gmd-8-1339-2015.

    • Search Google Scholar
    • Export Citation
  • Murphy, J. M., 1990: Assessment of the practical utility of extended range ensemble forecasts. Quart. J. Roy. Meteor. Soc., 116, 89125, https://doi.org/10.1002/qj.49711649105.

    • Search Google Scholar
    • Export Citation
  • Rodwell, M. J., D. P. Rowell, and C. K. Folland, 1999: Oceanic forcing of the wintertime North Atlantic Oscillation and European climate. Nature, 398, 320323, https://doi.org/10.1038/18648.

    • Search Google Scholar
    • Export Citation
  • Scaife, A. A., and D. Smith, 2018: A signal-to-noise paradox in climate science. npj Climate Atmos. Sci., 1, 28, https://doi.org/10.1038/s41612-018-0038-4.

    • Search Google Scholar
    • Export Citation
  • Scaife, A. A., and Coauthors, 2009: The CLIVAR C20C project: Selected twentieth century climate events. Climate Dyn., 33, 603614, https://doi.org/10.1007/s00382-008-0451-1.

    • Search Google Scholar
    • Export Citation
  • Scaife, A. A., and Coauthors, 2014: Skillful long-range prediction of European and North American winters. Geophys. Res. Lett., 41, 25142519, https://doi.org/10.1002/2014GL059637.

    • Search Google Scholar
    • Export Citation
  • Scaife, A. A., and Coauthors, 2017: Tropical rainfall, Rossby waves and regional winter climate predictions. Quart. J. Roy. Meteor. Soc., 143 (702), 111, https://doi.org/10.1002/qj.2910.

    • Search Google Scholar
    • Export Citation
  • Siegert, S., D. B. Stephenson, P. G. Sansom, A. A. Scaife, R. Eade, and A. Arribas, 2016: A Bayesian framework for verification and recalibration of ensemble forecasts: How uncertain is NAO predictability? J. Climate, 29, 9951012, https://doi.org/10.1175/JCLI-D-15-0196.1.

    • Search Google Scholar
    • Export Citation
  • Smith, D. M., A. A. Scaife, and B. P. Kirtman, 2012: What is the current state of scientific knowledge with regard to seasonal and decadal forecasting? Environ. Res. Lett., 7, 015602, https://doi.org/10.1088/1748-9326/7/1/015602.

    • Search Google Scholar
    • Export Citation
  • Tatebe, H., and Coauthors, 2019: Description and basic evaluation of simulated mean state, internal variability, and climate sensitivity in MIROC6. Geosci. Model Dev., 12, 27272765, https://doi.org/10.5194/gmd-12-2727-2019.

    • Search Google Scholar
    • Export Citation
  • Zhang, W., and B. Kirtman, 2019: Understanding the signal-to-noise paradox with a simple Markov model. Geophys. Res. Lett., 46, 13 30813 317, https://doi.org/10.1029/2019GL085159.

    • Search Google Scholar
    • Export Citation
  • Ziehn, T., and Coauthors, 2020: The Australian Earth System Model: ACCESS-ESM1.5. J. South. Hemisphere Earth Syst. Sci., 70, 193214, https://doi.org/10.1071/ES19035.

    • Search Google Scholar
    • Export Citation

Supplementary Materials

Save
  • Athanasiadis, P. J., and Coauthors, 2017: A multisystem view of wintertime NAO seasonal predictions. J. Climate, 30, 14611475, https://doi.org/10.1175/JCLI-D-16-0153.1.

    • Search Google Scholar
    • Export Citation
  • Bernhardt, J., A. M. Carleton, and C. LaMagna, 2018: A comparison of daily temperature-averaging methods: Spatial variability and recent change for the CONUS. J. Climate, 31, 979996, https://doi.org/10.1175/JCLI-D-17-0089.1.

    • Search Google Scholar
    • Export Citation
  • Bethke, I., and Coauthors, 2021: NorCPM1 and its contribution to CMIP6 DCPP. Geosci. Model Dev., 14, 70737116, https://doi.org/10.5194/gmd-14-7073-2021.

    • Search Google Scholar
    • Export Citation
  • Boucher, O., and Coauthors, 2020: Presentation and evaluation of the IPSL‐CM6A‐LR climate model. J. Adv. Model. Earth Syst., 12, e2019MS002010, https://doi.org/10.1029/2019MS002010.

    • Search Google Scholar
    • Export Citation
  • Bretherton, C. S., and D. S. Battisti, 2000: An interpretation of the results from atmospheric general circulation models forced by the time history of the observed sea surface temperature distribution. Geophys. Res. Lett., 27, 767770, https://doi.org/10.1029/1999GL010910.

    • Search Google Scholar
    • Export Citation
  • Chang, Y., S. D. Schubert, R. D. Koster, A. M. Molod, and H. Wang, 2019: Tendency bias correction in coupled and uncoupled global climate models with a focus on impacts over North America. J. Climate, 32, 639661, https://doi.org/10.1175/JCLI-D-18-0598.1.

    • Search Google Scholar
    • Export Citation
  • Chen, H., and E. K. Schneider, 2014: Comparison of the SST-forced responses between coupled and uncoupled climate simulations. J. Climate, 27, 740756, https://doi.org/10.1175/JCLI-D-13-00092.1.

    • Search Google Scholar
    • Export Citation
  • Copsey, D., R. Sutton, and J. R. Knight, 2006: Recent trends in sea level pressure in the Indian Ocean region. Geophys. Res. Lett., 33, L19712, https://doi.org/10.1029/2006GL027175.

    • Search Google Scholar
    • Export Citation
  • Cottrell, F. M., J. A. Screen, and A. A. Scaife, 2024: Signal-to-noise errors in free-running atmospheric simulations and their dependence on model resolution. Atmos. Sci. Lett., 25, e1212, https://doi.org/10.1002/asl.1212.

    • Search Google Scholar
    • Export Citation
  • Danabasoglu, G., and Coauthors, 2020: The Community Earth System Model Version 2 (CESM2). J. Adv. Model. Earth Syst., 12, e2019MS001916, https://doi.org/10.1029/2019MS001916.

    • Search Google Scholar
    • Export Citation
  • Eade, R., D. Smith, A. Scaife, E. Wallace, N. Dunstone, L. Hermanson, and N. Robinson, 2014: Do seasonal-to-decadal climate predictions underestimate the predictability of the real world? Geophys. Res. Lett., 41, 56205628, https://doi.org/10.1002/2014GL061146.

    • Search Google Scholar
    • Export Citation
  • Eyring, V., S. Bony, G. A. Meehl, C. A. Senior, B. Stevens, R. J. Stouffer, and K. E. Taylor, 2016: Overview of the Coupled Model Intercomparison Project phase 6 (CMIP6) experimental design and organization. Geosci. Model Dev., 9, 19371958, https://doi.org/10.5194/gmd-9-1937-2016.

    • Search Google Scholar
    • Export Citation
  • Gates, W. L., 1992: An AMS continuing series: Global change—AMIP: The Atmospheric Model Intercomparison Project. Bull. Amer. Meteor. Soc., 73, 19621970, https://doi.org/10.1175/1520-0477(1992)073<1962:ATAMIP>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Gelaro, R., and Coauthors, 2017: The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2). J. Climate, 30, 54195454, https://doi.org/10.1175/JCLI-D-16-0758.1.

    • Search Google Scholar
    • Export Citation
  • Koenigk, T., and U. Mikolajewicz, 2009: Seasonal to interannual climate predictability in mid and high northern latitudes in a global coupled model. Climate Dyn., 32, 783798, https://doi.org/10.1007/s00382-008-0419-1.

    • Search Google Scholar
    • Export Citation
  • Koster, R. D., M. J. Suarez, and M. Heiser, 2000: Variance and predictability of precipitation at seasonal-to-interannual timescales. J. Hydrometeor., 1, 2646, https://doi.org/10.1175/1525-7541(2000)001<0026:VAPOPA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Koster, R. D., and Coauthors, 2004: Regions of strong coupling between soil moisture and precipitation. Science, 305, 11381140, https://doi.org/10.1126/science.1100217.

    • Search Google Scholar
    • Export Citation
  • Koster, R. D., and Coauthors, 2011: The second phase of the global land–atmosphere coupling experiment: Soil moisture contributions to subseasonal forecast skill. J. Hydrometeor., 12, 805822, https://doi.org/10.1175/2011JHM1365.1.

    • Search Google Scholar
    • Export Citation
  • Kumar, A., M. Chen, and W. Wang, 2013: Understanding prediction skill of seasonal mean precipitation over the tropics. J. Climate, 26, 56745681, https://doi.org/10.1175/JCLI-D-12-00731.1.

    • Search Google Scholar
    • Export Citation
  • Kushnir, Y., and Coauthors, 2019: Towards operational predictions of the near-term climate. Nat. Climate Change, 9, 94101, https://doi.org/10.1038/s41558-018-0359-7.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20, 130141, https://doi.org/10.1175/1520-0469(1963)020%3C0130:DNF%3E2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Mehta, V. M., M. J. Suarez, J. V. Manganello, and T. L. Delworth, 2000: Oceanic influence on the North Atlantic Oscillation and associated Northern Hemisphere climate variations: 1959–1993. Geophys. Res. Lett., 27, 121124, https://doi.org/10.1029/1999GL002381.

    • Search Google Scholar
    • Export Citation
  • Molod, A., L. Takacs, M. Suarez, and J. Bacmeister, 2015: Development of the GEOS-5 atmospheric general circulation model: Evolution from MERRA to MERRA2. Geosci. Model Dev., 8, 13391356, https://doi.org/10.5194/gmd-8-1339-2015.

    • Search Google Scholar
    • Export Citation
  • Murphy, J. M., 1990: Assessment of the practical utility of extended range ensemble forecasts. Quart. J. Roy. Meteor. Soc., 116, 89125, https://doi.org/10.1002/qj.49711649105.

    • Search Google Scholar
    • Export Citation
  • Rodwell, M. J., D. P. Rowell, and C. K. Folland, 1999: Oceanic forcing of the wintertime North Atlantic Oscillation and European climate. Nature, 398, 320323, https://doi.org/10.1038/18648.

    • Search Google Scholar
    • Export Citation
  • Scaife, A. A., and D. Smith, 2018: A signal-to-noise paradox in climate science. npj Climate Atmos. Sci., 1, 28, https://doi.org/10.1038/s41612-018-0038-4.

    • Search Google Scholar
    • Export Citation
  • Scaife, A. A., and Coauthors, 2009: The CLIVAR C20C project: Selected twentieth century climate events. Climate Dyn., 33, 603614, https://doi.org/10.1007/s00382-008-0451-1.

    • Search Google Scholar
    • Export Citation
  • Scaife, A. A., and Coauthors, 2014: Skillful long-range prediction of European and North American winters. Geophys. Res. Lett., 41, 25142519, https://doi.org/10.1002/2014GL059637.

    • Search Google Scholar
    • Export Citation
  • Scaife, A. A., and Coauthors, 2017: Tropical rainfall, Rossby waves and regional winter climate predictions. Quart. J. Roy. Meteor. Soc., 143 (702), 111, https://doi.org/10.1002/qj.2910.

    • Search Google Scholar
    • Export Citation
  • Siegert, S., D. B. Stephenson, P. G. Sansom, A. A. Scaife, R. Eade, and A. Arribas, 2016: A Bayesian framework for verification and recalibration of ensemble forecasts: How uncertain is NAO predictability? J. Climate, 29, 9951012, https://doi.org/10.1175/JCLI-D-15-0196.1.

    • Search Google Scholar
    • Export Citation
  • Smith, D. M., A. A. Scaife, and B. P. Kirtman, 2012: What is the current state of scientific knowledge with regard to seasonal and decadal forecasting? Environ. Res. Lett., 7, 015602, https://doi.org/10.1088/1748-9326/7/1/015602.

    • Search Google Scholar
    • Export Citation
  • Tatebe, H., and Coauthors, 2019: Description and basic evaluation of simulated mean state, internal variability, and climate sensitivity in MIROC6. Geosci. Model Dev., 12, 27272765, https://doi.org/10.5194/gmd-12-2727-2019.

    • Search Google Scholar
    • Export Citation
  • Zhang, W., and B. Kirtman, 2019: Understanding the signal-to-noise paradox with a simple Markov model. Geophys. Res. Lett., 46, 13 30813 317, https://doi.org/10.1029/2019GL085159.

    • Search Google Scholar
    • Export Citation
  • Ziehn, T., and Coauthors, 2020: The Australian Earth System Model: ACCESS-ESM1.5. J. South. Hemisphere Earth Syst. Sci., 70, 193214, https://doi.org/10.1071/ES19035.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Representative examples of how key model relationships vary with ensemble size, focusing on the simulation of monthly mean air temperature, T2M. Blue dots: Variation of Corr2(Ymn, Ym) with ensemble size, where Corr2(Ymn, Ym) characterizes the ability of the ensemble mean to capture the variability produced by a single ensemble member. Red dots: Variation of Corr2(Yo, Ym) with ensemble size, where Corr2(Yo, Ym) characterizes the ability of the ensemble mean to capture the variability seen in the observations. The lines through the dots are determined through a fitting procedure, which provides as a matter of course the indicated asymptotes, shown as dashed lines (see text). Results shown are for JJA.

  • Fig. 2.

    Derived global distribution of ρm2 (the fractional contribution of SST boundary forcing to monthly T2M variability in the GEOS atmospheric model) for (a) DJF, (b) MAM, (c) JJA, and (d) SON. The shading contour at 0.029 represents the 99% confidence level that the null hypothesis (i.e., no underlying signal) is invalid, as determined by Monte Carlo analysis. Similarly, values above 0.037 indicate that the null hypothesis is invalid at the 99.9% confidence level.

  • Fig. 3.

    Spatial distribution of ρm2, the fraction of the simulated T2M variance explained by the boundary-forced signal, for six different AGCMs. (a) MIROC6. (b) CESM2. (c) NorCPM1. (d) ACCESS. (e) IPSL. (f) GEOS. The GEOS results are based on a 45-member ensemble, and those for the other AGCMs are based on 10-member ensembles (though in all cases, the values effectively represent what would be obtained with an infinite number of ensemble members). The shading contour at 0.052 represents, for the CMIP models, the 99% confidence level that the null hypothesis (i.e., no underlying signal) is invalid, as determined by Monte Carlo analysis. The corresponding value for the 99.9% confidence level is 0.067.

  • Fig. 4.

    Spatial distribution of Corr2(Bm, Bq), the degree to which the different CMIP models agree with the GEOS model on the temporal variations in the boundary-forced signal. (a) Corr2(BGEOS, BMIROC6). (b) Corr2(BGEOS, BCESM2). (c) Corr2(BGEOS, BNorCPM1). (d) Corr2(BGEOS, BACCESS). (e) Corr2(BGEOS, BIPSL). (f) Arithmetic mean of the Corr2(Bm, Bq) values. Values are considered undefined (and shown as white) if the internal variability of either model involved in a calculation is overwhelmingly high (i.e., if ρm2 or ρq2 falls below a threshold of 0.05). Undefined values are not utilized in the calculation of the arithmetic mean.

  • Fig. 5.

    Derived global distribution of ρo2Corr2(Bo,Bm) (the fraction of the monthly T2M variance that the GEOS model can hope to reproduce, assuming an infinite number of ensemble members) for (a) DJF, (b) MAM, (c) JJA, and (d) SON. The shading contour at 0.059 represents the 99% confidence level that the null hypothesis (i.e., no underlying signal) is invalid, as determined by Monte Carlo analysis. The corresponding value for the 99.9% confidence level is 0.083.

  • Fig. 6.

    (a) Square of the correlation [as determined with (12)] between the observed and ensemble mean monthly T2M for JJA, assuming a five-member ensemble. (b) Increase in this skill metric when the ensemble size is increased from 5 to 20 members. (c) Increase in the skill metric when the ensemble size is increased from 20 to 50 members. (d) Increase in the skill metric when the ensemble size is increased from 50 to 100 members.

  • Fig. 7.

    Spatial distribution of ρm2, the fraction of the simulated T2M variance explained by the boundary-forced signal, averaged over the individual values obtained for six different AGCMs. (a) DJF. (b) MAM. (c) JJA. (d) SON. Note that in contrast to Figs. 2, 3, and 5, colored shading here is not tied specifically to a 99% confidence level.

  • Fig. 8.

    Estimates of the degree to which SST boundary forcing controls the time variability of T2M in the real world (as measured with ρo2) obtained by dividing the estimates of ρo2Corr2(Bo,Bm) in Fig. 5 by the arithmetic mean of the model-based Corr2(Bm, Bq) estimates, as shown for MAM in Fig. 4f. (a) DJF. (b) MAM. (c) JJA. (d) SON. Values deemed undefined are masked in white. Note that in contrast to Figs. 2, 3, and 5, colored shading here is not tied specifically to a 99% confidence level.

  • Fig. 9.

    Qualitative indication of the uncertainty associated with estimating ρo2 from ρo2Corr2(Bo,Bm). (a) Estimates of the lower bound of ρo2 obtained by dividing ρo2Corr2(Bo,Bm) by the maximum value of Corr2(Bo, Bm) found at the grid cell among the five model combinations. (b) As in (a), but obtained by dividing ρo2Corr2(Bo,Bm) by the minimum value of Corr2(Bm, Bq) found at the grid cell. Values deemed undefined are masked in white.

  • Fig. 10.

    For the 0.5° × 0.5° grid cells containing at least one T2M measurement station (considering here the dimensions of the raw CPC data arrays), the fraction of the 408 months during 1981–2014 for which a monthly T2M value can be computed based on at least some submonthly measurements.

  • Fig. 11.

    Differences between the ρo2 values illustrated in Fig. 8 with the model-averaged ρm2 values shown in Fig. 7. (a) DJF. (b) MAM. (c) JJA. (d) SON.

All Time Past Year Past 30 Days
Abstract Views 408 408 0
Full Text Views 407 407 341
PDF Downloads 273 273 186