## Abstract

The decadal predictability of three-dimensional Atlantic Ocean anomalies is examined in a coupled global climate model [the third climate configuration of the Met Office Unified Model (HadCM3)] using a linear inverse modeling (LIM) approach. It is found that the evolution of temperature and salinity in the Atlantic, and the strength of the meridional overturning circulation (MOC), can be effectively described by a linear dynamical system forced by white noise. The forecasts produced using this linear model are more skillful than other reference forecasts for several decades. Furthermore, significant nonnormal amplification is found under several different norms. The regions from which this growth occurs are found to be fairly shallow and located in the far North Atlantic. Initially, anomalies in the Nordic seas impact the MOC and the anomalies then grow to fill the entire Atlantic basin, especially at depth, over one to three decades. It is found that the structure of the optimal initial condition for amplification is sensitive to the norm employed, but the initial growth seems to be dominated by MOC-related basin-scale changes, irrespective of the choice of norm. The consistent identification of the far North Atlantic as the most sensitive region for small perturbations suggests that additional observations in this region would be optimal for constraining decadal climate predictions.

## 1. Introduction

Climate variability on decadal time scales is strongly influenced by the oceans. The Atlantic Ocean is particularly important because of its central role in the overturning circulation. Furthermore, there is evidence from models that variations in the Atlantic overturning circulation, and associated impacts on climate, are potentially predictable on decadal time scales (e.g., Griffies and Bryan 1997; Collins and Sinha 2003; Pohlmann et al. 2004; Sutton and Hodson 2005; Collins et al. 2006). This evidence suggests that predictions for the climate of the next few decades should be properly initialized using information about the present ocean state, rather than relying solely on the simulated response to changing radiative forcing (Solomon et al. 2007).

There has recently been significant progress in the development of properly initialized decadal prediction systems (Smith et al. 2007; Keenlyside et al. 2008; Meehl et al. 2009). Nevertheless, such systems are still at an early stage of development, and there are many challenges ahead. One important need is to improve understanding of the processes that govern error growth. As in numerical weather prediction, understanding the growth of perturbations is essential to design ensembles for sampling uncertainty efficiently (e.g., Molteni et al. 1996) and also to improve observing systems in a targeted, cost-effective way. This latter point is especially important for the oceans because of the high cost of subsurface observations.

The purpose of our study is to investigate the growth, on decadal time scales, of perturbations in the Atlantic Ocean. More specifically, we aim to estimate optimal perturbations, these being the perturbations that grow most rapidly, in a linear sense, as quantified by an appropriate metric, over a defined time interval (e.g., Farrell 1988). These perturbations have been examined in simple models using a variety of techniques (e.g., Lohmann and Schneider 1999; Zanna and Tziperman 2005; Sévellec et al. 2008). The methodology that we employ is the linear inverse modeling (LIM) approach of Penland and Sardeshmukh (1995, hereafter referred to as PS95), which we apply to output from a coupled general circulation model (GCM). The LIM approach is attractive in that it provides a way to estimate optimal “nonnormal” perturbations with vastly less computational expense than alternative methods (e.g., empirical singular vectors) (Kleeman et al. 2003). LIM has been exploited to study decadal variability and predictability in the Pacific using observations (e.g., Newman 2007; Alexander et al. 2008) and amplification of the Atlantic thermohaline circulation (THC) in simple box models (e.g., Tziperman and Ioannou 2002). Recently, Tziperman et al. (2008) used LIM to examine decadal predictability in the Atlantic Ocean in a coupled GCM [the Geophysical Fluid Dynamics Laboratory Climate Model version 2.1 (GFDL CM2.1)] and found significant rapid, nonnormal amplification from anomalies near the surface in the Labrador Sea, which grew into anomalies in a larger region of the North Atlantic. In their model, this growth limits the predictability of the ocean temperature and salinity fields, as well as the THC, to about 8 years.

However, it is important to determine if this predictability limit is found in other GCMs (and in the real ocean), and there is evidence that different GCMs behave in diverse ways. Griffies and Bryan (1997) analyzed a previous version of the GFDL model and found that the predictability of the deep ocean lasted 10–20 yr. Collins and Sinha (2003) found that the THC in the third climate configuration of the Met Office Unified Model (HadCM3) was potentially predictable for up to 50 yr. This analysis was extended to a multimodel study by Collins et al. (2006), who found that several GCMs had significant predictability for 20 yr. There was also evidence that predictability varies depending on the exact initial conditions, for example, whether the overturning is relatively weak or strong.

In this paper we adopt the LIM approach to explore Atlantic Ocean predictability in HadCM3. This analysis, although similar to Tziperman et al. (2008), was developed independently and has a different emphasis. The forecast skill and the robustness of the identified optimal perturbations to the LIM methodology, under several different norms, are explored in detail. The suggested physical mechanisms for amplification identified by the linear model are shown to occur in the full GCM. These robustness checks are vital for the successful application of these results to operational decadal climate predictions and to the design of future ocean monitoring systems.

This paper is structured as follows. In section 2 we briefly describe the HadCM3 model and the data used. Section 3 summarizes the use of LIM, and the justification for this approach, with more details in appendixes A and B. Section 4 demonstrates the ability of the constructed linear model to produce skillful forecasts for several decades. The estimation of the nonnormal amplification and optimal initial conditions is presented in section 5, and we conclude and discuss the implications for predictability in section 6.

## 2. Model description and data used

In the subsequent analysis we have used data from an extended (1600 yr) preindustrial control run of the Hadley Centre climate model, HadCM3. The model details are given in Gordon et al. (2000) and references therein; here we give a brief summary. HadCM3 is a global coupled ocean–atmosphere model with an atmospheric resolution of 2.5° × 3.75° and 19 vertical levels. The ocean component has a resolution of 1.25° × 1.25° with 20 vertical levels. The model does not require flux adjustment to maintain a stable climate. The mean state of the ocean model matches observed values to within 1 K and 1 psu in most regions (Gordon et al. 2000; Pardaens et al. 2003). We do not analyze the first 500 years of data to minimize the influence of spinup effects in the ocean.

### a. EOF basis

Three-dimensional (3D) bivariate correlation EOFs of the annual mean data for temperature and salinity in the Atlantic, for 1100 yr, were estimated by Hawkins and Sutton (2007, hereafter referred to as HS07), who describe the details of their construction. In summary, the domain used is 20°S–90°N, 100°W–20°E using 12 depth levels from the surface to 1800 m. The state vector contains both salinity and temperature anomalies, which are normalized to have unit variance at each grid point and are also weighted by their contribution to local density. We also ensure that each volume of water is treated equally by weighting by the volume of each grid box. HS07 showed that the two leading modes have significant multidecadal and centennial variability and are well correlated with an index of the meridional overturning circulation (MOC) on decadal time scales. These two modes also show good potential predictability, which is explored further here.

In this study we represent the state of the Atlantic Ocean using the leading 30 EOFs, which have been scaled so that the standard deviation of each principal component (PC) is proportional to the square root of the fraction of variance of the full three-dimensional fields explained. The eigenspectrum is shown in Fig. 1 as the solid black line. The dark gray line represents the cumulative variance explained and shows that these 30 EOFs together account for 71% of the total variance of this large (105 150 × 1100 elements) system. The leading four PCs are shown in Fig. 2.

### b. Overturning index

The strength of the meridional overturning circulation in this model has been found to vary on a wide range of time scales,^{1} including an interdecadal mode (∼25 yr) (Dong and Sutton 2005), a multidecadal mode (∼70 yr) (Vellinga and Wu 2004), and a centennial mode (∼150 yr) (HS07). In this paper we will examine whether small stochastic perturbations can amplify and excite MOC variability.

The meridional overturning index (MOI) used in this study is defined as the anomaly from the time-mean meridional streamfunction averaged over the latitude band 27.5°–32.5°N at a depth of 1000 m. This is the same as used by HS07, but here it has been decadally smoothed. The mean overturning strength for this smoothed index is 16.5 Sv (Sv ≡ 10^{6} m^{3} s^{−1}), with a standard deviation of 0.59 Sv, and is hereafter denoted as MOI_{GCM}. The light gray line in Fig. 1 shows that the leading 30 PCs account for 81% of the variance of this index, at zero lag.

## 3. Linear inverse modeling

We now briefly describe the linear inverse modeling approach, following PS95. The more technical details are given in appendix A.

### a. The linear model

The evolution of variables *y* in a GCM can be represented as

where *F* is a complex nonlinear operator. We assume here that the dynamics are effectively linear, that is, that the nonlinear dynamics have a much shorter time scale than the linear dynamics (Penland 1996). The temporal evolution of the GCM can then be approximated as a stochastically forced linear dynamical system:

where **x** is a state vector, *ξ* is a forcing term, and 𝗕 is a matrix defining the temporal evolution of the state vector. We discuss later how well this linear model performs for the ocean state in HadCM3.

If 𝗕^{T}𝗕 ≠ 𝗕𝗕^{T} then the system is described as nonnormal, which results in eigenvectors of 𝗕 that are nonorthogonal. This is the case in most systems based on fluid dynamical equations such as that being considered here (e.g., Farrell 1988).

To reduce the dimensionality of the system we consider just the evolution in the subspace of the leading modes of variability in the data (in this case, the EOFs^{2}). The matrix 𝗕 that best models the PCs of the GCM data can be estimated through the data covariance matrices as described in appendix A. Properties of the noise, *ξ*, can be similarly estimated.

The eigenvectors of 𝗕 are sometimes known as the empirical normal modes (ENMs) (PS95), and their properties for our dataset are discussed further in appendix A. These modes are not usually promising for examining growth as they tend to be overdamped, oscillating modes. For our dataset, one exponentially decaying mode shows a small transient amplification when initialized in certain phases, but larger growth is possible through the interference (or sum) of different modes (see section 5).

The linear model described makes several assumptions about the linear nature of the system and the properties of the stochastic forcing. Previous authors have described several tests to justify these assumptions, and the application of these tests to our dataset is discussed in appendix B. We restrict our analysis to the leading 30 EOFs as these tests are not well passed when more EOFs are considered.

### b. Making forecasts

The matrix 𝗕 can be used to make forecasts of x for any lead time *τ* :

where **x̂** denotes the predicted value of *x*, and our general linear propagator

For comparison, we also model the evolution of each PC (i.e., each element of **x**) as an autoregressive (AR1) process, which uses a diagonal matrix for 𝗣_{τ}. Results using both the full and diagonal propagators are shown later.

As well as forecasting the PC state vector **x**, it is also possible to forecast any index of variability. Here we focus on the overturning strength and define a vector **m** that optimally combines the PCs to create a reconstructed overturning index,

by minimizing the variance of the residuals between MOI_{GCM} and MOI_{recon} (e.g., Tziperman et al. 2008). The reconstructed MOI can then be forecast in a similar way to Eq. (3),

and compared with the actual MOI_{recon} (and MOI_{GCM}) to give an estimate of the skill of the forecast.

## 4. Forecasts and predictability

Having demonstrated that our system is behaving appropriately (appendix B) we can proceed to use the linear modeling approach described above with more confidence. First we consider the skill of the forecasts produced with Eqs. (3) and (6).

### a. Forecasting the leading PCs

Figure 2 shows the four leading PCs (gray lines) and forecasts for lead times of up to 50 yr (black lines) using our linear model. The forecasts seem to show considerable skill, even with multidecadal lead times. We now quantitatively assess this skill by comparing the linear model forecast errors to reference forecasts such as climatology (𝗣_{τ} = 0) and persistence (𝗣_{τ} = 1). We also compare with the theoretical error due to the presence of the unpredictable white noise forcing in Eq. (2)—this is termed a “perfect” linear model. Penland (1989) showed that the prediction error covariance matrix, assuming the system was purely linear, would be

where 𝗖(0) is the covariance matrix of *x*, defined in Eq. (A4). As we are fitting a linear model to a system that is probably slightly nonlinear, the actual errors using the propagator will be different and normally larger. So, the differences between the actual and theoretical errors are a measure of the nonlinearity of the system and a test of our linear assumptions.

Figure 3 shows the rms errors in the forecasts for the total vector **x** (top left panel) and the leading five PCs for a perfect linear model [Eq. (7), dashed black line], the full linear model (solid black line), the restricted diagonal model [solid gray line, which is equivalent to a damped persistence (AR1) forecast], and a pure persistence forecast (dashed gray line) relative to a climatological forecast. The full linear model comfortably beats climatology and the diagonal model for up to 100 yr in all examples shown. The full linear model is also close to the theoretical limit, demonstrating that our system is behaving roughly linearly. The largest nonlinearity appears in the leading three PCs.

The rms errors shown in Fig. 3 are potentially optimistic because the errors and propagator are estimated using the same data. In Fig. 4 we test how important this is by showing the rms errors from just the last 100 yr calculated using the full propagator (dashed black line) and a propagator calculated using just the first 1000 yr, that is, independent data (solid gray line). The solid black lines are repeated from Fig. 3. Although the rms errors tend to increase when using independent data, they are not dramatically worse, particularly for the first few decades. All further results are shown using the full propagator. The large differences between the solid and dashed black lines demonstrate that the rms errors can be state dependent.

### b. Forecasting the overturning

We now consider forecasts of the overturning indices, which rely on the higher PCs not considered in detail above, although it should be noted that not all of the MOI variability is captured by the linear model (see appendix B).

#### 1) The reconstructed MOI

The reconstructed MOI, defined in Eq. (5), has a correlation of 0.90 with MOI_{GCM}.^{3} The time series of both indices are shown in Fig. 5, which also shows forecasts of the MOI for lead times of up to 50 yr (black lines) for our full linear model. Again, the forecasts show some predictive ability, especially for the low-frequency variability, even at long lead times.

In a similar way to Eq. (7), Tziperman et al. (2008) showed that the theoretical error covariance for MOI_{recon}, assuming a perfectly linear system, is

The bottom left panel of Fig. 3 shows the rms errors for the forecast of MOI_{recon} for the full linear model (solid black line), the perfect linear system (dashed black line), the diagonal model (solid gray line), and the persistence forecast (dashed gray line). It can be seen that the full linear model is again close to the theoretical limit and outperforms all the reference forecasts for several decades.

#### 2) The GCM overturning strength

As the MOI cannot be perfectly reconstructed from our PC basis, the skill of the linear model in forecasting MOI_{recon} is not necessarily what we would like to know for practical purposes. A more relevant forecast is for MOI_{GCM}, and the bottom right panel of Fig. 3 quantifies the skill of the linear model in predicting this decadally filtered index. It is found that for lead times less than ∼4 yr damped persistence and pure persistence forecasts for MOI_{GCM} (solid gray and dashed gray lines respectively) perform best. At longer lead times the linear propagator model (solid black line) is superior and performs better than a climatological forecast for more than 100 yr. The lead time at which the linear model forecast reaches the level of 50% of the climatological variance is ∼10 yr. This compares well to three ensemble “perfect model” integrations of this GCM shown by the black points and thin black lines (Collins and Sinha 2003).^{4} At zero lag, using 30 (2) PCs it is possible to estimate MOI_{GCM} with a rms error of 0.26 (0.42) Sv.

### c. Predicting SST anomalies

The skill of the linear inverse model in predicting annual mean Atlantic SST anomalies is now demonstrated. Figure 6 shows anomaly correlation maps for the LIM forecast, a damped persistence forecast, and the difference in the fraction of variance explained between the two, for 1-yr and 10-yr lead times. The LIM shows considerable skill, even at a lead time of 10 yr, over most of the Atlantic and outperforms damped persistence in most of the basin at both lead times. The greatest increase in skill, over damped persistence, at short lead times is in the tropical South Atlantic and in the northeast Atlantic. At longer lead times, the North Atlantic Current (NAC) region shows the largest increase in skill. These results might encourage use of the LIM methodology with historical observations in the Atlantic to improve forecasts for several years to decades ahead (e.g., Newman 2007 for the Pacific Ocean) without the computational expense of running ensembles of GCM integrations.

## 5. Optimal perturbations

Making the forecasts as described above is important to demonstrate that the linear model has skill, but the main focus of this study is to examine the growth of anomalies. As already mentioned, none of the individual empirical normal modes (ENMs) give substantial transient amplification. To obtain larger growth the damped, oscillating ENMs need to interfere constructively. This is only possible in a nonnormal system as the ENMs are nonorthogonal.

### a. Calculating transient amplification and optimal initial conditions

Using the definition of the norm of a vector **v**,

where 𝗡 is a “norm kernel,” we can define the amplification of our system at a lead time *τ* as

where 𝗟 and 𝗡 are the initial and final norm kernels, respectively. We then look for initial conditions **x**_{0}, that maximize this amplification. This is achieved (Farrell 1988; Tziperman and Ioannou 2002) by solving the generalized eigenvalue equation

for *λ* and **x**_{0} at different lead times *τ*. The largest eigenvalue, *λ*_{max}, is an estimate of the largest transient amplification possible for that lead time. The optimal initial conditions are normalized so that **x**_{0}^{T}𝗟**x**_{0} = 1, and the spatial patterns corresponding to **x**_{0} indicate the regions most sensitive to small perturbations, where observations would potentially be the most useful to constrain predictions.

### b. Choice of norm kernel

The choice of norm kernels, 𝗟 and 𝗡, must be considered carefully as it affects both the estimated optimal initial conditions and the growth of anomalies (Tziperman et al. 2008). Usually the two norm kernels are chosen to be the same, although there is no restriction for this to be the case. Table 1 summarizes the different norms used in this study (described below), and the sensitivity to this choice is discussed later.

#### 1) The quadratic, SST, and total variance norms

The simplest norm uses the identity matrix for the initial and final norm kernels (𝗟 = 𝗡 = 𝗜). This quadratic norm (sometimes called an energy norm) finds maximal growth in the space defined by the leading PCs. We attempt to interpret this growth by considering what any amplification means in the original data space.

It is possible to construct a norm kernel for any subset of the data,

where 𝗘 is a matrix (of size *n* × *p*) that consists of each of the *p* retained EOF patterns, restricted to the *n* ocean grid points in the field of interest, for example, just the surface temperature data (*p* = 30 and *n* = 4768 for our dataset) to construct an SST norm (𝗡_{SST}), which we consider later; and 𝗪 is a matrix of weights. For the SST norm, 𝗪 consists of a latitude-dependent area weighting. More generally, if 𝗘 represents the entire 3D EOF fields for temperature and salinity (*p* = 30, *n* = 105 150 for our dataset), then 𝗡 is a norm that maximizes changes in total variance. Note especially that this is not the identity matrix unless the various weightings (𝗪) used in the original EOF calculation (i.e., for volume, contribution to local density, and local standard deviation—see HS07) are all applied to 𝗘 to make it orthogonal. Thus, the quadratic norm can be interpreted as a variance norm that takes account of all these weightings. We will also consider a different variance norm (denoted by 𝗡_{V}) that includes only the weightings for differing volumes of each grid box and for local standard deviation (although this norm still includes some covariance between temperature and salinity).^{5} To ensure comparable sizes of anomalies, 𝗡_{SST} and 𝗡_{V} are normalized to have unit determinant. When using both these norms we use the same initial and final norm kernel (𝗟 = 𝗡).

#### 2) The MOC norm

Previous authors, starting with Tziperman and Ioannou (2002), have considered the transient amplification of the overturning strength by defining an MOC norm,

where ɛ is a “small” number and 𝗡_{ɛ} is a suitable nonsingular norm, and use 𝗡_{MOC} as both the initial and final norm kernels. The extra small term (or “regularization”) is required, as **mm**^{T} is singular (see Tziperman and Ioannou 2002; Tziperman et al. 2008 for more details).

This approach constrains the optimal initial MOI condition to be small (𝒪[ɛ]), resulting in a large (𝒪[ɛ^{−1}]) amplification for *τ* > 0,

which has the interesting feature that the initial and final norm kernels are dominated by different terms. Thus, the optimal initial conditions depend strongly on the perturbing norm kernel 𝗡_{ɛ} with the additional restriction that the initial MOI anomaly be virtually zero. If we instead choose different initial and final norms (𝗟 = 𝗡_{V} and 𝗡 = 𝗡_{MOC}), then we can eliminate the regularization (ɛ = 0) and find small variance anomalies that grow into large MOI anomalies; this is the approach we adopt. This also means that the optimal initial MOI condition need not be small; that is, it can be 𝒪[1].

### c. Amplification under the variance norm

We first focus in detail on the maximum amplification under the variance norm. The optimal initial conditions for growth can be found by solving Eq. (11) for the leading eigenvector (**x**_{0}) and maximum amplification (*λ*_{max}) for different norms and lead times. Figure 7 shows the maximum amplification curves (*λ*_{max} as a function of *τ*) under the variance norm, retaining different numbers of EOFs. The maximum transient amplification increases as the number of EOFs increases, so the amplification found should therefore be considered as a lower bound on the possible nonnormal amplification. But, Fig. 7 also demonstrates that the overall shape of the maximum amplification curve is remarkably consistent with the number of EOFs used. The weights in the leading eigenvectors, and hence the optimal initial conditions, are also fairly consistent as the number of EOFs is increased (not shown). We can therefore be reasonably confident that the optimal initial conditions and amplification process are robust.

It is found that the maximum transient amplification, using 30 EOFs, occurs at 36 yr and involves an amplification factor of ∼10. The thin black curve in Fig. 7 shows the evolution of this largest growing mode. The level of maximum amplification is slightly larger than that found for the GFDL CM2.1 model by Tziperman et al. (2008), although the time to reach maximum amplification is far longer in HadCM3 (36 yr compared to ∼8 yr). We now focus on the mechanisms of amplification of this maximal growth.

The evolution of temperature and salinity anomalies from the initial (x_{0}) to final (𝗣_{τ}**x**_{0}) states, projected into physical space and integrated from the surface to 1800-m depth, is shown in Fig. 8. The initial state has fairly localized and shallow (Fig. 9) anomalies concentrated in the North Atlantic and Nordic seas that grow into anomalies affecting most of the Atlantic basin, with a strong northeast–southwest dipole structure. The inference is that a small anomaly with the structure of the initial state will grow into the final state over 36 yr. It is of interest that the anomalies tend to grow in volume rather than amplitude and are substantially density compensating and therefore involve changes in the spiciness of the water masses. The evolution of these anomalies shows propagation from the North Atlantic southward along the western boundary into the tropical Atlantic, probably due to a combination of wave and advective processes, followed by spreading into the interior.

Figure 9 shows how these anomalies are distributed with depth, integrated over two latitude bands. The dashed lines represent the initial state and the solid lines are the final state. It can be seen that the initial anomalies are concentrated in the top 500 m, especially in the subpolar gyre. The final state is dominated by deeper anomalies (below 500 m) in the tropical Atlantic. The locations highlighted in the initial state indicate where “observations” would be most valuable for constraining predictions. The shallow nature of the initial anomalies is encouraging, as these may be more easily observable than anomalies at deeper levels. Also note that, although the sign of the anomalies is arbitrary in this linear model, the GCM behaves similarly for both signs of anomalies (see Fig. 12 later).

Tziperman et al. (2008) also found that the optimal initial conditions were located in the far North Atlantic in the GFDL CM2.1 model. In their maps of dynamic topography there are also hints of a similar western boundary propagation. It is encouraging to see similar processes at work in two different models.

The evolution of the MOC during this amplification is also of interest. The solid black line in Fig. 10 shows that the negative MOI_{recon} anomaly amplifies for the first 10 years and then recovers. Thorpe et al. (2001) showed that the MOC strength in HadCM3 is strongly correlated with the meridional density gradient in the North Atlantic, and the evolution of anomalous meridional density gradients in Fig. 8 shows the same effect with anomalous low density in the far North Atlantic occurring when the MOC is anomalously weak. De Coëtlogon et al. (2006) also found correlations between Gulf Stream position and transport changes and the MOC in several GCMs, and similar relationships between Gulf Stream transport and gradients in potential energy anomalies have been identified in observations (Curry and McCartney 2001).

Figures 8 and 10 indicate that there may be two stages to the amplification. A faster (∼10 yr) growth due to changes in the high latitudes—the optimal perturbation is a low density anomaly (dominated by low salinity) in the surface layers of the Nordic seas that inhibits convection. The mean density of water flowing through Denmark Strait is also lowered (not shown), and these factors are known to lead decreases in the strength of the MOC in HadCM3 (Hawkins and Sutton 2008). There is also propagation along the western boundary and a corresponding change in the MOC. After a few years, the convection can restart and the MOC recovers. During the second stage (years ∼10 onward), there is slower growth spreading the warm and salty anomalies to greater depths and into the ocean interior, especially in the tropics.

### d. Comparing different norms

One of the uses for optimal initial conditions is to perturb ocean analyses to help design efficient decadal climate forecast ensembles (e.g., Smith et al. 2007; Meehl et al. 2009). Therefore, we now examine the sensitivity of our results to the choice of norm for a fixed optimization lead time of 10 yr. Figure 11 shows the optimal initial conditions for each norm. The patterns are very similar with large values found in the far North Atlantic, especially in the Labrador Sea and Nordic seas, that is, the regions of deep ocean convection. The pattern for the SST norm is slightly different with more weight in the tropical North Atlantic, especially in salinity. However, after 10 yr growth, the amplified patterns are fairly similar (not shown).

Figure 10 considers the development of MOI anomalies for the different norms for the fixed optimization time of 10 yr. The evolution is similar for the variance and MOC norms, and also the quadratic norm (not shown), with growth to the largest MOC extremum^{6} occurring at ∼8–12 yr followed by a decay. The SST norm shows a similar evolution but with the largest MOC anomaly at around 20 yr.

Table 1 gives the lead time of largest possible amplification for each norm. Using the SST norm there is larger nonnormal amplification—a factor of ∼32 at a lead time of 5 yr. The MOC norm shows a far smaller maximal amplification of just ∼1.35 from 1 to 10 yr.

These results suggest that (i) the optimal initial condition is dependent on the norm chosen, (ii) the growth of anomalies over the first decade appears dominated by an MOC perturbation under all norms, and (iii) the time scale to maximum amplification and the pattern at maximum amplification are sensitive to the norm chosen.

### e. Does this amplification actually occur?

The calculation of these optimal initial conditions is a mathematical construct, so it is important to examine whether the amplification described actually occurs. One way to test this would be to impose the optimal initial conditions as anomalies in the full GCM and run an ensemble of integrations. First, this is a computationally expensive test of the kind that we are trying to avoid and, second, these patterns are “average” amplifying perturbations, which are not optimized for any particular initial condition. A simpler, and perhaps more relevant, test is to use the natural internal variability generated by the GCM.

Although it is extremely unlikely that, at any one particular time, the anomalies in the GCM control integration will exactly match the optimal initial condition estimated above, we can study the projection of the GCM anomalies for all 1100 yr onto the optimal initial and final states (see PS95).

This simple estimate of the relevance of the optimal initial condition is shown in Fig. 12, which plots, in the space of the leading PCs, the projection of the state vector **x**(*t*) onto the optimal initial condition **x**_{0}(*τ*) against the projection of **x**(*t* + *τ*) onto the maximally amplified state, 𝗣_{τ}**x**_{0}(*τ*) for all *t*, for a fixed optimization time of *τ* = 10 yr, for each norm. If this linear nonnormal amplification is occurring in the GCM, then all of the points should lie near the dashed lines (a slope of unity). The relation will not be perfect owing to the integrated effect of noise throughout the evolution of the anomaly, but it is reassuring that the best-fit slopes are positive, with good correlations. This demonstrates that this type of amplification does, indeed, occur in the GCM and that our linear model is capturing this. Although there are small differences from a slope of unity, departures from linearity are modest.

Although the structure of the growth appears to occur in the same way as the linear model suggests, the observed level of amplification in the GCM typically reaches ∼50% of the maximum linear estimate (not shown). This is not surprising as we expect the nonlinearities in the GCM to impact our linear estimates.

These results give us confidence that our estimated optimal initial conditions are realistic and giving rise to amplification in the GCM in the way described by our simple linear model.

## 6. Conclusions and implications

We have analyzed the decadal variability and predictability of 3D Atlantic Ocean anomalies in a GCM (HadCM3) using a linear inverse modeling approach. The main findings can be summarized as follows:

The evolution of temperature and salinity in the Atlantic Ocean as captured by the leading 30 three-dimensional EOFs can be effectively described by a linear dynamical system, forced by white noise.

Forecasts, using the linear model, of Atlantic SSTs, the strength of the overturning, and of the leading PCs are more skillful than damped persistence and climatological forecasts for several decades.

Significant nonnormal amplification by a factor of ∼10 can occur in the linear model framework under the variance norm over 36 yr. Similar processes of amplification are also found in the natural variability of the full GCM.

The regions from which growth occurs under the variance, quadratic, and MOC norms are found in the top 500 m of the far North Atlantic. Anomalies in this region amplify to fill most of the Atlantic basin, especially at depth, over 1–3 decades. The mechanism of amplification appears to initially involve anomalies in the Nordic seas followed by southward propagation of anomalies along the western boundary affecting the MOC and subsequent spreading into the interior.

The structure of the optimal initial conditions is sensitive to the choice of norm, whereas the initial growth appears similar and dominated by MOC-related basin-scale changes for all norms considered.

Tziperman et al. (2008) used a similar methodology to that described here, using a quadratic norm, on the GFDL CM2.1 model and found a transient amplification of comparable amplitude that occurred over the first 10 yr—similar to the first stage of growth found here for HadCM3. They concluded that the predictability in this model decays rapidly over 10 yr, and this is also seen in the errors in their forecasts, which grow more rapidly than for HadCM3. In HadCM3 the transient growth is slower, indicating greater predictability in this model. This is also demonstrated by the significant skill found in the forecasts, even at long lead times, and also by previous studies (e.g., Collins and Sinha 2003).

There are two main reasons for performing this kind of analysis: First, the initial perturbations indicate where observations would be most valuable to help constrain decadal climate predictions. The similar location (far North Atlantic) for the optimal initial conditions found in both HadCM3 and GFDL CM2.1 is encouraging, and suggests additional ocean observations in these regions could be valuable, though the spatial density of ocean observations that would be required is still an open question. Second, any future operational decadal climate prediction system will rely on ensemble forecasts from analyzed initial conditions and will require optimal perturbations, such as these, to efficiently explore the uncertainty in the initial state and hence provide more accurate forecast uncertainties.

It should be noted that, although we have attempted to justify the various assumptions about the linearity of the system, it needs to be rigorously demonstrated that the identified regions are, indeed, the most sensitive to perturbations using integrations of the full GCM (e.g., Kleeman et al. 2003). A further paper will explore this issue in more detail. Further work will also investigate the application of these methods to different GCMs.

## Acknowledgments

We thank Myles Allen for suggesting the use of the linear inverse modeling approach, and Cécile Penland for valuable discussions. We also thank Laure Zanna and the anonymous reviewer for their comments, which helped improve the paper. EH is funded by the U.K. Natural Environment Research Council under the thematic Rapid Climate Change programme. RS is supported by a Royal Society University Research Fellowship.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

### APPENDIX A

#### Methodology Details

##### Constructing a linear propagator

Here we give more details (also see PS95) of the estimation of 𝗕, starting from Eq. (2). If we assume (for now) that the forcing term (*ξ*) is zero, then Eq. (2) is a purely linear system, and we can introduce a linear propagator 𝗚 so that

where **x** is our PC state vector and *δ* is the lead time. This assumption of linearity is not going to be strictly true, so we assume that the nonlinear terms can be considered as a forcing term (*ξ*) in Eq. (2). This forcing is generally taken to be Gaussian white noise (in time). These assumptions of linearity and Gaussianity are tested for our system in appendix B. The optimal linear propagator 𝗚(*δ*) can be found from the covariance matrices of **x** (PS95),

in which angle brackets denote an average over all *t*; 𝗖(0) is a diagonal matrix with elements proportional to the fraction of the variance explained by each EOF. In this notation,

and should be independent of the lead time (*δ*) used to estimate it. The general linear propagator is then [cf. Eq. (4)]

Although, in principle, we are free to choose any lead time *δ* to calculate 𝗕, there are physical constraints that limit this choice. To ensure that the modes of variability are physically realistic decaying modes, it is necessary that the eigenvalues of 𝗚(*δ*) have positive real part and magnitude less than unity. It is found for our 3D EOF system that using a lead time of *δ* = 1 yr will satisfy this constraint, but longer lead times do not. All results presented here therefore use the propagator derived for *δ* = 1 yr.

##### Empirical normal modes

The eigenvectors of 𝗕 are sometimes known as the empirical normal modes (PS95), and the corresponding eigenvalues *β _{j}* are generally complex:

Table A1 summarizes the properties of the ENMs and shows that two modes in our system are exponentially decaying modes (*w _{j}* = 0) and the rest are oscillatory, complex conjugate pairs with decay times −1/

*σ*and periods 2

_{j}*π*/

*w*. The shortest period for one of these oscillatory modes is 4.4 yr (ENM 3/4). Erroneous results for 𝗚 can be obtained if the lag (

_{j}*δ*) is chosen to be larger than half the shortest period, and large uncertainties can occur near this half period (PS95). This is the so-called Nyquist problem and is the reason why the propagator matrix could not be reliably estimated for

*δ*> 1 yr in our 30-EOF system.

##### Noise EOFs

As the individual ENMs decay there must be an energy input from the stochastic forcing (*ξ*) to ensure stationary statistics (a fluctuation–dissipation relation). This relation (Penland and Matrosova 1994; PS95) is

and defines the covariance matrix of the forcing 𝗤 = 〈*ξξ*^{T}〉*dt*. Although the forcing is white noise in time, the eigenvectors of 𝗤 correspond to coherent, three-dimensional spatial patterns of forcing, sometimes called “noise EOFs.” The most likely source for noise in the ocean is the atmosphere, so it is hoped that these noise EOFs are dominated by surface anomalies that resemble known surface flux patterns due to coherent atmospheric forcing states. In our case, the leading noise EOFs are dominated by anomalies in the top ∼350 m, with the spatial pattern of the first noise EOF (representing around 13% of the variance) similar to the observed ocean response to changes in the North Atlantic Oscillation (not shown).

### APPENDIX B

#### Validation of the Assumptions in the Methodology

There are several tests that can be used to justify the assumptions made in constructing the linear model and to check that our system is indeed acting like a Gaussian white-noise-forced version of Eq. (2). Section 4a analyses the skill of the forecasts produced, and here we consider some further tests.

##### Are the linearity assumptions valid?

The evolution of the PCs will not be perfectly linear, but the skill of the forecasts shown in Fig. 3 gives us confidence in this assumption. Also, as previously discussed, for a linear system, **B** should be independent of *δ*. Unfortunately we cannot check this as 𝗕 can only be reliably calculated for *δ* = 1 yr, using 30 EOFs. If fewer EOFs are used, then this test is reasonably well passed (not shown).

##### Is the noise forcing consistent?

As the noise matrix 𝗤 is a covariance matrix, it should be positive definite; that is, the eigenvalues should all be positive. It is found that, when using 30 EOFs, two of the eigenvalues are slightly negative (∼−10^{−5}). We also find that, if only the leading 15 EOFs are retained, then all the eigenvalues are positive. This may indicate the presence of a small amount of nonlinearity found in the low variance, higher EOFs, or that we do not have enough data to reliably determine the higher EOFs. Previous studies (e.g., PS95) have found similar results and concluded that it does not influence their results significantly.

##### Are the statistics consistent?

Another approach to testing the linear model assumptions is to integrate a forward model, constructed from Eq. (2), and generate artificial data with the same linear propagator as that estimated from the GCM data (see Penland and Matrosova 1994 and PS95 for details). This artificial data should have similar statistics to the real data. Newman (2007) compared the power spectra of the components output by the forward model with the power spectra derived directly from the GCM data and from observations, and we perform a similar test here.

We integrated our forward model for 55 000 years with a time step of 1 month, and split the output into 50 independent segments, so that each segment had the same length (1100 yr) as the GCM data, to act as a Monte Carlo ensemble. The stochastic forcing used in the forward model is generated using the noise matrix 𝗤 with the two negative eigenvalues suppressed.

The solid lines in Fig. B1 show the power spectra for the leading five PCs, and these should be compared to the dashed lines that indicate the ensemble mean from the forward model, with 95% confidence limits shaded gray. It can be seen that the power spectra derived directly from the data are within the estimated confidence levels for the leading five principal components, indicating that the linear model can reproduce the frequency statistics of the distribution well. The large uncertainty range in the power spectra indicates that even the statistics of 1100-yr segments can be very variable and each segment may not fully represent all the frequencies of natural variability. Figure B2 shows the same for the domain means of SST, sea surface salinity (SSS), temperature and salinity at a depth of 1500 m, and the MOI. It is not surprising to see that the linear model is better at representing the variability of the deep ocean than the surface layers. The lower panel of Fig. B2 indicates that not all of the MOI variability is completely represented by the linear model. Also seen is the similarity of the spectra for MOI_{GCM} and MOI_{recon} for periods greater than about 20 yr, except for the 25-yr spectral peak which is missing in MOI_{recon}. These differences will limit the skill of the linear model to predict the overturning indices.

##### Considering Gaussianity

If a Gaussian white-noise-forced linear model is a good representation of our system, then the statistics of *x* should be Gaussian, but it is found that (only) the leading two PCs have significantly non-Gaussian skewness. This could be an indication of nonlinear processes that are not well represented by Gaussian white noise or due to the low-frequency variability (Fig. B1) found in these two leading PCs that is not fully sampled in the 1100 years of analyzed data. As a further check we examined the statistics of different segments of the stochastically forced forward model described above and found significantly non-Gaussian skewness for the leading two PCs in many segments. The statistics were only Gaussian if the segment considered was long enough to capture enough cycles of the coherent oscillation. Even so, the skewness of the real PC1 is outside the 95% confidence limits derived from the forward model (not shown). Non-Gaussianity is possible if the noise is multiplicative rather than additive (e.g., Sura and Sardeshmukh 2008), but this is a small caveat on our results.

## Footnotes

*Corresponding author address:* Ed Hawkins, Dept. of Meteorology, University of Reading, Reading, RG6 6BB, United Kingdom. Email: e.hawkins@reading.ac.uk

^{2}

Note that, although Farrell and Ioannou (2001) argue that EOFs may not be the optimal choice for investigating nonnormal growth, we find that they describe the system adequately for our purposes.

^{3}

The linear model should not be expected to predict the variability over time scales of a few years, which is dominated by the Ekman response to atmospheric variability. This correlation increases to 0.94 if the MOI has a 30-yr filter applied, rather than the 10-yr filter used throughout this study. The correlation is 0.70 using just the leading two PCs.

^{4}

Note, though, that the MOC index used by Collins and Sinha (2003) is different—they used annual means of the overturning at 45°N and 670-m depth, and subtracted off the Ekman part of the overturning, which is likely to enhance skill.

^{5}

Note that in Tziperman et al. (2008) the quadratic and variance norms are identical, as they do not include a weighting owing to a contribution to local density in their EOF estimation. We will focus on the variance norm in particular to more directly compare to Tziperman et al. (2008), but the results for the quadratic norm are very similar.

^{6}

Note again that the sign of the anomaly is arbitrary.