## Abstract

A key aspect in designing an efficient decadal prediction system is ensuring that the uncertainty in the ocean initial conditions is sampled optimally. Here one strategy for addressing this issue is considered by investigating the growth of optimal perturbations in the third climate configuration of the Met Office Unified Model (HadCM3) global climate model (GCM). More specifically, climatically relevant singular vectors (CSVs)—the small perturbations of which grow most rapidly for a specific set of initial conditions—are estimated for decadal time scales in the Atlantic Ocean. It is found that reliable CSVs can be estimated by running a large ensemble of integrations of the GCM. Amplification of the optimal perturbations occurs for more than 10 yr, and possibly up to 40 yr. The identified regions for growing perturbations are found to be in the far North Atlantic, and these perturbations cause amplification through an anomalous meridional overturning circulation response. Additionally, this type of analysis potentially informs the design of future ocean observing systems by identifying the sensitive regions where small uncertainties in the ocean state can grow maximally. Although these CSVs are expensive to compute, ways in which the process could be made more efficient in the future are identified.

## 1. Introduction

Successful predictions of changes in climate for the forthcoming decade, especially on regional and local scales, depend on predicting both the continuing response of the climate to radiative forcings and the natural decadal climate variations that mainly arise from slow changes in the oceans (e.g., Hawkins and Sutton 2009b). Most current climate projections, including the Intergovernmental Panel on Climate Change’s (IPCC) Fourth Assessment Report (AR4; Meehl et al. 2007), only consider the changes in radiative forcings. The potential for enhanced skill has led to the design of decadal climate prediction systems that initialize climate forecasts from the observed ocean state (e.g., Smith et al. 2007; Keenlyside et al. 2008; Pohlmann et al. 2009). Initialized forecasts also offer a new way of testing and validating climate models, and a multimodel intercomparison is planned for the next IPCC report (Meehl et al. 2009).

The Atlantic Ocean is of particular importance because of its central role in the global overturning circulation. There is evidence from global climate models (GCMs) that variations in the Atlantic meridional overturning circulation (MOC) are largely responsible for the observed decadal changes in Atlantic sea surface temperatures (SSTs) and hence can influence the atmosphere, and that these variations are potentially predictable on decadal time scales (e.g., Griffies and Bryan 1997; Pohlmann et al. 2004; Collins et al. 2006; Hawkins et al. 2010, manuscript submitted to *Climate Dyn.*).

One key challenge is to improve our understanding of error growth on decadal time scales, and this is important for several reasons. First, understanding why, and how quickly, a climate model, initialized from the observations, diverges from subsequent observations is valuable for identifying the specific physical processes responsible for error growth. Improving the representation of these key processes in GCMs might then become a priority.

Second, identifying the ocean regions that are most sensitive to small perturbations will therefore also allow the design of optimal observing systems to constrain predictions in a targeted, cost-effective way. This improvement is especially important for the oceans because of the high cost of subsurface observations. Third, understanding perturbation growth is required to design efficient ensembles to sample initial condition uncertainty (e.g., Molteni et al. 1996). For example, in the Met Office’s Decadal Prediction System (DePreSys; Smith et al. 2007), which is based on the third climate configuration of the Met Office Unified Model (HadCM3) GCM, an ensemble of forecasts is currently generated by using essentially identical ocean states, but with different atmospheric states, in the initial conditions. This causes the forecasts to diverge, even in the ocean, giving a spread, or uncertainty, in the prediction. However, there is evidence that the spread of these ensemble members is too narrow when compared with the observations; that is, the estimated uncertainties in the forecasts are not reliable (Smith et al. 2007). There is therefore a need to design initial condition perturbations that will ensure a reliable spread.

Optimal perturbations, these being the perturbations that grow most rapidly over a defined time interval (e.g., Farrell 1988), are routinely used in numerical weather prediction (NWP) to ensure reliable ensemble spreads and to identify regions for targeted observations (e.g., Palmer et al. 1998). Several techniques exist for generating such optimal perturbations, such as singular vectors (e.g., Buizza and Palmer 1995) or breeding vectors (e.g., Toth and Kalnay 1997). However, in NWP it is the rapidly growing weather modes that are important, but on longer (seasonal to decadal) time scales, it is necessary to remove any effects of the rapidly growing weather perturbations and instead focus on the fastest-growing climate perturbations.

Several methods exist for estimating optimal perturbations for climate time scales. Singular vectors have been estimated in simplified climate models (e.g., Zanna and Tziperman 2005; Sevellec et al. 2007; Zanna et al. 2011) and both Sevellec et al. (2008) and Czeschel et al. (2010) examined the sensitivity of the MOC to perturbations and surface forcings using adjoints of ocean-only models. However, it is important to consider the role of coupled processes. To our knowledge, no explicit adjoint has yet been developed for a fully coupled GCM and there is therefore a need to consider approximate methods. Other studies (e.g., Tziperman et al. 2008; Hawkins and Sutton 2009a) have estimated optimal perturbations for decadal predictions of the Atlantic Ocean in GCMs using linear inverse modeling [LIM; see, e.g., Penland and Sardeshmukh (1995) for more details] instead. This statistical methodology allows the calculation of an optimal perturbation averaged over many different initial conditions and, thus, removes the “weather noise.” Both Tziperman et al. (2008) and Hawkins and Sutton (2009a, hereafter HS09) found that the far North Atlantic was the optimal initial perturbation region in the GCMs considered. In the HadCM3 GCM, growth occurred for more than 30 yr through an initial induced change in the MOC, followed by a slower spreading of anomalies southward and into the deeper ocean (HS09). Tziperman et al. (2008) found a similar mechanism in the Geophysical Fluid Dynamics Laboratory Climate Model, version 2.1 (GFDL CM2.1), GCM, but the growth occurred over shorter lead times.

However, it is important to note that the growth of perturbations not only depends on the structure of the perturbation and the norm used to measure growth, but also on the ocean state at the time the perturbation occurs; that is, the growth of perturbations is initial state dependent, even in very simple systems (e.g., Palmer 1993). Kleeman et al. (2003, hereafter KTM03) describe a methodology to estimate climatically relevant singular vectors (CSVs) from a GCM for specific initial states by running a large ensemble of integrations to average out the weather noise. They applied their methodology to estimate CSVs for seasonal forecasting of SST in the Pacific Ocean, and found that their leading CSV grew into a structure resembling the mature phase of an El Niño–Southern Oscillation event. However, for decadal climate predictions, it is likely that perturbations over the entire ocean depth, and to salinity as well as temperature, will result in the largest growth.

Our study investigates the growth, on decadal time scales, of perturbations in the Atlantic Ocean. More specifically, we aim to estimate the optimal perturbations in a GCM for specific initial conditions with the motivation of using the perturbations in decadal predictions of the Atlantic Ocean. We therefore extend and adapt the methodology of KTM03 to allow the estimation of CSVs for decadal, rather than seasonal, time scales. To do this requires including information from the surface and deeper ocean, and to consider the effects of both temperature and salinity, and hence density. We use the HadCM3 GCM to allow a direct comparison with the more statistical LIM approach (HS09).

This paper is structured as follows. Section 2 describes the model and methodology used, and section 3 examines the properties of the leading CSV. In section 4 we confirm that the estimated CSV works as predicted, and examine the mechanism of amplification. We conclude and discuss our results in section 5.

## 2. Model description and methodology

### a. Details of HadCM3

In the subsequent analysis we use the third Hadley Centre global, coupled ocean–atmosphere model (HadCM3). Gordon et al. (2000) describe HadCM3 in detail, and here we give a brief summary. HadCM3 has an atmospheric resolution of 2.5° × 3.75° and 19 vertical levels coupled to an ocean component that has a resolution of 1.25° × 1.25° with 20 vertical levels. The model does not require flux adjustment to maintain a stable climate. The mean state of the ocean model matches the observed values to within 1 K and 1 psu in most regions (Gordon et al. 2000; Pardaens et al. 2003). The relatively coarse resolution of the model allows for the several millennia of integrations required in our analysis (see below) to be run in a reasonable time. Smith et al. (2007) describe the use of this GCM for decadal climate predictions and demonstrate the enhanced skill of initialized retrospective forecasts from 1981 to 2001, as compared to uninitialized projections with the same model.

### b. Estimation of climatic singular vectors

A general dynamical system can be written as

where ** ϕ** is the state vector of the system and

*F*represents a complicated, nonlinear, operator. For “small” perturbations (

*δ*), the evolution of the system can be approximated by

**ϕ**where 𝗥 is a linear operator. The singular vectors of the system, which are the perturbations that amplify maximally over time Δ*t*, are the eigenvectors of 𝗥^{T}𝗥 with the largest real part (e.g., Buizza and Palmer 1995), where 𝗥^{T} is the transpose of 𝗥.

For an atmosphere–ocean coupled GCM, this conceptually simple framework becomes more complicated as 𝗥 is not directly available. First, the state vector of the system is enormous, requiring the system to be represented in a reduced state space. This reduction is often achieved by isolating the leading modes of variability, for example, by using empirical orthogonal functions [EOFs; see Farrell and Ioannou (2001), Moore and Kleeman (2001) and KTM03 for extensive discussions]. Second, when considering climatic time scales, the presence of weather noise requires a modified strategy because different realizations of the weather may cause different climate outcomes. To address this issue, it is necessary to average over many realizations of the weather to obtain a climatically relevant singular vector. Thus, a propagator 𝗥 can be estimated (Gardiner 1985; KTM03) using

where ** ψ** is the reduced state vector of the system and

*δ*represents an ensemble mean. The different ensemble members are constructed with tiny perturbations to

**ψ****(**

*ψ**t*

_{0}), which cause the weather to diverge.

### c. Application to decadal time scales in HadCM3

KTM03 used the above methodology to estimate singular vectors for SST on seasonal time scales in the Pacific Ocean. We extend the methodology to estimate climatic singular vectors for decadal time scales in the Atlantic basin. Amplification on decadal time scales will likely involve the deeper ocean due to the dynamics of the meridional overturning circulation and, hence, the density structure. It is therefore necessary to consider three spatial dimensions, and both salinity and temperature, rather than just SST.

The details of this process are now described. In all that follows we use annual mean fields, except for the instantaneous initial conditions.

Calculate the leading 3D joint, correlation

^{1}EOFs (denoted*E*) of ocean temperature and salinity from a long (1600 yr) preindustrial control integration of HadCM3 in the chosen domain (20°S–90°N, 100°W–20°E, from the surface to 1800-m depth)._{i}^{2 }Hawkins and Sutton (2007, hereafter HS07) describe the estimation and properties of these EOFs in more detail, but it is important to note that the temperature and salinity fields are weighted by their contributions to local density anomalies.Choose a suitable initial condition from the control integration; we chose 1 December, after 1050 yr of the control run. This choice was made partly because of the availability of the model initial conditions and partly because the model state in the Atlantic was not particularly unusual at that time (in the subspace of the leading two EOFs; see HS07). The instantaneous ocean temperature and salinity anomalies from the time mean are shown in Fig. 1a, and the meridional overturning index (MOI) around the start date is shown in Fig. 1b.

Run a control ensemble (with

*K*members) from the chosen initial condition for*T*years. The ensemble mean of the ocean temperature and salinity fields over the chosen domain is denoted by*ψ*_{0}(*t*). The different ensemble members are generated by perturbing the SST field globally by different realizations of random Gaussian white noise with variance 10^{−8}K. These small perturbations cause the atmospheric states to diverge rapidly.- Run
*M*separate ensembles (each with*K*members) for*T*years from the chosen initial condition, with the*i*th ensemble perturbed in salinity and temperature by the*i*th EOF (*E*); so that, We scale the EOFs by a factor of 0.4 to try to remain in the linear regime. The spatial domain of the EOFs is extended to all depths and into the Arctic by regressing the appropriate principal components (PCs) against salinity and temperature in these regions. Furthermore, the anomalies are tapered to zero at the southern boundary (20°S) to minimize any sudden unwanted ocean readjustment. The different members of each ensemble are generated as for the control ensemble. The ensemble means of ocean temperature and salinity are denoted by_{i}*ψ*(_{i}*t*), where 1 ≤*i*≤*M*. - The difference between the mean of ensemble
*i*and the mean of the control ensemble is where*t*represents*t*_{0}+ Δ*t*, following Eq. (3). An approximate (tangent linear) propagator matrix (𝗥) is assumed to model the evolution of the ensemble mean anomalies in the subspace of the leading*M*EOFs. So, from Eq. (3), where the rows of 𝗘 are the EOFs (*E*), the rows of 𝗣 are_{i}*P*, 𝗪 is a matrix of weights, and · represents the matrix entry-wise product. In the estimation of the EOFs various weightings are used for latitude, depth, local standard deviation, and contribution to local density (see HS07); 𝗪 is a combination of all these factors and is required to ensure the deep ocean is represented and that each volume of water is treated equally._{i} Equation (6) is solved approximately for 𝗥 by minimizing the residual term. As 𝗣 is lead time dependent, the propagator 𝗥 can be estimated for different lead times.

- The CSVs (
**x**) are then the solutions of where 𝗡 is the norm under which the amplification (*λ*^{2}) is maximized. The CSVs,**x**, can then be transformed back into real space (**x**^{T}𝗘). In this analysis, a quadratic norm is used (𝗡 = 𝗜, the identity matrix), although section 3c briefly explores the sensitivity to this choice. To demonstrate that the CSV amplifies as predicted by the linear propagator, the amplified singular vector in real space, (𝗥

**x**)^{T}𝗘, can be compared to the pattern that is derived from running an additional ensemble of the full GCM with the same initial conditions perturbed with the CSV.The steps 2–8 can be repeated for a different initial condition from the control integration.

This methodology has the advantage that the choice of norm and optimization lead time can be made after the ensembles have been completed; they do not have to be chosen beforehand. The sensitivity to these choices can therefore be explored without further model integrations. However, we note that this process estimates the optimally growing mode in the subspace of the leading EOFs of the ocean variability. It is possible that a more optimally amplifying mode exists that we cannot capture with this approximation.

### d. Choice of parameters

This method thus requires an “ensemble of ensembles,” with a total of *K* × (*M* + 1) members, each run for *T* years, per initial condition. This is potentially an extremely expensive computational task. Previous results with HadCM3 (HS09) indicate that amplification can occur for more than 30 yr in the Atlantic. Each ensemble was therefore chosen to be run for *T* = 40 yr, with *K* = 16 members, to ensure a reliable estimate of the mean response. Computational resources allowed us to run *M* = 8 ensembles for a single initial condition^{3}; these leading eight EOFs account for 46% of the total 3D variance in temperature and salinity. The sensitivity to these choices is explored below.

An operational decadal prediction system would likely only consider singular vectors for, at most, 10-yr lead times, and hence the computational cost could be reduced by at least a factor of 4 in this case. As shown below, this reduced lead time for optimization may also allow smaller ensembles to be run, reducing the computational expense further.

## 3. The climatic singular vectors

### a. The leading CSV

The leading CSVs and singular values (*λ*) are estimated for lead times up to 40 yr following the above methodology.

We first consider the leading singular vector, optimized for maximum growth at a lead time of 10 yr, which shows an amplification of *λ* ≈ 2 (see below); this is an example of a perturbation that could be used in an operational decadal prediction system. The optimal initial pattern for temperature, salinity, and density is shown in Fig. 2. Note that the surface perturbation, in both temperature and salinity, has large values in the North Atlantic Current (NAC) region, and in the Nordic and Labrador Seas, which is where deep convection occurs in this GCM (Grist et al. 2007). At deeper levels, the NAC and northern coast of South America regions dominate. Note also that the temperature and salinity fields are substantially density compensating, but temperature tends to dominate. The optimal perturbation has a north–south density gradient in the far northern Atlantic, which is likely to cause an MOC acceleration in this GCM (Thorpe et al. 2001; Dong and Sutton 2005).

### b. Robustness of CSVs

Before examining the amplification in more detail, it is first necessary to establish that the CSVs are robust to the various choices made in the methodology, especially the number of EOFs retained (*M*) and the number of ensemble members used (*K*).

#### 1) Number of EOFs required

The leading singular values (*λ* as a function of lead time), using different numbers of EOFs in their estimation, are shown in Fig. 4, for a fixed *K* = 16. Significant amplification is seen, even with a relatively small number of EOFs, and the amplification seems to be converging for eight EOFs (solid orange line). The shapes of the curves when using different numbers of EOFs are quite consistent, suggesting that the dominant features of the amplification are captured in the leading EOF modes. The double-peak structure suggests that there may be two time scales and mechanisms of amplification, or that we do not have enough ensemble members to adequately describe the growth for lead times longer than ∼15 yr (see below).^{4}

It is also necessary to ensure that the initial CSV patterns themselves are converging. Figure 5 shows the leading CSV pattern for temperature, using different numbers of EOFs, optimized for maximum amplification after 33 yr. Again, convergence of the optimally growing pattern can be seen, with increasing resolution of small-scale features, especially in the North Atlantic Current region. A similar convergence is seen for the salinity field (not shown). The structure of the final amplified state also converges (not shown), although the amplitude of the anomalies increases, consistent with Fig. 4.

#### 2) Number of ensemble members required

Figure 6 shows how the maximal amplification changes for differing numbers of ensemble members (*K*), when using *M* = 8 EOFs. If just four ensemble members are used, then a large maximal amplification is seen (*λ* > 10, dashed gray line). The maximal amplification is smaller when using more ensemble members as the true ocean mean is being captured more effectively. For lead times larger than about 15 yr, the amplification does not appear to have converged even when using 16 members, and so this amplification estimate must be viewed as an upper bound. It can be seen that 8–12 members might be enough for exploring amplification for lead times of 10 yr or less, which would further reduce the computational cost of this type of analysis in an operational decadal prediction system.

KTM03 also found that the singular value converged relatively slowly when adding additional ensemble members, but that the CSV initial pattern converged more quickly. Figure 7 shows how the initial CSV pattern for temperature converges with the number of members used in each ensemble, always using eight EOFs and for a lead time of 33 yr. The similarity of the spatial patterns when using 12 and 16 members suggests that the pattern has converged. A similar convergence is seen for the salinity field (not shown). The structure of the final amplified state also converges (not shown), although the amplitude of the anomalies decreases, consistent with Fig. 6.

### c. Sensitivity to choice of norm

Although the amplification is explored below for the optimal perturbations already described using the identity matrix as the norm, 𝗡 = 𝗜, we briefly explore the sensitivity of the CSVs to this choice. Following HS09, we consider, first, a norm that specifically finds the optimal perturbations that cause growth in anomalies of the MOC (denoted by 𝗡_{MOC} in HS09) and, second, a norm that maximizes the changes in total variance (denoted by 𝗡_{V} in HS09). In both cases, the optimal initial perturbations and patterns of growth over the first decade are almost identical to using the identity matrix as the norm (not shown), giving us confidence that our growth is robust.

However, specific norms that explore growth in just SSTs, or in certain regions of the Atlantic, would produce different results, but this is beyond the scope of this paper.

## 4. Do the estimated CSVs work?

The CSVs described above are estimated from a linear approximation to the nonlinear GCM. Therefore, the next step in the analysis is to examine whether the calculated CSVs actually produce similar growth when they are added to the same initial conditions and run forward in HadCM3. Note also that the sign of the perturbation is arbitrary as we have assumed linearity, and so we consider both positive and negative projections of the leading CSV.

### a. Scaling the CSV

A key question when using the estimated CSV is choosing the amplitude of the initial condition perturbation. An operational decadal climate forecast system will use an ocean analysis as the initial conditions, but this ocean state will be uncertain. For the CSV to be of optimal use in such a system, the amplitude of the perturbation pattern should be close to the uncertainty in the ocean analysis. Thus, the estimated CSV needs to be scaled to an appropriate amplitude.

Although it is not easy to estimate a scaling rigorously, an estimate can be obtained (G. Smith 2009, personal communication) from the spread between different estimates of ocean analyses produced by different research groups [e.g., the Climate Variability and Predictability (CLIVAR) Global Synthesis and Observations Panel (GSOP)]. This method gives an estimate of approximately 0.4 × the estimated CSV pattern, which results in a perturbation amplitude corresponding to 10%–20% of the interannual variability. In operational NWP, the amplitudes of perturbations are broadly similar in terms of percentage of the daily variability (Toth and Kalnay 1997).

### b. Level of amplification

To ensure a large response, we choose to consider the largest amplifying CSV (i.e., that which amplifies maximally after 33 yr) and run two further ensembles from the same initial state, perturbed with the positive and negative projections of the CSV pattern, scaled by 0.4. Figure 8 shows that the amplification of the positive projection of the leading CSV (red line) tracks the shape of the statistically predicted amplification (thin black line) for around 12 yr, before the curves diverge. The negative projection tracks the statistically predicted amplification for around 25 yr.^{5} We would not necessarily expect the linear approximations made in the estimation of the leading CSV to hold for the very long lead times, but it is reassuring that, for at least a decade, we find that the amplification is well represented by the linear model.

We are most confident about our amplification estimates for lead times up to ∼10 yr (Fig. 6). It is of interest to note that these lead times are also when we are most confident that the actual amplification occurs as predicted (Fig. 8). If more ensemble members were used in estimating the CSVs, then perhaps the lead times for which we are confident about our amplification estimates could be extended, but this would make the procedure more computationally expensive.

### c. Patterns of amplification

It has been shown above that the level of amplification of the largest growing CSV approximately matches the linear prediction for at least 10 yr. However, to be confident about using the estimated CSV, it is also necessary that the patterns of amplification evolve similarly to that predicted by the linear model. Figure 9 shows the predicted temperature fields in the subspace of the leading eight EOFs after 10 yr of amplification (top row), and the actual amplification for both the positive (middle row) and negative (bottom row) projections of the leading CSV. Figure 10 shows the same for the salinity fields.

It is clear from Figs. 9 and 10 that the amplified patterns are very similar to those predicted by the linear approximation. Note from that we have scaled the patterns by the appropriate ratios of the estimated amplification factors shown in Fig. 8 for a direct comparison. The largest discrepancies are at the deeper levels, and are slightly larger in salinity than in temperature, but overall these results are encouraging, and give us confidence that our estimates of the CSVs are robust and are working as intended.

### d. Amplification mechanism

We now briefly consider how the amplification of anomalies occurs over the first 15 yr. Figure 11a shows the ensemble mean temperature anomaly of the CSV perturbed integrations, averaged over years 8–12. The growth in the North Atlantic Current region is typical of an MOC response in this GCM and a increase in the strength of the subpolar gyre (e.g., Vellinga and Wu 2004; Dong and Sutton 2005), and as noted in section 3a, the optimal initial perturbation has a density gradient that is likely to induce an MOC response (Thorpe et al. 2001) through enhanced deep water formation. The time scale of amplification seems to be set by the advective time scales (a few years) in the subpolar gyre (also see Dong and Sutton 2005), although what processes are setting the time scales of the propagation at depth remains an open question, though it is likely a combination of wave and advective processes.

Figure 11b shows the regression coefficient between ocean temperature and an index of the MOC (defined as the strength of the MOC at 50°N at 1000-m depth to consider the higher latitudes where the amplification occurs), but scaled to show the temperature change expected per 0.5-Sv (1 Sv ≡ 10^{6} m^{3} s^{−1}) increase in the MOC. The patterns closely resemble each other, especially in the deeper levels, suggesting that the change in the MOC is responsible for a significant part of the amplification.

Figure 12 shows the response of the MOC for the mean of years 8–12 after the perturbation, for both positive and negative projections. There is a basin-wide increase of the MOC (as a response to the positive pattern) in both panels of 0.2–0.6 Sv, and this persists for ∼15 yr (not shown). Although the details differ between the positive and negative projections (e.g., the large anomaly in the positive projection near the latitude of the North Atlantic Current), they are broadly consistent in the sign and magnitude of the MOC response. This is possibly due to nonlinearities, such as in the deep water formation processes and, although the MOC was in a fairly neutral state at the chosen start date (Fig. 1b), one might expect a lagged response to the previous slight decline of the MOC, which will make the MOC response nonlinear to the sign of the perturbation.

It is therefore likely that the MOC response to the perturbed density gradients is the primary driver of the amplification. This finding is in line with previous papers examining MOC variability in this GCM, and with HS09, who also found that an anomalous MOC is responsible for amplification of anomalies in HadCM3. Wider climate impacts of this amplification include a northward shift in the ITCZ and a decrease in Arctic sea ice, consistent with the MOC amplification (not shown).

## 5. Discussion and conclusions

We have examined the growth of Atlantic Ocean anomalies on decadal time scales in a atmosphere–ocean coupled GCM (HadCM3) to estimate the perturbations that amplify most rapidly. The main findings can be summarized as follows:

By extending the methods of KTM03 to three dimensions and longer time scales, we have demonstrated that initial-state-dependent climatically relevant singular vectors (CSVs) for use in decadal climate predictions can be reliably estimated for a GCM.

For the Atlantic domain in HadCM3, the optimal initial regions for growth are found to be the North Atlantic Current region and the main convection sites in the Labrador and Nordic Seas, especially in the top ∼700 m. These regions are therefore identified as being the areas where additional ocean observations would maximally inform decadal climate predictions, for this GCM and particular initial conditions.

An amplification factor of ∼2 is predicted at a lead time of 15 yr. The levels of amplification found in the full nonlinear GCM are consistent with that predicted by the linear model for around 15 yr, but not on longer lead times.

Growth of the perturbations mainly occurs in situ, and the mechanism of amplification relies on a perturbation of the Atlantic MOC, which initially grows, and then persists for ∼15 yr.

Although resources have only allowed a single initial condition to be investigated here, the methods used could be streamlined by focusing only on shorter lead times (i.e., up to 10 yr) and, hence, also reducing the number of ensemble members required. This could reduce the computational resources required by a factor of ∼8.

We note that our growth estimates have assumed that we can represent the variability with the leading EOFs, but it is possible that a more optimally amplifying mode exists that we cannot capture with this approximation.

In a previous study (HS09) we estimated the optimal perturbations for HadCM3 over the same Atlantic domain using a statistical method, namely linear inverse modeling (LIM). The LIM estimates are independent of the specific initial state and, hence, represent an average optimal perturbation, whereas the CSV approach has estimated perturbations for a single set of initial conditions. However, the identified regions (the far North Atlantic) are similar in both approaches [and also for a LIM analysis of the GFDL CM2.1 GCM; Tziperman et al. (2008)], giving us more confidence in the results. In planned future work we will compare the reliability of the amplifications of the LIM and CSV perturbations in both idealized and operational decadal prediction experiments.

Finally, we note that these approaches may help guide the development of ocean observing systems for prediction purposes. It is suggested that additional observations in the far North Atlantic would be of most benefit for improving decadal climate predictions, such as those planned for the forthcoming IPCC AR4 (Meehl et al. 2009), although it must be noted that uncertainties in the ocean state below 2000 m are greater than near the surface, mainly because of the recent growth in Argo observations, and this has implications for validating predictions also. A future study will examine the robustness of these suggested observation regions across a range of GCMs.

## Acknowledgments

We thank the three anonymous reviewers for their constructive comments, which helped improve the paper. The research leading to these results has received funding from the European Community’s seventh framework programme (FP7/2007-2013) under Grant Agreement GA212643 (THOR: “Thermohaline Overturning - at Risk,” 2008–2012), from the U.K. Natural Environment Research Council under the thematic Rapid Climate Change programme and from NCAS-Climate. RS was also supported by a Royal Society University research fellowship.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Ed Hawkins, Dept. of Meteorology, University of Reading, Reading RG6 6BB, United Kingdom. Email: e.hawkins@reading.ac.uk

^{1}

Moore and Kleeman (2001) discuss why it is important to use correlation EOFs to retain regions with small absolute variability, but from where anomalies can still grow rapidly (e.g., the deep ocean).

^{2}

We note that potential global teleconnections from the Antarctic Circumpolar Current or Agulhas region may be missed by restricting the EOFs to northward of 20°S, but this was done because of computational constraints.

^{3}

This choice of parameters amounts to nearly 6000 yr of model integration, using around 20 CPU years.

^{4}

Alternative explanations could include growth temporarily occurring in covariances of temperature and salinity, causing the apparent amplification to decrease, or even that there is growth in density anomalies through cancellation of temperature and salinity anomalies.

^{5}

Note that this ensemble was only run for 31 yr due to computational constraints.