## Abstract

Many subgrid-scale (SGS) parameterizations in climate models contain empirical parameters and are thus data dependent. In particular, it is not guaranteed that the SGS parameterization still helps the model to produce the correct climate projection in the presence of an external perturbation (e.g., because of climate change). Therefore, a climate dependence of tuning parameters is proposed, using the fluctuation–dissipation theorem (FDT). The FDT provides an estimation of the changes in the statistics of a system, caused by a small external forcing. These estimations are then used to update the SGS parameterization. This procedure is tested for a toy atmosphere given by a quasigeostrophic three-layer model (QG3LM). We construct a low-order climate model for this toy atmosphere, based on a reduced number of its empirical orthogonal functions (EOFs), equipped with either an empirical deterministic or an empirical stochastic SGS parameterization. External forcings are considered that are either a local anomalous heat source in the extratropics or a global dynamical forcing represented by individual EOF patterns. A quasi-Gaussian variant of the FDT is able to successfully update the SGS parameterization leading to an improvement in both amplitude and correlation between the low-order climate model and the QG3LM, in case of a perturbed system. The stochastic closure exhibits nearly no improvement compared to the deterministic parameterization. The application of a more sophisticated non-Gaussian FDT algorithm (i.e., the blended short-time/quasi-Gaussian FDT) yields only marginal improvement over the simple quasi-Gaussian FDT.

## 1. Introduction

One key problem in climate modeling is the response of the climate system to an external forcing. In particular, the anthropogenic influence on the climate is of interest. This includes the global mean temperature, the sea level, precipitation patterns, or atmospheric phenomena such as ENSO. The current procedure to estimate such a response is by means of sensitivity studies using general circulation models (GCMs). However, because of the complexity of GCMs, those sensitivity studies are quite expensive.

It might be tempting to use alternatively so-called low-order climate models, based, for example, on optimal basis patterns (Achatz et al. 1995; Kwasniok 1996; Selten 1997; Achatz and Schmitz 1997; Achatz and Opsteegh 2003a,b; Kwasniok 2004, 2007). The result is a reduction of dimension while the overall quality is supposed to stay comparable to that of a regular GCM. Because of their lower state-space dimension such models are potentially less demanding on computing power than the latter. The dimension reduction also entails, however, as in any multiscale system, the use of additional subgrid-scale (SGS) parameterizations that take the effect of neglected basis patterns into account.

Such SGS parameterizations are often data driven (e.g., Achatz and Branstator 1999; Achatz and Opsteegh 2003a,b; Kravtsov et al. 2009). That is, a suitable function is tuned against simulated data (or real observations) to minimize the residual error between the low-order climate model and the data. However, such a data-driven SGS parameterization is inherently dependent on the data against which it is tuned. This was shown by Achatz and Branstator (1999) for a low-order model consisting of a filtered two-layer model projected onto empirical orthogonal functions (EOFs) of a GCM and a simple linear SGS parameterization. The reduced model was successful in capturing the characteristics of the GCM dynamics, such as the average transient eddy fluxes and the statistics of the first and second moment. However, the low-order model was not able to capture the dynamical changes because of the presence of a local anomalous heating in the tropics. A similar result was found by Achatz and Opsteegh (2003b) for a reduced EOF model based on the primitive equations.

This problem is not limited to low-order models. Even state-of-the-art GCMs contain empirical parameters. Those models also employ SGS parameterizations, partially based on physical considerations, but nonetheless are usually highly tuned to the present-day climate. Consequently, they suffer from data dependence as well. One could argue that the effect of data dependence in a physical SGS parameterization is negligible. However, the fact that the tuning of GCMs and climate models is a common and even necessary step underlines the impact of the empirical parameters. One example for the dependence on the empirical parameters is the study of Rockel and Geyer (2008), which shows that regional climate models cannot be transferred to a different climatic zone without retuning.

There are approaches to derive SGS parameterizations directly from model equations. Verkley and Severijns (2014) introduced a SGS parameterization based on maximum entropy principle (Verkley et al. 2016), which does not require any fitting. In the stochastic mode reduction (Majda et al. 2001, 2003) the nonlinear self-interaction of the SGS processes is described by an empirical Ornstein–Uhlenbeck process. This Ornstein–Uhlenbeck process is then used to derive a stochastic SGS parameterization explicitly from the model equations. Franzke et al. (2005) introduced a seamless variant of the stochastic mode reduction that drops the assumption of an empirical Ornstein–Uhlenbeck process. Consequently, the required lag correlations for the stochastic SGS closure cannot be calculated analytically from the Ornstein–Uhlenbeck process but are obtained directly from data. Either way, while there is still some tuning involved, this explicitly derived SGS closure should be more robust to climate change than traditional parameterizations. Nevertheless, the stochastic mode reduction requires a strict time-scale separation that is not necessarily given in (atmospheric) GCMs (Franzke and Majda 2006). An alternative method based on response theory, which instead of a time-scale separation relies on weak coupling, was proposed by Wouters and Lucarini (2012) and used to derive parameterizations (Wouters et al. 2016; Demaeyer and Vannitsem 2017). Recently, Lucarini and Wouters (2017) derived explicit expressions on how this parameterization changes when the background state of the system is perturbed. This response-theory ansatz is conceptually close to the approach we take in this study.

In particular, in this paper we follow the route of Achatz et al. (2013). For cases where the tuning parameters of the SGS parameterization are based on the statistics of the system in question, the authors propose to use the fluctuation–dissipation theorem [FDT; see Marconi et al. (2008) for a recent review] to estimate the change of the statistics caused by an external forcing. In general, the FDT connects a suitable covariance function of a dynamical system at statistical equilibrium with the response of the system, caused by a small external forcing or a sufficiently small change in model parameters (Risken 1984). Since the FDT only depends on a covariance function, no detailed knowledge of the governing equations of the system is required. However, for the FDT to work, several constraints have to be fulfilled. In general, the system in question must be in statistical equilibrium and have a time-invariant probability density function (PDF), which must be differentiable. Although extensions to nonequilibrium systems (Lucarini and Sarno 2011; Ragone et al. 2016) and to time-periodic cases are possible (Majda and Wang 2010; Gritsun 2010), the condition of differentiability is generally violated in systems which exhibit deterministic chaos. Usually, the system attractor of such systems is fractal. The differentiablity can be ensured if some suitable random noise is added to the system to smooth the PDF (Zeeman 1988).

Even though the atmosphere and the climate system are not strictly consistent with the conditions above, Leith (1975) suggested the use of the FDT for an approximation of the climate response. Since then numerous studies have applied the so-called quasi-Gaussian FDT [qG-FDT; Majda et al. (2005), where the equilibrium PDF is assumed to be close to Gaussian] to data from various idealized climate models (e.g., Bell 1980; North et al. 1993; Gritsun and Dymnikov 1999; Gershgorin and Majda 2010; Achatz et al. 2013; Fuchs et al. 2015; Lutsko et al. 2015), simple GCMs (Gritsun and Branstator 2007; Gritsun et al. 2008; Ring and Plumb 2008), and even coupled atmosphere–ocean GCMs (Gritsun and Branstator 2016). However, all studies showed mixed results regarding the FDT performance. It remains unclear why this is the case. A recent study of Hassanzadeh and Kuang (2016) indicated that this could be due to nonnormal response operators. This nonnormality might lead to a strong interaction between resolved and unresolved EOFs which cannot be captured by an FDT-response operator in an EOF subspace. Another possibility is that the forcing might project onto a stable direction of the attractor, resulting in a response that is not covered by the unperturbed model PDF (Gritsun and Lucarini 2017). Furthermore, it is possible that the Gaussian assumption may contribute to the error more than expected. Therefore, Cooper and Haynes (2011) suggest relaxing the Gaussian approximation of the equilibrium PDF by the use of a nonparametric kernel method. Alternatively, Abramov and Majda (2007, 2008, 2009) introduced a blended short-time/quasi-Gaussian (ST/qG-FDT) algorithm that avoids the differentiation of the PDF by use of a tangent linear model. Extensions of non-Gaussian response theory to higher orders in the perturbation amplitude are possible as well (Lucarini and Wouters 2017). While those methods generally yield superior results, compared to the simple qG-FDT, they are significantly more expensive.

Therefore, Achatz et al. (2013) decided to use the qG-FDT. The authors considered a toy atmosphere based on the barotropic vorticity equation on the sphere and a corresponding reduced model based on EOFs. The linear SGS parameterization of the latter model is determined completely by first and second moments of the toy atmosphere. In addition, they used a local anomalous heating located in the midlatitudes to investigate the robustness of the closure caused by a changing climate. Their qG-FDT estimations of the first moments were reasonably good. However, the second moments exhibited a considerably higher error. Consequently, the resulting changes in the tuning parameters of the closure were not usable. Therefore, Achatz et al. (2013) introduced a reduced form of qG-FDT, basically discarding the estimations for the second moments, resulting in reasonably good closure updates. Furthermore, they showed that the use of the reduced-qG-FDT (rqG-FDT) modified closure yielded a better agreement with the reference model than the unmodified reduced model, in the case of a perturbed climate. Moreover, for sufficiently high EOF dimension the reduced model with modified SGS parameterization even outperformed the direct qG-FDT estimation. Nevertheless, because of the negligence of the qG-FDT estimation of the second moments, only a part of the SGS parameterization could be updated. The failure of the qG-FDT could be associated with the constraints of the theorem. The nonlinear dynamics of the smallest-scale processes might not act as a stochastic forcing and thus a sufficiently smooth PDF may not be given, since the toy atmosphere has no clear time-scale separation. Furthermore, only barotropic Rossby waves and no fast synoptic-scale processes (e.g., baroclinic instability) or gravity waves are present in this model.

Consequently, the present study pursues the approach of Achatz et al. (2013) further, however, by applying it to a baroclinic toy atmosphere, given by a quasigeostrophic three-layer model (QG3LM). As it turns out, the increased complexity of the model and the presence of baroclinic instability leads to an overall better performance of the qG-FDT, allowing a successful update of the closure parameters.

This paper is organized as follows. In section 2 the QG3LM and the low-order climate model, based on the leading EOFs of the QG3LM, are introduced. Furthermore, data-driven SGS parameterizations are constructed. The models are perturbed by both local anomalous forcings and global anomalous forcings presented in section 3. Moreover, we recapitulate the qG-FDT and explain the climate dependence of the SGS parameterization in more detail. Afterward we present the results of the experiments in section 4. Section 5 summarizes and discusses the findings and gives some conclusions.

## 2. Models

### a. The quasigeostrophic three-layer model

In this paper the QG3LM of Marshall and Molteni (1993) is considered as a toy atmosphere. The QG3LM is a model of medium complexity containing baroclinic Rossby waves. The model is governed by the quasigeostrophic potential vorticity equation on the sphere:

at the pressure levels of 200, 500, and 800 hPa denoted by *i* = 1, 2, and 3, respectively. Here **q** is the quasigeostrophic potential vorticity; is the streamfunction; denotes the standard Jacobian operator; and **D** is a function containing temperature relaxation, Ekman friction, and hyperdiffusion. Furthermore, ** η** denotes the stretching vorticity,

*f*is the Coriolis parameter, and is the normalized orography (cf. Marshall and Molteni 1993). The constant vorticity forcing

**S**is tuned against 10 winter seasons of reanalysis data from ECMWF (Liu and Opsteegh 1995), which enables the model to simulate a realistic Northern Hemispheric winter.

The QG3LM has a spectral discretization with a triangular truncation of T21. This leads to *N* = 1449 degrees of freedom. The model is integrated forward in time using the leapfrog scheme with a time step of . For all following results a daily model output is used. Furthermore, for all integrations the first 10 000 days have been discarded to eliminate spinup effects.

Equation (1) can be rewritten by the introduction of a state vector , that is,

where the function represents the right-hand side of (1).

### b. The low-order climate model

For the construction of the low-order climate model the QG3LM is projected onto EOF space. In particular, the deviation of the state vector from its time mean (i.e., , where denotes a time average) is projected onto the leading *M * EOFs. Therefore, the state vector is expressed as

where is the so-called principal component vector, is a matrix containing the leading *M* EOFs **e** as columns, and is the time-dependent truncation error.

The EOFs are defined by the eigenvalue problem:

where denotes the explained energy variance of the QG3LM by the *k*th EOF and is the total energy metric of Ehrendorfer (2000). The exact formula of is given in the appendix. The EOFs have been calculated from a 100 000-day integration of the QG3LM. It turns out that 540 EOFs are necessary to explain 90% of the variance of the analyzed data. Spectra with this kind of flatness are typical when an energy metric is applied to data that are not filtered in time (e.g., Achatz and Branstator 1999; Achatz and Opsteegh 2003a).

where **s** is the SGS tendency error resulting from the truncation error within .

A suitable parameterization is required to address the SGS error. This results in the low-order model equation:

where and denotes the parameterization error [i.e., ]. To address the SGS error, we choose a linear deterministic parameterization (Achatz and Branstator 1999; Achatz and Opsteegh 2003a), that is,

where and are a constant vector and matrix, respectively. Minimizing the norm of the parameterization error (i.e., ) yields as optimal closure parameters:

where denotes the SGS tendency error approximated by centered differences in time. Thus, the closure is completely determined by first and second statistical moments of the QG3LM. For the calculation of the closure parameters **r** and , a 6 × 10^{6}-day integration of the QG3LM is used to safely exclude sampling errors. This huge amount of data is not necessary for a reliable application of the FDT, but needed for the convergence of the response due to the external forcing (see section 5). Hence, only for consistency we use a time series of the same length here.

Furthermore, we consider a stochastic parameterization given by an Ornstein–Uhlenbeck process:

where is a constant diagonal matrix and denotes an increment of a Wiener process. The noise in (12) (and in all following equations) is interpreted in the sense of Stratonovich (i.e., as a limit of physical noise with vanishing memory). Using a maximum likelihood approach (Honerkamp 1994) to estimate the closure parameters, we obtain for the deterministic part of (12) the same equations as for in (11), whereas the optimal noise amplitude is given by

In conclusion, the low-order climate model with parameterization reads as follows:

where in the case of a deterministic closure and **Σ** given by (13) in the case of a stochastic parameterization . Thus, for the remainder of this paper we write the reduced model in this general form.

Generally, such semiempirical low-order models tend to overestimate the variance (Achatz and Branstator 1999; Achatz and Opsteegh 2003a). To counter this behavior, we tune the hyperdiffusion of the reduced model by adjusting the diffusion time scale (see Marshall and Molteni 1993) to minimize the relative error [see (25)] of the variance. The numerical values for both the deterministic and stochastic parameterization are given in Table 1.

Figure 1 displays the performance of the reduced model with deterministic parameterization in the 500 EOF case, in terms of both the mean streamfunction (top) and the variance (bottom) of the streamfunction. In particular, Figs. 1a and 1c show the result of the QG3LM at 200 hPa (where the response is strongest among all three layers) projected onto the 500 EOF space, whereas Figs. 1b and 1d show the equivalent result of the reduced model in (14). In general, the parameterization works quite well: the mean is virtually identical, whereas in the variance only minor differences are visible. For smaller EOF truncations of the low-order models as well as for the models with stochastic parameterization, the results are qualitatively similar (not shown).

## 3. Climate-dependent parameterization

### a. External anomalous forcing

The low-order model with parameterization reproduces both qualitatively and quantitatively the QG3LM. However, it remains to be seen how well the reduced model responds to an external forcing. For this purpose we use a time-independent local anomalous heating, representing the effect of a sea surface anomaly resulting from a change in the ocean circulation (Branstator and Haupt 1998; Achatz and Branstator 1999; Achatz and Opsteegh 2003b; Achatz et al. 2013). As a quasigeostrophic potential vorticity forcing , the discretized heating can be written as

where is the amplitude of the heating, *R* is the universal gas constant, denotes the pressure difference of the respective layers, is the surface pressure, is the Coriolis parameter at 45°N, denote the position at which the forcing is centered, and is the horizontal extension of the forcing. For and the anomalous forcing is set to zero. The pressure in between the layers are given by and , respectively. Furthermore *r* denotes the Rossby deformation radius [see Marshall and Molteni (1993) for the exact values].

In contrast to Branstator and Haupt (1998), the anomalous heating is placed in the midlatitudes since the QG3LM exhibits relatively small variances at the equator, resulting in a poor EOF representation of an anomalous forcing placed there. Therefore, for all following experiments the latitude position is fixed at , whereas the longitude position varies at . The most compact model to be investigated is the 20-EOF model. Hence, the anomalous forcing has always been projected onto the first 20 EOFs. Projection onto more EOFs also increases the response amplitude, resulting in nonlinear behavior of the response, especially in the second moments. Perturbing the system with the same external forcing but with opposite signs (i.e., ) should result in a pattern correlation of −1 if the response is purely linear. However, even with restriction onto the first 20 EOFs, this test shows an average pattern correlation of only about −0.75 for the second moments. On the other hand, the first moments exhibit on average a correlation stronger than −0.95.

The left column of Fig. 2 shows the equilibrium response of the QG3LM at 200 hPa due to a local anomalous heating at , calculated from a 6 × 10^{6}-day time series. Figure 2a displays the response in mean streamfunction, whereas Fig. 2c shows the response in mean zonal wind. The maximum values in wind speed are about 2 m s^{−1}, which is comparable to the expected change in mean zonal wind speed caused by anthropogenic climate change (Lorenz and DeWeaver 2007).

It might be a strong simplification to model a climate change by a single local anomalous forcing placed in the extratropics. One would rather expect that climate change manifests on a global scale. Therefore, in order to drop the locality in the external forcing and thus be more realistic, we are also considering global anomalous forcings given by individual EOFs multiplied by a small constant, that is, (14) is supplemented by

where and . The values of are chosen in such a way that the turbulent energy in the response of the global anomalous forcing is comparable to that of the local anomalous forcing (*E*_{turb} ≈ 470 TJ kg^{−1}). The exact numerical values are given in Table 2.

The right column of Fig. 2 shows, as a representative example, the response to the global anomalous forcing of EOF 1. The response is similar to the response of the local anomalous forcing. Overall, the amplitude is slightly weaker with maximum zonal wind speeds of about 1.5 m s^{−1}.

### b. The perturbed low-order climate model

The simplest ansatz to implement an anomalous forcing in the reduced model is the a priori (i.e., the naïve) low-order model:

where is either given by (16) or (where denotes projection onto the first 20 EOFs) in the case of a global anomalous forcing or local anomalous forcing, respectively.

The performance of the a priori model is illustrated in Figs. 3a and 3b. Shown is the response in variance of streamfunction for a local anomalous forcing placed at . Figure 3a shows the result of the QG3LM projected onto the first 500 EOFs; Fig. 3b shows the result of the 500 EOF a priori low-order model in (17). In the response of the a priori low-order model the minima over the Pacific Ocean and Atlantic Ocean are barely visible. Furthermore, the amplitude of the maxima over the Pacific Ocean is too strong. On the other hand, the minimum over Greenland is too pronounced. Still, the pattern correlation is reproduced well. Qualitatively similar results are found for the global anomalous forcings (not shown).

The reason for the incorrect response of the a priori low-order model is the dependence of the closure parameters on the training dataset, which is from the unperturbed model (Achatz et al. 2013; Lucarini and Wouters 2017). The actual reduced climate model is given by

where , , and are the changes in the SGS parameterization resulting from the altered climate of the model. In particular, the updated closure parameters read as follows:

where and . Consequently, if the changes in the statistical moments (i.e., , and ) are known, it will be easy to update the SGS parameterization.

One way to obtain the changes in the statistical moments is to directly analyze data from the perturbed QG3LM. Using the resulting a posteriori parameterization clearly fixes the issues of the a priori model. This can be seen in Fig. 3c. Similar results as for the response in the variance are found for the response in the mean streamfunction (not shown), although, in this case, already the a priori low-order model reproduces the response quite well.

Nevertheless, a recalculation of the QG3LM with the anomalous forcing contradicts the very idea behind efficient low-order climate models. In addition, the problem outlined in Figs. 3a and 3b is not only limited to low-order climate models. All models containing empirical tuning parameters suffer from data dependence, including state-of-the-art GCMs, paleoclimate models, and many SGS parameterizations. However, for climate projections it is impossible to retune the empirical parameters against a new dataset. Therefore, other approaches are necessary.

### c. Fluctuation–dissipation theorem

The climate dependence of the SGS parameterization in (18) is introduced by the FDT, as suggested by Achatz et al. (2013). In this study we focus mainly on the conventional qG-FDT. However, to evaluate the approximation of Gaussianity we additionally investigate selected cases with the more sophisticated ST/qG-FDT.

The qG-FDT states for the steady-state response of an arbitrary function of the state in the reduced EOF space (Gritsun et al. 2008)

where is a constant forcing. A detailed derivation of expression (22) can be found, for example, in Risken (1984). For practical purposes we integrate (22) up to 100 days. The calculation of the lag integral is done via the efficient “Cooper–Haynes” algorithm presented in the appendix of Lutsko et al. (2015), however, here applied to the Simpson’s rule.

In addition to the qG-FDT, we also apply the blended ST/qG-FDT algorithm of Abramov and Majda (2008) to selected cases. In contrast to the qG-FDT, the ST-FDT makes no assumption on the shape of the PDF of the system. Thus, the resulting response is generally more accurate than (22). The price for the higher accuracy, however, is the need of the tangent linear model. However, since the tangent linear model is inherently unstable the ST-FDT response is only useful for short times (up to ). Therefore, Abramov and Majda (2008) propose to combine the ST-FDT and qG-FDT by replacing the qG-FDT response operator in (22) with the ST-FDT response operator at short times, while keeping the qG-FDT operator for longer times. The resulting blended ST/qG-FDT reads as follows:

where the tangent linear model , initialized with and integrated to time *τ*, is calculated solving

and is the Jacobian of the right-hand side of the QG3LM in (3) in the full EOF space. Furthermore, in (23) , where depends on the dimension of **h**. Both the instability of the tangent linear model and the ambiguous choice of the parameter , however, limit the robustness of the blended ST/qG-FDT algorithm. Here , determined by trial and error (i.e., by comparing the response of the qG-FDT and ST-FDT for various ).

## 4. Experiments

### a. Setup

For the test of the FDT-adjusted closure, we consider 12 different cases, corresponding to the different positions of the local anomalous forcing (see section 3a). In the case of the global anomalous forcing we use the first five EOFs as stated in (16). For the computation of the FDT response operators the same 6 × 10^{6}-day time series is used as for the calculation of the closure. Furthermore, we run the QG3LM with the anomalous forcing for each case. With the resulting perturbed time series the actual response of the respective moments are calculated to quantify the performance of the FDT algorithms. We restrict our investigation to five low-order models with truncations of 20 (30% explained variance), 50 (46%), 100 (60%), 200 (73%), and 500 (89%) EOFs, respectively.

For the evaluation of the FDT-adjusted closure we consider the following four different low-order models:

a priori: no climate-dependence as in (17).

qG-FDT: closure corrections calculated by qG-FDT.

a posteriori: the true closure corrections, obtained from a run of the QG3LM with anomalous forcing.

rqG-FDT: as the qG-FDT but with (Achatz et al. 2013).

As it turns out, no useful closure updates can be obtained for **Σ** for the local anomalous forcing. In this case, we could keep **Σ** constant and set . However, this might suppress the effect of the update of the remaining part of the closure with the FDT. Therefore, for the local anomalous forcing we consider only reduced models with a deterministic parameterization (i.e., ). To test the effect of the stochasticity we focus on the cases with the global anomalous forcing.

For the quantification of the results we calculate for and a relative error given by

where the norm is either the 2-norm (if the argument is a vector) or the Frobenius norm (if the argument is a matrix).

### b. Results

#### 1) Local anomalous forcing

Figure 4 (top) shows a boxplot of relative error of the various statistical moments entering (20), summarizing the qG-FDT predictions for all local anomalous forcing cases and each EOF truncation. Clearly the qG-FDT is unable to provide a correct estimate of for any EOF truncation. Apart from that, the remaining first moments (i.e., and ) are predicted remarkably well with an average median of around 0.1 for all truncations. The median of relative error of the second moments (i.e., and ) are on average 0.3–0.4, considerably higher. In general, the performance of the qG-FDT for and is independent of the EOF truncation. In fact, for higher EOF truncations the spread of relative error decreases for both moments (although, for this is not visible). However, all moments containing the discretized SGS error (i.e., , , and ) exhibit an increase in both median and spread with rising EOF truncation.

Figure 4 (bottom) shows the equivalent evaluation of the closure corrections in (20) resulting from the qG-FDT estimation of the statistical moments. The constant term has the best result with an average median of 0.15 for all EOF truncations. In contrast, the linear term has a considerably higher error in the median that even reaches for the 500 EOF low-order model. Generally, the quality of the qG-FDT predicted closure corrections decrease with increasing EOF dimension. This is due to the increase of the error of (see Fig. 4, top) which directly influences . Consequently, this trend translates to as well, since it is dependent on in (20). The same holds for , although the high error of renders the closure update of the noise amplitude useless for all EOF truncations.

Figure 4 is not sufficient for a final evaluation of the qG-FDT-predicted closure corrections. The utility of the method is eventually decided by the performance of the reduced models including the adjusted closures. However, before showing summarizing plots, we first focus on a representative example, given by the 500 EOF low-order model with a local anomalous forcing located at .

Figure 3 shows the response in variance of streamfunction of the QG3LM projected onto 500 EOFs (Fig. 3a), three low-order models with different deterministic SGS parameterizations (Figs. 3b–d), and the direct qG-FDT estimation (Fig. 3e). The QG3LM shows mainly a response in the Northern (winter) Hemisphere. In particular, the response of the QG3LM exhibits a band of maxima at roughly 60°N, spanning nearly the whole globe. Furthermore, multiple minima are located over the Pacific Ocean, the Atlantic Ocean, and Greenland. Overall, the pattern correlation of all reduced models and the direct qG-FDT estimation are, at about 0.90, quite high. Yet, prominent differences are visible in the amplitude. The a priori low-order model (Fig. 3b) shows discrepancies as described in section 3b, resulting in a relative error of . The a posteriori low-order model (Fig. 3c) fairs better with , because of stronger minima and a slightly reduced maximum over the Pacific Ocean. The response of the low-order model with the qG-FDT-adjusted SGS parameterization (Fig. 3d) is virtually identical to the a posteriori low-order model that can also be seen from the relative error of . In contrast, the direct qG-FDT estimation (Fig. 3e) overestimates the amplitude of the response by a factor of 2 resulting in a relative error of .

Similar results are found for the meridional momentum flux (Fig. 5) and the meridional entropy flux (not shown), although for these fluxes the estimation using the qG-FDT directly also provides useful results.

Finally, we summarize the results of the reduced models with adjusted closures in Fig. 6. Figure 6 has a similar structure as Fig. 4; however, it shows the performance of the response of the first moment (, left panel) and covariance (, right panel) of the streamfunction of the reduced models with the various adjusted deterministic closures and the direct qG-FDT estimation, compared to the QG3LM. In general, for the response in the first moment the a priori low-order model with the unmodified closure performs the worst for all EOF truncations, whereas the a posteriori model yields generally the best result of all low-order models. Furthermore, we see a systematic improvement for the low-order model with qG-FDT-adjusted SGS parameterization compared to the a priori low-order model and the model with the rqG-FDT-adjusted closure. Still, for EOF truncations smaller than 500 EOFs, the direct qG-FDT yields by far the best correspondence with the actual response. However, the higher the EOF truncation, the better fair the reduced models in general. In fact, the qG-FDT-adjusted low-order model with 500 EOFs yields a comparable result to the direct qG-FDT estimation.

The finding of the response of the covariance (right panel) is qualitatively similar, independent of the EOF truncation. However, the results are useful only for low-order models with more than 200 EOFs (i.e., ). Additionally, for 500 EOFs we find that the reduced model with adjusted closure begins to outperform the direct qG-FDT estimation of the response in covariance. Although, in those cases even the a priori low-order model yields a better result than the qG-FDT.

Even though the FDT estimations of the change of the statistics are linear, we can significantly improve the description of the perturbed dynamics by the low-order model compared to the a priori case. In particular, we do not use the FDT estimations directly but apply them on the empirical parts of the SGS closure. This allows us to account for the nonlinearity explicitly when running the low-order model.

#### 2) Global anomalous forcing

Figure 7 shows the qG-FDT estimation of the changes in the statistical moments and the skill of the resulting closure updates, in the case of the global anomalous forcing. Qualitatively, the findings in the case of the local anomalous forcing are reproduced. However, the overall performance of the qG-FDT estimations is increased. In general, the spread of estimates is significantly reduced for all statistical moments. In the second moments, the average median of the relative error of is below 0.2, whereas the relative error of is basically unchanged in comparison to the local anomalous forcing case. In addition, the estimation of is comparable to the second moments, with an average median of 0.15. Furthermore, the remaining first moments exhibit on average a relative error below 0.1. Interestingly, the quality of the resulting closure corrections are (apart from ) nearly identical compared to the local anomalous forcing case. This indicates that the considered SGS parameterization is highly sensitive to . Nevertheless, the good estimation of allows a useful update of , which was not possible for the local anomalous forcing case. Consequently, the stochastic parameterization in (12) can be studied under the perturbation with a global forcing.

For direct comparison we first show in Fig. 8 the results of the low-order models with adjusted deterministic closures in the case of the global anomalous forcing. In general, the results of the local anomalous forcing are reproduced, however, the performance of all low-order models and the direct qG-FDT is systematically better. In particular, the spread is significantly reduced. Figure 9 shows the reduced models with adjusted stochastic parameterization in the case of the global anomalous forcing. Overall, the performance is qualitatively similar to the deterministic case. A slight improvement in terms of spread is visible for all reduced models and all EOF truncations. Nevertheless, the net effect by the introduction of the noise term is quite small. The results of the low-order models with a posteriori and rqG-FDT closure are not shown, since their behavior is qualitatively the same as in Fig. 6.

#### 3) ST/qG-FDT

To estimate the error of the Gaussian approximation in the qG-FDT, we predict the changes in the statistical moments of the 20 EOF and 200 EOF case using the ST/qG-FDT by Abramov and Majda (2007). The result is given in Fig. 10 (top). Generally, the performance of the ST/qG-FDT is qualitatively similar to that of the qG-FDT (Fig. 7). For both FDT algorithms, the first moments are significantly better estimated compared to the second moments. Furthermore, the quality of the response of the first moments are equivalent. Nevertheless, we see an improvement in the ST/qG-FDT estimations in terms of spread for the second moments, in particular for . However, running the reduced models with ST/qG-FDT-adjusted closure yield similar results compared to the models with qG-FDT-updated closure in Fig. 6 (not shown). This might indicate an inherent issue of the ST/qG-FDT. However, the median of estimated by ST/qG-FDT (Fig. 10) stays roughly the same compared to the qG-FDT estimation (Fig. 7). It turns out that for our simulations the deterministic part of the SGS parameterization is highly sensitive toward this specific moment. Thus, this might explain the lack of improvement in the low-order models with ST/qG-FDT-adjusted closures.

## 5. Summary and discussion

In this study we addressed the problem of data dependence of empirical tuning parameters and the resulting inability of models containing such parameters to respond correctly to external perturbations. We considered the QG3LM of Marshall and Molteni (1993) as a toy atmosphere. In addition, we constructed a low-order climate model based on EOFs and equipped with a purely data-driven deterministic or stochastic SGS parameterization. While the low-order model with both closures reproduced the statistics (in terms of both amplitude and correlation) of the QG3LM correctly, it generated an incorrect response if the system was perturbed by an anomalous forcing. This is a known issue of (semi) empirical low-order models (Achatz and Branstator 1999; Achatz and Opsteegh 2003b), which is because the parameterization is tuned to the unperturbed statistics. Consequently, such low-order models are unsuitable for climate projections. However, we want to emphasize that this is a fundamental problem in climate modeling, which is not only limited to low-order climate models. Even physical parameterization schemes such as radiation or convection contain to some degree empirical components (e.g., aerosol distribution, vegetation, or entrainment of convective cells). One could hope that the physical basis should make those closures more robust to climate change than their purely data-driven counterparts. However, there are indications that physical SGS closures suffer from data dependence as well. For example, Rockel and Geyer (2008) show that a regional climate model, tuned to simulate regional climate over middle Europe well, yields incorrect results if applied over South America without retuning. Schirber et al. (2015) show that several gravity wave parameterization schemes can be tuned successfully to help a GCM simulate a realistic quasi-biennial oscillation. Yet, in a climate-change experiment the response of the quasi-biennial oscillation depends strongly on the chosen scheme.

The present study follows the ansatz of Achatz et al. (2013) by tackling the problem by means of the FDT, which is related to a corresponding response-theory approach of Lucarini and Wouters (2017). If the empirical parameters are based on a minimization of an objectively formulated error (i.e., it depends on the statistics of the system), a priori estimations of the changes in the corresponding statistics could help reduce the data dependence. The study of Achatz et al. (2013), however, potentially suffered from the barotropic toy atmosphere used there, with relatively weak high-frequency variability, and hence limited applicability of stochastic approaches. The hypothesis is that a setting with baroclinic instability and corresponding synoptic-scale activity should be better suited to an FDT ansatz. For a corresponding test, this study focuses on the simplest form of FDT: the qG-FDT. In addition, we considered two types of perturbations, namely, local anomalous forcings (localized heat sources) and global anomalous forcings, represented up to a factor by individual EOFs.

For both anomalous forcing types, the estimations of the response of the first moments were remarkably accurate. The estimations of the second moments exhibited a systematically higher error. This is consistent with Gritsun et al. (2008) who also show a generally higher error for second moments. In contrast to the literature (e.g., Gritsun and Branstator 2007; Gritsun et al. 2008), the problem in this study is even more challenging, since the reference used for the evaluation of the FDT is the true a posteriori response of the perturbed QG3LM, including all nonlinearities. The large amount of data used in this study allows us to exclude a convergence problem. Moreover, we analyzed the response in the second moments with respect to its linearity. There we found no link between the (non)linearity of the response and the quality of the FDT estimations. At least for a transient response, Majda et al. (2005, p. 68) proved that the qG-FDT estimation of the first moment is more accurate than that of the second moment .

Furthermore, we found that the statistical moments containing the discretized SGS error **s** (i.e., , , and ) experienced a trend, that is the error of the qG-FDT estimation increased with higher EOF truncations. This could be due to a larger contribution of the stable manifold (Gritsun and Lucarini 2017) at higher EOF truncations. On the other hand, it can also be understood by observing that the size of the SGS error gets smaller as the EOF truncation is relaxed. This results in a small signal-to-noise ratio for moments containing the discretized SGS error. Repeating the experiments with only half of the time series shows an increase in this trend.

The qG-FDT estimations of the statistics were used to update the closures. Here we distinguished four cases: the a priori low-order model (no change in the parameterization), the a posteriori low-order model (retuned parameterization), the qG-FDT low-order model (closure updated with the qG-FDT estimations), and the rqG-FDT low-order model [closure updated with qG-FDT estimations of first moments only; see Achatz et al. (2013) for details].

In general, for the response in mean and covariance of streamfunction the update of the deterministic closure by the qG-FDT yielded a better agreement with the QG3LM compared to the a priori low-order model. However, for the response in covariance of streamfunction, the result of the low-order models with EOF truncations below 200 EOFs were not useful (i.e., ). In those cases, the simple linear closure does not suffice to describe the SGS processes. For the 500 EOF low-order models we found that the qG-FDT-updated closure starts to outperform the direct qG-FDT estimation of the response in both mean and covariance. Nevertheless, even a comparable quality to the direct qG-FDT result is already a success, since the low-order model provides a time series that can be further analyzed, whereas the FDT only provides a response.

Overall, the qG-FDT estimations of the changes of the statistics worked better than in Achatz et al. (2013). This was to be expected since the QG3LM with its relatively fast baroclinic instability better fulfills the constraints of the qG-FDT. Consequently, our simulations show that discarding the FDT estimations of the second moments (i.e., using the rqG-FDT low-order model) yields inferior results compared to the qG-FDT low-order model, for all considered cases.

If the forcing projects only on one of the first five EOFs (i.e., global anomalous forcing) instead of a combination of the first 20 EOFs (i.e., local anomalous forcing), the FDT performs significantly better. Lutsko et al. (2015) argue that taking more EOFs into account results in higher uncertainty of the response operator, since higher EOFs are inherently noisier. However, we consider quite long time series, so one would expect that the first 20 EOFs are sufficiently well sampled. Furthermore, we are confident that enough data were used to construct the FDT response operators. In fact, using only 3 × 10^{6} (or even fewer) days to generate the qG-FDT operators and comparing them to the 6 × 10^{6}-days a posteriori response yielded nearly the same results as presented in this paper. Moreover, if we split up the 6 × 10^{6}-days time series into six parts and, thus generate an ensemble of qG-FDT operators, we can show that the mean operators are quite similar to the operators used in this study, while the ensemble spread is quite small. Thus, it seems that the worse performance of the FDT in case of the local anomalous forcing is due to its “hot spot” nature, which is in agreement with findings of Fuchs et al. (2015).

Because of the overall worse performance of the FDT in the case of the local anomalous forcing, we were unable to obtain useful updates for the noise amplitude in the stochastic closure. Therefore, only for the global anomalous forcing could a stochastic parameterization be considered. Qualitatively, the reduced models with stochastic closure produce similar results compared to the deterministic case described above. However, quantitatively no significant improvement is evident.

The superior ST/qG-FDT blended algorithm, which was exemplary applied on the 20 EOF and 200 EOF case, yielded an improvement, especially for . However, the SGS parameterization used in this study is highly sensitive to . Therefore, the resulting closure correction was only marginally better compared to the qG-FDT result. Apparently, the derivations from Gaussianity are negligible in our study. Yet, because of the calculation of the tangent linear model, the computational cost of the ST/qG-FDT algorithm is significantly higher than that for the simple qG-FDT. Thus, we see our comparatively simple approach (i.e., using the qG-FDT) supplementary to the more sophisticated ST/qG-FDT, at least if one is interested in the equilibrium response of the model.

All in all, the FDT provides useful estimations of the changes of the statistics in a perturbed climate, which allows for a successful update of the empirical parameters in an atmospheric model. However, since the FDT is a linear response theory, this method can only be applied to small forcings. Furthermore, it is restricted to forcings which project into unstable directions of the attractor (Gritsun and Lucarini 2017). Still, FDT offers a first step toward climate-dependent tuning parameters in data-driven SGS parameterizations. One possible route to reduce further the climate dependence of parameterizations would be to apply the present approach within the stochastic mode reduction procedure (Majda et al. 2001, 2003). If, instead of the seamless approach (Franzke et al. 2005; Franzke and Majda 2006), the coefficients of the Ornstein–Uhlenbeck process are estimated directly (Dolaptchiev et al. 2013a,b; Zacharuk et al. 2018), the latter can be updated using the FDT. For future studies, more complex models (e.g., GCMs), containing unbalanced motion (e.g., gravity waves) or moisture should be considered. In addition, it might be interesting to study whether the FDT is able to estimate the changes caused by a realistic external forcing. A related open issue is how the FDT (or other response-theory approaches) might be used to predict the sensitivity of tuning parameters of standard physical SGS parameterizations (e.g., convection, turbulence, gravity waves, and clouds) to external conditions. Since climate change might operate to a considerable degree via such SGS processes, this could eventually be quite relevant for sensitivity studies of the climate system as a whole.

## Acknowledgments

The authors thank Valerio Lucarini and the two anonymous reviewers for their helpful comments, which led to an improvement of this manuscript. MP thanks Erwan Brisson for the useful discussion. MP and UA thank German Research Foundation (DFG) for partial support through Grant AC 71/7-1. SD and MZ thank the DFG for partial support through Grant DO 1819/1-1. AG received support from the Russian Foundation for Basic Research (Grant 16-55-12015).

### APPENDIX

#### Total Energy Metric

The state vector is defined as

where is the spectral expansion coefficient of the streamfunction , *m* is the zonal wavenumber, *n* is the total wavenumber, and denotes the real (imaginary) part of the coefficient, respectively. Furthermore, the boundary conditions of the QG3LM leads to trivial components for and zero imaginary components for and (Ehrendorfer 2000). Consequently, the state vector can be reduced to

where the maximum wavenumbers of 21 are given by the spectral truncation of T21. The total energy metric by Ehrendorfer (2000) is defined as

where the matrices on the diagonal are given by

Here for and is equal to 1 otherwise; *a* = 6370 km is Earth’s radius; and and are the Rossby radii of deformation of the 200–500- and 500–800-hPa layers, respectively. In combination with the state vector in (A2) this energy metric yields

that is, the total energy of the flow.

## REFERENCES

*Stochastic Dynamical Systems: Concepts, Numerical Methods, Data Analysis.*Wiley-VCH, 535 pp.

*Stochastic Physics and Climate Modeling*, T. Palmer and P. Williams, Eds., Cambridge University Press, 35–72.

*Information Theory and Stochastics for Multiscale Nonlinear Systems*. CRM Monograph Series, Vol. 25, American Mathematical Society, 133 pp.

*The Fokker-Planck Equation: Methods of Solution and Applications.*Springer-Verlag, 454 pp.

*Quart. J. Roy. Meteor. Soc.,*https://doi.org/10.1002/qj.3396, in press.

## Footnotes

© 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).