## 1. Introduction

Data assimilation is the process of combining dynamical models and observations to estimate the state of a dynamical system (Kalnay 2002; Evensen 2009; Majda and Harlim 2012). There are many data assimilation algorithms, and ensemble Kalman filters (EnKFs; Evensen 1994; Burgers et al. 1998; Houtekamer and Mitchell 1998) are a class of algorithms that are particularly amenable to applications in atmospheric and oceanic data assimilation (e.g., Houtekamer et al. 2005; Whitaker et al. 2008; Szunyogh et al. 2008; Houtekamer et al. 2009). Computational costs prevent ensemble data assimilation systems from using models that resolve all the active scales of the atmosphere or ocean. The use of low-resolution models leads to two kinds of error: forecast model error and observation model error, the latter being associated with the contribution of subgrid scales to the observations, and known as representativeness error, or representation error. The present investigation is focused on forecast model errors associated with low-resolution models.

Model error can come from many sources (e.g., incorrect parameters); the focus here is on two kinds of model error associated with low resolution: the numerical/truncation error in modeling large-scale self-interaction and the error associated with subgrid-scale interactions. The true state of the dynamical system can be partitioned into a large-scale part that is represented on the low-resolution model grid and a subgrid-scale part that is not represented. The low-resolution model attempts to predict the evolution of the large-scale part. Even when the subgrid-scale part is zero, there are model errors associated with truncation errors in the numerical discretization of the large-scale dynamics; this type of model error can be reduced by increasing the accuracy of the numerical discretization. Of course, the subgrid scales are generally not zero and they influence the evolution of the large scales. Subgrid-scale parameterizations attempt to model the effect of the subgrid scales on the large scales, and improving subgrid-scale parameterizations can reduce large-scale model error. Nevertheless, because the state of the subgrid scales is not precisely known there is in principle a nonzero minimum of possible model error due to the uncertainty concerning the state of the subgrid scales; information barriers of this type are discussed by Branicki and Majda (2012).

Model errors need to be accounted for in ensemble data assimilation algorithms, and one common approach in the context of EnKFs is covariance inflation, either multiplicative (Anderson and Anderson 1999) or additive (Mitchell and Houtekamer 2000). Multiplicative inflation multiplies the prior ensemble perturbations by an inflation factor that can vary in space and time. The inflation factor can be hand-tuned, which involves considerable computational expense, or it can be adaptively estimated as part of the filtering algorithm (e.g., Anderson 2007, 2009; Li et al. 2009; Miyoshi 2011). Additive inflation is implemented by drawing random samples from a specified model error distribution, and adding these to the prior ensemble before the analysis update. It assumes that model error is independent of the model state, which is convenient but typically erroneous. Methods of estimating the model error distribution are discussed by Zupanski and Zupanski (2006). Additive inflation is better suited to accounting for model error than multiplicative inflation (Whitaker and Hamill 2012), though the latter can mitigate the effects of many different kinds of errors in ensemble filtering, like sampling errors. Covariance inflation techniques attempt to account for model error in the analysis, but they do not reduce the model error.

Additive inflation is analogous to stochastic parameterization, the difference being that random perturbations are added to the forecast before every assimilation cycle, versus being added at every time step of the numerical time integration scheme. In contrast to additive inflation, stochastic parameterizations can act to *reduce* model error (e.g., Shutts 2005; Berner et al. 2012; Frenkel et al. 2012). Since stochastic parameterizations typically increase ensemble spread they have an effect similar to covariance inflation, and can thus be viewed as accounting for as well as reducing model error. Unlike additive inflation, stochastic parameterization accounts for model error without assuming that it is independent of forecast error: even when the stochastic noise terms added to the model are independent of the model state, their integrated effect depends on the model.

A well-designed stochastic parameterization should outperform covariance inflation, so the latter can be used as a benchmark for evaluating stochastic parameterizations in the context of ensemble data assimilation (Whitaker and Hamill 2012). Houtekamer et al. (2009) compared additive inflation and stochastic parameterization in a low-resolution ensemble filtering context and found that additive inflation had a significant positive impact on performance, while a stochastic backscatter parameterization (SKEBS; Berner et al. 2009) had no positive impact. Whitaker and Hamill (2012) found that additive inflation and SKEBS had similarly positive impacts on the performance of an ensemble assimilation scheme.

We perform ensemble data assimilation experiments in the idealized setting of two-layer, doubly periodic quasigeostrophic (QG) turbulence with observations of the top-layer streamfunction, analogous to observations of sea surface height. Stochastic subgrid-scale parameterizations have been developed for this setting by Grooms and Majda (2013, 2014, hereafter GM13 and GM14, respectively) and Grooms et al. (2015, hereafter GLM15). GM13 and GM14 developed stochastic superparameterization (SP) for this setting [the connection of stochastic SP with the approach of Randall et al. (2013) is discussed by Majda and Grooms (2014)]; stochastic SP generates a stochastic forcing of quasigeostrophic potential vorticity conditional on the local large-scale variables. GLM15 extended the stochastic SP algorithm to include temporal correlation, and developed a simplified backscatter scheme that is independent of the state of the large-scale variables. The stochastic backscatter schemes used here model backscatter in the inverse cascade regime of quasigeostrophic turbulence, as appropriate to eddy-permitting ocean models. In contrast, backscatter schemes in operational weather forecasting models model backscatter associated with a wider range of physical processes (Shutts 2005, 2013).

In this idealized setting we find that a nonadaptive additive inflation scheme has no positive impact, uniform (nonadaptive) multiplicative inflation has a positive impact, and stochastic parameterization has a greater positive impact on assimilation performance. We also find that moving from second-order to fourth-order discretization leads to significant improvement in the performance of the assimilation algorithm, comparable to the effect of the stochastic parameterizations. The use of fourth-order numerics in combination with the stochastic parameterization yields our best results, slightly better than the use of the stochastic parameterization alone.

The configuration of the high-resolution truth model and of several imperfect low-resolution models is described in section 2, and the configuration of the ensemble assimilation system is described in section 3. The results of the assimilation experiments are presented in section 4, and conclusions are offered in section 5.

## 2. Two-layer quasigeostrophic model configuration

*r*specifies the strength of linear bottom friction (Ekman drag), and

*t*and

*c*are used throughout to denote barotropic and baroclinic components, respectively.

The simulations are carried out in a square periodic domain of half the width of the domains used in GM13, GM14, and GLM15, but with the same grid resolution; the computational grid has 256 × 256 points. The reference simulations use spectral numerics and fourth-order Runge–Kutta time stepping as described in GM13 and GM14. The deformation wavenumber is *f* plane with *β* plane with *β*-plane case considered here and are omitted for brevity. The experiments are carried out in these two significantly different regimes to ensure that the results (i.e., the performance of the data assimilation with different model configurations) are robust.

*f*-plane scenario is dominated by small-scale vortices with spatially homogeneous statistics. In the

*β*-plane scenario the flow organizes into three zonal jets (see Fig. 1c) that act as a barrier to meridional transport. The meridional heat flux is defined as

*f*-plane case the (nondimensional) climatological heat flux is 442, and in the

*β*-plane case the climatological heat flux is 2.3. The massive reduction in heat flux in the

*β*-plane case is partly due to the strong zonal jets, and is partly due to the fact that the

*β*-plane case is less energetic. GM14 showed that in both scenarios the heat flux is generated by scales resolved on the coarse model grid.

### Imperfect models

We consider several different imperfect models on a low-resolution grid of

One imperfect model, denoted “spectral” in the results, simply uses the same governing equations and spectral discretization as the perfect model (with no stochastic subgrid-scale parameterization), but uses a different value of the hyperviscous coefficient *f*-plane scenario uses *β*-plane scenario uses

_{j}are stochastic subgrid-scale parameterizations. These terms are described further in appendix A.

By analogy with ocean models and some atmospheric models, the remaining imperfect models do not use a spectral discretization of the nonlinear terms. Instead, they use either the second-order or fourth-order energy- and enstrophy-conserving finite-difference discretizations of Arakawa (1966). Models using the second-order discretization are denoted FD2, and models using the fourth-order discretization are denoted FD4. All of the other terms are still discretized using the same spectral method as the reference simulations.

To separate the effects of the numerical discretization we run models with and without stochastic subgrid-scale (SGS) parameterizations. We consider two stochastic SGS parameterizations: a backscatter scheme from GLM15 that is white in time and independent of the model state, and stochastic superparameterization (SP) from GM14 that is temporally correlated (following GLM15) and dependent on the model state. The stochastic SGS parameterizations are further described in appendix A. (Experiments were also run with the temporally correlated backscatter scheme from GLM15, but the results were almost identical to the white-in-time scheme and are omitted. Also note that the stochastic SP scheme includes a backscatter component.)

In the *f*-plane case all the FD2 and FD4 models, with and without stochastic SGS parameterizations, use the same viscosity coefficient *β*-plane case. The FD2 and FD4 models without stochastic SGS parameterizations are more sensitive to this parameter in the *β*-plane case, so the values are tuned to improve the model climatology. The FD2 model without stochastic SGS terms uses

The time-mean streamfunction spectra *β*-plane case. In the *f*-plane case (Fig. 1a) the imperfect models without stochastic SGS terms have far too little variability, by nearly two orders of magnitude, because they lack backscatter whereas all methods with stochastic SGS terms have accurate spectra. There is no time-mean structure in the *f*-plane case.

In the *β*-plane case the perfect model develops three barotropic zonal jets, whose time- and zonal-mean profiles are shown in black in Fig. 1c. The imperfect models with stochastic SP (FD2 and FD4) both have three reasonably accurate jets. The FD4 model without SGS terms has four jets, and the FD2 model with backscatter has three weak jets. The remaining models have too many jets that are too weak. The streamfunction spectra are shown in Fig. 1b. The models with SP are the most accurate, followed by the FD4 model without SGS terms and the FD2 model with backscatter, and finally by the FD2 and spectral models without SGS terms.

Without stochastic SGS parameterizations, the FD4 model has a bit more energy than the FD2 model. This is true in both cases, but on the *β* plane the FD4 model has almost the correct amount of energy, whereas it has far too little on the *f* plane. The increased energy in the FD4 model may be explained as follows. The imperfect models are able to extract energy from the imposed background shear through baroclinic instability, though less efficiently than the full-resolution model. This energy enters the system at a relatively small scale, though still representable on the coarse grid [the peak linear instability is at wavenumber *f*-plane case the transfer of energy to large scales is presumably primarily due to subgrid-scale interactions, whereas on the *β* plane the transfer from subgrid scales accounts for a smaller proportion of the total; this might explain why the FD4 model is much more accurate on the *β* plane than on the *f* plane.

The heat flux generated by the models is presented in Table 1. The accuracy of the heat flux generated by the imperfect models is not consistent across the two parameter regimes. In the *f*-plane case the models without stochastic SGS parameterizations all have far too little energy, and as a result generate far too little heat flux (less than 100 vs the true value of 442), while the models with stochastic parameterizations all produce too much heat flux. In the *β*-plane case the models exhibit a wide range of heat fluxes, with the least-accurate model (FD2) having the best heat flux. The spectral model has too little heat flux, and the remaining models all have too much.

Time mean *f*-plane and *β*-plane scenarios.

## 3. Ensemble assimilation system configuration

The ensemble assimilation experiments use the EAKF (Anderson 2001) with 100 ensemble members and with observations of the upper-layer streamfunction *f* plane) or 1 (*β* plane); the observations are assimilated serially. Observation errors in both cases are about 10% of the climatological variability of

Following Keating et al. (2012) we compute the eddy turnover time *Z* is the time-averaged total enstrophy

The imperfect model ensembles were initialized by adding random samples from a homogeneous, spatially uncorrelated Gaussian random field with variance equal to the observational error variance to the exact state of the large-scale part of the perfect model. We perform 500 assimilation cycles. The first 200 assimilation cycles are discarded and the remaining 300 are used to compute performance statistics.

Angle-averaged spatial correlation functions for *f*-plane scenario (solid) and *β*-plane scenario (dashed).

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

Angle-averaged spatial correlation functions for *f*-plane scenario (solid) and *β*-plane scenario (dashed).

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

Angle-averaged spatial correlation functions for *f*-plane scenario (solid) and *β*-plane scenario (dashed).

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

In the experiments with the unparameterized FD2 model we used multiplicative and additive inflation, alone and in combination. We simply tested constant multiplicative inflation factors between 1% and 10%; larger inflation factors led to large errors in the heat flux estimation. To implement an additive inflation we diagnosed 500 samples of model error for the unparameterized FD2 model, and used the results to develop an algorithm to generate random samples from an approximate error distribution, as described in appendix B. The diagnosed model error in the *β*-plane scenario was very inhomogeneous, with large model error variance located near the peaks of the zonal jets. As a result, we only developed an additive error approximation for the *f*-plane scenario, where the model errors are approximately homogeneous. The additive inflation method was implemented following Mitchell and Houtekamer (2000) by simply adding zero-mean samples from the approximate model error distribution to the prior ensemble at the beginning of each assimilation cycle (as opposed to, e.g., at each time step of the numerical integration, as in the stochastic parameterizations).

## 4. Results

The results of the assimilation experiments for the *f*-plane and *β*-plane scenarios are presented in Tables 2 and 3, respectively. The results for the FD2 model with neither inflation nor subgrid-scale parameterization are presented in boldface, and serve as a baseline for evaluating improvements. For both scenarios the time-mean RMS errors in the forecast and analysis of

Filter performance statistics for the *f*-plane scenario. The format is forecast

Filter performance statistics for the *β*-plane scenario. The format is forecast

*f*-plane case the RMS errors in the components of velocity are all relatively high, and are fairly similar for all the models: between 21 and 26 for the top layer and between 19 and 15 for the bottom layer, compared to climatological variability of 33 and 27 for the top and bottom layers. Nevertheless, despite these relatively large errors, the pattern correlation (PC) in the velocity is relatively high and better differentiates between the models; Table 2, therefore, shows the pattern correlations for the velocity, rather than the RMS errors. The instantaneous PC between the analysis upper-layer meridional velocity

### a. The f plane

In the *f*-plane scenario the baseline FD2 model exhibits modest accuracy despite model error and a relatively sparse observation network: RMS errors for the analysis estimate of the streamfunctions (3.6 and 3.3 for

The use of additive inflation, denoted FD2+, has no significant effect on estimates of streamfunctions or velocities, but degrades the performance of the heat flux estimation. In contrast, 10% multiplicative inflation, denoted FD2

The model error sampling algorithm described in appendix B gives a faithful approximation to the model error spectrum and to the correlation of the error in the top and bottom layers, so it is not immediately clear why it fails to improve the performance of the assimilation system, especially in light of previous results (e.g., Whitaker and Hamill 2012) indicating that additive inflation is better suited to accounting for model error than multiplicative inflation. One possibility is that the ensemble is sufficiently spread and does not need inflation, but this is contradicted by the success of multiplicative inflation. Another possibility is that the model error is strongly correlated with the model state, against the assumption of additive inflation. This was tested using the 500 samples of model error diagnosed as described in the foregoing section: the absolute value of the correlation coefficient between the model error and the streamfunction was less than 0.02 for both layers. Of course, the model error might be correlated with the gradient of the streamfunction or some other derived quantity, but the assumption of independence is at least not trivially wrong. A potential explanation is provided by comparing samples of the diagnosed model error to samples obtained from the algorithm from appendix B. Figures 3a and 3b show the upper-layer streamfunction

(a) A snapshot of the upper-layer streamfunction

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

(a) A snapshot of the upper-layer streamfunction

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

(a) A snapshot of the upper-layer streamfunction

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

The use of higher-order numerics reduces model error and leads to significant improvement in the assimilation performance: the FD4 model without inflation is comparable to the FD2 model with 10% multiplicative inflation. The spectral model reduces the model error further, resulting in even more accurate estimates for the streamfunctions and velocities. The assimilation performance for heat flux degrades when moving from the FD2 to the FD4 and spectral models; this mirrors the decrease in accuracy of the heat flux climatology shown in Table 1.

Addition of a white noise backscatter independent of the model variables in the FD2 model leads to significant improvements in filtering the streamfunctions and velocities, and a small improvement in heat flux. The backscatter reduces model error, as seen in Fig. 1, and inflates the forecast covariance. The results are better than the FD4 model and better than the FD2 model with 10% inflation, and are nearly comparable to the spectral model, but with better heat flux. The temporally correlated, state-dependent stochastic SP scheme with the FD2 model results in only slightly better performance than the backscatter method. Pairing the stochastic SP scheme with the FD4 model leads to further incremental improvements in estimating the streamfunctions and velocities, but has a mildly detrimental impact on heat flux estimation.

For the *f*-plane scenario covariance inflation, stochastic parameterization, and high-order numerics all lead to improvements in assimilation quality, and the combination of high-order numerics with stochastic SP is more accurate than covariance inflation with the FD2 model.

### b. The β plane

In the *β*-plane scenario the baseline FD2 model exhibits modest accuracy. RMS errors for the analysis estimate of the streamfunctions, 0.97 and 0.86 for *f*-plane case. This is partially due to the success of the filter in modeling the correlations between the upper and lower layers, and partially due to the fact that the lower layer has less climatological variability.

Similar to the *f*-plane case, multiplicative inflation degrades heat flux estimation, though in contrast to the *f*-plane case it causes the heat flux to increase erroneously. It is not clear why multiplicative inflation degrades the accuracy of the heat flux estimate. Streamfunction estimation improves as the multiplication factor increases up to an optimum value of 4%, past which it degrades. As shown in Table 3, 4% inflation improves the filter performance significantly, decreasing RMS errors in the analysis estimates of

The use of higher-order numerics in the FD4 model reduces model error, and leads to significant improvement in the assimilation performance for all variables—streamfunctions, velocities, and heat flux. This is somewhat surprising since the FD4 model has a too-large climatological heat flux of 7. The FD4 model without inflation performs better than the FD2 model with optimized (albeit nonadaptive) multiplicative inflation. The spectral model reduces the model error further, resulting in even more accurate estimates for the streamfunctions and velocities, but a slightly worse estimate for heat flux, in line with the too-low climatological heat flux of the spectral model (Table 1).

The FD2 model with white noise backscatter is slightly better than the FD2 model with inflation when it comes to filtering the streamfunctions and velocities and has a small improvement in heat flux, though the latter is still too small. This is somewhat surprising since the climatological heat flux for the white noise backscatter model is too large, at 4.3 (Table 1). The FD2 with backscatter is not quite as accurate as the FD4 model. The FD2 model with stochastic SP improves on the white noise backscatter, and is very similar to the FD4 model, and the FD4 model with stochastic SP leads to further improvements in streamfunction and velocity estimates. The spectral model without a stochastic parameterization has better estimates of streamfunctions and velocities than the FD models with stochastic SP, and a comparable (though slightly worse) heat flux.

## 5. Conclusions

Ensemble data assimilation systems for atmosphere and ocean science use computational models that are unable to resolve all the active dynamical scales. Two types of model error result from the use of low-resolution models: truncation error in the numerical discretization of the large-scale dynamics and errors associated with subgrid-scale interactions. Additive and multiplicative covariance inflation are methods for accounting for model errors in ensemble data assimilation algorithms (Anderson and Anderson 1999; Mitchell and Houtekamer 2000), but inflation only accounts for model errors and does not reduce them. The use of high-order numerical schemes and subgrid-scale parameterizations reduces low-resolution model errors; stochastic subgrid-scale parameterizations also inflate ensemble spread, similar to additive and multiplicative inflation. Covariance inflation can act as a baseline for evaluating the effect of stochastic parameterizations in ensemble data assimilation systems (Whitaker and Hamill 2012), but previous studies using a particular stochastic subgrid-scale parameterization (SKEBS; Berner et al. 2009) have not found significant improvements compared to covariance inflation (Houtekamer et al. 2009; Whitaker and Hamill 2012). We compare inflation and stochastic parameterization in the context of idealized quasigeostrophic turbulence, and find that a simple model-independent white noise backscatter scheme (from GLM15) is comparable to tuned, nonadaptive multiplicative inflation, while stochastic superparameterization (GM13, GM14) gives better results. We attribute this success to the ability of a well-designed stochastic parameterization to both reduce and account for model error. A similar result has recently been obtained by Ha et al. (2015) in the context of atmospheric data assimilation, where a version of the SKEBS scheme outperformed an adaptive inflation scheme.

The performance of ensemble assimilation systems based on low-resolution models can be improved through the use of stochastic parameterizations, by carefully estimating and accounting for model error distributions (Zupanski and Zupanski 2006), and through sophisticated adaptive covariance inflation techniques (Anderson 2007, 2009; Li et al. 2009; Miyoshi 2011). But in our setting the most straightforward way to improve the assimilation performance was to move from a second-order to a fourth-order spatial discretization. The effect of higher-order numerics may seem counterintuitive because the model does not resolve the true solution; the explanation is that the large-scale part of the true solution that is *represented* on the coarse model grid is not equally well *resolved* by different numerical methods. The effect of higher-order numerics was comparable to that of our stochastic subgrid-scale parameterizations, though the latter had the added benefit of improving the low-resolution model climatology, much more so than the use of higher-order numerics. Janjic et al. (2011) developed fourth-order advection schemes in the context of global atmospheric models, but found minimal improvement in forecast accuracy (not in an ensemble assimilation context) compared to second-order schemes. The impact of increased-order numerics is presumably dependent on the situation; when the fields are well resolved, numerical error is small and higher-order numerics will not have much impact. Similarly, when the fields are very poorly resolved (i.e., when there is significant variation at the grid scale) neither low- nor high-order methods are accurate. However, in situations where structures are only partially resolved, as in our idealized eddy-permitting setting, higher-order numerics are able to provide increased accuracy.

Covariance inflation will remain an important and useful technique in ensemble data assimilation. But the ability of stochastic parameterizations to reduce model error, and not just to account for it, underscores the need for further development and improvement of stochastic parameterizations in oceanic and atmospheric models. Higher-order spatial discretizations can also reduce error, though this depends on the degree to which structures are resolved by the computational grid (Janjic et al. 2011). Our results suggest that the data-assimilation performance of eddy-permitting ocean models might benefit from higher-order numerics, even though we find that higher-order numerics do not necessarily improve model climatology.

## Acknowledgments

The authors gratefully acknowledge input from three anonymous reviewers and funding from ONR MURI Grant N00014-12-1-0912.

## APPENDIX A

### Stochastic Subgrid-Scale Parameterizations

_{j}in the imperfect model Eqs. (3) and (4). The simplified scheme from GLM15, labeled “backscatter” in the results, takes the following form:

*A*is a tunable parameter controlling the amplitude of the backscatter and

*θ*is a random field taking values in

*θ*gains temporal correlation by being modeled (at each coarse grid point) by a Weiner process on the circle:

*W*is a Weiner process independent of all other coarse grid points. The parameter

*σ*controls the decorrelation time.

The values of the tunable parameters *A* and *σ* are included here for completeness. In the *f*-plane case *A* by the square root of the time step size in the numerical integration algorithm (necessary for robustness when changing step size; see GLM15), whereas the stochastic SP scheme multiplies by the time step. In the *β*-plane case *f*-plane case the stochastic SP scheme uses *β*-plane case it uses *σ* is due to the difference in the subgrid-scale energy: in the *f*-plane case there is more energy and the subgrid-scale terms decorrelate faster.

## APPENDIX B

### Model Error Parameterization

The model error for the FD2 model is estimated by initializing the FD2 model from the large-scale state of the perfect model, running both models forward for an interval of

In the *β*-plane scenario the model error is inhomogeneous, as shown in Fig. B1. The middle panel shows that the time-mean error in the barotropic streamfunction

(left) Time-mean barotropic streamfunction, (middle) time-mean model error in the barotropic streamfunction for the FD2 model, and (right) standard deviation of model error in the barotropic streamfunction for the FD2 model.

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

(left) Time-mean barotropic streamfunction, (middle) time-mean model error in the barotropic streamfunction for the FD2 model, and (right) standard deviation of model error in the barotropic streamfunction for the FD2 model.

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

(left) Time-mean barotropic streamfunction, (middle) time-mean model error in the barotropic streamfunction for the FD2 model, and (right) standard deviation of model error in the barotropic streamfunction for the FD2 model.

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

The model error in the *f*-plane scenario is approximately homogeneous and isotropic, as shown in Fig. B2. The left panel shows the time-average of the square amplitude of the barotropic model error [

The *f*-plane FD2 streamfunction model error statistics: (a) barotropic model error spectrum; (b) baroclinic model error spectrum; and (c) angle-integrated barotropic error spectrum from the diagnostics (solid) and from the approximate sampling algorithm (solid, circles), and angle-integrated baroclinic error spectrum from the diagnostics (dashed) and from the approximate sampling algorithm (dashed, circles).

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

The *f*-plane FD2 streamfunction model error statistics: (a) barotropic model error spectrum; (b) baroclinic model error spectrum; and (c) angle-integrated barotropic error spectrum from the diagnostics (solid) and from the approximate sampling algorithm (solid, circles), and angle-integrated baroclinic error spectrum from the diagnostics (dashed) and from the approximate sampling algorithm (dashed, circles).

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

The *f*-plane FD2 streamfunction model error statistics: (a) barotropic model error spectrum; (b) baroclinic model error spectrum; and (c) angle-integrated barotropic error spectrum from the diagnostics (solid) and from the approximate sampling algorithm (solid, circles), and angle-integrated baroclinic error spectrum from the diagnostics (dashed) and from the approximate sampling algorithm (dashed, circles).

Citation: Monthly Weather Review 143, 10; 10.1175/MWR-D-15-0032.1

The *f*-plane model error sampling algorithm:

1a) Generate

independent samples of a standard normal random variable, and arrange on the coarse grid. 1b) Take the discrete Fourier transform. For wavenumbers with

, multiply the Fourier coefficients by exp ; set the remaining coefficients to zero. 1c) Take the inverse discrete Fourier transform, and rescale so that the sample has unit variance. This is the unscaled barotropic error sample.

2a) Repeat step 1a to generate a new random field.

2b) Take the discrete Fourier transform. For wavenumbers with

multiply the coefficients by ; set the remaining coefficients to zero. 2c) Take the inverse discrete Fourier transform, and rescale so that the sample has unit variance.

2d) Add the unscaled barotropic error sample from step 1c to the sample from step 2c and divide by 2. The result is the unscaled baroclinic error sample.

3) Multiply the unscaled barotropic sample by 1.47 and the unscaled baroclinic sample by 0.61 to get the barotropic and baroclinic error samples, respectively. The top layer sample is the sum of the barotropic and baroclinic samples, and the bottom layer sample is the barotropic minus the baroclinic sample.

This error sampling algorithm is clearly ad hoc. However, it produces a homogeneous and isotropic random field with properties very similar to the diagnosed model error. For example, the diagnosed standard deviation for the barotropic model error is 1.49, while the algorithm has standard deviation 1.47. The diagnosed standard deviation for the baroclinic model error is 0.63, while the algorithm has a standard deviation of 0.61. The diagnosed local correlation between the barotropic and baroclinic error is 0.48, while the algorithm generates a correlation of 0.5. The diagnosed correlation between the model error in the top and bottom layers is 0.75, which is matched exactly by the algorithm. Finally, the barotropic and baroclinic model error spectra are accurately reproduced by the algorithm, as shown in the right panel of Fig. B2.

## REFERENCES

Anderson, J., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2903, doi:10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.Anderson, J., 2007: An adaptive covariance inflation error correction algorithm for ensemble filters.

,*Tellus***59A**, 210–224, doi:10.1111/j.1600-0870.2006.00216.x.Anderson, J., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters.

,*Tellus***61A**, 72–83, doi:10.1111/j.1600-0870.2008.00361.x.Anderson, J., and S. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts.

,*Mon. Wea. Rev.***127**, 2741–2758, doi:10.1175/1520-0493(1999)127<2741:AMCIOT>2.0.CO;2.Arakawa, A., 1966: Computational design for long-term numerical integration of the equations of fluid motion: Two-dimensional incompressible flow. Part I.

,*J. Comput. Phys.***1**, 119–143, doi:10.1016/0021-9991(66)90015-5.Berner, J., G. Shutts, M. Leutbecher, and T. Palmer, 2009: A spectral stochastic kinetic energy backscatter scheme and its impact on flow-dependent predictability in the ECMWF ensemble prediction system.

,*J. Atmos. Sci.***66**, 603–626, doi:10.1175/2008JAS2677.1.Berner, J., T. Jung, and T. Palmer, 2012: Systematic model error: The impact of increased horizontal resolution versus improved stochastic and deterministic parameterizations.

,*J. Climate***25**, 4946–4962, doi:10.1175/JCLI-D-11-00297.1.Branicki, M., and A. Majda, 2012: Quantifying uncertainty for predictions with model error in non-Gaussian systems with intermittency.

,*Nonlinearity***25**, 2543, doi:10.1088/0951-7715/25/9/2543.Burgers, G., P. van Leeuwen, and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter.

,*Mon. Wea. Rev.***126**, 1719–1724, doi:10.1175/1520-0493(1998)126<1719:ASITEK>2.0.CO;2.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99**, 10 143–10 162, doi:10.1029/94JC00572.Evensen, G., 2009:

*Data Assimilation: The Ensemble Kalman Filter.*2nd ed. Springer, 307 pp.Frenkel, Y., A. Majda, and B. Khouider, 2012: Using the stochastic multicloud model to improve tropical convective parameterization: A paradigm example.

,*J. Atmos. Sci.***69**, 1080–1105, doi:10.1175/JAS-D-11-0148.1.Gaspari, G., and S. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125**, 723–757, doi:10.1002/qj.49712555417.Grooms, I., and A. Majda, 2013: Efficient stochastic superparameterization for geophysical turbulence.

,*Proc. Natl. Acad. Sci. USA***110**, 4464–4469, doi:10.1073/pnas.1302548110.Grooms, I., and A. Majda, 2014: Stochastic superparameterization in quasigeostrophic turbulence.

,*J. Comput. Phys.***271**, 78–98, doi:10.1016/j.jcp.2013.09.020.Grooms, I., and Y. Lee, 2015: A framework for variational data assimilation with superparameterization.

*Nonlinear Processes Geophys. Discuss*.,**2**, 513–536, doi:10.5194/npgd-2-513-2015.Grooms, I., Y. Lee, and A. Majda, 2014: Ensemble Kalman filters for dynamical systems with unresolved turbulence.

,*J. Comput. Phys.***273**, 435–452, doi:10.1016/j.jcp.2014.05.037.Grooms, I., Y. Lee, and A. Majda, 2015: Numerical schemes for stochastic backscatter in the inverse cascade regime of quasigeostrophic turbulence.

, in press.*Multiscale Model. Simul.*Ha, S., J. Berner, and C. Snyder, 2015: A comparison of model error representations in mesoscale ensemble data assimilation.

*Mon. Wea. Rev.*, doi:10.1175/MWR-D-14-00395.1, in press.Houtekamer, P., and H. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126**, 796–811, doi:10.1175/1520-0493(1998)126<0796:DAUAEK>2.0.CO;2.Houtekamer, P., H. Mitchell, G. Pellerin, M. Buehner, M. Charron, L. Spacek, and B. Hansen, 2005: Atmospheric data assimilation with an ensemble Kalman filter: Results with real observations.

,*Mon. Wea. Rev.***133**, 604–620, doi:10.1175/MWR-2864.1.Houtekamer, P., H. Mitchell, and X. Deng, 2009: Model error representation in an operational ensemble Kalman filter.

,*Mon. Wea. Rev.***137**, 2126–2143, doi:10.1175/2008MWR2737.1.Janjic, Z., T. Janjic, and R. Vasic, 2011: A class of conservative fourth-order advection schemes and impact of enhanced formal accuracy on extended-range forecasts.

,*Mon. Wea. Rev.***139**, 1556–1568, doi:10.1175/2010MWR3448.1.Kalnay, E., 2002:

*Atmospheric Modeling, Data Assimilation, and Predictability.*Cambridge University Press, 364 pp.Keating, S., A. Majda, and K. Smith, 2012: New methods for estimating ocean eddy heat transport using satellite altimetry.

,*Mon. Wea. Rev.***140**, 1703–1722, doi:10.1175/MWR-D-11-00145.1.Li, H., E. Kalnay, and T. Miyoshi, 2009: Simultaneous estimation of covariance inflation and observation errors within an ensemble Kalman filter.

,*Quart. J. Roy. Meteor. Soc.***135**, 523–533, doi:10.1002/qj.371.Majda, A., and J. Harlim, 2012:

*Filtering Complex Turbulent Systems.*Cambridge University Press, 368 pp.Majda, A., and I. Grooms, 2014: New perspectives on superparameterization for geophysical turbulence.

,*J. Comput. Phys.***271**, 60–77, doi:10.1016/j.jcp.2013.09.014.Mitchell, H., and P. Houtekamer, 2000: An adaptive ensemble Kalman filter.

,*Mon. Wea. Rev.***128**, 416–433, doi:10.1175/1520-0493(2000)128<0416:AAEKF>2.0.CO;2.Miyoshi, T., 2011: The Gaussian approach to adaptive covariance inflation and its implementation with the local ensemble transform Kalman filter.

,*Mon. Wea. Rev.***139**, 1519–1535, doi:10.1175/2010MWR3570.1.Randall, D., M. Branson, M. Wang, S. Ghan, C. Craig, A. Gettelman, and J. Edwards, 2013: A Community Atmosphere Model with superparameterized clouds.

,*Eos, Trans. Amer. Geophys. Union***94**, 221–222, doi:10.1002/2013EO250001.Shutts, G., 2005: A kinetic energy backscatter algorithm for use in ensemble prediction systems.

,*Quart. J. Roy. Meteor. Soc.***131**, 3079–3102, doi:10.1256/qj.04.106.Shutts, G., 2013: Coarse graining the vorticity equation in the ECMWF integrated forecasting system: The search for kinetic energy backscatter.

,*J. Atmos. Sci.***70**, 1233–1241, doi:10.1175/JAS-D-12-0216.1.Stammer, D., 1997: Global characteristics of ocean variability estimated from regional TOPEX/POSEIDON altimeter measurements.

,*J. Phys. Oceanogr.***27**, 1743–1769, doi:10.1175/1520-0485(1997)027<1743:GCOOVE>2.0.CO;2.Szunyogh, I., E. Kostelich, G. Gyarmati, E. Kalnay, B. Hunt, E. Ott, E. Satterfield, and J. Yorke, 2008: A local ensemble transform Kalman filter data assimilation system for the NCEP global model.

,*Tellus***60A**, 113–130, doi:10.1111/j.1600-0870.2007.00274.x.Whitaker, J., and T. Hamill, 2012: Evaluating methods to account for system errors in ensemble data assimilation.

,*Mon. Wea. Rev.***140**, 3078–3089, doi:10.1175/MWR-D-11-00276.1.Whitaker, J., T. Hamill, X. Wei, Y. Song, and Z. Toth, 2008: Ensemble data assimilation with the NCEP global forecast system.

,*Mon. Wea. Rev.***136**, 463–482, doi:10.1175/2007MWR2018.1.Zupanski, D., and M. Zupanski, 2006: Model error estimation employing an ensemble data assimilation approach.

,*Mon. Wea. Rev.***134**, 1337–1354, doi:10.1175/MWR3125.1.