1. Introduction
Most numerical weather prediction (NWP) centers operationally produce both deterministic and ensemble forecasts. The deterministic forecast represents the best single estimate of the atmospheric conditions in the future, whereas the ensemble forecast provides information on the range of possible conditions that could occur given the uncertainties inherent in all aspects of the prediction system.
Data assimilation is used to provide the analyses (i.e., initial conditions) for both deterministic and ensemble forecasts. For deterministic forecasts, variational data assimilation approaches are most often used. These include three-dimensional variational data assimilation (3DVar), four-dimensional variational data assimilation (4DVar), and, more recently, ensemble–variational assimilation (EnVar; Buehner et al. 2013; Kleist and Ide 2015; Wang and Lei 2014). For ensemble data assimilation approaches, the goal is to produce an ensemble of model states consistent with the probability density of the initial condition uncertainty. Several of these ensemble techniques are based on Monte Carlo simulation in which all uncertain components of the prediction system are randomly perturbed in a way that is consistent with the presumed uncertainty of each component. This includes the observations and forecast model parameters. It is also common to directly modify the ensemble spread of the complete model state to account for multiple sources of uncertainties (e.g., Houtekamer et al. 2009; Whitaker and Hamill 2012). Such data assimilation approaches include the perturbed-observation ensemble Kalman filter (EnKF; Houtekamer et al. 2005) and the ensemble of data assimilations (EDA; Isaksen et al. 2010). Other EnKF algorithms, referred to as ensemble square root filters, rely on a modification to the original algorithm to avoid the need to perturb the observations (e.g., Whitaker and Hamill 2002; Tippett et al. 2003; Hunt et al. 2007). Alternatively, some centers compute perturbations with an approach not directly related to data assimilation (e.g., singular vectors, bred vectors; Buizza et al. 2005) and these are then added to the deterministic analysis for initializing the ensemble forecast.
Ensembles of short-term forecasts are also used within several types of data assimilation algorithms to either partially or fully specify the background error covariances. This includes the EnKF (both the perturbed observation version and all of the ensemble square root filter variants), EnVar, and some implementations of 4DVar that use ensemble covariances to specify the background error covariances at the beginning of the data assimilation time window (Buehner et al. 2010; Clayton et al. 2013).
Several NWP centers currently employ a variational data assimilation approach for their deterministic forecasts and a separate EnKF system that is used for both initializing the ensemble forecasts and for providing ensemble covariances for the deterministic system [e.g., Environment and Climate Change Canada (ECCC), the National Centers for Environmental Prediction (NCEP), and the Met Office]. The data assimilation procedures used within current EnKF systems differ fundamentally from the variational approach by relying on either the serial assimilation of individual (or small batches) of observations (e.g., Whitaker and Hamill 2002; Tippett et al. 2003; Houtekamer et al. 2005) or an algorithm that independently updates spatial subdomains by simultaneously assimilating all surrounding observations (e.g., Hunt et al. 2007). Several practical benefits may be realized by using the same data assimilation approach for both deterministic and ensemble prediction, including a reduction in the effort required to develop and maintain the computer code and an improved consistency of the impacts from major changes made to the two systems. To that end, the goal of the present study is to evaluate a new approach for performing the data assimilation step within a perturbed-observation EnKF by adapting the variational algorithm used for deterministic data assimilation. The analysis increment is computed with a variational assimilation approach separately for the ensemble mean and for all of the ensemble perturbations (i.e., the deviations of each member from the ensemble mean). To obtain a computationally efficient algorithm, a much simpler configuration is used for the ensemble perturbations, whereas the configuration used for the ensemble mean is similar to that used for the deterministic system. Since the new approach is essentially an EnKF implemented with a variational approach, it is called VarEnKF.
The next section includes a description of the VarEnKF approach together with its expected benefits. Section 3 provides a description of the numerical data assimilation experiments performed using either a standard EnKF approach, VarEnKF, or a combination of the two. The results from these experiments are presented in section 4. The final section provides the conclusions.
2. VarEnKF approach
a. General approach
The most straightforward approach for using variational data assimilation within an EnKF is to simply run independent data assimilation cycles for each ensemble member. Each assimilates independently perturbed observations, while the other sources of uncertainty are simulated with appropriate random perturbations. This is similar to the EDA approach (Isaksen et al. 2010) and the so-called system simulation approach (Houtekamer et al. 1996), except in those approaches, unlike with EnVar and the EnKF, the assimilation does not fully use the ensemble covariances. However, if the analysis step for each member was performed with a variational approach that uses the ensemble of background states to define the background error covariances, such as EnVar, this would be theoretically equivalent (other than the unavoidable differences related to the use of a different solution technique for obtaining each member’s analysis increment) to the perturbed-observation EnKF (e.g., Fairbairn et al. 2014). Because a full data assimilation system is used for each member with the same complexity as a typical deterministic system, the computational cost is comparable to that of the deterministic system times the number of members, which is much higher than the cost of current EnKF approaches. Consequently, this would typically limit the number of ensemble members to O(10), too little to be used to fully specify the background error covariances. For example the Canadian EnKF currently uses 256 members and, in contrast, Météo-France and ECMWF both use an EDA consisting of only 25 members of 4DVar. Because of this small ensemble size, a large amount of filtering is applied to the EDA ensemble covariances by combining members over several analysis times, using a wavelet-diagonal approach for the correlations, and using spatially smoothed ensemble variance estimates (Berre et al. 2015). Simplifying the data assimilation procedure used for all the members, such as by using 3DVar instead of 4DVar to allow for a larger ensemble size, would result in a degradation in the quality of the resulting ensemble forecasts.
An approach with the potential to be competitive with current EnKF algorithms in terms of computational cost is based on a variational “mean-pert” approach, first suggested by Lorenc et al. (2014). Similar to many formulations of the ensemble square root filter, the idea of this approach is to compute the analysis increment for the ensemble mean separately from the increment for the deviation of each member from the ensemble mean (hereafter referred to as the ensemble perturbations). Then, both are computed with a variational technique, such as the four-dimensional version of EnVar (4DEnVar). The increment for the ensemble mean shifts all background members by the same amount in phase space and therefore has no effect on the ensemble spread, whereas the increments for the ensemble perturbations modify the ensemble spread with no effect on the ensemble mean. We hypothesize that the overall quality of the ensemble forecasts depends to a much greater extent on the analysis increment for the ensemble mean as compared with those for the ensemble perturbations. This is supported by a comparison of various ensemble prediction systems that found that the ensemble mean has a large impact on the quality of ensemble forecasts (e.g., Buizza et al. 2005) and that simple approaches for computing the initial ensemble perturbations may be as effective as more theoretically and computationally complex approaches (e.g., Magnusson et al. 2008; Raynaud and Bouttier 2016). If this hypothesis is valid, then the overall computational cost of obtaining the analysis ensemble could be significantly reduced by simplifying only the component of the assimilation procedure that is used for the ensemble perturbations. This is critical, since an independent calculation of the analysis increment for the ensemble perturbation is required for each member. This computation must be highly efficient and parallelized to be feasible for a large ensemble of O(100) members. It must be emphasized, however, that no simplifications need be made to the assimilation procedure, such as EnVar, used to compute the analysis increment for the ensemble mean.
The only simplification proposed by Lorenc et al. (2014) is a significant reduction in the number of iterations used in the minimization of the cost function for computing the increments for the ensemble perturbations. As will be shown later, these increments generally act to reduce the ensemble spread. Consequently, a reduction in the number of iterations would lead to less reduction in the ensemble spread during the analysis step, which could potentially be compensated for by a reduction to the multiplicative or additive ensemble inflation used to maintain a realistic spread (e.g., Houtekamer et al. 2009; Whitaker and Hamill 2012). However, the use of EnVar with this simplification alone would likely not reduce the cost to a point that is comparable with many EnKF algorithms. Additional simplifications should therefore be considered for computing the perturbation increment, including the following:
a reduction in ensemble size used to compute the ensemble background error covariances;
the use of 3D instead of 4D ensemble background error covariances;
the use of only the static climatological background error covariance matrix that is typically based on homogeneous and isotropic correlations (i.e., 3DVar);
a decrease in spatial resolution or a more severe spectral truncation for the static climatological background error covariance matrix; and
a reduction in the number of observations assimilated.
b. Benefits of VarEnKF approach
Only a single execution of variational data assimilation is required to compute the analysis increment for the ensemble mean. Nearly the same configuration as the deterministic system can be used for this, including a much larger volume of assimilated observations than what is currently used in most operational EnKF systems. The lower number of assimilated observations in most EnKF algorithms follows from the fact that their computational cost scales linearly with the number of observations. This increase in the number of assimilated observations for computing the analysis increment for the ensemble mean in VarEnKF could potentially improve the quality of ensemble forecasts. Some centers (e.g., NCEP; National Weather Service 2015) obtain a similar benefit from directly using the high-resolution deterministic analysis to replace the EnKF ensemble mean analysis at each analysis time. In this case, the EnKF data assimilation procedure is, in effect, only used to compute the analysis increments for the ensemble perturbations.
Because it uses the same data assimilation approach as the deterministic system, VarEnKF has the practical advantage that only a single data assimilation algorithm need be developed and maintained, reducing the required software development effort. For example, the introduction of a new type of observation or an improved treatment of the error covariances need only be implemented once for it to be available for both the deterministic and ensemble prediction systems. Moreover, a much higher degree of consistency would exist between the deterministic and ensemble systems if they were to employ the same assimilation algorithm and use the same set of observations (for computing the ensemble mean analysis increment). Consequently, the impact on forecast accuracy from any major change, such as adding a new type of observation, would be similar in both systems. This would likely reduce the overall amount of numerical experimentation required.
The variational approach also allows for covariance localization to be applied directly to the background error covariances in gridpoint space (Lorenc 2003; Buehner 2005) instead of partially in observation space, as in the EnKF (e.g., Houtekamer et al. 2005). This is an important distinction for nonlocal observations, such as satellite radiances. Campbell et al. (2010) showed that gridpoint space localization can result in improved analysis accuracy as compared with observation space localization when assimilating satellite radiances in an idealized experimental context. For some situations, however, Lei and Whitaker (2015) showed that localization in observation space can be superior, though the difficulty of assigning an appropriate single vertical location to each radiance observation remains.
The use of a variational approach within VarEnKF could lead to additional benefits from covariance modeling approaches that are more readily implemented in the context of the variational approach. These include the use of hybrid covariances that combine localized ensemble covariances with static climatological covariances (e.g., Hamill and Snyder 2000; Lorenc 2003) and scale-dependent covariance localization (Buehner and Shlyaeva 2015). In addition to these covariance modeling approaches, the use of the variational quality control technique for rejecting potentially erroneous observations can also be easily incorporated in the calculation of the ensemble mean analysis increment (Anderson and Järvinen 1999).
c. Formulation

















































It is these cost functions that form the foundation of the VarEnKF approach. The cost function for the ensemble perturbation increments [Eq. (9)] employs a configuration with a much lower computational cost (e.g., fewer assimilated observations, a much simpler configuration for the background error covariance matrix) than that used for the ensemble mean increment [Eq. (8)]. This results in a significant reduction in the overall computational cost of computing the O(100) ensemble member perturbation increments, while maintaining the highest quality possible for the ensemble mean analysis increment. The ensemble of analyses is then obtained by summing the ensemble mean increment and the ensemble perturbation increment for each member.
The tangent linear observation operator
3. Description of numerical experiments
A series of numerical data assimilation experiments are performed to evaluate the impact on the ensemble analyses and forecasts of using a variational approach to compute the analysis increments within a perturbed observation ensemble data assimilation system. Table 1 provides a summary of the configurations of these data assimilation experiments. For comparison purposes, the EnKF-control experiment uses the same assimilation algorithm as in the ECCC operational system that serially assimilates batches of observations [described by Houtekamer et al. (2014a), with subsequent modifications described by Gagnon et al. (2014) and Gagnon et al. (2015)]. The VarEnKF experiment employs the variational approach for both the ensemble mean analysis increment and the ensemble perturbation analysis increments, while the EnVar-mean and EnVar-mean2 experiments use the variational approach only for the ensemble mean increment and rely on the serial batch EnKF algorithm for the ensemble perturbation increments. To test the forecast error sensitivity to the set of assimilated observations, the EnVar-mean2 experiment assimilates a much larger set of observations for computing the ensemble mean analysis increment than the EnVar-mean experiment.
Summary of the configurations used for the numerical experiments. Note that


All experiments begin from the same ensemble at 0000 UTC 28 December 2014. The initial ensemble is generated by adding to a single 24-h deterministic forecast, a set of 256 random Gaussian perturbations computed using the covariances of the static climatological background error covariances from the variational data assimilation system. No results are considered from the first six days to allow the ensemble to spin up from these initial conditions. The data assimilation experiments are then run with four analyses per day (i.e., 6-h assimilation windows) until 1200 UTC 15 January 2015, for the EnVar-mean and EnVar-mean2 experiments, and until 1200 UTC 28 January 2015, for the EnKF-control and VarEnKF experiments. Five-day, 20-member ensemble forecasts and an additional “control member” forecast are launched twice per day, at 0000 and 1200 UTC. As in the operational system, the ensemble forecasts are initialized with the first 20 analysis ensemble members after they are recentered about the mean of the full 256-member analysis ensemble. This ensemble mean is also used to initialize the control member forecast.
All experiments are performed using the same set of configurations of the Global Environmental Multiscale (GEM) model (Girard et al. 2014) for the 256-member ensemble within the data assimilation cycle and for the 20-member medium-range ensemble forecasts. This includes the use of the same digital filter initialization procedure applied to the full analysis fields in all experiments. These configurations differ from those used operationally starting 15 December 2015 (Gagnon et al. 2015) in that the model top was raised from 2 to 0.1 hPa and the number of vertical levels increased from 74 to 80 to be more consistent with the deterministic prediction system (Buehner et al. 2015). Together with these changes, microwave radiances from channels 13 and 14 of AMSU-A instruments are additionally assimilated in these experiments. Compared with the operational EnKF configuration, these changes result in large improvements for the ensemble forecasts in the stratosphere starting at 100 hPa and above, but with little impact in the troposphere (results not shown). Also unlike the operational system, which uses fewer vertical levels for the medium-range ensemble forecasts than for the background states in the data assimilation cycle, the same 80 vertical levels are used in all model configurations. In all experiments, a set of random perturbations are added to the analysis ensemble to simulate the additional uncertainty due to various sources of system error. As in the operational EnKF, these are computed using the same static climatological background error covariances used in the variational data assimilation system, but after multiplying the standard deviations for all variables by 0.33 when initializing the forecasts for the ensemble of background states within the data assimilation cycle, and by 0.66 when initializing the medium-range ensemble forecasts.
The configuration of EnVar used for the ensemble mean is similar to the version of the deterministic analysis that has been used operationally since 15 November 2015 [Buehner et al. (2015) with subsequent modifications described by Qaddouri et al. (2015)]. Unlike the operational version, however, the ensemble covariances are obtained using each experiment's own 256-member ensemble with model top at 0.1 hPa. Also, hybrid covariances are used with only a 10% contribution from climatological covariances [referred to as “
As in the operational EnKF, a smaller subset of observation types are assimilated in the EnKF-control and EnVar-mean experiments (referred to as the “EnKF subset”) than in the operational deterministic system (referred to as the “full set”). The EnVar-mean2 and VarEnKF experiments both use the full set of observation types for computing only the ensemble mean increment. For the ensemble perturbation increments, the EnVar-mean2 experiment uses the EnKF subset, whereas the VarEnKF experiment uses the EnKF subset with the addition of ground-based GPS data. The specific types of observations assimilated in each experiment for computing both the ensemble mean increment and the ensemble perturbation increments are summarized in Table 2. As an example of the difference in the number of observations used in each experiment, for the 6-h time window centered at 1800 UTC 12 January 2015, the number of assimilated observations of all types was 1.22 million in the EnKF subset versus 3.14 million in the full set of observations. Of these observations, 0.85 million in the EnKF subset and 2.77 million in the full set of observations were satellite radiances.
Summary of the types of observations assimilated in each experiment for computing the analysis increments for the ensemble mean and ensemble perturbation. Note that AMW refers to atmospheric motion winds from both geostationary and polar-orbiting satellites.


The observation error covariances differ between the EnVar used for computing the ensemble mean analysis increment and the EnKF serial batch algorithm. For the EnVar configuration, nonzero interchannel observation error correlations are included for all types of satellite radiance observations. These error correlations were estimated using the method of Desroziers et al. (2005) as described by Heilliette and Garand (2015). In combination with the inclusion of these correlations, it was found to be beneficial to decrease the error variances for water vapor and surface-sensitive channels, which are the most highly intercorrelated. These are the same observation error covariances as used in the EnVar to initialize the operational deterministic prediction system. In that system, it was found that inclusion of nonzero correlations did not significantly affect the convergence of the cost function minimization. In comparison, all observation error correlations are assumed to be zero in the EnKF serial batch algorithm used in this study, consistent with the operational EnKF configuration.
The VarEnKF experiment uses a much simpler configuration for computing the increments to the ensemble perturbations than it does for the increment to the ensemble mean. This includes, as already mentioned, the assimilation of much fewer observations (Table 2). In addition, only the static climatological background-error covariances at a reduced spectral truncation of T108 are used (instead of the spectral truncation of T270 used for the ensemble mean). Since the ensemble covariances are not used in specifying
A cross-validation approach is used in the serial batch EnKF algorithm to avoid an excessive reduction in ensemble spread during the analysis step that would otherwise occur (Mitchell and Houtekamer 2009). Following this approach, the 256-member background ensemble is split into 8 subensembles and only the other 7 of these (224 members) are used to estimate the background error covariances when computing the analysis increments for the 32 members of each subensemble (Houtekamer et al. 2014b). However, by calculating the ensemble mean analysis increment separately from the ensemble perturbation increments in the EnVar-mean, EnVar-mean2, and VarEnKF experiments, this problem is avoided. Consequently, the entire 256-member ensemble of background states is used to estimate the
4. Results from numerical experiments
a. Analysis and background ensembles
The impact on the ensemble spread of using the simplified variational approach instead of the serial batch EnKF algorithm for computing the ensemble perturbation analysis increments is shown for a single analysis time for surface pressure in Fig. 1. Only the ensemble spread from the EnVar-mean2 and VarEnKF experiments are compared since these experiments use identical configurations for computing the ensemble mean increment. This allows the impact of using 3DVar instead of the serial batch EnKF algorithm for the ensemble perturbation increments to be isolated, which is critical for the VarEnKF approach to have a computation cost comparable to the current EnKF. The overall amplitude and large-scale spatial variation of the background and analysis ensemble spreads are very similar in the EnVar-mean2 and VarEnKF experiments. The background ensemble spreads for the two experiments also have many smaller-scale features in common, including areas of relatively large spread in the extratropical oceanic regions, likely caused by rapid error growth related to low pressure systems. Several of these areas, however, have higher background spread in the VarEnKF experiment than in the EnVar-mean2 experiment. As expected, the analysis ensemble spread is generally reduced in each experiment relative to the background ensemble spread. This is most noticeable in areas of high background ensemble spread. In some areas with low background ensemble spread (e.g., over Greenland), however, the analysis spread is actually larger than the background spread, likely due to the addition of the random system error perturbations. As a result, the amount of spatial variation of the ensemble spread is reduced in the analysis ensembles relative to the background ensembles.

The ensemble spread standard deviation of surface pressure (in hPa) at 0000 UTC 10 Jan 2015 for the (top) EnVar-mean2 and (bottom) VarEnKF experiments computed from the 256-member (left) background (6-h forecast) ensemble and (right) analysis ensemble. The analysis ensemble spread is computed after the addition of the random perturbations that simulate the various sources of system uncertainty.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

The ensemble spread standard deviation of surface pressure (in hPa) at 0000 UTC 10 Jan 2015 for the (top) EnVar-mean2 and (bottom) VarEnKF experiments computed from the 256-member (left) background (6-h forecast) ensemble and (right) analysis ensemble. The analysis ensemble spread is computed after the addition of the random perturbations that simulate the various sources of system uncertainty.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
The ensemble spread standard deviation of surface pressure (in hPa) at 0000 UTC 10 Jan 2015 for the (top) EnVar-mean2 and (bottom) VarEnKF experiments computed from the 256-member (left) background (6-h forecast) ensemble and (right) analysis ensemble. The analysis ensemble spread is computed after the addition of the random perturbations that simulate the various sources of system uncertainty.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
The relationship between the analysis and background ensemble spreads is more clearly shown by computing the relative frequency of occurrence of different combinations of the background and analysis spread over all grid points for each experiment as shown in Fig. 2. The combined effect of adding the perturbation analysis increments and the random system error perturbations results in a slight flattening of the relationship between the analysis and background spread, consistent with the reduction in spatial variation of the analysis ensemble spread seen in Fig. 1. The analysis spread also appears to deviate more from the background spread in the VarEnKF experiment (Fig. 2b) than in the EnVar-mean2 experiment (Fig. 2a). Figure 3 also shows the relative frequency of occurrence of different combinations of ensemble spread, but comparing either the background spread (Fig. 3a) or the analysis spread (Fig. 3b) between the two experiments. This shows that the two approaches generally result in very similar background and analysis ensemble spread across all grid points, except that for areas of low ensemble spread (around 0.4 hPa), the VarEnKF experiment has slightly lower spread than the EnVar-mean2 experiment. This can be explained by the differences in the background error covariances used for computing the ensemble perturbation increments. As part of the static climatological covariances used in the VarEnKF approach, the background error variance is nearly spatially constant, whereas the EnVar-mean2 experiment uses the spatially varying background error variance obtained from the ensemble spread. Consequently, in areas of low background ensemble spread the EnVar-mean2 experiment will correctly produce lower-amplitude analysis increments for the ensemble perturbations than the VarEnKF experiment. Since these analysis increments generally act to reduce the ensemble spread [as seen in Eq. (5)], the spread will be reduced less strongly in areas with low spread in the EnVar-mean2 experiment and the opposite will occur in areas of high ensemble spread. This may also explain why several regions of high background ensemble spread have larger values in the VarEnKF experiment than in the EnVar-mean2 experiment. For example, large ensemble spread in the North Sea related to the rapidly developing cyclone “Nina” is more notable in the VarEnKF experiment (cf. Figs. 1a and 1c). To partially remove this difference between the two approaches, a simple modification with negligible additional computation cost could be made to the VarEnKF approach to use the background ensemble spread only for specifying the background error variance, while still using the static climatological correlations.

Normalized frequency of occurrence of the ensemble spread standard deviation of surface pressure (in hPa) from the analysis ensemble vs from the background ensemble for the (a) EnVar-mean2 and (b) VarEnKF experiments at 0000 UTC 10 Jan 2015. The frequencies are normalized such that the maximum value is 1. The solid contour indicates the 0.05 level.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

Normalized frequency of occurrence of the ensemble spread standard deviation of surface pressure (in hPa) from the analysis ensemble vs from the background ensemble for the (a) EnVar-mean2 and (b) VarEnKF experiments at 0000 UTC 10 Jan 2015. The frequencies are normalized such that the maximum value is 1. The solid contour indicates the 0.05 level.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
Normalized frequency of occurrence of the ensemble spread standard deviation of surface pressure (in hPa) from the analysis ensemble vs from the background ensemble for the (a) EnVar-mean2 and (b) VarEnKF experiments at 0000 UTC 10 Jan 2015. The frequencies are normalized such that the maximum value is 1. The solid contour indicates the 0.05 level.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

As in Fig. 2, but for the EnVar-mean2 vs the VarEnKF experiments computed from (a) the background ensemble and (b) the analysis ensemble.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

As in Fig. 2, but for the EnVar-mean2 vs the VarEnKF experiments computed from (a) the background ensemble and (b) the analysis ensemble.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
As in Fig. 2, but for the EnVar-mean2 vs the VarEnKF experiments computed from (a) the background ensemble and (b) the analysis ensemble.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
b. Control member forecast
The control member forecasts from each experiment are evaluated by comparing them with the set of independent atmospheric analyses from the ERA-Interim (Dee et al. 2011). This comparison is done after first using a spatial averaging procedure for both the forecasts and the ERA-Interim analyses to interpolate them onto a global 1.5° latitude–longitude grid. Consequently, this evaluation does not include scales that cannot be resolved on this relatively low-resolution grid. Since the length of the experiments after the spinup period is relatively short (i.e., 26 forecasts over 13 days or 52 forecasts over 26 days) only the globally averaged forecast error statistics were computed and only for forecasts up to a maximum lead time of 72 h.
Figure 4 shows the global error standard deviation and bias over the 13-day period measured relative to ERA-Interim analyses on pressure levels between 1000 and 10 hPa for the 72-h forecasts initialized from the analysis ensemble means from the EnVar-mean (red) and EnKF-control (blue) experiments. A significant1 reduction in error standard deviation and bias is seen at several levels in the troposphere for the temperature and geopotential height. Significant improvements to the bias for geopotential height extend into the stratosphere. Recall that the only difference between these experiments is the assimilation procedure used for the ensemble mean (Table 1). Tests performed with the deterministic prediction system to isolate the impact of only including the interchannel observation error correlations showed that a reduction in global 72-h forecast error standard deviation for temperature of up to 0.02 K can be expected from this change, with little impact on the mean error (Heilliette and Garand 2015). The length of the experiments is likely too short to determine definitively if the larger improvements seen in Fig. 4 (up to 0.05 K for the temperature standard deviation) are due to the use of EnVar with hybrid background error covariances for the ensemble mean instead of the serial batch EnKF algorithm. A significant improvement in EnVar-mean relative to EnKF-control is also seen, except over even more pressure levels, for the 24- and 48-h global forecasts (not shown).

The error in the 72-h global control member forecasts relative to ERA-Interim analyses from the EnVar-mean (red) and EnKF-control (blue) experiments computed over the period 0000 UTC 3 Jan–1200 UTC 15 Jan 2015. Both the standard deviation (solid curves) and average (dashed curves) of the error are shown for (a) zonal wind, (b) relative humidity, (c) geopotential height, and (d) temperature. Boxes containing numbers on the right and left side of each panel indicate the level of statistical significance that, respectively, the error standard deviation, and the mean error are different for the two experiments. The color of the box—EnVar-mean (red) and EnKF-control (blue)—indicates the experiment with the error statistic closer to zero.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

The error in the 72-h global control member forecasts relative to ERA-Interim analyses from the EnVar-mean (red) and EnKF-control (blue) experiments computed over the period 0000 UTC 3 Jan–1200 UTC 15 Jan 2015. Both the standard deviation (solid curves) and average (dashed curves) of the error are shown for (a) zonal wind, (b) relative humidity, (c) geopotential height, and (d) temperature. Boxes containing numbers on the right and left side of each panel indicate the level of statistical significance that, respectively, the error standard deviation, and the mean error are different for the two experiments. The color of the box—EnVar-mean (red) and EnKF-control (blue)—indicates the experiment with the error statistic closer to zero.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
The error in the 72-h global control member forecasts relative to ERA-Interim analyses from the EnVar-mean (red) and EnKF-control (blue) experiments computed over the period 0000 UTC 3 Jan–1200 UTC 15 Jan 2015. Both the standard deviation (solid curves) and average (dashed curves) of the error are shown for (a) zonal wind, (b) relative humidity, (c) geopotential height, and (d) temperature. Boxes containing numbers on the right and left side of each panel indicate the level of statistical significance that, respectively, the error standard deviation, and the mean error are different for the two experiments. The color of the box—EnVar-mean (red) and EnKF-control (blue)—indicates the experiment with the error statistic closer to zero.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
A similar comparison of 72-h global forecast error is shown in Fig. 5 for the EnVar-mean2 (red) and EnKF-control (blue) experiments. A small additional reduction in the forecast error standard deviation of all variables in the troposphere can be seen due to the assimilation of a much larger volume of satellite observations for computing the ensemble mean analysis increment (cf. Figs. 4 and 5). This comparison shows the benefit of using a configuration of EnVar nearly identical to the configuration used for the current deterministic prediction system when computing the ensemble mean analysis increment, since both experiments use the serial batch EnKF algorithm for the perturbation increments.

As in Fig. 4, but showing the 72-h global forecast error relative to ERA-Interim analyses for the EnVar-mean2 (red) and EnKF-control (blue) experiments.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

As in Fig. 4, but showing the 72-h global forecast error relative to ERA-Interim analyses for the EnVar-mean2 (red) and EnKF-control (blue) experiments.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
As in Fig. 4, but showing the 72-h global forecast error relative to ERA-Interim analyses for the EnVar-mean2 (red) and EnKF-control (blue) experiments.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
As in section 4a, the impact of changing the approach for computing the ensemble perturbation analysis increments is evaluated by comparing the VarEnKF (red) with EnVar-mean2 (blue) experiments (Fig. 6). The use of a highly simplified variational approach for the ensemble perturbation increments appears to result in a nearly neutral impact on the accuracy of the control member forecast relative to using the serial batch algorithm. A small negative impact is only seen for zonal wind and geopotential height in the stratosphere. This negative impact, however, is much smaller than the positive impact of computing the ensemble mean analysis increment using EnVar with hybrid background error covariances, nonzero interchannel observation error correlations for satellite radiances, and a much larger number of assimilated observations.

As in Fig. 4, but showing the 72-h global forecast error relative to ERA-Interim analyses for the VarEnKF (red) and EnVar-mean2 (blue) experiments.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

As in Fig. 4, but showing the 72-h global forecast error relative to ERA-Interim analyses for the VarEnKF (red) and EnVar-mean2 (blue) experiments.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
As in Fig. 4, but showing the 72-h global forecast error relative to ERA-Interim analyses for the VarEnKF (red) and EnVar-mean2 (blue) experiments.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
All comparisons shown previously were based on only a 13-day period. To obtain a more robust result, the VarEnKF and EnKF-control experiments were extended until 1200 UTC 28 January 2015, a total period of 26 days. A direct comparison of these experiments (Fig. 7) shows the combined impact of using a variational approach for computing the analysis increments for both the ensemble mean and perturbations and all of the other changes for computing the ensemble mean increment. Even though the period is twice as long, most of the improvements seen in the comparison of EnVar-mean2 versus EnKF-control (Fig. 5) are maintained in the VarEnKF versus EnKF-control comparison. The statistical significance of the difference appears to be lower for tropospheric wind and geopotential height in this comparison relative to EnVar-means2 versus EnKF-control. This can be partially explained by the increased variability in the error standard deviations for both experiments during the second half of the 26-day period. Figure 8 shows the time series of error standard deviation for the global 500-hPa geopotential height forecasts during the entire period. For the 72-h forecast lead time, the error standard deviation is shown to increase substantially for both experiments during the second half of the period. The VarEnKF experiment produces forecasts with a lower global error standard deviation for 77% of the forecasts during this period, further supporting the significance of the improvement relative to the EnKF-control experiment. The time series for the 24- and 120-h forecast lead times show that this improvement is even more robust for the shorter forecasts, whereas the forecast accuracy is less systematically improved for the longer forecasts. Qualitatively similar and statistically significant improvements for the VarEnKF experiment were also seen in comparisons of forecast error relative to radiosonde observations (not shown).

As in Fig. 4, but showing the 72-h global forecast error relative to ERA-Interim analyses for the VarEnKF (red) and EnKF-control (blue) experiments and over the longer period of 0000 UTC 3 Jan 2015–1200 UTC 28 Jan 2015.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

As in Fig. 4, but showing the 72-h global forecast error relative to ERA-Interim analyses for the VarEnKF (red) and EnKF-control (blue) experiments and over the longer period of 0000 UTC 3 Jan 2015–1200 UTC 28 Jan 2015.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
As in Fig. 4, but showing the 72-h global forecast error relative to ERA-Interim analyses for the VarEnKF (red) and EnKF-control (blue) experiments and over the longer period of 0000 UTC 3 Jan 2015–1200 UTC 28 Jan 2015.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

Time series of the error standard deviation (solid curves) in the global control member forecasts of 500-hPa geopotential height relative to ERA-Interim analyses from the VarEnKF (red) and EnKF-control (blue) experiments computed over the period 0000 UTC 3 Jan 2015–1200 UTC 28 Jan 2015. Results are shown for forecast lead times of (a) 24, (b) 72, and (c) 120 h. Note that the proportion of the 52 forecasts for which the global error standard deviation is lower for the VarEnKF experiment is 100%, 77%, and 69% for the 24-, 72-, and 120-h forecast lead times, respectively. The dotted lines are linear fits to the time evolution of the standard deviations.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

Time series of the error standard deviation (solid curves) in the global control member forecasts of 500-hPa geopotential height relative to ERA-Interim analyses from the VarEnKF (red) and EnKF-control (blue) experiments computed over the period 0000 UTC 3 Jan 2015–1200 UTC 28 Jan 2015. Results are shown for forecast lead times of (a) 24, (b) 72, and (c) 120 h. Note that the proportion of the 52 forecasts for which the global error standard deviation is lower for the VarEnKF experiment is 100%, 77%, and 69% for the 24-, 72-, and 120-h forecast lead times, respectively. The dotted lines are linear fits to the time evolution of the standard deviations.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
Time series of the error standard deviation (solid curves) in the global control member forecasts of 500-hPa geopotential height relative to ERA-Interim analyses from the VarEnKF (red) and EnKF-control (blue) experiments computed over the period 0000 UTC 3 Jan 2015–1200 UTC 28 Jan 2015. Results are shown for forecast lead times of (a) 24, (b) 72, and (c) 120 h. Note that the proportion of the 52 forecasts for which the global error standard deviation is lower for the VarEnKF experiment is 100%, 77%, and 69% for the 24-, 72-, and 120-h forecast lead times, respectively. The dotted lines are linear fits to the time evolution of the standard deviations.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
c. Ensemble forecast
The analysis ensembles are used to initialize 20-member medium-range ensemble forecasts. Figure 9 shows an example of the ensemble spread for surface pressure computed from the 24- and 72-h forecasts valid at 0000 UTC 10 January 2015 from three experiments. The ensemble spread from the EnVar-mean2 experiment (Figs. 9c,d) is very similar to that from the EnKF-control experiment (Figs. 9a,b), especially for the 24-h lead time. This demonstrates that the changes to the assimilation approach used for the ensemble mean have only a small visible impact on the medium-range ensemble forecast spread, at least for surface pressure. This is somewhat surprising given the significant impact on the accuracy of the control member forecasts from this change (Fig. 5). A slightly more noticeable difference is apparent when comparing the ensemble spread from the VarEnKF experiment (Figs. 9e,f) with the EnVar-mean2 experiment. While most of the same areas of large spread are seen in both, the amplitude of the spread is different in several of these. This is caused by the difference in the assimilation procedure used for computing the analysis increments for the ensemble perturbations, though this change had little impact on the accuracy of the control member forecasts (Fig. 6).

Ensemble spread of surface pressure (in hPa) for the (top) EnKF-control, (middle) EnVar-mean2, and (bottom) VarEnKF experiments computed from the 20-member ensemble forecasts with (left) 24- and (right) 72-h lead times valid at 0000 UTC 10 Jan 2015.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

Ensemble spread of surface pressure (in hPa) for the (top) EnKF-control, (middle) EnVar-mean2, and (bottom) VarEnKF experiments computed from the 20-member ensemble forecasts with (left) 24- and (right) 72-h lead times valid at 0000 UTC 10 Jan 2015.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
Ensemble spread of surface pressure (in hPa) for the (top) EnKF-control, (middle) EnVar-mean2, and (bottom) VarEnKF experiments computed from the 20-member ensemble forecasts with (left) 24- and (right) 72-h lead times valid at 0000 UTC 10 Jan 2015.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
The continuous ranked probability score (CRPS) was computed relative to radiosonde zonal wind and temperature observations to evaluate changes to the overall accuracy of the ensemble forecasts for the VarEnKF experiment relative to the EnKF-control experiment over the 26-day period. The CRPS measures the error in the cumulative probability distribution computed from the ensemble members relative to observations (e.g., Hersbach 2000). The CRPS for the VarEnKF (red) and EnKF-control (blue) experiments is shown in Fig. 10 for three pressure levels (250, 500, and 850 hPa) as a function of lead time between 24 and 120 h. Consistent with the impacts on the control member forecasts, the change of the assimilation procedure for the ensemble mean results in a reduction to the CRPS in the VarEnKF experiment relative to the EnKF-control experiment for both temperature and zonal wind at the three pressure levels shown and for most lead times. This also indicates that the use of a simplified variational assimilation approach instead of the serial batch EnKF algorithm for the ensemble perturbation increments has a small impact on the overall quality of the ensemble forecasts, relative to the changes for the ensemble mean. The statistical significance of these differences was determined using the two-sided 90% confidence interval computed using a bootstrap method (Candille et al. 2007). By this measure, the improvements to the forecasts are statistically significant for both variables, the three pressure levels shown, and for most lead times (as indicated with the colored circles in the figure).

The continuous ranked probability score (CRPS) relative to radiosonde observations of (left) zonal wind and (right) temperature at (top) 250, (middle) 500, and (bottom) 850 hPa for lead times between 24 and 120 h for the VarEnKF (red) and EnKF-control (blue) experiments. The small colored circles indicate the lead times for which the difference in CRPS is statistically significant, with lower values for VarEnKF in all cases. The CRPS is computed over the period 0000 UTC 3 Jan 2015–1200 UTC 28 Jan 2015.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1

The continuous ranked probability score (CRPS) relative to radiosonde observations of (left) zonal wind and (right) temperature at (top) 250, (middle) 500, and (bottom) 850 hPa for lead times between 24 and 120 h for the VarEnKF (red) and EnKF-control (blue) experiments. The small colored circles indicate the lead times for which the difference in CRPS is statistically significant, with lower values for VarEnKF in all cases. The CRPS is computed over the period 0000 UTC 3 Jan 2015–1200 UTC 28 Jan 2015.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
The continuous ranked probability score (CRPS) relative to radiosonde observations of (left) zonal wind and (right) temperature at (top) 250, (middle) 500, and (bottom) 850 hPa for lead times between 24 and 120 h for the VarEnKF (red) and EnKF-control (blue) experiments. The small colored circles indicate the lead times for which the difference in CRPS is statistically significant, with lower values for VarEnKF in all cases. The CRPS is computed over the period 0000 UTC 3 Jan 2015–1200 UTC 28 Jan 2015.
Citation: Monthly Weather Review 145, 2; 10.1175/MWR-D-16-0106.1
d. Computational cost
In this section, the computational cost of the serial batch EnKF algorithm used in the current operational system is compared with the overall cost of the VarEnKF configuration evaluated in this study. The comparison focuses mostly on the cost of calculating the ensemble perturbation analysis increments since this dominates the overall cost of the VarEnKF approach. The computational cost of the EnVar-mean2 experiment would be simple to determine since it equals the cost of the serial batch EnKF algorithm used for the ensemble perturbations plus the cost of the full EnVar used for the ensemble mean, with the two being executed simultaneously.
The 3DVar used in the VarEnKF experiment to compute the ensemble perturbation increments takes approximately 90 s to complete the 70 iterations of the cost function minimization for one member using 128 processors on the IBM Power7 cluster at ECCC. Currently, 16 such jobs are submitted in parallel, each computing the analysis increments for 16 members sequentially, taking a total of about 24 min (using 2048 processors). This compares with the four-dimensional version of EnVar used for the ensemble mean analysis increment that currently takes about 16 min on 512 processors to also complete 70 iterations. Therefore, the EnVar requires approximately 40 times the computational resources of the 3DVar due to both the larger number of assimilated observations and the use of 4D background error covariances obtained from the 256 ensemble members. The EnVar for the ensemble mean can be executed in parallel with multiple 3DVar jobs for the ensemble perturbations. Moreover, since the analysis increment for each ensemble perturbation is independent of all the others, their calculation scales perfectly up to a very large number of processors. It would therefore be straightforward to reduce the execution time for computing the perturbation increments, if more processors were available, by simply splitting the 256 members over more than 16 jobs (up to a maximum of 256 separate parallel jobs).
For comparison, the assimilation algorithm of serially assimilating batches of observations to update the entire ensemble as used in the current EnKF takes about 11 min using 2048 processors. This algorithm, however, also requires a separate preprocessing of the observations that takes another 8 min on 256 processors. Accounting for this additional execution time (assuming it cannot be significantly decreased by increasing the number of processors beyond 256) results in a total of approximately 19 min. Therefore the total execution time of VarEnKF is only moderately higher than that of the current EnKF algorithm, but with the benefit of only needing to develop, maintain and test a single data assimilation system for both deterministic and ensemble prediction. This increase in computational cost relative to the current EnKF algorithm is small compared with the two orders of magnitude increase that would be required to use the basic EDA approach with an ensemble of 256 data assimilation systems with each equivalent to the deterministic system. The total execution time of VarEnKF could be reduced to match the execution time of the current EnKF, when including its observation preprocessing step, by increasing the total number of processors used for the ensemble perturbation analysis increments from 2048 to 2560.
5. Conclusions
In the context of the currently operational prediction systems at ECCC, a dramatic increase in the volume of assimilated observations along with other benefits could be realized by using the four-dimensional version of EnVar for computing the ensemble mean analysis increment within the ensemble data assimilation system. In other words, the EnKF analysis ensemble would be recentered on an EnVar analysis produced using the background ensemble mean as the background state. The results from this study indicate that this relatively easy modification would lead to significantly improved ensemble forecasts. This would also increase the consistency between the ensemble mean analysis and the deterministic analysis, since the EnVar configurations for the corresponding systems would be nearly identical. Alternatively, it may be possible to obtain a similar improvement by using the deterministic analysis state itself to recenter the EnKF analysis ensemble. This would be more similar to how most other NWP centers construct their analysis ensembles. Such a change would also significantly reduce the impact of further improvements to the serial batch EnKF assimilation approach used in the current system (e.g., from assimilating additional types of observations), since it would only be used to modify the ensemble spread and not the ensemble mean. Consequently, more research resources could be dedicated to improving the quality of the EnVar analysis technique, having an automatic benefit for both deterministic and ensemble systems. Alternatively, by also using a simplified variational approach for computing the ensemble perturbation analysis increments, that is, the complete VarEnKF approach, the need to maintain two different data assimilation algorithms would be avoided.
The conclusion that the analysis increments for the ensemble perturbations can be computed using a simpler data assimilation approach without significant impact on the accuracy of the ensemble forecasts could possibly be used to adapt other ensemble data assimilation schemes. For example, it suggests that the EDA approach used at ECMWF and Météo-France could be made much more efficient without a significant degradation to forecast skill. This could be accomplished by using the full 4DVar only for updating the ensemble mean and a much cheaper 3DVar (possibly assimilating fewer observations) for the ensemble perturbations, instead of the current approach of using 4DVar for all members. The resulting reduction in computational cost would facilitate an increase in the ensemble size and thus improve the usefulness of the resulting ensembles for application in both the ensemble and deterministic prediction systems.
Additional research is required to determine whether simplifications similar to those employed in this study for computing the ensemble perturbation increments would also have little impact on ensemble forecast accuracy in other types of systems. For example, the VarEnKF approach could be tested in a higher-resolution NWP system with a more rapid cycling strategy (e.g., 3-hourly or hourly cycling). More numerical experiments are also needed to confirm the results presented in this study for a global system over longer test periods and during different seasons before proposing any changes to the operational system.
Acknowledgments
The authors thank Martin Charron and two anonymous reviewers whose comments helped to improve an earlier version of the paper.
REFERENCES
Anderson, E., and H. Järvinen, 1999: Variational quality control. Quart. J. Roy. Meteor. Soc., 125, 697–722, doi:10.1002/qj.49712555416.
Berre, L., H. Varella, and G. Desroziers, 2015: Modelling of flow-dependent ensemble-based background-error correlations using a wavelet formulation in 4D-Var at Météo-France. Quart. J. Roy. Meteor. Soc., 141, 2803–2812, doi:10.1002/qj.2565.
Buehner, M., 2005: Ensemble-derived stationary and flow-dependent background-error covariances: Evaluation in a quasi-operational setting. Quart. J. Roy. Meteor. Soc., 131, 1013–1043, doi:10.1256/qj.04.15.
Buehner, M., and A. Shlyaeva, 2015: Scale-dependent background-error covariance localisation. Tellus, 67A, 28027, doi:10.3402/tellusa.v67.28027.
Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part I: Description and single-observation experiments. Mon. Wea. Rev., 138, 1550–1566, doi:10.1175/2009MWR3157.1.
Buehner, M., J. Morneau, and C. Charette, 2013: Four-dimensional ensemble-variational data assimilation for global deterministic weather prediction. Nonlinear Processes Geophys., 20, 669–682, doi:10.5194/npg-20-669-2013.
Buehner, M., and Coauthors, 2015: Implementation of deterministic weather forecasting systems based on ensemble-variational data assimilation at Environment Canada. Part I: The global system. Mon. Wea. Rev., 143, 2532–2559, doi:10.1175/MWR-D-14-00354.1.
Buizza, R., P. L. Houtekamer, G. Pellerin, Z. Toth, Y. Zhu, and M. Wei, 2005: A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems. Mon. Wea. Rev., 133, 1076–1097, doi:10.1175/MWR2905.1.
Campbell, W. F., C. H. Bishop, and D. Hodyss, 2010: Vertical covariance localization for satellite radiances in ensemble Kalman filters. Mon. Wea. Rev., 138, 282–290, doi:10.1175/2009MWR3017.1.
Candille, G., C. Côté, P. L. Houtekamer, and G. Pellerin, 2007: Verification of an ensemble prediction system against observations. Mon. Wea. Rev., 135, 2688–2699, doi:10.1175/MWR3414.1.
Clayton, A. M., A. C. Lorenc, and D. M. Barker, 2013: Operational implementation of a hybrid ensemble/4D-Var global data assimilation system at the Met Office. Quart. J. Roy. Meteor. Soc., 139, 1445–1461, doi:10.1002/qj.2054.
Dee, D. P., and Coauthors, 2011: The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quart. J. Roy. Meteor. Soc., 137, 553–597, doi:10.1002/qj.828.
Desroziers, G., L. Berre, B. Chapnik, and P. Poli, 2005: Diagnosis of observation, background and analysis-error statistics in observation space. Quart. J. Roy. Meteor. Soc., 131, 3385–3396, doi:10.1256/qj.05.108.
Fairbairn, D., S. R. Pring, A. C. Lorenc, and I. Roulstone, 2014: A comparison of 4DVar with ensemble data assimilation methods. Quart. J. Roy. Meteor. Soc., 140, 281–294, doi:10.1002/qj.2135.
Gagnon, N., X.-X. Deng, P. L. Houtekamer, S. Beauregard, A. Erfani, M. Charron, R. Lahlou, and J. Marcoux, 2014: Improvements to the Global Ensemble Prediction System (GEPS) from version 3.1.0 to 4.0.0. Canadian Meteorological Centre Tech. Note, 49 pp. [Available online at http://collaboration.cmc.ec.gc.ca/cmc/CMOI/product_guide/docs/lib/technote_geps-400_20141118_e.pdf.]
Gagnon, N., and Coauthors, 2015: Improvements to the Global Ensemble Prediction System (GEPS) from version 4.0.1 to version 4.1.1. Canadian Meteorological Centre Tech. Note, 36 pp. [Available online at http://collaboration.cmc.ec.gc.ca/cmc/CMOI/product_guide/docs/lib/technote_geps-411_20151215_e.pdf.]
Girard, C., and Coauthors, 2014: Staggered vertical discretization of the Canadian Environmental Multiscale (GEM) model using a coordinate of the log-hydrostatic-pressure type. Mon. Wea. Rev., 142, 1183–1196, doi:10.1175/MWR-D-13-00255.1.
Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble kalman filter–3D variational analysis scheme. Mon. Wea. Rev., 128, 2905–2919, doi:10.1175/1520-0493(2000)128<2905:AHEKFV>2.0.CO;2.
Heilliette, S., and L. Garand, 2015: Impact of accounting for inter-channel error covariances at the Canadian Meteorological Center. Oral Proceedings of the 2015 EUMETSAT Meteorological Satellite Conference, Session 1, Toulouse, France, EUMETSAT. [Available online at www.eumetsat.int/website/home/News/ConferencesandEvents/PreviousEvents/DAT_2305526.html.]
Hersbach, H., 2000: Decomposition of the continuous ranked probability score for ensemble prediction systems. Wea. Forecasting, 15, 559–570, doi:10.1175/1520-0434(2000)015<0559:DOTCRP>2.0.CO;2.
Houtekamer, P. L., L. Lefaivre, J. Derome, H. Ritchie, and H. L. Mitchell, 1996: A system simulation approach to ensemble prediction. Mon. Wea. Rev., 124, 1225–1242, doi:10.1175/1520-0493(1996)124<1225:ASSATE>2.0.CO;2.
Houtekamer, P. L., H. L. Mitchell, G. Pellerin, M. Buehner, M. Charron, L. Spacek, and B. Hansen, 2005: Atmospheric data assimilation with the ensemble Kalman filter: Results with real observations. Mon. Wea. Rev., 133, 604–620, doi:10.1175/MWR-2864.1.
Houtekamer, P. L., H. L. Mitchell, and X. Deng, 2009: Model error representation in an operational ensemble Kalman filter. Mon. Wea. Rev., 137, 2126–2143, doi:10.1175/2008MWR2737.1.
Houtekamer, P. L., X. Deng, H. L. Mitchell, S.-J. Baek, and N. Gagnon, 2014a: Higher resolution in an operational ensemble Kalman filter. Mon. Wea. Rev., 142, 1143–1162, doi:10.1175/MWR-D-13-00138.1.
Houtekamer, P. L., B. He, and H. L. Mitchell, 2014b: Parallel implementation of an ensemble Kalman filter. Mon. Wea. Rev., 142, 1163–1182, doi:10.1175/MWR-D-13-00011.1.
Hunt, B. R., E. J. Kostelich, and I. Szunyogh, 2007: Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter. Physica D, 230, 112–126, doi:10.1016/j.physd.2006.11.008.
Isaksen, L., M. Bonavita, R. Buizza, M. Fisher, J. Hasler, M. Leutbecher, and L. Raynaud, 2010: Ensemble of data assimilations at ECMWF. ECMWF Tech. Memo. 636, 48 pp. [Available online at http://www.ecmwf.int/sites/default/files/elibrary/2010/10125-ensemble-data-assimilations-ecmwf.pdf.]
Kleist, D., and K. Ide, 2015: An OSSE-based evaluation of hybrid variational-ensemble data assimilation for the NCEP GFS. Part II: 4DEnVar and hybrid variants. Mon. Wea. Rev., 143, 452–470, doi:10.1175/MWR-D-13-00350.1.
Lei, L., and J. S. Whitaker, 2015: Model space localization is not always better than observation space localization for assimilation of satellite radiances. Mon. Wea. Rev., 143, 3948–3955, doi:10.1175/MWR-D-14-00413.1.
Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4D-Var. Quart. J. Roy. Meteor. Soc., 129, 3183–3203, doi:10.1256/qj.02.132.
Lorenc, A. C., N. Bowler, A. Clayton, and S. Pring, 2014: Development of the Met Office’s 4DEnVar System. Sixth EnKF Data Assimilation Workshop, Buffalo, NY, Pennsylvania State University. [Available online at http://hfip.psu.edu/fuz4/EnKF2014/EnKF-Day1/Lorenc_4DEnVar.pptx.]
Magnusson, L., M. Leutbecher, and E. Källén, 2008: Comparison between singular vectors and breeding vectors as initial perturbations for the ECMWF Ensemble Prediction System. Mon. Wea. Rev., 136, 4092–4104, doi:10.1175/2008MWR2498.1.
Mitchell, H. L., and P. L. Houtekamer, 2009: Ensemble Kalman filter configurations and their performance with the logistic map. Mon. Wea. Rev., 137, 4325–4343, doi:10.1175/2009MWR2823.1.
National Weather Service, 2015: Technical implementation notice 15-43: Global Ensemble Forecast System (GEFS). (Changes effective 2 December 2015), accessed 8 January 2016. [Available online at http://www.nws.noaa.gov/os/notification/tin15-43gefsaad.htm.]
Parrish, D., and J. Derber, 1992: The National Meteorological Center’s spectral statistical interpolation analysis scheme. Mon. Wea. Rev., 120, 1747–1763, doi:10.1175/1520-0493(1992)120<1747:TNMCSS>2.0.CO;2.
Qaddouri, A., C. Girard, L. Garand, A. Plante, and D. Anselmo, 2015: Changes to the Global Deterministic Prediction System (GDPS) from version 4.0.1 to version 5.0.0—Yin–Yang grid configuration. Canadian Meteorological Centre Tech. Note, 59 pp. [Available online at http://collaboration.cmc.ec.gc.ca/cmc/CMOI/product_guide/docs/lib/technote_gdps-500_20151215_e.pdf.]
Raynaud, L., and F. Bouttier, 2016: Comparison of initial perturbation methods for ensemble prediction at convective scale. Quart. J. Roy. Meteor. Soc., 142, 854–866, doi:10.1002/qj.2686.
Tippett, M. K., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 2003: Ensemble square root filters. Mon. Wea. Rev., 131, 1485–1490, doi:10.1175/1520-0493(2003)131<1485:ESRF>2.0.CO;2.
Wang, X., and T. Lei, 2014: GSI-based four dimensional ensemble-variational (4DEnsVar) data assimilation: Formulation and single resolution experiments with real data for NCEP Global Forecast System. Mon. Wea. Rev., 142, 3303–3325, doi:10.1175/MWR-D-13-00303.1.
Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 1913–1924, doi:10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.
Whitaker, J. S., and T. M. Hamill, 2012: Evaluating methods to account for system errors in ensemble data assimilation. Mon. Wea. Rev., 140, 3078–3089, doi:10.1175/MWR-D-11-00276.1.
Statistical significance is computed using a permutation test and reported for p values ≥ 0.9.