## 1. Introduction

Four-dimensional variational data assimilation (4DVar) techniques that use tangent-linear (Lewis and Derber 1985; Courtier et al. 1994) or linear perturbation models (Rawlins et al. 2007) and their corresponding adjoints have been shown to be powerful natural extensions to the 3DVar technique. In fact, 4DVar is the method of choice for initialization of single deterministic numerical weather prediction (NWP) applications at many operational centers (Rabier et al. 2000; Rosmond and Xu 2006; Gauthier et al. 2007; Rawlins et al. 2007). One attractive feature of 4DVar is that a dynamic model is used to help impose temporal smoothness and physical constraints on the analysis. Additionally, 4DVar allows for the simultaneous assimilation of asynchronous observations throughout a window at their appropriate times by producing a 4D analysis trajectory (Lorenc and Rawlins 2005). This is in contrast with the three-dimensional variational data assimilation-First Guess at Appropriate Time (3DVar-FGAT) method (Rabier et al. 1998; Lawless 2010), which employs 4D model states at the appropriate time to compute innovations but only solves for a solution at a single time, typically at the center of a window. The major drawbacks to the 4DVar technique are the computational cost, complications related to developing and maintaining linearized forecast models and their corresponding adjoints, and the basic assumption of linearity for the incremental formulation, which may be particularly problematic for high resolution.

Much like 3DVar, 4DVar typically assumes a static error covariance but valid at the beginning of the assimilation window. The tangent-linear (TL) and adjoint (AD) models then implicitly evolve this background error covariance as part of the variational solver. This procedure does allow for some flow dependence, though the quality of the initial static background error covariance can still play a crucial role for short assimilation windows. Further, the use of a dynamic model to constrain the solution does help improve the multivariate aspects. Much like for the 3D case, the development and application of a hybrid 4D ensemble–variational (4DEnVar) technique, which utilizes ensemble-based covariances for helping to prescribe the background error covariance at the beginning of the assimilation window, has been shown to be beneficial (Buehner et al. 2010a,b; Zhang and Zhang 2012; Clayton et al. 2013; Kuhl et al. 2013). The drawbacks of such a method are the same as those for 4DVar, namely, computational cost and the need for TL and AD models. As in Lorenc (2013), hybrid is used to describe a blended covariance and 4DVar implies the use of TL and AD models.

Along the lines of the 4D ensemble-based techniques (Hunt et al. 2004) such as the 4D-local ensemble transform Kalman filter (LETKF; Hunt et al. 2007), several methods expanding on the idea introduced by Lorenc (2003) have recently been proposed to utilize 4D ensemble perturbations within a variational framework (EnVar; Lorenc 2013) to help solve for a 4D-analysis increment without the need for TL and AD models (Liu et al. 2008; Tian et al. 2008; Liu et al. 2009; Tian et al. 2011; Buehner et al. 2010a,b; Liu and Xiao 2013). While most of the methods in these previous studies rely exclusively on an ensemble-based error covariance, several recent studies did combine the ensemble covariances with time-invariant static covariances in their 4DEnVar (i.e., hybrid 4DEnVar; Buehner et al. 2013; Desroziers et al. 2014; Lorenc et al. 2015; Wang and Lei 2014). Formulating the problem in the variational framework allows one to take full advantage of the many developments that have taken place over the years, such as dynamic constraints (Gauthier and Thépaut 2001; Kleist et al. 2009) and variational bias correction (Derber and Wu 1998; Dee 2005; Zhu et al. 2014).

In Kleist and Ide (2015, hereafter Part I) it was demonstrated that including ensemble covariances in a variational-based hybrid algorithm yielded improvements in the quality of analyses and subsequent forecasts for the National Centers for Environment Prediction (NCEP) Global Forecast System (GFS) model in the context of an observing system simulation experiment (OSSE). The experiments were performed using 3DVar and hybrid 3DEnVar, leaving significant room for improvement. Without access to the TL and AD models, a natural extension of the hybrid 3DEnVar to include 4D ensemble perturbations (hybrid 4DEnVar), is a logical next step for improving upon the previous work.

Several previous studies have investigated the use of 4DEnVar for use with global deterministic NWP. A 4DEnVar algorithm was implemented into a prototype observation space global data assimilation system for the U. S. Navy in Bishop and Hodyss (2011). There, they focused on an adaptive localization algorithm within the context of a single case study. Buehner et al. (2010b) performed an intercomparison study for the Canadian operational global NWP model, and found that the 4DEnVar improved upon their operational, nonhybrid 4DVar in the tropics and Southern Hemisphere, but not in the Northern Hemisphere. It was also found that 4DEnVar performed slightly worse than a hybrid 4DVar. Buehner et al. (2013), again using the operational Canadian system, found that while the use of 4D instead of 3D ensemble covariances did result in small, consistent improvements in their EnVar for deterministic NWP, the gains were not as large as found when going from 3DVar to 4DVar. They also found that the performance of their 4DEnVar was comparable to or better than their 4DVar, except for the extratropical summer regions. Using the Met Office global system, Lorenc et al. (2015) found that both hybrid 4DVar and 4DEnVar beat their 3D counterparts in NWP trial runs. However, it was also found that their hybrid 4DEnVar did not perform as well as their hybrid 4DVar because of the heavy reliance on the climatological covariance that the 4DVar algorithm can propagate through the assimilation window whereas the EnVar algorithm cannot. Finally, Wang and Lei (2014) performed a comparison study of hybrid 4DEnVar with 3DEnVar using the NCEP GFS (model) at low resolution. They found that 4DEnVar was better than 3DEnVar, with a larger impact in the extratropical troposphere than in the tropics. They also found that analysis increments from the 4DEnVar algorithm were more balanced than 3DEnVar, and as in a companion 3DEnVar study (Wang et al. 2013), the use of a dynamic constraint was valuable for improving the forecast skill of the 4DEnVar initialized forecasts in the extratropics.

This work is similar to the aforementioned studies that investigated the use of 4DEnVar for initializing global determinist NWP. Here, the NCEP GFS is utilized as in Wang and Lei (2014), however, an OSSE is used instead of real observations as in Part I. OSSEs have the distinction of allowing for the calculation of actual analysis error since the “truth” is known. Furthermore, the experiments utilize a dual-resolution paradigm as in Part I. Given the importance of initialization within the context of 4DEnVar for the NCEP GFS system (Wang and Lei 2014) and the Met Office system (Lorenc et al. 2015), we also aim to corroborate previous findings as well as explore alternate, computationally efficient initialization options. The remainder of the manuscript is organized as follows. Section 2 describes the implementation of the 4D extension to the hybrid including a time-invariant static error covariance supplement. Section 3 then follows with a description of various OSSE-based experiments that demonstrate the impact of utilizing 4DEnVar and hybrid variants relative to the 3D hybrid experiments that were carried out in Part I. Several experiments are carried out to demonstrate the impact of including a static error covariance in the dual-resolution 4DEnVar paradigm, as well as to show the impact of various dynamic constraints. A summary and motivation for future work then follows.

## 2. 4D extensions of 3D hybrid

### a. GSI-based hybrid

_{f}is the static background error covariance,

*K*-time levels and the TL model (

_{k}) for each time-level index (

*k*). The AD model (transpose of

_{k}) is necessary in order to obtain the gradient for the minimization. This cost function can then be extended to a hybrid 4DVar cost function by including an ensemble control variable (

*α*^{n}) for the ensemble contribution to the analysis increment at the beginning of the window:

*α*^{n}) for an ensemble of size

*N*:

*t*= 0 and the hybrid analysis increment is propagated to each time level (

*k*) by the TL as in Eq. (2). For a dual-resolution configuration,

_{k}is simply chosen to be the identity model, where a single static contribution is valid through the whole window as in hybrid 3DEnVar. The ensemble contribution to the increment now utilizes 4D, nonlinear perturbations associated with each of the observation bins. In this particular formulation,

*α*^{n}is assumed to be the same throughout the assimilation window, analogous to the weights in a 4D-LETKF without temporal localization. More details regarding the GSI 4DEnVar formulation can be found in Wang and Lei (2014). Alternate formulations and practical implementations of 4DEnVar are provided in Desroziers et al. (2014).

To allow for a single time level, the time invariant contribution, _{k} is set to be an identity model. One could formulate an alternate hybrid and utilize the TL and AD models to evolve the static contribution *only*, prescribe

The necessary components to solve the GSI using the formulation that is described by Eqs. (3) and (5), 4DEnVar, are already in place. The ensemble control variable for the hybrid is developed and implemented as part of the 3D-hybrid work described in Part I. The 4D ensemble perturbations are derived from a serial square root filter form of an ensemble Kalman filter (EnKF; Whitaker and Hamill 2002; Whitaker et al. 2008). A 4DVar capability within the GSI was previously developed through collaboration with colleagues at the National Aeronautics and Space Administration (NASA) Global Modeling and Assimilation Office (GMAO; R. Todling and Y. Trémolet 2010, personal communication). A key component of the extension of 3DVar to 4DVar within the GSI was the addition of new observation handling features and the inclusion of time binning. All configurations (3D and 4D) utilize similar default observation selection. With these pieces in place, the 3D hybrid capability is extended to allow for 4D ensemble perturbations and a 4DEnVar option.

### b. Single-observation example

To demonstrate the impact of the 4DEnVar algorithms, a set of experiments that assimilate only a single simulated observation is performed. A summary and description of the experiments is available in Table 1. The two hybrid cases that include a static and ensemble contribution to the increment are done so with

Description of various 4D (and hybrid) single-observation experiments and relevant equations for Figs. 1 and 2.

The resultant analysis increment at the middle of the assimilation window, three hours after the observation is taken, for the various 4D configurations is shown in Fig. 1. All four experiments show the maximum increment downstream from where the observation was taken consistent with the northwesterly background flow. The 4DVar experiment results in a spatially broad, quasi-Gaussian temperature increment (Fig. 1, top left). This is not terribly surprising given that only three hours elapse between the time that the observation was taken at the beginning of the window and the analysis time at the center of the window. The three experiments that utilized ensemble covariances exhibit a temperature increment that is stretched along the height gradient as would be expected (Fig. 1). All of the experiments show a cyclonic wind response to the cold temperature observation and increment, with the ensemble- and hybrid-based experiments showing a stronger wind response than the 4DVar case. It is clear that the 4DEnVar case suffers from sampling (spurious correlations) more so than the two hybrid variants. This is not surprising either, in this case, given that a 40-member ensemble is utilized in combination with quite broad localization. However, adding a time-invariant, static contribution to the increment helps to reduce the impact of these apparent problems substantially, without hurting the 4D nature of the increment. In fact, the H-4DEnVar increment is qualitatively and quantitatively very similar to that from the H-4DVar experiment, despite the fact that one uses a fairly simply dynamic model while the other utilizes 4D, nonlinearly evolved ensemble perturbations. This is in contrast to the single-observation experiments carried out in Lorenc et al. (2015), where the more heavily reliance on a static, time-invariant contribution to the increment was shown to be problematic in the H-4DEnVar paradigm.

The same single-observation test can be utilized to further investigate how the algorithms handle the propagation of information through the assimilation window, by visualizing the analysis increment at various *k*-time levels. Recall that for the 4DVar cases, this involves an explicit propagation of the increment through the use of a linear, dynamic model [i.e., Eq. (2)]. For the 4DEnVar variants, the propagation of information is achieved implicitly through correlations contained within the 4D ensemble perturbations [i.e., Eq. (5) with _{k} =

## 3. Constraints on high-frequency noise

The 4DEnVar (and hybrid) option is implemented in such a way as to allow for the application of many of the standard features included in the 3D GSI such as variational quality control, variational satellite bias correction, the tangent-linear normal mode constraint (TLNMC; Kleist et al. 2009), and various weak constraints such as digital filter (Gustafsson 1993; Polavarapu et al. 2000; Gauthier and Thépaut 2001). However, the 4DEnVar solution to the analysis problem requires special attention when it comes to the application of such constraints given that the 4D aspect of the problem is obtained through a dot product of the weights and the ensemble perturbations in a discrete manner without the explicit use of a dynamic model. The TLNMC and the weak constraint digital filter, as well as the combination of the two, are explored within the context of 4DEnVar. In this work, the constraints are imposed on the high-resolution deterministic analysis. While the EnKF analysis is recentered about the filtered, deterministic analysis, the individual members are only explicitly filtered using the full-field digital filter of the GFS model.

### a. Tangent-linear normal mode constraint

For a dynamic constraint such as the TLNMC, the minimization procedure of the cost function within the 4D context can be prohibitively expensive if one were to use hourly time levels and a 6-h assimilation window, as an example. Furthermore, the original implementation of the normal mode constraint for application within hybrid 3DEnVar requires the filter to be applied to the total analysis increment as a sum of static and ensemble contributions without much flexibility, as in Eq. (2) from Part I. For these reasons, several options related to the application of various constraints are explored and implemented into the analysis code.

*k*-time levels:

_{k}is also denoted with a time level index, since the possibility exists for linearization about the background for each time level, as the constraint is in fact tangent linear. However, for practical reasons such as computational cost and memory, the background state used for linearization in

*k*time levels. Wang and Lei (2014) utilized the constraint in this manner for their 4DEnVar experiments and found improvements in forecast skill in the extratropical troposphere. Even still, the application of the constraint in this manner can be prohibitively expensive. For a 6-h window with hourly time levels and hourly ensemble perturbations, the computational cost of the analysis goes up substantially. For this reason, another possibility to apply the full constraint only to the solution in the middle of the window is considered. The advantage of such a method is that the increment that is applied to the background and used to restart the model can be filtered explicitly reducing spinup and spindown, without the considerable cost of having to filter all time levels. This of course introduces inconsistencies between the incremental solution in the center of the window and the other time levels, which could have undesirable consequences. All applications of the TLNMC referred to hereafter utilize the default eight vertical modes and a single iteration as in Kleist et al. (2009) and Part I.

### b. Weak constraint digital filter

^{1}We denote this digital filter initialization as the JCDFI (weak constraint penalty based on digital filter initialization). The formulation of the JCDFI involves the addition of a new penalty term [see Gustafsson (1993) or Polavarapu et al. (2000) for a more detailed derivation]:

*m*(in the case of 4DEnVar, assumed to be the center of the assimilation window), and

*χ*is a general weighting factor. This weighting parameter is typically denoted

*α*in the literature but avoided in this context to remove confusion with the ensemble control variable. The filtered state is constructed from the 4D increment using the same filter coefficients (

*h*) as in the standard formulation (Lynch and Huang 1992):

*k*. The norm for the penalty function in Eq. (8) is chosen to be a dry energy norm, though the capability does exist to use a moist energy norm. Such a constraint in the 4DEnVar context, on one hand, potentially allows for noise control within the 4D increment without adding much computational cost. On the other hand, high-frequency noise cannot be described by the hourly incremental states in this context. Furthermore, there are no means to correct the origin of the high-frequency noise with links to the actual model equations as would be the case for a 4DVar-based formulation. Despite these limitations, it does allow for a cost-effective alternative to employ some filtering of the 4D increment. Additionally, it is possible to explore the use of a combination of noise constraints, weak constraint digital filter and TLNMC, to improve the quality of analysis.

### c. Single analysis impact

To ensure that the constraints are all properly functioning for use with the 4DEnVar paradigm within GSI, a single analysis test case that assimilates real observations including satellite radiances within a 6-h window valid at 0600 UTC 15 July 2010 is performed. The background and ensemble members are generated from an offline experiment that utilizes the dual-resolution hybrid configuration, but at higher resolution than the OSSE-based cycled experiments described in Part I, with T574 for the high-resolution deterministic component and T254 for the 80-member ensemble. Double inner loops of 100 iterations each are utilized, whereby the quality control, observation selection, and relinearization are performed and the nonlinear model is not rerun. The analysis is run with four separate configurations: 1) 4DEnVar, 2) 4DEnVar with TLNMC on the increment over all time levels, 3) 4DEnVar with JCDFI, and 4) 4DEnVar with the TLNMC imposed at the center of the window only combined with the JCDFI. The fourth configuration that utilizes both constraints is done with computational considerations in mind. In all four configurations, there is no static contribution to the solution (i.e.,

The divergence increment for the experiments that ran with the JCDFI is damped across the entire spectrum, including at large scales (Fig. 3). Some interesting behavior is observed on the high-spatial-frequency part of the spectrum, most of which is an artifact of the dual-resolution aspect of the configuration. The aliasing that results from the interpolation within the current dual-resolution configuration is an area of active research and the subject of a future manuscript. As expected, the more the solution is constrained, the more difficult it becomes to draw to the observations (Fig. 4). At the end of the first inner loop minimization (100 iterations), the penalty reduction is greatest for the 4DEnVar case with no extra constraints, followed by the experiments that utilized a single constraint (TLNMC and JCDFI), followed by the configuration that used both constraints. At the end of the second inner loop minimization, the order is slightly changed, but the final observation penalty is within 1% for all four configurations. The minimization appears well behaved with a steady reduction of the gradient norm in all four experiments. By design, the TLNMC has a significant impact on the incremental tendencies (Kleist et al. 2009), particularly the gravity mode tendencies (Table 2). On the other hand, while the JCDFI reduces the total tendencies by a factor of 2, it has very little impact on the amount of tendencies that project onto undesirable gravity modes due to the aforementioned issues regarding the inability to filter the highest-frequency noise using only hourly states. The combined (COMB) configuration that utilizes JCDFI and TLNMC in the center of the window appears to be the best compromise, by significantly reducing the total tendencies while also reducing the percentage that project onto gravity modes, all at a significantly cheaper computational cost than running the normal mode constraint on all time levels. The inclusion of a static error covariance (H-4DEnVAR) also acts to decrease the incremental tendencies slightly, while improving the ratio of gravity mode to total tendencies.

The root-mean-square sum of the incremental (spectral) tendencies (total and gravity mode) as well as the ratio (gravity mode/total) for the eight vertical modes kept as part of the TLNMC for the single analysis valid at 0600 UTC 15 Jul 2010 using various 4DEnVar configurations.

## 4. Cycled experiment results

### a. Analysis error comparison

A series of experiments is designed to study the impact of the 4D-ensemble and hybrid components as an extension to the 3D experiments that were described in Part I. All of these new experiments utilize the same cold start initial condition for the control and ensemble, simulated observations from the Joint OSSE (Andersson and Masutani 2010), GFS model version and spatial resolutions (dual-resolution mode, T382 deterministic with an 80-member T190 ensemble), comparable versions for the GSI and EnKF codes, identical inflation parameters (Whitaker and Hamill 2012), and cycling configuration for the data assimilation including the recentering procedure. The experiments are designed in such a way as to infer which components of the hybrid and 4D extensions yield improvements (or degradations) to the quality of analyses. A listing and description of the experiments can be found in Table 3. Note that the two 3D experiments were carried out for Part I, but are included here as a reference point. To reiterate, the assimilation and nature run models are not the same, and it is important to keep in mind their different handling of various components such as parameterized convection when interpreting the results.

Description of various (hybrid) OSSE-based experiments.

The first experiment, H-4DENVAR_NMI, is designed to build upon the successes of the 3DHYB experiment in Part I and to test the impact of going from 3D to 4D ensemble perturbations. The assimilation window is kept fixed at 6 h, though the H-4DENVAR_NMI experiment utilizes hourly time levels and therefore, hourly background and ensemble perturbations [e.g., Eqs. (3), (5), and (7)]. The observation selection procedure is identical in both configurations. This hybrid configuration also utilizes a 25% time-invariant static contribution to the analysis increment as well as the TLNMC over all seven time levels. The change in the analysis error for zonal wind, temperature, and specific humidity is shown in Fig. 5. Generally speaking, the analysis error is smaller in the H-4DENVAR_NMI experiment, especially for upper-tropospheric extratropical winds and temperature, and lower-tropospheric water vapor. It appears that by going from 3D to 4D, the temperature analysis error has actually increased over the southern polar cap in the lower troposphere. Also of note is the increase in analysis error for specific humidity in the midtropospheric tropics, temperature in the upper-tropospheric tropics, as well temperature in the upper stratosphere. The increased analysis error is probably related to differences in topography or model physics, as well as the use of 3-hourly output of the nature run to generate the continuous simulated observation set (Errico et al. 2013). In particular, differences between the nature run model and assimilation model in terms of the handling of vertical mixing in very stable regimes, deep convection in the tropics, and damping in the upper stratosphere are likely being exposed. Other 4D experiments exhibit similar behavior. By in large, however, the impact in going from 3D to 4D is generally positive, consistent with the better use of observations distributed through the assimilation windows as was demonstrated with the single-observation test. However, the impact is smaller than what was found in going from 3DVar to hybrid 3DEnVar (Part I), consistent with the findings of Buehner et al. (2013) and not as impressive as the findings in Wang and Lei (2014).

Although encouraged by the fact that the H-4DEnVar_NMI yielded improved analyses relative to the original 3D hybrid experiment, the application of the TLNMC over all seven observation bins is computationally expensive, given the necessity to calculate incremental tendencies as well as grid to spectral transforms, all within the iterative scheme. An experiment is carried out to test the impact of replacing the TLNMC with a more computationally efficient JCDFI (H-4DEnVar_DFI). The digital filter term is based on a dry energy norm and utilizes a weighting parameter (*χ*) of 10 in Eq. (8), based on previous findings (Polavarapu et al. 2000; Gauthier and Thépaut 2001). Relative to an experiment that does not utilize any of the constraint options (H-4DEnVar), the H-4DEnVar_DFI is found to have small, consistently negative impact (Fig. 6, purple). Sensitivity experiments that varied the amplitude of the weak constraint cost function parameter did not yield significant improvements, as a large increase in the parameter amplitude results in a significantly worse fit to observations.

Similar to the findings of Kleist et al. (2009), the TLNMC does have an impact reducing the background and analysis errors for surface pressure whereas the JCDFI does not (Fig. 7). In fact, the H-4DEnVar_DFI results in increased background error for surface pressure relative to no constraint for many cycles (Fig. 7, top). The JCDFI as designed is not able to filter out higher-frequency noise, given that it cannot be described with a time series of hourly states (section 3c). Furthermore, there is no direct link to the model equations themselves as there is in the 4DVar variant of the digital filter. In contrast, the H-4DEnVar_NMI results in a consistent reduction of analysis error relative to the experiment with no constraint (Fig. 6, blue), consistent with the findings of Wang et al. (2013), Part I, and Wang and Lei (2014).

Despite these shortcomings, an experiment is carried out that utilizes a combined constraint similar to that which was utilized in section 3. In this new experiment (H-4DEnVar_COMB), the same digital filter parameters from H-4DEnVar_DFI are utilized while the TLNMC is only applied to the center of the assimilation window, or in other words, to the increment that is added to the 6-h forecast which is then passed on as the initial condition for the forecast model. The JCDFI uses information from all time levels, and therefore aids to remove some gravity mode tendencies for the time levels away from the center according to the filter weights without having to explicitly apply the normal mode initialization over all time levels. This is demonstrated by looking at the total and gravity mode tendencies from the single analysis case used in section 3. The largest incremental tendencies are generated for the analysis that utilized no constraint whereas the TLNMC is most efficient at reducing them (Fig. 8, top). While the JCDFI does act to reduce the incremental tendencies, it also acts to increase the amount of gravity mode tendencies that project onto the total tendencies (Fig. 8, bottom). What is especially troubling is the fact that the JCDFI increases the ratio of gravity mode to total tendencies to be larger than with no constraint at all. The combined filter is most efficient at reducing the incremental tendencies at the center of the window as expected, but the ratio of gravity mode to total tendencies is not as small as the TLNMC-only over all time levels. Like the JCDFI itself, the combined constraint acts to degrade the quality of analyses for most levels, the exception being the layer between 300 and 20 hPa (Fig. 6). Additionally, the combined constraint is worse than the multi-time-level TLNMC for all layers except for a slight improvement found between 300 and 150 hPa. It is clear from this set of experiments that of the constraint formulations considered, the TLNMC is the only one that consistently yields improved analyses.

The suite of constraint experiments all included a 25% contribution from the static error covariance. It was found in Part I that including a static

### b. Forecast impact

Similar to Part I, a forecast impact experiment is carried out utilizing 0000 UTC analyses from the H-4DEnVar_NMI configuration to initialize the GFS model for integration out to 7.5 days. The first two weeks of the experiment are ignored to account for spinup. All verification is done relative to the ECMWF nature run on a common grid. It was shown in Part I that forecasts initialized from the 3DHYB analyses were generally superior to those that were initialized from 3DVar analyses. The analyses from the H-4DEnVar_NMI experiment are chosen in order to directly compare to the 3D hybrid experiments carried out in Part I, where the TLNMC was also utilized, in addition to the fact that he H-4DEnVar_NMI experiment resulted in the smallest analysis errors relative to the other 4D hybrid experiments.

A comparison of the 500-hPa geopotential height anomaly correlation die off curves reveals that the 4D hybrid experiment yields improved forecasts at this level for all lead times in both extratropical hemispheres (Fig. 10). The improvement over the 3D hybrid experiment is not as large in amplitude as was found in going from 3DVar to hybrid 3DEnVar. The 4D hybrid configuration results in improved height forecasts at other levels beyond 72 h, except for the upper stratosphere (Fig. 11). This is consistent with the increased analysis temperature errors that were found for these upper levels (Fig. 5, middle), related to the ensemble underdispersion in terms of temperature in the upper stratosphere (noted in Part I).

The forecast impact on other variables and levels is more mixed (Fig. 12), similar to the analysis error findings (Fig. 5). In general, it can be seen that the only region that resulted in a consistent improvement in going from 3D to 4D hybrid was in the Northern Hemisphere troposphere (Fig. 12, left column). While there are some improvements in the wind and height forecasts in the Southern Hemisphere beyond 48 h, the differences are generally not statistically significant. The impact in the tropics is generally found to be neutral to worse. This is counter to the findings in Part I in going from 3DVar to 3D hybrid, where significant improvements were found in the tropics and Southern Hemisphere troposphere. The use of 4D versus 3D ensembles results in only modest improvements, consistent with the findings in Buehner et al. (2013). This is somewhat counter to the findings of Wang and Lei (2014), where large improvements were found in their 4DEnVar versus 3DEnVar in the extratropics, particularly in the Southern Hemisphere. However, they too found only very small impacts in the tropics.

## 5. Summary and conclusions

An extension of the GSI-based hybrid variational–ensemble algorithm to include 4D ensemble perturbations is proposed and implemented. The 4DEnVar algorithm has several advantages relative to 4D-EnKF and 4DVar algorithms that make it attractive for an operational center such as NCEP. Since 4DEnVar does not require the use of additional dynamic models (i.e., the TL and AD), it requires less in terms of computational resources than a traditional 4DVar algorithm. Since it is based on a variational algorithm, it becomes quite trivial to supplement the ensemble-based (4D) covariance with some static estimate that can help ameliorate potential sampling issues, particularly for small ensemble sizes. Furthermore, such a configuration allows for easy implementation of a dual-resolution algorithm, solving for a high-resolution increment using a low-resolution ensemble and control variable [or vice versa, analogous to the interpolation of ensemble weights algorithm; Yang et al. (2009)]. Last, variational-based methods have the advantage of physical space localization, implicit through the localization of the ensemble control variable, which can be important for observations such as satellite radiances.

Several OSSE-based experiments are carried out to demonstrate that the 4DEnVar algorithm contributes positively to the quality of analysis relative to the 3DEnVar and 3DHYB experiments that were carried out in Part I. It is found that going from 3D to 4D ensemble perturbations reduces the analysis error for most variables and levels, though the impact is generally small as in Buehner et al. (2013), in contrast to the findings for the extratropics in Wang and Lei (2014). The addition of a time-invariant static contribution to the analysis increment (i.e., hybridization) does improve the quality of analyses, but only in the absence of the TLNMC. Experiments that utilized the JCDFI, either alone or in combination with the TLNMC, resulted in increased analysis error due to the use of hourly states in prescribing the filtered state. Despite the computational cost associated with it, the use of the TLNMC over all 4D time levels produced the highest-quality analyses. Wang and Lei (2014) also found that the use of the TLNMC improved forecast quality within their 4DEnVar experiments. It is possible that the JCDFI may still have utility if one were to utilize higher temporal frequency for the observation bins and ensemble time levels. This will prove to be computationally expensive, but may be worth pursuing as computing power continues to increase.

The H-4DEnVar_NMI experiment had the largest impact in the extratropical troposphere. This corresponds to a more dynamically active region, where the propagation of information and the usage of the appropriate time levels within the observations are more important. The amplitude of the error reduction in going from the 3DHYB to H-4DEnVar_NMI is found to be smaller than that when going from the 3DVar control to 3DHYB. Going from 3D to 4D in this context resulted in increased analysis error for a few variables in certain regions. This could be the result of the use of discrete (3 hourly) time levels from the nature run to generate the 4D observations as well as differences between the assimilation and nature run models. Perhaps the use of stochastic physics in place of additive inflation (and a single deterministic model) within the ensemble could reduce the negative impact in going to 4D. There is work under way to generate new high-resolution nature runs with more frequent temporal output. It will be interesting if some of these issues can be reduced with the new datasets. Finally, it is found that the improved analyses generally result in improved forecast skill, especially for extratropical tropospheric height and wind forecasts.

Although the GSI has a 4DVar capability within it, the inefficiencies of the inner loop dynamic model make it unaffordable for running fully cycled experiments. Work is ongoing to make the dynamic model more efficient. Once it is ready and can be utilized for such an experiment, it will be interesting to compare the results of the H-4DEnVar experiments with actual 4DVar, including H-4DVar.

There is also additional work that can be done on the 4DEnVar algorithm itself, such as treating the outer loop more appropriately, along the lines of the ideas proposed in Yang et al. (2012). To this point, the 4DEnVar algorithm has been executed more like a 3DVar implementation at least in terms of how the multiple inner loops operate. Perhaps some of the noted issues can be improved through a proper outer loop where the nonlinear control model is rerun but the perturbations are held fixed. An alternative would be to allow for the full reintegration of the control and the ensemble, much like running in place (Yang et al. 2012) or an iterative ensemble Kalman smoother (Bocquet and Sakov 2014). Last, work is under way to explore the use of a 4D incremental analysis update (IAU; Bloom et al. 1996) within the hybrid 4DEnVar context. The 4DIAU has been demonstrated to be a suitable alternative to a weak constraint digital filter for 4DEnVar (Lorenc et al. 2015). The 4DIAU is especially attractive as it provides a means for passing the 4D incremental state to the forecast model, instead of the single incremental state in the center of the assimilation window as is typically done.

In addition, the aforementioned work related specifically to 4DEnVar; there is related work that could be performed to improve the system. The TL and AD models that are used as part of the TLNMC and 4DVar applications are still based on a dry, adiabatic tendency model. This has been shown to have a negative impact in the tropics (Kleist et al. 2009; Wang et al. 2013; Wang and Lei 2014). Work is ongoing to add various moist physics to these models to help improve the use of the TLNMC in the tropics. It should also be pointed out that the selection of weighting parameters between the ensemble and static contribution has been simplistic to this point, and could certainly be investigated further as in Bishop and Satterfield (2013) and Bishop et al. (2013). One proposed modification to the application of the weighting already under way would be to consider scale-dependent parameters [chapter 4 of Kleist (2012)].

## Acknowledgments

This work was completed as part of the first author’s Ph.D. thesis at the University of Maryland, College Park, made possible through the support of EMC management, especially Steve Lord, John Derber, and William Lapenta. The ECMWF nature run was provided by Erik Andersson through arrangements made by Michiko Masutani. We are grateful to the Joint Center for Satellite Data Assimilation for providing access to their supercomputers to perform the experiments. The support of ONR Grants N000140910418 and N0001410557 is gratefully acknowledged. The authors wish to thank Nikke Privé and Ron Errico for kindly providing access to the calibrated, simulated observations. The authors wish to thank Dave Parrish and Jeff Whitaker for contributing to various aspects of the hybrid, EnVar, and EnKF developments. Last, the authors acknowledge Ting Lei and Xuguang Wang for good discussion and collaboration as they worked on an implementation of 4DEnVar in parallel to this work. Two anonymous reviewers helped to significantly improve the manuscript.

## REFERENCES

Andersson, E., and M. Masutani, 2010: Collaboration on observing system simulation experiments (Joint OSSE).

*ECMWF Newsletter,*No. 123, ECMWF, Reading, United Kingdom, 14–16.Bishop, C. H., and D. Hodyss, 2011: Adaptive ensemble covariance localization in ensemble 4D-Var state estimation.

,*Mon. Wea. Rev.***139**, 1241–1255, doi:10.1175/2010MWR3403.1.Bishop, C. H., and E. A. Satterfield, 2013: Hidden error variance theory. Part I: Exposition and analytic model.

,*Mon. Wea. Rev.***141**, 1454–1468, doi:10.1175/MWR-D-12-00118.1.Bishop, C. H., E. A. Satterfield, and K. T. Shanley, 2013: Hidden error variance theory. Part II: An instrument that reveals hidden error variance distributions from ensemble forecasts and observations.

,*Mon. Wea. Rev.***141**, 1469–1483, doi:10.1175/MWR-D-12-00119.1.Bloom, S. C., L. L. Takacs, A. M. da Silva, and D. Ledvina, 1996: Data assimilation using incremental analysis updates.

,*Mon. Wea. Rev.***124**, 1256–1271, doi:10.1175/1520-0493(1996)124<1256:DAUIAU>2.0.CO;2.Bocquet, M., and P. Sakov, 2014: An iterative ensemble Kalman smoother.

,*Quart. J. Roy. Meteor. Soc.***140**, 1521–1534, doi:10.1002/qj.2236.Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010a: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part I: Description and single-observation experiments.

,*Mon. Wea. Rev.***138**, 1550–1566, doi:10.1175/2009MWR3157.1.Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010b: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part II: One-month experiments with real observations.

,*Mon. Wea. Rev.***138**, 1567–1586, doi:10.1175/2009MWR3158.1.Buehner, M., J. Morneau, and C. Charette, 2013: Four-dimensional ensemble-variational data assimilation for global deterministic weather prediction.

,*Nonlinear Processes Geophys.***20**, 669–682, doi:10.5194/npg-20-669-2013.Clayton, A. M., A. C. Lorenc, and D. M. Barker, 2013: Operational implementation of a hybrid ensemble/4D-Var global data assimilation system at the Met Office.

,*Quart. J. Roy. Meteor. Soc.***139**, 1445–1461, doi:10.1002/qj.2054.Courtier, P., J.-N. Thépaut, and A. Hollingsworth, 1994: A strategy for operational implementation of 4D-Var, using an incremental approach.

,*Quart. J. Roy. Meteor. Soc.***120**, 1367–1388, doi:10.1002/qj.49712051912.Dee, D. P., 2005: Bias and data assimilation.

,*Quart. J. Roy. Meteor. Soc.***131**, 3323–3343, doi:10.1256/qj.05.137.Derber, J. C., and W.-S. Wu, 1998: The use of TOVS cloud-cleared radiances in the NCEP SSI analysis system.

,*Mon. Wea. Rev.***126**, 2287–2299, doi:10.1175/1520-0493(1998)126<2287:TUOTCC>2.0.CO;2.Desroziers, G., J.-T. Camino, and L. Berre, 2014: 4DEnVar: Link with 4D state formulation of variational assimilation and different possible implementations.

,*Quart. J. Roy. Meteor. Soc.***140,**2097–2110, doi:10.1002/qj.2325.Errico, R. M., R. Yang, N. Privé, K.-S. Tai, R. Todling, M. E. Sienkiewicz, and J. Guo, 2013: Development and validation of observing-system simulation experiments at NASA’s Global Modeling and Assimilation Office.

,*Quart. J. Roy. Meteor. Soc.***139**, 1162–1178, doi:10.1002/qj.2027.Fairbairn, D., S. R. Pring, A. C. Lorenc, and I. Roulstone, 2014: A comparison of 4DVar with ensemble data assimilation methods.

,*Quart. J. Roy. Meteor. Soc.***140**, 281–294, doi:10.1002/qj.2135.Gauthier, P., and J.-N. Thépaut, 2001: Impact of the digital filter as a weak constraint in the preoperational 4DVAR assimilation system of Météo-France.

,*Mon. Wea. Rev.***129**, 2089–2102, doi:10.1175/1520-0493(2001)129<2089:IOTDFA>2.0.CO;2.Gauthier, P., M. Tanguay, S. Laroche, and S. Pellerin, 2007: Extension of 3DVar to 4DVar: Implementation of 4DVar at the Meteorological Service of Canada.

,*Mon. Wea. Rev.***135**, 2339–2364, doi:10.1175/MWR3394.1.Gustafsson, N., 1993: Use of a digital filter as weak constraint in variational data assimilation.

*Proc. Workshop on Variational Assimilation, with Special Emphasis on Three-Dimensional Aspects,*Reading, United Kingdom, ECMWF, 327–338.Hunt, B. R., and Coauthors, 2004: Four-dimensional ensemble Kalman filtering.

,*Tellus***56A**, 273–277, doi:10.1111/j.1600-0870.2004.00066.x.Hunt, B. R., E. Kostelich, and I. Szunyogh, 2007: Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter.

,*Physica D***230**, 112–126, doi:10.1016/j.physd.2006.11.008.Kleist, D. T., 2012: An evaluation of hybrid variational-ensemble data assimilation for the NCEP GFS. Ph.D. thesis, Dept. of Atmospheric and Oceanic Science, University of Maryland, College Park, College Park, MD, 149 pp. [Available online at http://drum.lib.umd.edu/handle/1903/13135.]

Kleist, D. T., and K. Ide, 2015: An OSSE-based evaluation of hybrid variational–ensemble data assimilation for the NCEP GFS. Part I: System description and 3D-Hybrid results.

,*Mon. Wea. Rev.***143**, 433–451, doi:10.1175/MWR-D-13-00351.1.Kleist, D. T., D. F. Parrish, J. C. Derber, R. Treadon, R. M. Errico, and R. Yang, 2009: Improving incremental balance in the GSI 3DVar analysis system.

,*Mon. Wea. Rev.***137**, 1046–1060, doi:10.1175/2008MWR2623.1.Kuhl, D. D., T. E. Rosmond, C. H. Bishop, J. McClay, and N. L. Baker, 2013: Comparison of hybrid ensemble/4DVar and 4DVar within the NAVDAS-AR data assimilation framework.

,*Mon. Wea. Rev.***141**, 2740–2758, doi:10.1175/MWR-D-12-00182.1.Lawless, A. S., 2010: A note on the analysis error associated with 3D-FGAT.

,*Quart. J. Roy. Meteor. Soc.***136**, 1094–1098, doi:10.1002/qj.619.Lewis, J. M., and J. C. Derber, 1985: The use of adjoints equations to solve a variational adjustment problem with advective constraints.

,*Tellus***37A**, 309–322, doi:10.1111/j.1600-0870.1985.tb00430.x.Liu, C., and Q. Xiao, 2013: An ensemble-based four-dimensional variational data assimilation scheme. Part III: Antarctic applications with Advanced Research WRF using real data.

,*Mon. Wea. Rev.***141**, 2721–2739, doi:10.1175/MWR-D-12-00130.1.Liu, C., Q. Xiao, and B. Wang, 2008: An ensemble-based four-dimensional variational data assimilation scheme. Part I: Technical formulation and preliminary test.

,*Mon. Wea. Rev.***136**, 3363–3373, doi:10.1175/2008MWR2312.1.Liu, C., Q. Xiao, and B. Wang, 2009: An ensemble-based four-dimensional variational data assimilation scheme. Part II: Observing system simulation experiments with Advanced Research WRF.

,*Mon. Wea. Rev.***137**, 1687–1704, doi:10.1175/2008MWR2699.1.Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4D-VAR.

,*Quart. J. Roy. Meteor. Soc.***129**, 3183–3203, doi:10.1256/qj.02.132.Lorenc, A. C., 2013: Recommended nomenclature for EnVar data assimilation methods.

*Research Activities in Atmospheric and Oceanic Modeling,*WGNE, 2 pp. [Available online at http://www.wcrp-climate.org/WGNE/BlueBook/2013/individual-articles/01_Lorenc_Andrew_EnVar_nomenclature.pdf.]Lorenc, A. C., and F. Rawlins, 2005: Why does 4D-Var beat 3D-Var?

,*Quart. J. Roy. Meteor. Soc.***131**, 3247–3257, doi:10.1256/qj.05.85.Lorenc, A. C., N. E. Bowler, A. M. Clayton, S. R. Pring, and D. Fairbairn, 2015: Comparison of hybrid-4DEnVar and hybrid-4DVar data assimilation methods for global NWP.

,*Mon. Wea. Rev.***143**, 212–229, doi:10.1175/MWR-D-14-00195.1.Lynch, P., and X. Y. Huang, 1992: Initialization of the HIRLAM model using a digital filter.

,*Mon. Wea. Rev.***120**, 1019–1034, doi:10.1175/1520-0493(1992)120<1019:IOTHMU>2.0.CO;2.Polavarapu, S., M. Tanguay, and L. Fillion, 2000: Four-dimensional variational data assimilation with digital filter initialization.

,*Mon. Wea. Rev.***128**, 2491–2510, doi:10.1175/1520-0493(2000)128<2491:FDVDAW>2.0.CO;2.Rabier, F., J.-N. Thépaut, and P. Courtier, 1998: Extended assimilation and forecast experiments with a four-dimensional variational assimilation system.

,*Quart. J. Roy. Meteor. Soc.***124**, 1861–1887, doi:10.1002/qj.49712455005.Rabier, F., H. Järvinen, E. Klinker, J.-F. Mahfouf, and A. Simmons, 2000: The ECMWF operational implementation of four-dimensional variational assimiliation. I: Experimental results with simplified physics.

,*Quart. J. Roy. Meteor. Soc.***126**, 1143–1170, doi:10.1002/qj.49712656415.Rawlins, F., S. P. Ballard, K. J. Bovis, A. M. Clayton, D. Li, G. W. Inverarity, A. C. Lorenc, and T. J. Payne, 2007: The Met Office global 4-dimensional data assimilation system.

,*Quart. J. Roy. Meteor. Soc.***133**, 347–362, doi:10.1002/qj.32.Rosmond, T., and L. Xu, 2006: Development of NAVDAS-AR: Nonlinear formulation and outer loop tests.

,*Tellus***58A**, 45–59, doi:10.1111/j.1600-0870.2006.00148.x.Tian, X., Z. Xie, and A. Dai, 2008: An ensemble-based explicit four-dimensional variational assimilation method.

,*J. Geophys. Res.***113**, D21124, doi:10.1029/2008JD010358.Tian, X., Z. Xie, and Q. Sun, 2011: A POD-based ensemble four-dimensional variational assimilation method.

,*Tellus***63A**, 805–816, doi:10.1111/j.1600-0870.2011.00529.x.Wang, X., and T. Lei, 2014: GSI-based four-dimensional ensemble–variational (4DEnsVar) data assimilation: Formulation and single-resolution experiments with real data for the NCEP Global Forecast System.

,*Mon. Wea. Rev.***142**, 3303–3325, doi:10.1175/MWR-D-13-00303.1.Wang, X., D. Parrish, D. Kleist, and J. Whitaker, 2013: GSI 3DVar-based ensemble–variational hybrid data assimilation for NCEP Global Forecast System: Single-resolution experiments.

,*Mon. Wea. Rev.***141**, 4098–4117, doi:10.1175/MWR-D-12-00141.1.Wee, T.-K., and Y.-H. Kuo, 2004: Impact of a digital filter as a weak constraint in MM5 4DVar: An observing system simulation experiment.

,*Mon. Wea. Rev.***132**, 543–559, doi:10.1175/1520-0493(2004)132<0543:IOADFA>2.0.CO;2.Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations.

,*Mon. Wea. Rev.***130**, 1913–1924, doi:10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.Whitaker, J. S., and T. M. Hamill, 2012: Evaluating methods to account for system errors in ensemble data assimilation.

,*Mon. Wea. Rev.***140**, 3078–3089, doi:10.1175/MWR-D-11-00276.1.Whitaker, J. S., T. M. Hamill, X. Wei, Y. Song, and Z. Toth, 2008: Ensemble data assimilation with the NCEP Global Forecast System.

,*Mon. Wea. Rev.***136**, 463–482, doi:10.1175/2007MWR2018.1.Yang, S.-C., E. Kalnay, B. Hunt, and N. E. Bowler, 2009: Weight interpolation for efficient data assimilation with the Local Ensemble Transform Kalman Filter.

,*Quart. J. Roy. Meteor. Soc.***135**, 251–262, doi:10.1002/qj.353.Yang, S.-C., E. Kalnay, and B. Hunt, 2012: Handling nonlinearity in an ensemble Kalman filter: Experiments with the three-variable Lorenz model.

,*Mon. Wea. Rev.***140**, 2628–2646, doi:10.1175/MWR-D-11-00313.1.Zhang, M., and F. Zhang, 2012: E4DVar: Coupling an ensemble Kalman filter with four-dimensional variational data assimilation in a limited-area weather prediction model.

,*Mon. Wea. Rev.***140**, 587–600, doi:10.1175/MWR-D-11-00023.1.Zhu, Y., J. Derber, A. Collard, D. Dee, R. Treadon, G. Gayno, and J. A. Jung, 2014: Enhanced radiance bias correction in the National Centers for Environmental Prediction’s Gridpoint Statistical Interpolation data assimilation system.

,*Quart. J. Roy. Meteor. Soc.***140**, 1479–1492, doi:10.1002/qj.2233.

^{1}

The JCDFI that was originally developed for 4DVar applications was not implemented for the standard double-conjugate gradient minimization algorithm that is utilized for the hybrid applications. Minor modifications were necessary to adapt the code for use in this context.