• Bishop, C. H., and Z. Toth, 1999: Ensemble transformation and adaptive observations. J. Atmos. Sci., 56, 17481765.

  • Bishop, C. H., D. Hodyss, P. Steinle, H. Sims, A. M. Clayton, A. C. Lorenc, D. M. Barker, and M. Buehner, 2011: Efficient ensemble covariance localization in variational data assimilation. Mon. Wea. Rev., 139, 573580.

    • Search Google Scholar
    • Export Citation
  • Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010a: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part I: Description and single-observation experiments. Mon. Wea. Rev., 138, 15501566.

    • Search Google Scholar
    • Export Citation
  • Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010b: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part II: One-month experiments with real observations. Mon. Wea. Rev., 138, 15671586.

    • Search Google Scholar
    • Export Citation
  • Clayton, A. M., A. C. Lorenc, and D. M. Barker, 2013: Operational implementation of a hybrid ensemble/4D-Var global data assimilation system at the Met Office. Quart. J. Roy. Meteor. Soc., doi:10.1002/qj.2054, in press.

    • Search Google Scholar
    • Export Citation
  • Courtier, P., 1997: Dual formulation of four-dimensional variational assimilation. Quart. J. Roy. Meteor. Soc., 123, 24492461.

  • Courtier, P., J. N. Thepaut, and A. Hollingsworth, 1994: A strategy for operational implementation of 4D-Var, using an incremental approach. Quart. J. Roy. Meteor. Soc., 120, 13671387.

    • Search Google Scholar
    • Export Citation
  • Daley, R., and E. Barker, 2001: The NAVDAS sourcebook. Naval Research Laboratory NRL/PU/7530–01-441, 161 pp. [Available online at http://www.dtic.mil/dtic/tr/fulltext/u2/a396883.pdf.]

  • Dee, D., 2004: Variational bias correction of radiance data in the ECMWF system. Proc. Workshop on Assimilation of High Spectral Resolution Sounders in NWP, Reading, United Kingdom, ECMWF, 97–112.

  • Derber, J. C., and W. S. Wu, 1998: The use of TOVS cloud-cleared radiances in the NCEP SSI analysis system. Mon. Wea. Rev., 126, 22872299.

    • Search Google Scholar
    • Export Citation
  • Etherton, B. J., and C. H. Bishop, 2004: Resilience of hybrid ensemble/3DVAR analysis schemes to model error and ensemble covariance error. Mon. Wea. Rev., 132, 10651080.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 (C5), 10 14310 162.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter–3D variational analysis scheme. Mon. Wea. Rev., 128, 29052919.

  • Hamill, T. M., J. S. Whitaker, D. T. Kleist, M. Fiorino, and S. G. Benjamin, 2011: Predictions of 2010's tropical cyclones using the GFS and ensemble-based data assimilation methods. Mon. Wea. Rev., 139, 32433247.

    • Search Google Scholar
    • Export Citation
  • Harris, B., and G. Kelly, 2001: A satellite radiance-bias correction scheme for data assimilation. Quart. J. Roy. Meteor. Soc., 127, 14531468.

    • Search Google Scholar
    • Export Citation
  • Hogan, T. F., T. Rosmond, and R. Gelaro, 1991: The NOGAPS forecast model: A technical description. Naval Research Laboratory AD–A247 216, 218 pp. [Available online at http://handle.dtic.mil/100.2/ADA247216.]

  • Houtekamer, P. L., H. L. Mitchell, G. Pellerin, M. Buehner, M. Charron, L. Spacek, and M. Hansen, 2005: Atmospheric data assimilation with an ensemble Kalman filter: Results with real observations. Mon. Wea. Rev., 133, 604620.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., H. L. Mitchell, and X. X. Deng, 2009: Model error representation in an operational ensemble Kalman filter. Mon. Wea. Rev., 137, 21262143.

    • Search Google Scholar
    • Export Citation
  • Kepert, J. D., 2011: Balance-aware covariance localisation for atmospheric and oceanic ensemble Kalman filters. Comput. Geosci., 15, 239250.

    • Search Google Scholar
    • Export Citation
  • Kleist, D. T., D. F. Parrish, J. C. Derber, R. Treadon, W. S. Wu, and S. Lord, 2009: Introduction of the GSI into the NCEP Global Data Assimilation System. Wea. Forecasting, 24, 16911705.

    • Search Google Scholar
    • Export Citation
  • Lewis, J. M., and J. Derber, 1985: The use of adjoint equations to solve a variational adjustment problem with advective constraints. Tellus, 37A, 309322.

    • Search Google Scholar
    • Export Citation
  • Lorenc, A. C., 1981: A global three-dimensional multivariate statistical interpolation scheme. Mon. Wea. Rev., 109, 701721.

  • Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP— A comparison with 4D-Var. Quart. J. Roy. Meteor. Soc., 129, 31833203.

    • Search Google Scholar
    • Export Citation
  • McLay, J. G., C. H. Bishop, and C. A. Reynolds, 2008: Evaluation of the ensemble transform analysis perturbation scheme at NRL. Mon. Wea. Rev., 136, 10931108.

    • Search Google Scholar
    • Export Citation
  • McLay, J. G., C. H. Bishop, and C. A. Reynolds, 2010: A local formulation of the ensemble transform (ET) analysis perturbation scheme. Wea. Forecasting, 25, 985993.

    • Search Google Scholar
    • Export Citation
  • Rosmond, T., and L. Xu, 2006: Development of NAVDAS-AR: Non-linear formulation and outer loop tests. Tellus, 58A, 4558.

  • Toth, Z., and E. Kalnay, 1997: Ensemble forecasting at NCEP and the breeding method. Mon. Wea. Rev., 125, 32973319.

  • Wang, X. G., C. Snyder, and T. M. Hamill, 2007: On the theoretical equivalence of differently proposed ensemble–3DVAR hybrid analysis schemes. Mon. Wea. Rev., 135, 222227.

    • Search Google Scholar
    • Export Citation
  • Wang, X. G., D. Parrish, D. Kleist, and J. Whitaker, 2013: GSI 3DVar-based ensemble-variational hybrid data assimilation for NCEP Global Forecast System: Single resolution experiments. Mon. Wea. Rev., in press.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2011: Statistical Methods in the Atmospheric Sciences. 3rd ed. Elsevier Academic Press, 704 pp.

  • Xu, L., T. Rosmond, and R. Daley, 2005: Development of NAVDAS-AR: Formulation and initial tests of the linear problem. Tellus, 57A, 546559.

    • Search Google Scholar
    • Export Citation
  • Zhang, F. Q., M. Zhang, and J. A. Hansen, 2009: Coupling ensemble Kalman filter with four-dimensional variational data assimilation. Adv. Atmos. Sci., 26, 18.

    • Search Google Scholar
    • Export Citation
  • Zhang, F. Q., M. Zhang, and J. Poterjoy, 2013: E3DVar: Coupling an ensemble Kalman filter with three-dimensional variational data assimilation in a limited-area weather prediction model and comparison to E4DVar. Mon. Wea. Rev., 141, 900917.

    • Search Google Scholar
    • Export Citation
  • Zhang, M., and F. Q. Zhang, 2012: E4DVar: Coupling an ensemble Kalman filter with four-dimensional variational data assimilation in a limited-area weather prediction model. Mon. Wea. Rev., 140, 587600.

    • Search Google Scholar
    • Export Citation
  • View in gallery
    Fig. 1.

    Vertical localization on surfaces. Included in the figure on the right-hand side are the approximate pressure levels for the surfaces.

  • View in gallery
    Fig. 2.

    Meridional wind response (filled thick contours) to a single meridional wind observation at (a)–(c) the start and (d)–(f) the end of the 6-h 4D-Var window. The time of the observation is at the start of the 6-h 4D-Var window. The ensemble localization in the (g) horizontal and (h),(i) vertical planes; with (h) the latitude–pressure plane and (i) the longitude–pressure plane. The plots in (a)–(g) are all for the same model level ( hPa) as the observation. In (a),(d) the plots are for the static mode () case; in (b),(e) the plots are for the hybrid mode (); and in (c),(f) the plots are for the flow-dependent case (). The observation location is marked with a black cross at the center of each plot. The unfilled contours of (a)–(f) show the background temperature field and the gray arrows show the background wind vectors.

  • View in gallery
    Fig. 3.

    (top) Impact and (bottom) significance level of that impact of the verification scores of 5-day forecasts of vector wind from the hybrid mode () experiment with respect to the static mode () experiment as a function of lead time and pressure. The plotted quantity in the top plots is defined by where is the RMS distance (RMSD) of the control forecast from a subset of 400 radiosonde observations for the vector wind quantity. The shading in the bottom plots gives the level of statistical significance if there is a difference in the hybrid mode and static mode . In both the (top) and (bottom), red (blue) shading indicates that the hybrid mode is better (worse) than the static mode. The level of statistical significance is binned by Gaussian confidence probabilities as 95%: , 97.5%: , 99%: , and 99.5%: . For both the (top) and (bottom) the results are regionally averaged: (left) Northern Hemisphere (NH) extratropics, (middle) tropics (TR), and (right) Southern Hemisphere (SH) extratropics; and temporally averaged over forecasts initiated every 12 h from 0000 UTC 1 Jul to 0000 UTC 1 Sep 2010.

  • View in gallery
    Fig. 4.

    As in Fig. 3, but displaying 0000 UTC 1 Feb–0000 UTC 1 Apr 2011.

  • View in gallery
    Fig. 5.

    As in Fig. 3, but given for the RMSD from verifying analyses rather than radiosondes. NOGAPS scorecard metrics are highlighted in black. The Met Office scorecard metrics are highlighted in green.

  • View in gallery
    Fig. 6.

    As in Fig. 5, but displaying 0000 UTC 1 Feb–0000 UTC 1 Apr 2011.

  • View in gallery
    Fig. 7.

    As in Fig. 3, but the RMSD from radiosondes is averaged over the whole globe and is for (left) geopotential height, (middle) temperature, and (right) vector wind. NOGAPS scorecard metrics are highlighted in black.

  • View in gallery
    Fig. 8.

    As in Fig. 7, but displaying 0000 UTC 1 Feb–0000 UTC 1 Apr 2011.

  • View in gallery
    Fig. 9.

    (top) Impact and (bottom) significance level of that impact of the verification scores of 5-day forecasts of the geopotential height anomaly correlation from the hybrid mode () experiment with respect to the static mode () experiment as a function of lead time and pressure. The plotted quantity in the top plots is the percentage change in geopotential height anomaly correlation . The shading in each of the bottom plots gives the level of statistical significance if there is a difference in the hybrid mode and static mode . In both the (top) and (bottom) red (blue) shading indicates that the hybrid mode is better (worse) than the static mode. The level of statistical significance is binned by Gaussian confidence probabilities as 95%:, 97.5%: , 99%: , and 99.5%: . For both (top) and (bottom) the results are regionally averaged: (left) Northern Hemisphere (NH) extratropics and (right) Southern Hemisphere (SH) extratropics; and temporally averaged over forecasts initiated every 12 h from 0000 UTC 1 Jul to 0000 UTC 1 Sep 2010. NOGAPS scorecard metrics are highlighted in black.

  • View in gallery
    Fig. 10.

    As in Fig. 9, but displaying 0000 UTC 1 Feb–0000 UTC 1 Apr 2011.

  • View in gallery
    Fig. 11.

    As in Fig. 3, but comparing between the static mode (blue) and flow-dependent mode (red) experiments.

  • View in gallery
    Fig. 12.

    As in Fig. 4, but comparing between the static mode (blue) and flow-dependent mode (red) experiments.

  • View in gallery
    Fig. 13.

    As in Fig. 5, but comparing between the static mode (blue) and flow-dependent mode (red) experiments.

  • View in gallery
    Fig. 14.

    As in Fig. 6, but comparing between the static mode (blue) and flow-dependent mode (red) experiments.

  • View in gallery
    Fig. 15.

    Truncated table of Met Office scorecard metrics. Bars represent percentage changes to RMS error comparing of static mode () and hybrid mode () experiments . Colors represent region averages: green is for the Northern Hemisphere (NH), red is for the tropics (TR), and blue is for the Southern Hemisphere (SH). Labels are formatted as region (NH, TR, or SH), variable (H = geopotential height, W = vector wind), level (500 = 500 hPa, 200 = 200 hPa, and 850 = 850 hPa); and forecast lead time in hours (T + 24 = 24 h, T + 48 = 48 h, and T + 72 = 72 h). (a),(b) Verification with radiosondes. (c),(d) Verification with self-analysis. (a),(c) Experiments from July to August 2010 time period and (b),(d) are experiments from February to March 2011 time period.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 444 203 13
PDF Downloads 308 120 2

Comparison of Hybrid Ensemble/4DVar and 4DVar within the NAVDAS-AR Data Assimilation Framework

David D. KuhlNaval Research Laboratory, Washington, D.C.

Search for other papers by David D. Kuhl in
Current site
Google Scholar
PubMed
Close
,
Thomas E. RosmondSAIC, Forks, Washington

Search for other papers by Thomas E. Rosmond in
Current site
Google Scholar
PubMed
Close
,
Craig H. BishopNaval Research Laboratory, Monterey, California

Search for other papers by Craig H. Bishop in
Current site
Google Scholar
PubMed
Close
,
Justin McLayNaval Research Laboratory, Monterey, California

Search for other papers by Justin McLay in
Current site
Google Scholar
PubMed
Close
, and
Nancy L. BakerNaval Research Laboratory, Monterey, California

Search for other papers by Nancy L. Baker in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

The effect on weather forecast performance of incorporating ensemble covariances into the initial covariance model of the four-dimensional variational data assimilation (4D-Var) Naval Research Laboratory Atmospheric Variational Data Assimilation System-Accelerated Representer (NAVDAS-AR) is investigated. This NAVDAS-AR-hybrid scheme linearly combines the static NAVDAS-AR initial background error covariance with a covariance derived from an 80-member flow-dependent ensemble. The ensemble members are generated using the ensemble transform technique with a (three-dimensional variational data assimilation) 3D-Var-based estimate of analysis error variance. The ensemble covariances are localized using an efficient algorithm enabled via a separable formulation of the localization matrix. The authors describe the development and testing of this scheme, which allows for assimilation experiments using differing linear combinations of the static and flow-dependent background error covariances. The tests are performed for two months of summer and two months of winter using operational model resolution and the operational observational dataset, which is dominated by satellite observations. Results show that the hybrid mode data assimilation scheme significantly reduces the forecast error across a wide range of variables and regions. The improvements were particularly pronounced for tropical winds. The verification against radiosondes showed a greater than 0.5% reduction in vector wind RMS differences in areas of statistical significance. The verification against self-analysis showed a greater than 1% reduction from verifying against analyses between 2- and 5-day lead time at all eight vertical levels examined in areas of statistical significance. Using the Navy's summary of verification results, the Navy Operational Global Atmospheric Prediction System (NOGAPS) scorecard, the improvements resulted in a score (+1) that justifies a major system upgrade.

Corresponding author address: David Kuhl, Naval Research Laboratory, 4555 Overlook Ave. SW, Washington, DC 20375. E-mail: david.kuhl@nrl.navy.mil

Abstract

The effect on weather forecast performance of incorporating ensemble covariances into the initial covariance model of the four-dimensional variational data assimilation (4D-Var) Naval Research Laboratory Atmospheric Variational Data Assimilation System-Accelerated Representer (NAVDAS-AR) is investigated. This NAVDAS-AR-hybrid scheme linearly combines the static NAVDAS-AR initial background error covariance with a covariance derived from an 80-member flow-dependent ensemble. The ensemble members are generated using the ensemble transform technique with a (three-dimensional variational data assimilation) 3D-Var-based estimate of analysis error variance. The ensemble covariances are localized using an efficient algorithm enabled via a separable formulation of the localization matrix. The authors describe the development and testing of this scheme, which allows for assimilation experiments using differing linear combinations of the static and flow-dependent background error covariances. The tests are performed for two months of summer and two months of winter using operational model resolution and the operational observational dataset, which is dominated by satellite observations. Results show that the hybrid mode data assimilation scheme significantly reduces the forecast error across a wide range of variables and regions. The improvements were particularly pronounced for tropical winds. The verification against radiosondes showed a greater than 0.5% reduction in vector wind RMS differences in areas of statistical significance. The verification against self-analysis showed a greater than 1% reduction from verifying against analyses between 2- and 5-day lead time at all eight vertical levels examined in areas of statistical significance. Using the Navy's summary of verification results, the Navy Operational Global Atmospheric Prediction System (NOGAPS) scorecard, the improvements resulted in a score (+1) that justifies a major system upgrade.

Corresponding author address: David Kuhl, Naval Research Laboratory, 4555 Overlook Ave. SW, Washington, DC 20375. E-mail: david.kuhl@nrl.navy.mil

1. Introduction

To obtain the minimum error variance or maximal likelihood state of the atmosphere given a forecast and observations, one requires the true flow-dependent forecast error covariance matrix. For a variety of reasons, this is a very difficult quantity to accurately estimate. Until recently, a common approach to estimating this quantity at operational centers has been to use a static covariance matrix that approximates a climatological average of flow-dependent error covariances (Lewis and Derber 1985; Courtier et al. 1994). In contrast, ensemble Kalman filters (Evensen 1994) have received considerable attention by researchers as a means to generate flow-dependent forecast error covariances. Both approaches are imperfect, so it is of interest to examine the performance of data assimilation schemes that linearly combine these two different types of error covariance models, what we term as hybrid-ensemble/four-dimensional variational data assimilation (4D-Var) or just hybrid systems. Here, we report on the performance of a hybrid system that we have built within the framework of the Navy's operational Naval Research Laboratory Atmospheric Variational Data Assimilation System-Accelerated Representer (NAVDAS-AR) four-dimensional data assimilation scheme. As far as the authors' are aware, the results we present here are the first observation space hybrid-ensemble/4D-Var system results. The results add to a growing body of knowledge about the performance of these hybridizations of static and ensemble covariances within operational data assimilation schemes.

Hybrid error covariance models were first discussed in a three-dimensional variational data assimilation (3D-Var) context by Hamill and Snyder (2000) and Lorenc (2003). In experiments with a barotropic model, Etherton and Bishop (2004) found that that the hybrid ensemble/3D-Var formulation was particularly useful when the forecast model was imperfect. Recently, scientists from the National Centers for Environmental Prediction (NCEP) have shown that the introduction of a hybrid error covariance formulation significantly improved the performance of NCEP's 3D-Var–GSI (gridpoint statistical interpolation) system (Kleist et al. 2009) at reduced operational resolution globally (Wang et al. 2013) and for tropical cyclone tracks (Hamill et al. 2011). Using a simple model system Zhang et al. (2009) developed the first truly coupled hybrid ensemble/4D-Var system [coupled, as used here and by Zhang et al. (2009), is not to be confused with ocean–atmosphere coupling]. Both NCEP and the Met Office now use hybrid background error covariance models in their operational systems.

Experiments by Buehner et al. (2010a,b) with versions of the Canadian operational system found that replacing the static covariances by localized ensemble covariances lead to large forecast improvements in the southern extratropics. While their work showed the potential superiority of ensemble-based covariances over static error covariances, it did not address the question of whether linear combinations of flow-dependent and static error covariances would provide superior forecast skill. This question was, however, addressed by Clayton et al. (2013) who found that initial covariances based on a linear combination of static and flow-dependent covariances were able to reduce overall forecast RMS errors, relative to the normal 4D-Var system, by a significant amount even with a relatively small ensemble (24 ensemble members). Zhang and Zhang (2012) corroborated these results [extending the work of Zhang et al. (2009) to real-data experiments] using a limited-area weather prediction model. Zhang and Zhang (2012) showed that their hybrid ensemble/4D-Var system outperformed both the ensemble Kalman filter and the 4D-Var system separately and in Zhang et al. (2013) they demonstrated that additional advantages may come from using the adjoint in hybrid ensemble/4D-Var over hybrid ensemble/3D-Var systems.

The hybrid ensemble/4D-Var data assimilation system we have developed is designed to be a component of the existing operational NAVDAS-AR data assimilation system (Rosmond and Xu 2006; Xu et al. 2005) and the operational ensemble forecasting system (McLay et al. 2008, 2010). The operational ensemble is based on a local formulation of Bishop and Toth (1999)'s ensemble transform (ET) technique and features a short-term cycling ensemble of 80 members. Our implementation enables a range of configurations to be tested using the same code. The configurations range from only using the static covariances (which is our baseline experiment representing the operational system) to only using the flow-dependent ensemble covariances and all fractional combinations in between. This flexibility allows for experiments where the relative impact of various combinations of the static and ensemble covariances can be readily measured without changing other components of the DA or forecast system. The 5-day verification forecasts were produced using the same forecast model as the NAVDAS-AR system, Navy Operational Global Atmospheric Prediction System (NOGAPS; Hogan et al. 1991).

The NAVDAS-AR data assimilation system differs from other major operational 4D-Var implementations in that it has been formulated in observation space (the dual form) rather than model space (the primal form; Courtier 1997). As we will explain in section 2, because the formulation is in the dual form, the system does not require the use of the extended control variable technique. A description of the experimental setup is presented in section 3. In section 4, results from a series of 2-month data assimilation experiments and the resulting validation with 5-day deterministic forecasts are presented. The various experiment configurations are compared to highlight the impact of the static (4D-Var), flow-dependent (ensemble), and linear mixing of the two initial covariances. Finally, some conclusions are given in section 5.

2. Formulation of NAVDAS-AR hybrid

NAVDAS-AR is a weak constraint 4D-Var system that uses a variational algorithm to first solve for the vector in
e1
where is a vector of observations distributed over a 6-h time window, is the high-resolution four-dimensional model forecast, is the nonlinear observation operator that maps the forecast into observation space, is the Jacobian of , and is the adjoint of . The observation error covariance matrix is denoted by , and is a four-dimensional forecast error covariance matrix. Once (1) has been solved for , the time-evolving analysis is obtained from
e2
It can be shown that if represents the true four-dimensional forecast error covariance matrix then is the best linear unbiased estimate (BLUE) of the four-dimensional state.
While the NAVDAS-AR system can readily accommodate a range of representations of model error covariance, in operations and in our tests, the model error covariance is set to zero and takes the following form:
e3
where is the linear rectangular matrix operator that maps an initial perturbation to all of the discrete times within the data assimilation window and is its adjoint. In implementing (3), the time dimension was discretized into hourly intervals (−3, −2, −1, 0, 1, 2, 3) so that actually maps initial perturbations into these seven discrete times, with the analysis at the middle of the window. In other words, if were to be constructed as a matrix operator, it would have 7 times as many rows as it has columns. In practice, is implicitly formed by successive application of an approximation to the Jacobian or TLM of the nonlinear NOGAPS model. The large matrix never needs to be held in memory, all that is required by the conjugate gradient is the result of the matrix vector multiply , where is the vector of interest. This matrix vector multiply requires a sequence of operations involving the adjoint and the TLM (Xu et al. 2005).
Most hybrid operational systems use the extended control variable form of hybrid ensemble assimilation (originally proposed by Lorenc 2003). Because of our observation-based data assimilation (DA) system we can use a different form proposed by Hamill and Snyder (2000). These two forms were proven to be mathematically equivalent in Wang et al. (2007). In our implementation takes the following form:
e4
where is a positive scalar between 0 and 1. A value of produces identical to , in this paper we will refer to this as “static mode.” A value of produces identical to , in this paper we will refer to this as “flow-dependent mode.” A value of mixes and in equal measure. In this paper, we refer to the case as the “hybrid mode.” In considering the hybrid mode, bear in mind that the global and seasonal average of the variances of and are not necessarily equivalent.
The variable was originally formulated for the NAVDAS (3D-Var) system (Daley and Barker 2001) and later upgraded for NAVDAS-AR (Xu et al. 2005). For the background error covariances are specified and partitioned into the background error variances and the background error correlations as
e5
The background geopotential and temperature error variances are specified to be in exact hydrostatic balance. This is done by specifying the temperature variances and forcing the geopotential variance be related hydrostatically to the temperature variances by integrating the hydrostatic relation up from the surface. The wind and geopotential background error variances are set to be approximately geostrophically coupled in the extratropics, and are uncoupled in the tropics by way of the Lorenc (1981) geostrophic coupling parameter that varies between 1 at high latitudes and 0 (no coupling) in the tropics. The background error correlations are specified using a separable formulation into vertical and horizontal components. The horizontal and vertical correlations are based upon a second-order autoregressive (SOAR; Daley and Barker 2001) function dependent on the distance between points, as well as interactions between velocity potential, streamfunction, hydrostatic, and geostrophic balances. Errors in the relative humidity field are assumed to be uncorrelated with all other variables (i.e., they are univariate) and to have a shorter vertical and horizontal correlation length scales. Details of the formulation are given in Daley and Barker (2001).
The flow-dependent ensemble is obtained from
e6
where is the number of ensemble members, is the ith ensemble perturbation at the beginning of the data assimilation window, is the localization matrix, and is the Schur product (or element wise product). Again, we do not explicitly need the covariance matrix defined by (6), just like (3), for the NAVDAS-AR algorithm. The is fully multivariate matrix unlike where the relative humidity is univariate.
The localization employed in this study is a nonadaptive localization in physical space and the correlation functions associated with are a function of horizontal and vertical position. Following Bishop et al. (2011), computational gains were achieved by performing the localization on a reduced resolution grid and by using specific properties of separable correlation matrices. The reduced resolution horizontal grid used for localization was an 180 × 90 Gaussian grid. The vertical localization was based on a Gaussian log-sigma correlation function given by
e7
where gives the attenuation applied to covariances between variables on and , is the value of on the ith full model level, and is the vertical localization parameter (which we set to 10 after a small number of short experiments with other values such as 0.1, 1, and 100). The NAVDAS-AR system uses a hybrid vertical coordinate system ranging from at the top full model level to at the bottom full model level. Note that full levels are defined at the approximate midpoint of coordinate layers. State variables are defined at full levels, and are mean values for the corresponding layer. There are half model levels at the top, (0.04 hPa) and the bottom, (surface) that are not included in the calculation of (7). The coordinate system changes from a sigma system below 150 hPa in the lower atmosphere to pure pressure above 150 hPa (Hogan et al. 1991, chapter 2). This correlation function creates a vertical localization that has a shorter vertical scale in the stratosphere (low numbers) than in the troposphere (high values); see Fig. 1 for a plot of the vertical localization at several specified levels.
Fig. 1.
Fig. 1.

Vertical localization on surfaces. Included in the figure on the right-hand side are the approximate pressure levels for the surfaces.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Defining the vertical localization in terms of instead of pressure levels forces the vertical localization length to be shorter over regions of elevated terrain. We did this because it was computationally convenient and we surmised that the performance of the system would be insensitive to this effect provided the ensemble size was large enough to permit a fairly broad vertical localization length scale.

The horizontal localization correlation matrix was defined in terms of spherical harmonics and was given the following form:
e8
where is the inverse spherical harmonic transform, is the forward spherical harmonic transform, and is a diagonal matrix. The diagonal values of s are obtained from a function of the total wavenumber given by
e9
where is the total wavenumber associated with the ith element of the spectral representation of the horizontal state, and is a spectral length scale (which we set to 100 after a small number of short experiments with other values such as 1, 10, and 1000). The scalar coefficient is chosen to ensure that (8) gives a correlation matrix. Equation (9) is a Gaussian function in spectral space, which is also a Gaussian function in grid point space.

The horizontal and vertical localization covariance can be seen in the right-hand column plots of Figs. 2g–i for an observation assimilated at 500 hPa. The 50% covariance localization in the horizontal direction is approximately 20° in both latitude and longitude, which corresponds to about 2000 km. The 50% covariance localization in the vertical ranges from about 425 to 650 hPa or approximately a 225-hPa-thick layer.

Fig. 2.
Fig. 2.

Meridional wind response (filled thick contours) to a single meridional wind observation at (a)–(c) the start and (d)–(f) the end of the 6-h 4D-Var window. The time of the observation is at the start of the 6-h 4D-Var window. The ensemble localization in the (g) horizontal and (h),(i) vertical planes; with (h) the latitude–pressure plane and (i) the longitude–pressure plane. The plots in (a)–(g) are all for the same model level ( hPa) as the observation. In (a),(d) the plots are for the static mode () case; in (b),(e) the plots are for the hybrid mode (); and in (c),(f) the plots are for the flow-dependent case (). The observation location is marked with a black cross at the center of each plot. The unfilled contours of (a)–(f) show the background temperature field and the gray arrows show the background wind vectors.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Along with the localization, Fig. 2 illustrates the resulting meridional wind analysis increment fields through a 6-h 4D-Var assimilation window. Assimilated is a single meridional wind observation with innovation (=1 m s−1) and observation error (=1 m s−1) located at the middle of the figure and at the beginning of the window. Presented in the figure is the beginning of the assimilation window (τ = −3 h, Figs. 2a–c) and the end of the assimilation window (τ = 3 h, Figs. 2d–f) wind increments. The plots present the increments generated from three alpha cases: (static mode), (hybrid mode), and (flow-dependent mode). Note that in all plots the increment at the end of the time window shows a tilted hourglass shape. This shape is created by the advection of the winds through the TLM model (i.e., the flow of the day). Only the hybrid mode and flow-dependent mode cases retain this flow-produced hourglass shape at the beginning of the window as well. This figure illustrates how the ensemble informs the of the flow-dependent structures at the beginning of the window. It is hoped that this flow-dependence imparted to the initial covariances by the hybrid and flow-dependent modes will improve the data assimilation performance.

3. Experimental setup

Two series of experiments were performed, using a configuration similar to the operational system. The resolution for the control forecast model [ used to predict observations in (2)], and outer loop of NAVDAS-AR, is T319/L42 (960 × 480 Gaussian grid with 42 levels in the vertical). The inner loop resolution, and the ensemble member resolution [ in (6)], is T119/L42 (360 × 180 Gaussian grid with 42 levels in the vertical). The current operational ensemble member resolution is T159; however, for these experiments the lower resolution of T119 was chosen because it avoids any need to interpolate the ensemble perturbations to the resolution of the TLM and adjoint. Both series of experiments assimilated the suite of observations available for the operational data assimilation system. We used a 6-h data assimilation cycle for both a boreal summer and boreal winter period. The first experiments, boreal summer period, extended from 0000 UTC 1 June 2010 until 0000 UTC 1 September 2010, while the second experiments, boreal winter period, extended from 0000 UTC 1 January 2011 until 0000 UTC 1 April 2011. Each series of experiments included three different values: 0 (static mode), 0.5 (hybrid mode), and 1 (flow-dependent mode).

The assimilated observations include conventional observations from land surface stations, radiosondes, dropsondes and pilot balloons, aircraft, and buoys. Satellite remotely sensed observations/retrievals include global positioning system (GPS) radio occultation bending angle observations for temperature and water vapor, atmospheric motion vectors (AMVs) derived from both polar-orbiting and geostationary satellites, ocean surface winds from scatterometers and microwave imagers, and integrated water vapor from microwave imagers.

The largest set of observations comes from a wide range of satellite sensors. The satellite observations include the microwave sounders the Advanced Microwave Sounding Unit-A (AMSU-A), the Microwave Humidity Sounder (MHS), the Defense Meteorological Satellite Program (DMSP) Special Sensor Microwave Imager/Sounder (SSM/IS); and the advanced infrared sounders Aqua and Atmospheric Infrared Sounder (AIRS) and Meteorological Operation (MetOp-A), the Infrared Atmospheric Sounding Interferometer (IASI). The moisture analysis is mainly affected by radiosondes, surface observations, and the microwave imagers/sounders such as the MHS, SSM/IS, the Special Sensor Microwave Imager (SSM/I), and the Navy Research Laboratory's (NRL) WindSat. Because of technical difficulties with the decoders, MHS and Aqua AIRS/AMSU-A were not included in the suite of assimilated observations.

An 80-member ensemble centered on the analysis was generated using a configuration similar to the operational ET ensemble generation scheme (McLay et al. 2008, 2010). The ET ensemble generation scheme transforms 6-h forecast perturbations into analysis perturbations such that (i) each analysis perturbation is a linear combination of forecast perturbations and (ii) the covariance of the analysis perturbations is, on a globally averaged basis, consistent with an estimate of the analysis error covariance matrix obtained from NAVDAS-AR. The ET ensemble generation cycle selects growing perturbations in much the same way as Toth and Kalnay (1997)'s breeding method selects growing perturbations. Unlike the breeding scheme, the ET ensemble generation scheme ensures that variance is maintained in the entire vector subspace spanned by the ensemble. Although the NAVDAS-AR estimate of analysis error covariance only depends on the distribution and accuracy of observations and not on the flow or season, the ET ensemble analysis covariance is flow dependent because its perturbations are composed of dynamical modes that have recently amplified in response to the flow and stability properties of the recent flow-dependent state of the atmosphere.

The ET ensemble generation scheme used in our experiments has a climatological 3D-Var based estimate of analysis error variance for each time period of 0000, 0600, 1200, and 1800 UTC. Though there is a strong diurnal dependence in the variances due to the variation of the observation network (as discussed in McLay et al. 2008) through experimentation we detected little seasonal dependence. This is understandable because there is little seasonal dependence on the observation network. Thus we used diurnally varying climatological averages for all of our experiments. The climatological variances are averaged from 20 November 2008 to 31 December 2008. Recall that the ET analysis error covariance is determined by a combination of the instability of the recent and current flow regime and the prescribed NAVDAS estimate of analysis error variances. The ET analysis ensemble variances depend on the flow of the day and the season even though the NAVDAS estimate of analysis error variance does not. Low-resolution experiments comparing climatological and day-to-day varying analysis error variance files showed no statistically significant differences between the results. The initial ensemble, only used at the startup time of the experiment cycle, comprises 80 randomly chosen states of the atmosphere from a cycled system from December of 2008 with 32 flow-dependent ensemble members. The initial ensemble was used to start both the winter and summer experiments, which is one reason why a spinup period of 30 days was used.

The second reason the 30-day spinup period was to train the coefficients for the variational radiance bias correction system (VarBC) used for all radiance observations (Derber and Wu 1998; Dee 2004; B. Chua 2012, personal communication). With VarBC, the coefficients for the model-based bias predictors become part of the state vector and are updated each data assimilation cycle. Although satellite VarBC has not yet been put into operations at the Fleet Numerical Meteorology and Oceanography Center (FNMOC), tests have shown that it is as good or better than that used in operations at the time of writing and is planned to be transitioned to operations in the very near future. The current operational system uses the offline two-predictor bias correction approach of Harris and Kelly (2001).

4. Results

We use the static mode () as the control experiment. This experiment is essentially the same as the operational system. The most significant difference between the operational NAVDAS-AR and the static mode system is the discretization of the time stamp assigned to observations. In the operational system, the time window is discretized into continuous 0.5-h intervals and, for the purposes of assimilating the observations, those observations are assumed to have occurred at the center of the 0.5-h time interval. In our system, the 0.5-h intervals were replaced by 1-h intervals for all experiments. Another difference with the operational system is that the digital filter was not used in the static mode experiment. We tested the impact of these differences with a full-resolution experiment in operational mode with the 0.5-h discretization and the digital filter turned on and we saw minimal differences in the forecast verification results. Another difference with the operational system, pointed out above, is that our system uses a bias correction system (VarBC) that is believed to produce a superior performance to the system currently used in operations.

The first set of comparisons is between the static mode and the hybrid mode () results (sections 4ac). The second set of comparisons is between the static mode and the flow-dependent mode () results (section 4d). Then both the hybrid mode and flow-dependent mode, compared to the static mode, are investigated with the aggregate NOGAPS scorecard (section 4e). Finally the results are investigated with a truncated Met Office scorecard (section 4f) to facilitate comparison of our results with that of Clayton et al. (2013). For all experiments, we use exactly the same forecast model with the same suite of observations (small differences in the final set of assimilated observations can exist as a result of quality control decisions that are dependent upon the background or computed background error variances). All experiments started with the initial ensemble, discussed above, and ran coupled with the ensemble for the initial 1-month spinup time, which was not used for the forecast verification analysis [coupled, as used here and by Zhang et al. (2009), should not to be confused with ocean–atmosphere coupling]. After the 1-month spinup the coupled cycling system analysis from each experiment at 0000 and 1200 UTC each day were used to initialize a 5-day deterministic forecast. The first test period, 0000 UTC 1 July to 0000 UTC 1 September 2010 (referred to as July–August 2010), produced 124 of the 5-day forecasts while the second period, 0000 UTC 1 February to 0000 UTC 1 April 2011 (referred to as February–March 2011), produced 120 of the 5-day forecasts.

The forecast quality evaluations were made with single 5-day deterministic forecasts, launched every 12 h (which is what is done operationally), comparing mass and wind fields verified against self-analysis and/or radiosondes observations. The verification scores are computed either globally or separately for two extratropical regions (NH = 20°–80°N and SH = 20°–80°S) and the tropical region (TR = 20°S–20°N). The self-analysis verification is performed at every grid point on a 360 × 180 Gaussian grid and weighted by the cosine of latitude to account for different area coverage of the grid boxes. The radiosonde verification is performed on a subset of 400 high-quality radiosonde stations scattered around the globe with 80.5% in the NH, 8.5% in the TR, and 10.8% in the SH (J. Goerss 2012, personal communication).

There are advantages and disadvantages to using both radiosonde and self-analysis data for validation. A pitfall of verifying against self-analyses is that one would obtain perfect forecast scores if no observations were assimilated because the analysis would be the same as the forecast. It is generally believed that changes to the data assimilation scheme that reduce the magnitude and/or spatial area of analysis corrections can increase the similarity of forecasts to self-analyses without changing or even increasing the distance of forecasts from the truth. The issues with self-analysis verification are thus partially mitigated by longer forecasts, which is why we chose to only report self-analysis verifications with forecasts longer than 2 days. The primary limitation of verifying against just radiosondes is that radiosondes cover a relatively small area of the globe and that such an approach ignores information from other observation types such as aircraft and satellite observations. For these reasons, one must be cautious about drawing conclusions from either verification in isolation.

To estimate the statistical confidence of observed performance differences between the control and the comparison experiments, the following test statistic was used
e10
where and are two different paired experiment test statistics for number of paired means ( 124 or 120 for our experiments), is the sample variance of the paired differences, and is the hypothesized population mean. Here, we test the null hypothesis that there is no statistical difference between the two means. Because our datasets are correlated in time we approximate an effective sample size with
e11
where is the lag-1 autocorrelation coefficient. The lag-1 assumption is that the theoretical autocorrelation for higher terms is , where is the order of the correlation (Wilks 2011). We checked this assumption for several of our experiments and found it to be adequate (results not shown). Finally to convert the test statistic into a confidence probability , we used the following equation:
e12
where is found from a table of left-tail cumulative probabilities for the standard Gaussian distribution [Table B.1 in Wilks (2011)]. The two-multiplier is added assuming a two-tailed probability since we have no prior knowledge of the relative means for the two experiments. Therefore, a significance level with a magnitude of 100% () means the verification statistic for one experiment is always lower than the other in all samples. Similarly, a significance level of 0% () corresponds with an equal frequency for one experiment having a lower verification statistic than the other (i.e., a 0% probability that the verification scores are different).

a. RMS vector wind error for the hybrid mode experiment

The root-mean-square (RMS) vector wind error is shown in Figs. 36. The RMS vector wind error is calculated by
e13
where and are the forecast u and υ components of wind and and are the verification and components of wind. Figures 3 and 4 show the verification against radiosondes and Figs. 5 and 6 show the verification against self-analysis both averaged over three different regions: NH, TR, and SH. The RMS vector wind error is compared for the hybrid mode ( or ) experiment relative to the static mode ( or ) control. The results are presented for July–August 2010 in Figs. 3 and 5 and February–March 2011 in Figs. 4 and 6. The top set of plots of all four figures display the percentage change of the RMS vector wind error averaged over the presented time period. Red shading on the plots corresponds to RMS vector wind error closer to 0.0 for hybrid mode experiment (). Blue shading on the plots corresponds to RMS vector wind error closer to 0.0 for static mode () control experiment. The bottom set of plots in Figs. 36 present the level of statistical significance of these differences in RMS vector wind error over the presented time period. In the figures, the Gaussian confidence probabilities are binned as 95%: , 97.5%: , 99%: , and 99.5%: .
Fig. 3.
Fig. 3.

(top) Impact and (bottom) significance level of that impact of the verification scores of 5-day forecasts of vector wind from the hybrid mode () experiment with respect to the static mode () experiment as a function of lead time and pressure. The plotted quantity in the top plots is defined by where is the RMS distance (RMSD) of the control forecast from a subset of 400 radiosonde observations for the vector wind quantity. The shading in the bottom plots gives the level of statistical significance if there is a difference in the hybrid mode and static mode . In both the (top) and (bottom), red (blue) shading indicates that the hybrid mode is better (worse) than the static mode. The level of statistical significance is binned by Gaussian confidence probabilities as 95%: , 97.5%: , 99%: , and 99.5%: . For both the (top) and (bottom) the results are regionally averaged: (left) Northern Hemisphere (NH) extratropics, (middle) tropics (TR), and (right) Southern Hemisphere (SH) extratropics; and temporally averaged over forecasts initiated every 12 h from 0000 UTC 1 Jul to 0000 UTC 1 Sep 2010.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Fig. 4.
Fig. 4.

As in Fig. 3, but displaying 0000 UTC 1 Feb–0000 UTC 1 Apr 2011.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Fig. 5.
Fig. 5.

As in Fig. 3, but given for the RMSD from verifying analyses rather than radiosondes. NOGAPS scorecard metrics are highlighted in black. The Met Office scorecard metrics are highlighted in green.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Fig. 6.
Fig. 6.

As in Fig. 5, but displaying 0000 UTC 1 Feb–0000 UTC 1 Apr 2011.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

The numerous red boxes in Figs. 36 indicate where the improvement of the hybrid over the control is statistically significant. The robustness of the improvement is particularly marked in the stratosphere and at 1–3-day forecast lead times in the troposphere.

It is of interest to note that in Figs. 3 and 4 at pressure levels 30 and 50 hPa in all regions, the static mode experiment is closer to the observations at lead time equal to zero than the hybrid mode. However, for lead time greater than zero the hybrid mode is closer to the verifying observations than the static mode. The fact that the hybrid mode forecasts are better than the static mode forecasts suggests that the hybrid mode analyses were better than the static mode analyses even though the hybrid mode analyses are further away from the observations at lead time equal to zero. This is possible because observations themselves are imperfect and hence, an analysis that lies closer to the observations than another analysis is not necessarily better.

The magnitude of the percentage improvements for radiosondes (Figs. 3 and 4) is smaller than that from the self-analysis improvements (Figs. 5 and 6). We have not yet been able to diagnose the cause this inconsistency but there are a variety of possibilities. First, radiosonde observations and analyses have independent errors that no amount of improvement of the forecast can remove. Consequently, if observation error variances were larger than analysis error variances then the percentage reduction in the RMS difference between forecast and verification due to a fixed reduction in forecast error variance would be smaller when the verification was based on radiosonde observations than it would be if the verification was based on analyses. Second, even though comparisons against self-analyses do allow all observation types to contribute to the verification metric, false positives/negatives are possible. If, for example, no observations were assimilated for a period that extended beyond 5 days, the verifying analysis would become identical to the 5-day forecast because; in the absence of any observations, our current system sets the analysis equal to the forecast. In future work, we hope to include an independent analysis (from a different operational center) among our verification tools so that we can avoid the possibility of false positives due to comparison against self-analyses.

b. Globally averaged radiosonde verifications for the hybrid mode experiment

The globally averaged analysis and forecast verification versus radiosonde observations are presented in Figs. 7 and 8 for geopotential height (left-hand column), temperature (center column) and vector winds (right-hand column). The results are presented in style similar to the regionally averaged RMS vector wind error plots in section 4a, but for globally averaged values. The radiosonde verification vector wind plots (right-hand column) of Figs. 7 and 8 are an average of all of the regions of Figs. 3 and 4 with the addition of radiosondes near the poles. These plots highlight the similarity and differences between the results of the different forecast metrics.

Fig. 7.
Fig. 7.

As in Fig. 3, but the RMSD from radiosondes is averaged over the whole globe and is for (left) geopotential height, (middle) temperature, and (right) vector wind. NOGAPS scorecard metrics are highlighted in black.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Fig. 8.
Fig. 8.

As in Fig. 7, but displaying 0000 UTC 1 Feb–0000 UTC 1 Apr 2011.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

In general, the results with the different metrics (geopotential height, temperature, and vector wind) are similar (i.e., the hybrid experiment resulted in improvement of the different metrics at most levels and at most forecast lead times greater than lead time equal to zero). One notable exception to this is for temperature at 30 hPa in Fig. 7 (July–August 2010) where the static mode experiment was statistically significantly better than the hybrid mode at lead times >3 days (although the hybrid mode was better in vector winds and geopotential height). We have not yet been able to diagnose the cause of this inconsistency but there are a variety of possibilities such as the improper specification of the ensemble localization at this level (which is very tight). In future work, we hope to address the localization issue and try and improve temperature forecasts in the stratosphere without removing the improvements seen in the geopotential heights or vector winds.

c. Geopotential height anomaly correlation for the hybrid mode experiment

The geopotential height anomaly correlation has been a widely used verification forecast metric and is computed using the following formula:
e14
where is the single deterministic forecast, is the analysis, and is the climate geopotential height field at the same verifying time. The climate geopotential height fields come from archives from National Center for Atmospheric Research (NCAR; T. Hogan 2012, personal communication). The plots are similar to the RMS vector wind error plots of section 4a; however, the shading in the contour plots indicate which experiment is closer to 1.0 (rather than which experiment is closer to 0.0). Thus, red shading on the plots indicates that hybrid mode experiment () has a greater geopotential height anomaly correlation, and blue shading on the plots indicates that the static mode () control experiment has a greater geopotential height anomaly correlation. The results are presented averaged over two different regions for July–August 2010 in Fig. 9 and for February–March 2011 in Fig. 10. Only the extratropical regions are shown in these figures because of the lack of geostrophic balance in the tropics.
Fig. 9.
Fig. 9.

(top) Impact and (bottom) significance level of that impact of the verification scores of 5-day forecasts of the geopotential height anomaly correlation from the hybrid mode () experiment with respect to the static mode () experiment as a function of lead time and pressure. The plotted quantity in the top plots is the percentage change in geopotential height anomaly correlation . The shading in each of the bottom plots gives the level of statistical significance if there is a difference in the hybrid mode and static mode . In both the (top) and (bottom) red (blue) shading indicates that the hybrid mode is better (worse) than the static mode. The level of statistical significance is binned by Gaussian confidence probabilities as 95%:, 97.5%: , 99%: , and 99.5%: . For both (top) and (bottom) the results are regionally averaged: (left) Northern Hemisphere (NH) extratropics and (right) Southern Hemisphere (SH) extratropics; and temporally averaged over forecasts initiated every 12 h from 0000 UTC 1 Jul to 0000 UTC 1 Sep 2010. NOGAPS scorecard metrics are highlighted in black.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Fig. 10.
Fig. 10.

As in Fig. 9, but displaying 0000 UTC 1 Feb–0000 UTC 1 Apr 2011.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Figures 9 and 10 of the geopotential height anomaly correlation demonstrate the importance of the use of spatially localized and flow-dependent background-error covariance versus stationary and highly parameterized covariances. For both time periods, the 2–5-day lead-time forecasts in the Southern Hemisphere for all vertical levels and in the Northern Hemisphere below 100 hPa the hybrid mode experiment resulted in improvement to the geopotential height anomaly correlations. Many of these areas were statistically significantly closer to 1.0 than the static mode experiment. However, the improvements were slight and there were no lead times or vertical levels where the increase was greater than 5%. There is one small area in the Northern Hemisphere, 100 hPa at the 5-day lead time, where the static mode was closer to 1.0 than the hybrid mode but this was not a statistically significant difference.

d. Flow-dependent mode verifications

Presented in Figs. 11 and 12 are the radiosonde verification and Figs. 13 and 14 are the self-analysis verification of the RMS vector wind error results comparing the static mode () control experiment to flow-dependent mode () experiment. In these figures, the red shading indicates improvement for the flow-dependent mode compared to the static mode.

Fig. 11.
Fig. 11.

As in Fig. 3, but comparing between the static mode (blue) and flow-dependent mode (red) experiments.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Fig. 12.
Fig. 12.

As in Fig. 4, but comparing between the static mode (blue) and flow-dependent mode (red) experiments.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Fig. 13.
Fig. 13.

As in Fig. 5, but comparing between the static mode (blue) and flow-dependent mode (red) experiments.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Fig. 14.
Fig. 14.

As in Fig. 6, but comparing between the static mode (blue) and flow-dependent mode (red) experiments.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

Comparing Figs. 1114 we see very good agreement in results with Figs. 36 above 100 hPa of all of the plots. Where there is discrepancy between the hybrid mode results and the flow-dependent mode results is below 100 hPa. Both radiosonde and self-analysis verifications agree that the 100% flow-dependent ensemble is inferior to the static background error covariance matrix in the extratropical troposphere. The radiosonde verification (Figs. 11 and 12) disagrees with the self-analysis verification (Figs. 13 and 14) in the tropics where the radiosonde favors the static mode and the self-analysis favors the flow-dependent mode. When both verification statistics disagree with each other it is hard to identify which verification is the truth. On one hand the self-analysis verification is flawed because one would obtain perfect forecast scores if no observations were assimilated because the analysis would be the same as the forecast. On the other hand the radiosondes cover a relatively small area of the globe, which ignores information from other observation types such as aircraft and satellite observations. In future work, we hope to include an independent analysis (from a different operational center) among our verification tools because there is less chance that the analysis error of an independent analysis will be correlated with forecast error.

Our flow-dependent results are contrary to the findings of Buehner et al. (2010b, see their Figs. 6 and 7) who found a positive impact of the 100% flow-dependent ensemble shown in all regions and at all levels. This suggests that the ratio of the accuracy of the Canadian ensemble covariance model relative to the Canadian static covariance model is greater than the corresponding ratio for our system. One significant difference between our ET ensemble and the Canadian ensemble is that the Canadian ensemble incorporates samples from a static covariance matrix (Houtekamer et al. 2005) whereas our ET ensemble does not. This fact might mean that the Canadian ensemble covariance model may not be able to benefit as much from being linearly combined with a static covariance model as our system because it already partially incorporates information from the 3D-Var static covariance model. It should be noted that there are other differences as well between the two systems such as ensemble size, Buehner et al. used an ensemble of 96 members while we have an ensemble of 80 members. Also, it is likely that the Canadian EnKF provides an accurate estimate of the effect of observations on analysis error covariance than the very approximate ET approach. Finally another difference is that Buehner et al. simulate the effect of model error with perturbations to the forecast initial conditions and different configurations of the model physical parameterizations (Houtekamer et al. 2009) neither of which are included in our ensemble system.

We found, in low-resolution experiments, that the ensemble covariances, and not the ensemble variances, were the major contributor to the improvements of the hybrid and flow-dependent modes over the static mode. For computational reasons this investigation was performed using a low-resolution version of the system (outer-loop resolution T119/L42 and 32-member ensemble/inner-loop resolution T47/L42) and only conventional observations (no satellite sensors). The computational cost of running experiments at operational resolution and with operational observations is 2 months of computer time for each experiment whereas low-resolution experiments can be completed in a matter of days. The covariance experiments were performed using mixed that had variances (diagonal terms) from the and the correlations (off-diagonal terms) of the . Then these results were compared using the variances from the and correlations from the . We do not show these results here because the geographic distribution of the errors was qualitatively different from the high-resolutions results; however, these experiments suggested that the improvements to the system from the ensemble are primarily from the correlations and not the variances. These results indicate that the method of using flow-dependent variances alone would not result in the improvements to our system.

e. Aggregate NOGAPS scorecard

The tool used at the U.S. Navy's operational center (FNMOC) to summarize verification results between a control experiment (normally the operational code) and a comparison experiment is the so-called “NOGAPS scorecard” (R. Pauley 2012, personal communication). The set of scorecard verification metrics are listed in Table 1 for the deterministic forecasts (a different scorecard is used with the ensemble forecasts). The scorecard awards positive (or negative) points when forecasts of a specified field have been improved (or degraded). The more positive points awarded, the greater the perceived value of the system change.

Table 1.

NOGAPS scorecard. The areas: global (GL), Northern Hemisphere (NH), tropics (TR), and Southern Hemisphere (SH). Tau is the forecast lead time in days. For anomaly correlations, the criteria for weighting are that the scores must be statistically different with a confidence level of 95%. For all other score types, the criterion for weighting is that the scores must be at least 5% less and that the scores must be statistically different with a confidence level of 95%. Weights for the control experiment are negative and weights for the comparison experiment positive. The total score for a control vs comparison experiment is the sum of all category weights with a maximum of 24 points at stake. An aggregate score of −1 or better is considered to be a neutral (or better) overall result and is the minimum requirement for a major system change to be considered for operational promotion.

Table 1.

The Northern Hemisphere 500-hPa geopotential height anomaly correlation for the 96-h forecast is given 4 times the weight of other geopotential height anomaly correlation metrics in the table. The anomaly correlation scores must be statistically higher with a confidence level of 95%. For wind speed and vector wind RMS errors, the errors of the comparison experiment must be at least 5% less than for the control experiment, with a confidence level of 95%. The aggregate verification score is the sum of all categories with a maximum of 24 points. An aggregate score of −1 or better is considered to be neutral (or better) for the overall results and is a minimum requirement to promote a major system upgrade. The score of −1 allows for a small amount of degradation in the verification results that occurs sometimes in major system changes.

Figures 310 include a black outlines around the boxes where the NOGAPS scorecard metrics are measured. For both experiments, the buoy verification as well as the tropical cyclone forecast track verification results yielded a neutral (+0) scorecard value (not shown).

The aggregate NOGAPS scorecard for the hybrid mode experiments has a score of +1 for the July–August 2010 experiment and +1 for the February–March 2011 experiment. If a 5% improvement constraint is relaxed, and only statistical significance is required, then the hybrid system has a total aggregate score of +7 for July–August 2010 and +5 for February–March 2011. Either way, the scores represent values sufficiently high for the system to be considered for extended testing, tuning and preoperational trials. As a reference, the NAVDAS (3D-Var) preoperational trials had a scorecard value of +1, while the NAVDAS-AR (4D-Var) preoperational trials had a scorecard value of +4.

The NOGAPS scorecard value for the flow-dependent mode experiment is considerably different. Both the July–August 2010 and the February–March 2011 experiments had a total aggregate score of −10. Clearly running the system in the flow-dependent mode would not warrant operational implementation at this time. However, as can be seen in Figs. 13 and 14, there are areas that performed better in the flow-dependent mode experiment than in the hybrid mode experiment. For example, the flow-dependent mode experiment had greater than a 30% reduction in the tropical vector wind RMS errors against self-analysis for all forecast lead times (Figs. 13 and 14) for all levels 100 hPa and above. In contrast, the hybrid mode experiment (Figs. 5 and 6) had between 5% and 20% improvement for most of those levels and forecast lead times. For this reason, we hypothesize that a spatially varying value of α may yield the best results.

f. Aggregate Met Office scorecard

Figure 15 was created to facilitate intercomparison with Clayton et al. (2013)'s work (see Fig. 10 in their paper). This figure does not include all of the Met Office scorecard metrics [listed in Table 1 of Clayton et al. (2013)]. Some of the Met Office scorecard metrics presented in this paper can be seen as highlighted in green in Figs. 56, 910, and 1314. Comparison with Clayton et al.'s results shows that our comparison against self-analysis in the tropics is better than those obtained by Clayton et al. but is about the same in the extratropics. Presumably, the differences are due to some combination of differences in the ratio of the accuracies of the ensemble covariances to static covariancies in each of the systems. Notable differences between the two systems include (i) our ensemble has 80 members whereas theirs had 24 members; and (ii) Clayton et al. localized ensemble covariances of geopotential, streamfunction, and velocity potential whereas we localized ensemble covariances of wind and temperature. As shown by Kepert (2011), their localization approach is better at preserving quasigeostrophic balance. While quasigeostrophic balance is a fair assumption in the extratropics, its value in the tropics is unclear. Our approach to localization ensures that the variance of the wind field is preserved whereas the Kepert (2011)'s approach does not. As noted by Kepert, localizing streamfunction and velocity potential causes the modeled wind variance to be larger than the wind variance in the unlocalized ensemble. In the extratropics, the potentially damaging effects of spuriously increasing the wind variances are compensated for by improved quasigeostrophic balances. However, in the tropics, there is no compensating “balance benefit” and, hence, it is possible that localization of streamfunction, velocity potential, and geopotential is worse than localization of wind and temperature. More careful experimentation will be required to assess the validity of this hypothesis.

Fig. 15.
Fig. 15.

Truncated table of Met Office scorecard metrics. Bars represent percentage changes to RMS error comparing of static mode () and hybrid mode () experiments . Colors represent region averages: green is for the Northern Hemisphere (NH), red is for the tropics (TR), and blue is for the Southern Hemisphere (SH). Labels are formatted as region (NH, TR, or SH), variable (H = geopotential height, W = vector wind), level (500 = 500 hPa, 200 = 200 hPa, and 850 = 850 hPa); and forecast lead time in hours (T + 24 = 24 h, T + 48 = 48 h, and T + 72 = 72 h). (a),(b) Verification with radiosondes. (c),(d) Verification with self-analysis. (a),(c) Experiments from July to August 2010 time period and (b),(d) are experiments from February to March 2011 time period.

Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00182.1

To check the dynamic balance of the initial conditions obtained in our experiments we examined the forecast model's global RMS surface pressure tendency (SPT) averaged between 0000 UTC 1 July 2010–0000 UTC 1 September 2010 and 0000 UTC 1 February–0000 UTC 1 March 2011. We found that in all experiments, the SPT is at its highest in the first 3 h of the forecast integration. We take these elevated values to be a sign of imbalance and use the average of SPT over the first 3 h as a measure of imbalance. The flow-dependent mode SPT values were lower than the static mode SPT values. This result was found to be statistically significant with greater than 99% confidence. This suggests that the states produced by our localized ensemble covariances propagated by the TLM and adjoint were at least as balanced as those produced by our static initial covariance model.1 In considering this result, recall that the balances used in this static covariance are currently fairly simplistic (Daley and Barker 2001). Hopefully, further improvements to the balance of initial conditions from localized ensemble covariances could be obtained by using localization methods that are better designed to preserve balance such as those suggested by Kepert (2011). The SPT values of the hybrid mode were also less than the static mode values but this difference was not statistically significant at the 95% margin. Last, we checked to make sure the SPT values from the static mode was not statistically significantly different from the operational mode experiment with time window discretized into 0.5-h intervals and the digital filter turned on.

5. Conclusions

We have tested the performance of a new form of the Navy's NAVDAS-AR observation space data assimilation scheme. This new form replaces a purely static initial covariance matrix by a hybrid linear blend of a static and ensemble-based covariance matrix. Our results show that the hybrid mode () data assimilation scheme significantly reduces forecast error across a wide range of variables and regions compared to the static mode () system. The improvements were particularly pronounced for tropical winds. In the verification against radiosondes, the hybrid mode was statistically significantly better than the static mode with a greater than a 0.5% reduction in RMS vector wind error differences. In the verification against self-analysis we showed a greater than 1% reduction from verifying against analyses between 2- and 5-day lead time at all eight vertical levels examined in areas of statistical significance. The average improvement across geopotential height and vector winds at multiple levels and lead times, using a truncated Met Office scorecard, was a reduction of 2.6% RMS errors. These improvements are similar to results found at other operational centers, such as the Canadian Met Office (Buehner et al. 2010a,b), the Met Office (Clayton et al. 2013), and NCEP. In contrast to Buehner et al. (2010b), we found that using only the flow-dependent ensemble () led to an overall degradation in data assimilation performance for our system. We speculate that improvements to our ensemble generation scheme, increasing in the number of ensemble members, and improvements to our localization scheme would improve the relative performance of this flow-dependent case.

Acknowledgments

We thank all those people responsible for the development of NAVDAS-AR, in particular, we thank the late Roger Daley who first formulated and initiated the development of NAVDAS-AR. We would also like to thank Liang Xu who was the PI of the project that ultimately led to the transition of NAVDAS-AR into operations. Boon Chua, Tim Hogan, Ben Ruston, James Goerss, and Pat Pauley also made major contributions to NAVDAS-AR. This research was started while D. D. Kuhl held a National Research Council Research postdoctoral fellowship at the Naval Research Laboratory, Washington, D.C., and continued during his NRL Jerome Karl Fellowship award. T. Rosmond acknowledges support from PMW-120 under Program Element 0603207N. C. H. Bishop acknowledges support from Office of Naval Research base funding via Program Element 0601153N, Task BE033-03-45, and NOPP funding via Program Element 0602435N. NAVDAS-AR was originally developed with ONR and PMW-120 funding under NRL base Program Elements 0601153N and 0602435N.

REFERENCES

  • Bishop, C. H., and Z. Toth, 1999: Ensemble transformation and adaptive observations. J. Atmos. Sci., 56, 17481765.

  • Bishop, C. H., D. Hodyss, P. Steinle, H. Sims, A. M. Clayton, A. C. Lorenc, D. M. Barker, and M. Buehner, 2011: Efficient ensemble covariance localization in variational data assimilation. Mon. Wea. Rev., 139, 573580.

    • Search Google Scholar
    • Export Citation
  • Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010a: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part I: Description and single-observation experiments. Mon. Wea. Rev., 138, 15501566.

    • Search Google Scholar
    • Export Citation
  • Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010b: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part II: One-month experiments with real observations. Mon. Wea. Rev., 138, 15671586.

    • Search Google Scholar
    • Export Citation
  • Clayton, A. M., A. C. Lorenc, and D. M. Barker, 2013: Operational implementation of a hybrid ensemble/4D-Var global data assimilation system at the Met Office. Quart. J. Roy. Meteor. Soc., doi:10.1002/qj.2054, in press.

    • Search Google Scholar
    • Export Citation
  • Courtier, P., 1997: Dual formulation of four-dimensional variational assimilation. Quart. J. Roy. Meteor. Soc., 123, 24492461.

  • Courtier, P., J. N. Thepaut, and A. Hollingsworth, 1994: A strategy for operational implementation of 4D-Var, using an incremental approach. Quart. J. Roy. Meteor. Soc., 120, 13671387.

    • Search Google Scholar
    • Export Citation
  • Daley, R., and E. Barker, 2001: The NAVDAS sourcebook. Naval Research Laboratory NRL/PU/7530–01-441, 161 pp. [Available online at http://www.dtic.mil/dtic/tr/fulltext/u2/a396883.pdf.]

  • Dee, D., 2004: Variational bias correction of radiance data in the ECMWF system. Proc. Workshop on Assimilation of High Spectral Resolution Sounders in NWP, Reading, United Kingdom, ECMWF, 97–112.

  • Derber, J. C., and W. S. Wu, 1998: The use of TOVS cloud-cleared radiances in the NCEP SSI analysis system. Mon. Wea. Rev., 126, 22872299.

    • Search Google Scholar
    • Export Citation
  • Etherton, B. J., and C. H. Bishop, 2004: Resilience of hybrid ensemble/3DVAR analysis schemes to model error and ensemble covariance error. Mon. Wea. Rev., 132, 10651080.

    • Search Google Scholar
    • Export Citation
  • Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 (C5), 10 14310 162.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter–3D variational analysis scheme. Mon. Wea. Rev., 128, 29052919.

  • Hamill, T. M., J. S. Whitaker, D. T. Kleist, M. Fiorino, and S. G. Benjamin, 2011: Predictions of 2010's tropical cyclones using the GFS and ensemble-based data assimilation methods. Mon. Wea. Rev., 139, 32433247.

    • Search Google Scholar
    • Export Citation
  • Harris, B., and G. Kelly, 2001: A satellite radiance-bias correction scheme for data assimilation. Quart. J. Roy. Meteor. Soc., 127, 14531468.

    • Search Google Scholar
    • Export Citation
  • Hogan, T. F., T. Rosmond, and R. Gelaro, 1991: The NOGAPS forecast model: A technical description. Naval Research Laboratory AD–A247 216, 218 pp. [Available online at http://handle.dtic.mil/100.2/ADA247216.]

  • Houtekamer, P. L., H. L. Mitchell, G. Pellerin, M. Buehner, M. Charron, L. Spacek, and M. Hansen, 2005: Atmospheric data assimilation with an ensemble Kalman filter: Results with real observations. Mon. Wea. Rev., 133, 604620.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., H. L. Mitchell, and X. X. Deng, 2009: Model error representation in an operational ensemble Kalman filter. Mon. Wea. Rev., 137, 21262143.

    • Search Google Scholar
    • Export Citation
  • Kepert, J. D., 2011: Balance-aware covariance localisation for atmospheric and oceanic ensemble Kalman filters. Comput. Geosci., 15, 239250.

    • Search Google Scholar
    • Export Citation
  • Kleist, D. T., D. F. Parrish, J. C. Derber, R. Treadon, W. S. Wu, and S. Lord, 2009: Introduction of the GSI into the NCEP Global Data Assimilation System. Wea. Forecasting, 24, 16911705.

    • Search Google Scholar
    • Export Citation
  • Lewis, J. M., and J. Derber, 1985: The use of adjoint equations to solve a variational adjustment problem with advective constraints. Tellus, 37A, 309322.

    • Search Google Scholar
    • Export Citation
  • Lorenc, A. C., 1981: A global three-dimensional multivariate statistical interpolation scheme. Mon. Wea. Rev., 109, 701721.

  • Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP— A comparison with 4D-Var. Quart. J. Roy. Meteor. Soc., 129, 31833203.

    • Search Google Scholar
    • Export Citation
  • McLay, J. G., C. H. Bishop, and C. A. Reynolds, 2008: Evaluation of the ensemble transform analysis perturbation scheme at NRL. Mon. Wea. Rev., 136, 10931108.

    • Search Google Scholar
    • Export Citation
  • McLay, J. G., C. H. Bishop, and C. A. Reynolds, 2010: A local formulation of the ensemble transform (ET) analysis perturbation scheme. Wea. Forecasting, 25, 985993.

    • Search Google Scholar
    • Export Citation
  • Rosmond, T., and L. Xu, 2006: Development of NAVDAS-AR: Non-linear formulation and outer loop tests. Tellus, 58A, 4558.

  • Toth, Z., and E. Kalnay, 1997: Ensemble forecasting at NCEP and the breeding method. Mon. Wea. Rev., 125, 32973319.

  • Wang, X. G., C. Snyder, and T. M. Hamill, 2007: On the theoretical equivalence of differently proposed ensemble–3DVAR hybrid analysis schemes. Mon. Wea. Rev., 135, 222227.

    • Search Google Scholar
    • Export Citation
  • Wang, X. G., D. Parrish, D. Kleist, and J. Whitaker, 2013: GSI 3DVar-based ensemble-variational hybrid data assimilation for NCEP Global Forecast System: Single resolution experiments. Mon. Wea. Rev., in press.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2011: Statistical Methods in the Atmospheric Sciences. 3rd ed. Elsevier Academic Press, 704 pp.

  • Xu, L., T. Rosmond, and R. Daley, 2005: Development of NAVDAS-AR: Formulation and initial tests of the linear problem. Tellus, 57A, 546559.

    • Search Google Scholar
    • Export Citation
  • Zhang, F. Q., M. Zhang, and J. A. Hansen, 2009: Coupling ensemble Kalman filter with four-dimensional variational data assimilation. Adv. Atmos. Sci., 26, 18.

    • Search Google Scholar
    • Export Citation
  • Zhang, F. Q., M. Zhang, and J. Poterjoy, 2013: E3DVar: Coupling an ensemble Kalman filter with three-dimensional variational data assimilation in a limited-area weather prediction model and comparison to E4DVar. Mon. Wea. Rev., 141, 900917.

    • Search Google Scholar
    • Export Citation
  • Zhang, M., and F. Q. Zhang, 2012: E4DVar: Coupling an ensemble Kalman filter with four-dimensional variational data assimilation in a limited-area weather prediction model. Mon. Wea. Rev., 140, 587600.

    • Search Google Scholar
    • Export Citation
1

Experiments not reported on here suggest that the use of the TLM and adjoint in the 4D covariance model improves the balance of the analyzed states.

Save