1. Introduction
Data assimilation (DA) methods incorporating ensemble-derived background error covariances (BECs), such as the ensemble Kalman filter (EnKF; Evensen 1994; Burgers et al. 1998; Houtekamer and Mitchell 1998), have become popular alternatives to variational DA approaches. In contrast to the static, isotropic BECs employed in three-dimensional variational (3DVAR; e.g., Parrish and Derber 1992; Lorenc et al. 2000; Barker et al. 2004) DA, EnKFs compute multivariate flow-dependent BECs from an ensemble of short-term forecasts that represent “errors of the day.” While four-dimensional variational (4DVAR; e.g., Huang et al. 2009) techniques allow BECs to evolve implicitly through tangent linear and adjoint models, use of fixed BECs at the start of each 4DVAR DA cycle represents a significant limitation.
There are many EnKF flavors, including the ensemble adjustment Kalman filter (EAKF; Anderson 2001, 2003), ensemble square root Kalman filter (EnSRF; Whitaker and Hamill 2002), and ensemble transform Kalman filter (ETKF; Bishop et al. 2001). While the precise formulations of these filters differ, they all use short-term ensemble forecasts to compute flow-dependent BECs. Numerous experiments have shown that EnKF-initialized forecasts can produce comparable or better forecasts than 3DVAR-initialized forecasts for a variety of meteorological applications (e.g., Meng and Zhang 2008a,b; Whitaker et al. 2008; Zhang et al. 2011; Zhang et al. 2013), including tropical cyclone (TC) forecasting (e.g., Torn and Hakim 2009; Zhang et al. 2009a; Torn 2010; Hamill et al. 2011a,b). Additionally, the EnKF can initialize forecasts that are competitive with 4DVAR-initialized forecasts (e.g., Buehner et al. 2010a,b; Miyoshi et al. 2010; Zhang et al. 2011; Zhang et al. 2013).
Ensemble-produced BECs can also be incorporated within a variational framework in a “hybrid” variational-ensemble DA approach. Hamill and Snyder (2000) initially proposed a hybrid method that expressed the full BECs as a linear combination of static and ensemble-based contributions in the 3DVAR cost function. Wang et al. (2007b) proved this initial formulation was equivalent to a later hybrid approach that incorporated ensemble BECs into the 3DVAR cost function using extended control variables (Lorenc 2003; Buehner 2005). The hybrid algorithm has also been introduced into 4DVAR systems (e.g., Zhang et al. 2009b; Clayton et al. 2012; Zhang and Zhang 2012). In all the various hybrid paradigms, adjustable parameters determine how much the BECs are weighted toward the static and ensemble contributions.
Hybrid techniques are attractive for several reasons. For example, the hybrid formulation can be easily implemented in pre-existing variational DA systems. Additionally, Zhang et al. (2013) and Wang et al. (2007a) suggested that hybrid techniques may produce similar results as an EnKF but with a substantially smaller ensemble. Moreover, the ensemble component of the hybrid can be at coarser resolution than the deterministic hybrid analysis, permitting higher-resolution analyses without the steep computational cost of a high-resolution ensemble [dual-resolution EnKFs are also possible (e.g., Rainwater and Hunt 2013)]. Furthermore, as the hybrid employs model-space covariance localization, assimilation of nonlocal observations, such as satellite radiances, may be more effective in hybrid frameworks than in EnKFs that use observation-space localization (Campbell et al. 2010).
Several studies have examined hybrid DA schemes in both simple (Hamill and Snyder 2000; Etherton and Bishop 2004; Wang et al. 2007a, 2008a, 2009; Zhang et al. 2009b) and numerical weather prediction (NWP) model settings (Buehner 2005; Buehner et al. 2010a,b; Wang et al. 2008b; Hamill et al. 2011b; Wang 2011, hereafter W11; Zhang and Zhang 2012; Wang et al. 2013; Zhang et al. 2013). Most of these studies have shown that hybrid approaches yield comparable or better forecasts than purely variational methods that do not incorporate ensemble BECs.
Recently, the hybrid has been used to study TCs in global (Hamill et al. 2011b) and regional (W11) modeling systems assimilating real observations. Hamill et al. (2011b) found that the hybrid produced statistically significantly better TC track forecasts than those initialized from 3DVAR analyses and comparable track forecasts to those initialized from mean EnSRF analyses. Similarly, W11 found that the flow-dependent BECs utilized in a limited-area hybrid DA setting were responsible for improved TC track forecasts compared to experiments initialized with 3DVAR DA.
W11 did not employ a TC relocation approach (e.g., Kurihara et al. 1995; Liu et al. 2000; Hsiao et al. 2010) to improve the TC initialization in their regional hybrid configuration. However, Hsiao et al. (2010) found that performing TC relocation before 3DVAR analyses improved limited-area Weather Research and Forecasting Model (WRF; Skamarock et al. 2008) TC track forecasts. The usefulness of TC relocation within a hybrid DA framework has not been previously examined. Furthermore, Hsiao et al. (2012) showed that using three outer loops (OLs; Courtier et al. 1994) in 3DVAR minimization dramatically improved TC track forecasts and lessened model biases, but the impact of employing multiple OLs has also not been examined in regional hybrid configurations [Wang et al. (2013) noted little difference between using one and two OLs in a global hybrid system].
This study again investigates TC track forecasts within a limited-area hybrid DA system, similar to W11. However, this work differs from W11 in several important ways. First, this work investigates TC relocation and use of multiple OLs in a limited-area hybrid framework; previously unexplored topics. Second, while W11 cycled continuously for two separate one-week periods, here the hybrid and 3DVAR systems were continuously cycled for a ~3.5-week period and produced 78 (total) 72-h forecasts that examined Typhoons Jangmi, Sinlaku, and Hagupit (2008). Moreover, here an EAKF (Anderson 2001, 2003; Liu et al. 2012) from the Data Assimilation Research Testbed (DART; Anderson et al. 2009) software was employed to update the ensemble, whereas W11 used an ETKF to update ensemble perturbations.
Section 2 details the forecast model and DA configurations while section 3 describes the experimental design. Results are presented in section 4, and a discussion of the impact of multiple OLs in hybrid and 3DVAR systems is provided in section 5. Section 6 assesses the impact of TC relocation on the forecasts before conclusions are presented in section 7.
2. Model and DA configurations
a. Forecast model
Weather forecasts were produced by version 3.3.1 of the nonhydrostatic Advanced Research WRF (ARW-WRF, hereafter WRF; Skamarock et al. 2008). All experiments ran over a computational domain encompassing the western Pacific Ocean and eastern Asia (Fig. 1). The horizontal grid spacing was 45 km, the time step was 180 s, and the domain spanned 222 × 128 grid points in the east–west and north–south directions, respectively. There were 45 vertical levels with a 30-hPa top. The following physical parameterizations were used: the Goddard microphysics scheme (Tao and Simpson 1993; Tao et al. 2003); the Rapid Radiative Transfer Model (RRTM) longwave (Mlawer et al. 1997) and Goddard shortwave (Chou and Suarez 1994) radiation schemes; the Yonsei University (YSU) boundary layer scheme (Hong et al. 2006); the Noah land surface model (Chen and Dudhia 2001); and Kain–Fritsch cumulus parameterization (Kain and Fritsch 1990, 1993; Kain 2004) with a modified trigger function (Ma and Tan 2009). Lateral boundary conditions (LBCs) were supplied from the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS) forecasts.

Snapshot of observations available for assimilation during the 0000 UTC 12 Sep analysis.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

Snapshot of observations available for assimilation during the 0000 UTC 12 Sep analysis.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Snapshot of observations available for assimilation during the 0000 UTC 12 Sep analysis.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
b. Data assimilation systems
The hybrid and 3DVAR algorithms in version 3.3.1 of the WRF data assimilation (WRFDA; Barker et al. 2012) software were used. WRFDA employs five control variables: streamfunction, pseudo relative humidity, and unbalanced velocity potential, temperature, and surface pressure. The WRFDA-3DVAR algorithm is described in Barker et al. (2004) and the hybrid implementation in WRFDA follows the extended control variable approach (Wang et al. 2008a). In the following paragraphs we provide descriptions of the WRFDA 3DVAR and hybrid formulations for multiple OL applications.




















Static BECs used in the 3DVAR and hybrid algorithms were constructed using the National Meteorological Center (NMC) method (Parrish and Derber 1992) from WRF forecasts produced over this domain for multiple months and used operationally by the Taiwan Central Weather Bureau (CWB). Empirically determined multiplicative weightings (Guo et al. 2006; Table 1) used by the CWB modified the initially computed static BEC standard deviations and length scales. As the magnitudes of the multiplicative weightings generally decreased each OL, the effective BEC standard deviations and length scales were reduced each OL. Thus, more emphasis was placed on the guess field and the observational content was spread over a smaller area in successive OLs. These identical static BEC tunings were used in all 3DVAR and hybrid experiments (section 3). Zhang et al. (2013) also inflated their background error standard deviations and achieved better results compared to using uninflated default statistics.
Empirical multiplicative tuning factors used to modify the static background error standard deviations and length scales in each OL for the different control variables, from Guo et al. (2006) and used operationally by the CWB.


The hybrid uses an ensemble of short-term forecasts to incorporate flow-dependent BECs in the variational cost function and it is necessary to update the ensemble when new observations are available. Here, the EAKF from the DART was used to update a 32-member WRF-based ensemble. To reduce spurious correlations due to sampling error, localization forced EAKF analysis increments to zero 1280 km from an observation in the horizontal and ~10 km in the vertical. Comparable values were used in DART to study TCs in previous work (Liu et al. 2012). Adaptive inflation (Anderson 2009) was employed to maintain ensemble spread, with the inflation applied immediately before computation of prior model-simulated observations. A stochastic kinetic-energy backscatter scheme (SKEBS; Shutts 2005; Berner et al. 2009) was applied during WRF advances between each EAKF analysis to further preserve spread.
Localization was also applied in the hybrid to limit the spatial extent of the ensemble contribution to the analysis increments. Horizontal localization of approximately the same length scale in DART was applied in the hybrid. Vertical localization in the hybrid varied with height (Fig. 2) and was implemented using EOFs configured such that the vertical length scales increased with height.

Vertical correlation of model levels 1, 10, 20, 30, and 40 with all model levels (y axis) used to localize the ensemble contribution to the BECs.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

Vertical correlation of model levels 1, 10, 20, 30, and 40 with all model levels (y axis) used to localize the ensemble contribution to the BECs.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Vertical correlation of model levels 1, 10, 20, 30, and 40 with all model levels (y axis) used to localize the ensemble contribution to the BECs.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
The hybrid BECs were weighted 75% toward the ensemble contribution and 25% toward the static (i.e., 3DVAR) component. We also weighted the BECs 50%–50% and 25%–75% toward the ensemble–static contribution and achieved similar results. Zhang et al. (2013), Wang et al. (2013), and W11 also noticed limited sensitivity to the BEC weightings in their real-data hybrid studies.
c. Observations
Observations taken within ±3 h of each analysis were assimilated and all observations were assumed to be valid at the analysis time. Surface mass and wind observations from synoptic observation (SYNOP), aviation routine weather report (METAR), ship, and buoy platforms were assimilated. Upper-air, radiosonde, aircraft, infrared and water vapor channel satellite-tracked wind, Quick Scatterometer (QuikSCAT) wind over water, and Global Positioning System Radio Occultation (GPSRO) refractivity observations were also assimilated. Furthermore, “bogus” TC observations (Hsiao et al. 2010) at several vertical levels and ~30 positions surrounding each TC were assimilated.
A typical distribution of observations available for assimilation is shown in Fig. 1. Over the ocean, satellite-derived wind observations provided the bulk of available data. Aircraft observations were primarily taken over eastern China. Many surface and radiosonde observations were available for assimilation over land. GPSRO refractivity observations were scattered throughout the domain. Bogus TC observations were distributed around Typhoon Sinlaku, and a similar spatial distribution of TC bogus observations was used at other times.
Observations were subject to various forms of quality control. These checks excluded observations outside the model domain and above the model top and chose the observation nearest the analysis time at stations where multiple observations were received during the ±3-h time window. Observational availability and preprocessing was identical for the hybrid and 3DVAR experiments. Additionally, an “outlier check” was applied before each OL of the variational minimization procedure in the hybrid and 3DVAR experiments. Specifically, an observation was not assimilated in the 3DVAR and hybrid experiments if its innovation exceeded 5σo, where σo is the observation error standard deviation.
The DART system assimilated a different observational set and performed preprocessing procedures following Torn (2010) and Liu et al. (2012), who both used DART to successfully simulate TCs. Satellite wind and aircraft observations were “superobbed” in 100 km × 100 km × 25 hPa boxes, only pressure observations from surface platforms were assimilated, and satellite winds were not assimilated over land. Furthermore, no QuikSCAT observations were assimilated and only 700-hPa TC bogus observations of wind and relative humidity were assimilated. A different outlier check was applied in DART compared to that in the 3DVAR and hybrid experiments to account for ensemble spread. As in Liu et al. (2012), an outlier check was applied in DART that rejected an observation if the ensemble mean innovation was greater than 3 times the square root of the sum of
3. Experimental design
Four initial experiments were performed. Two utilized the 3DVAR DA method and were configured identically, except one used 3 OLs (3DVAR-3OL) and the other 1 OL (3DVAR-1OL) during minimization. The other experiments employed the hybrid DA approach and corresponded to the 3DVAR experiments, with one using 3 OLs (HY-3OL) and the other 1 OL (HY-1OL).
All experiments began at 0000 UTC 4 September by interpolating the deterministic 0.5° × 0.5° GFS analysis onto the same computational domain (Fig. 1). The initial ensemble was constructed at this time by taking Gaussian random draws with zero mean and static BECs (Torn et al. 2006) and adding them to the GFS analysis. LBCs for the ensemble system were perturbed similarly.
The deterministic and ensemble fields produced at 0000 UTC 4 September initialized 6-h WRF forecasts, which served as backgrounds for the first 3DVAR, hybrid, and EAKF analyses at 0600 UTC 4 September. Thereafter, the EAKF, 3DVAR, and hybrid configurations cycled continuously until 0000 UTC 28 September, with a new analysis every 6 h. The background for DA was always the previous cycle's 6-h forecast. New 72-h WRF forecasts were initialized every 6 h starting at 1800 UTC 8 September and ending at 0000 UTC 28 September (78 forecasts). Digital filter initialization (DFI; Lynch and Huang 1992, Huang and Lynch 1993) using a twice-DFI scheme and the Dolph filter (Lynch 1997) with a 2-h backward integration was applied to all 72-h forecasts, but not during the 6-h cycling between analyses. Ancell (2012) showed that DFI, when applied to model advances between EnKF analysis cycles, yielded no improvement compared to when DFI was not applied.
The procedure for the cycling hybrid system was similar, but not identical to, that in Zhang et al. (2013) and is illustrated in Fig. 3. Given a background at an analysis time T, separate EAKF and hybrid analyses were performed. The hybrid utilized the ensemble valid at T to incorporate flow-dependent BECs into the variational framework. After each EAKF analysis, a 6-h ensemble forecast produced the background ensemble at T + 6. Similarly, the deterministic hybrid analysis was integrated to T + 6 and served as the background for the next hybrid analysis. There was no interaction between the hybrid and ensemble systems in this uncoupled configuration. Alternatively, as in Zhang et al. (2013), the analysis ensemble could be recentered about the hybrid analysis, which, while shifting the ensemble mean, preserves the perturbations about the mean. We tested the hybrid in this coupled approach but did not obtain substantially different results, similar to findings of Wang et al. (2013). In a dual-resolution hybrid configuration, where the ensemble is at coarser resolution than the deterministic hybrid, a coupled system is likely more essential since model errors and biases due to different resolutions may be quite different.

Flowchart describing the cycling EAKF and hybrid systems.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

Flowchart describing the cycling EAKF and hybrid systems.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Flowchart describing the cycling EAKF and hybrid systems.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
The cycling 3DVAR procedure was identical to the deterministic hybrid cycling (bottom circuit in Fig. 3), except that the BECs were purely static.
4. Results
Model output was assessed using a variety of metrics. TC track forecasts were verified and model fields were compared to observations, including dropwindsondes. Aspects of the ensemble forecasts were also examined since they are important inputs to the hybrid. The first ~5 days of the simulations were excluded from all verification statistics to allow ample time for the ensemble to “spin up” from the initial, randomly generated ensemble and to coincide with formation of Typhoon Sinlaku (Table 2).
a. Ensemble performance
Since the hybrid algorithm incorporated flow-dependent BECs produced by the cycling WRF–EAKF ensemble system, it is important to assess the ensemble performance. In a well-calibrated system, when compared to observations the ensemble mean root-mean-square error (RMSE) will equal the total spread, defined as the square root of the sum of the observation error variance and ensemble variance of the simulated observations (Houtekamer et al. 2005).
The 6-h forecast bias, RMSE, and total spread aggregated between 1800 UTC 8 September and 0000 UTC 28 September are shown in Fig. 4 for radiosonde observations. These 6-h forecast ensembles were directly used as input to the hybrid. The ensemble mean biases versus wind (temperature) observations were less than 0.5 m s−1 (K) at most levels (Figs. 4a,b). A slight dry bias was noted for 700-hPa specific humidity (Fig. 4c). Other than for wind and temperature between ~ 250 and 500 hPa, the spread was insufficient. This finding suggests more parameterization of model error beyond SKEBS may be needed.

Average 6-h bias, total spread, and RMSE of radiosonde (a) temperature (K), (b) horizontal wind (m s−1), and (c) specific humidity (g kg−1) between 1800 UTC 8 Sep and 0000 UTC 28 Sep for selected pressure levels. The sample size at each pressure level is shown at the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

Average 6-h bias, total spread, and RMSE of radiosonde (a) temperature (K), (b) horizontal wind (m s−1), and (c) specific humidity (g kg−1) between 1800 UTC 8 Sep and 0000 UTC 28 Sep for selected pressure levels. The sample size at each pressure level is shown at the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Average 6-h bias, total spread, and RMSE of radiosonde (a) temperature (K), (b) horizontal wind (m s−1), and (c) specific humidity (g kg−1) between 1800 UTC 8 Sep and 0000 UTC 28 Sep for selected pressure levels. The sample size at each pressure level is shown at the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Observations in areas with large ensemble standard deviations (spread) were most likely to have an impact in the EAKF and hybrid. There was less forecast uncertainty where spread was small, and observations were less likely to influence the analyses in these locales compared to regions with greater spread.
Ensemble spread of 500-hPa potential temperature and wind speed, aggregated over all 6-h forecasts valid at 0000 (Figs. 5a,b) and 1200 (Figs. 5c,d) UTC, reveal interesting patterns that reflect aspects of the meteorological conditions and observation locations. Spread was greatest over the Tibetan Plateau, where few observations were available to constrain the model. Conversely, spread was smallest over eastern China—where observations were plentiful. The spread was also small over the Pacific Ocean where the subtropical high was located and predicted confidently. A local spread maximum was evident in both 500-hPa wind and potential temperature southeast of Taiwan, where all three TCs moved, reflecting the uncertainty of TC prediction. Patterns were similar at 0000 and 1200 UTC and at other pressure levels (not shown).

Mean 6-h forecast ensemble standard deviation (spread) of 500-hPa (a),(c) potential temperature (K) and (b),(d) wind speed (m s−1) valid at (a),(b) 0000 and (c),(d) 1200 UTC.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

Mean 6-h forecast ensemble standard deviation (spread) of 500-hPa (a),(c) potential temperature (K) and (b),(d) wind speed (m s−1) valid at (a),(b) 0000 and (c),(d) 1200 UTC.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Mean 6-h forecast ensemble standard deviation (spread) of 500-hPa (a),(c) potential temperature (K) and (b),(d) wind speed (m s−1) valid at (a),(b) 0000 and (c),(d) 1200 UTC.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
b. TC track forecasts
Best track positions from the Taiwan CWB were used to verify TC forecasts. The DART forward operator was used to diagnose TC position using 800-hPa circulation (Cavallo et al. 2013). Track error statistics were computed from WRF forecasts initialized during the lifespan of each TC (Table 2). Sometimes the experiments failed to predict a TC, and different experiments missed different storms. Performing homogeneous comparisons based solely on storms that all experiments successfully predicted markedly decreased sample sizes. Thus, inhomogeneous comparisons amongst the experiments were employed to compare TC track forecasts.
Life spans of each TC.


The best-track positions of each storm are shown in Fig. 6a, and the initial positions analyzed by each experiment every 6 h between 1800 UTC 8 September and 0000 UTC 28 September are overlaid in Fig. 6b. The HY-1OL, HY-3OL, and 3DVAR-3OL initial TC positions agreed well with observations throughout the period, and, in fact, it is difficult to distinguish these three experiments from each other and observations in Fig. 6b. However, 3DVAR-1OL initial TC positions were often inconsistent with the observations, despite assimilation of bogus data.

(a) Best-track positions of the three TCs. Locations are plotted every 6 h. See Table 2 for the starting and ending times of each storm. (b) Analyzed TC positions for each experiment and the corresponding observations every 6 h between 1800 UTC 8 Sep and 0000 UTC 28 Sep.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

(a) Best-track positions of the three TCs. Locations are plotted every 6 h. See Table 2 for the starting and ending times of each storm. (b) Analyzed TC positions for each experiment and the corresponding observations every 6 h between 1800 UTC 8 Sep and 0000 UTC 28 Sep.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
(a) Best-track positions of the three TCs. Locations are plotted every 6 h. See Table 2 for the starting and ending times of each storm. (b) Analyzed TC positions for each experiment and the corresponding observations every 6 h between 1800 UTC 8 Sep and 0000 UTC 28 Sep.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Figure 7 shows the mean absolute track errors for each TC and the sample size at each forecast hour for each experiment. The hybrid configurations produced nearly identical track errors for Sinlaku (Fig. 7a) throughout the forecast period that were similar to 3DVAR-3OL errors. The 3DVAR-1OL had the largest track errors for Sinlaku at all times and missed the most storms (Fig. 7b). For Hagupit, 3DVAR-1OL again missed the most storms and produced the worst track errors for most times (Figs. 7c,d). Before ~24 h, the two hybrid experiments' errors were similar; thereafter, HY-1OL yielded lower errors than HY-3OL. The 3DVAR-3OL produced lower track errors than 3DVAR-1OL for most times but larger errors than both hybrid experiments. For Jangmi, the 3DVAR-1OL again yielded the worst track errors for all times and missed the most storms (Figs. 7e,f). The hybrid experiments performed similarly and 3DVAR-3OL was best after ~24 h.

Mean 0–72-h (left) absolute track errors (km) and (right) sample sizes for (a),(b) Sinlaku; (c),(d) Hagupit; and (e),(f) Jangmi.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

Mean 0–72-h (left) absolute track errors (km) and (right) sample sizes for (a),(b) Sinlaku; (c),(d) Hagupit; and (e),(f) Jangmi.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Mean 0–72-h (left) absolute track errors (km) and (right) sample sizes for (a),(b) Sinlaku; (c),(d) Hagupit; and (e),(f) Jangmi.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
The track errors averaged over the three TCs (Fig. 8) reflect error statistics of the individual storms. The hybrid configuration with one OL was superior to 3DVAR-1OL, corroborating W11. But, when the 3DVAR experiment was configured with three OLs, forecasts were dramatically better, confirming Hsiao et al. (2012), who obtained similar results while examining these same TCs with cycling one and three OL 3DVAR configurations. In fact, the 3DVAR-3OL mean errors averaged over all TCs were comparable to those of the hybrid experiments. The hybrid TC track errors were insensitive to the number of OLs, as the two hybrid experiments performed similarly on average. A discussion regarding the impact of multiple OLs in the 3DVAR and hybrid experiments is provided in section 5.

As in Fig. 7, but for track errors averaged over the three TCs and the total sample size.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

As in Fig. 7, but for track errors averaged over the three TCs and the total sample size.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
As in Fig. 7, but for track errors averaged over the three TCs and the total sample size.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
c. Verification versus conventional observations
To assess large-scale performance, model output was also verified against various observations at several forecast times. A common observational set was used to verify all experiments. To reduce data volume, verification versus satellite-tracked winds was performed against superobs produced as described in section 2c. Rather than verifying wind components separately, the total horizontal wind, defined as the square root of the sum of the squares of the zonal and meridional wind components, were verified. Statistics were aggregated over 78 forecasts initialized every 6 h between 1800 UTC 8 September and 0000 UTC 28 September.
We first examine the average RMSE and bias for 6-h forecasts to measure the quality of the background fields (Fig. 9). At 6 h, RMSEs for specific humidity were similar among all four experiments, and 3DVAR-1OL had the worst RMSEs compared to satellite-tracked and radiosonde winds, radiosonde temperature, and GPSRO refractivity observations below ~7 km. HY-1OL always had better or similar RMSEs compared to 3DVAR-1OL. There were no major differences between the two hybrid experiments' biases and RMSEs for all variables, although RMSEs were slightly lower with respect to radiosonde and satellite-tracked winds in HY-3OL than HY1-OL. The 3DVAR-3OL performed similarly to the hybrid configurations.

RMSE (solid lines) and bias (dashed lines) for verification vs (a) radiosonde temperature (K), (b) radiosonde horizontal wind (m s−1), (c) radiosonde specific humidity (g kg−1), (d) GPSRO refractivity (N units), and (e) satellite-tracked wind (m s−1) observations aggregated over all 6-h forecasts initialized between 1800 UTC 8 Sep and 0000 UTC 28 Sep. The sample size at each level is listed to the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

RMSE (solid lines) and bias (dashed lines) for verification vs (a) radiosonde temperature (K), (b) radiosonde horizontal wind (m s−1), (c) radiosonde specific humidity (g kg−1), (d) GPSRO refractivity (N units), and (e) satellite-tracked wind (m s−1) observations aggregated over all 6-h forecasts initialized between 1800 UTC 8 Sep and 0000 UTC 28 Sep. The sample size at each level is listed to the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
RMSE (solid lines) and bias (dashed lines) for verification vs (a) radiosonde temperature (K), (b) radiosonde horizontal wind (m s−1), (c) radiosonde specific humidity (g kg−1), (d) GPSRO refractivity (N units), and (e) satellite-tracked wind (m s−1) observations aggregated over all 6-h forecasts initialized between 1800 UTC 8 Sep and 0000 UTC 28 Sep. The sample size at each level is listed to the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Figure 10 shows the mean statistics at the analysis time to measure how closely the analyses fit the observations. Closer fits to wind and GPSRO refractivity observations in both the 3DVAR and hybrid configurations were realized when three OLs were employed, as evidenced by the mean RMSEs. Differences were larger between the 3DVAR experiments. The differences between the hybrid configurations at the analysis were notably larger than those at 6 h (Fig. 9), suggesting only a slight gain in value from fitting the observations more closely in HY-3OL. Contrarily, the closer fit to observations in 3DVAR-3OL yielded a large benefit at 6 h compared to 3DVAR-1OL. Analysis fits to temperature and specific humidity were relatively insensitive to the OLs. Differences between HY-1OL and HY-3OL fits to temperature were nearly indistinguishable and at most ~0.1 K (g kg−1) difference separated the 3DVAR-1OL and 3DVAR-3OL temperature (specific humidity) RMSEs. Bias characteristics were more similar among the experiments than RMSEs.

As in Fig. 9, but for the mean analysis fits to observations.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

As in Fig. 9, but for the mean analysis fits to observations.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
As in Fig. 9, but for the mean analysis fits to observations.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
At 36 h (Figs. 11a–c), the 3DVAR-1OL produced slightly worse specific humidity forecasts between 850 and 400 hPa than the other three experiments. Additionally, 3DVAR-3OL still produced lower RMSEs than 3DVAR-1OL for radiosonde temperature and wind, and HY-1OL also yielded better wind and temperature forecasts than 3DVAR-1OL at this time. RMSEs and biases of the two hybrid configuration were similar and comparable to 3DVAR-3OL. Biases and RMSEs compared to satellite wind superobs were similar to those at 6 h, with 3DVAR-1OL worst and similar performance of the other three experiments (not shown).

As in Fig. 9, but for (a)–(c) 36-h and (d)–(f) 72-h forecasts of (top to bottom) radiosonde temperature (K), horizontal wind (m s−1), and specific humidity (g kg−1).
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

As in Fig. 9, but for (a)–(c) 36-h and (d)–(f) 72-h forecasts of (top to bottom) radiosonde temperature (K), horizontal wind (m s−1), and specific humidity (g kg−1).
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
As in Fig. 9, but for (a)–(c) 36-h and (d)–(f) 72-h forecasts of (top to bottom) radiosonde temperature (K), horizontal wind (m s−1), and specific humidity (g kg−1).
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
By 72 h (Figs. 11d–f), the relationship between the experiments was the same at 36 h, except the performance of 3DVAR-1OL worsened for radiosonde temperature and specific humidity compared to the other three configurations. Additionally, while there were few differences between the four experiments' RMSEs and biases versus GPSRO observations at 36 h, at 72 h, 3DVAR-1OL performed worse than the other three configurations, as at 6 h (not shown).
d. Verification versus dropwindsonde observations
The forecasts were also compared to dropwindsondes released from aircraft during The Observing System Research and Predictability Experiment (THORPEX) Pacific Asian Regional Campaign (T-PARC; Elsberry and Harr 2008; Wang et al. 2010). These observations were not assimilated and provide an independent dataset for model validation. Many of these dropwindsondes sampled near-storm environments and represent mesoscale conditions surrounding the TCs. A total of 664 dropwindsonde profiles contained quality-controlled observations suitable for verification over the domain between 1800 UTC 8 September and 0000 UTC 28 September (Fig. 12), and we focus on bias and RMSE aggregated over this period.

Locations of each dropwindsonde sounding between 1800 UTC 8 Sep and 0000 UTC 28 Sep (inclusive) that was used for verification.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

Locations of each dropwindsonde sounding between 1800 UTC 8 Sep and 0000 UTC 28 Sep (inclusive) that was used for verification.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Locations of each dropwindsonde sounding between 1800 UTC 8 Sep and 0000 UTC 28 Sep (inclusive) that was used for verification.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Large wind errors were evident throughout the 72-h forecast (Fig. 13). Initially, 3DVAR-1OL had the worst RMSEs for wind, temperature, and specific humidity (Figs. 13a–c). The HY-1OL had lower RMSEs than 3DVAR-1OL, and the 3DVAR-3OL RMSEs were similar to those of the hybrid experiments. Biases of the 3DVAR-3OL and two hybrid experiments were generally similar. For some variables and levels, 3DVAR-1OL had the smallest biases (e.g., wind between ~400 and 200 hPa) but for others, the largest (e.g., wind below ~400 hPa, temperature below ~700 hPa, midtropospheric moisture). Biases and RMSEs at 6 h were quite similar, though the magnitudes of the RMSEs increased (not shown).

Aggregate RMSE (solid lines) and bias (dashed lines) for forecast verification vs dropwindsonde (a),(d),(g) temperature (K); (b),(e),(h) horizontal wind (m s−1); and (c),(f),(j) specific humidity (g kg−1) for (top to bottom) 0-, 36-, and 72-h forecasts initialized between 1800 UTC 8 Sep and 0000 UTC 28 Sep for selected pressure levels. The sample size at each level is listed to the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

Aggregate RMSE (solid lines) and bias (dashed lines) for forecast verification vs dropwindsonde (a),(d),(g) temperature (K); (b),(e),(h) horizontal wind (m s−1); and (c),(f),(j) specific humidity (g kg−1) for (top to bottom) 0-, 36-, and 72-h forecasts initialized between 1800 UTC 8 Sep and 0000 UTC 28 Sep for selected pressure levels. The sample size at each level is listed to the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Aggregate RMSE (solid lines) and bias (dashed lines) for forecast verification vs dropwindsonde (a),(d),(g) temperature (K); (b),(e),(h) horizontal wind (m s−1); and (c),(f),(j) specific humidity (g kg−1) for (top to bottom) 0-, 36-, and 72-h forecasts initialized between 1800 UTC 8 Sep and 0000 UTC 28 Sep for selected pressure levels. The sample size at each level is listed to the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
At 36 h (Figs. 13d–f) 3DVAR-1OL still had the highest RMSEs for most variables but the smallest low-level specific humidity bias. Temperature RMSEs and biases of HY-1OL, HY-3OL, and 3DVAR-3OL were similar. The HY1-OL had the lowest wind RMSEs below 700 hPa whereas 3DVAR-3OL had the smallest bias over that layer. By 72 h (Figs. 13g–i), the hybrid experiments had the smallest wind biases at most levels, and the HY-3OL and 3DVAR-3OL wind RMSEs were similar and smallest. The HY-1OL continued to have lower RMSEs than 3DVAR-1OL.
Overall, verification versus dropwindsondes was consistent with verification against soundings and satellite winds. HY-1OL was better than 3DVAR-1OL, there were no consistent differences between the two hybrid configurations, and 3DVAR-3OL usually performed similarly to HY-1OL and HY-3OL. These findings suggest 3DVAR-1OL had the poorest representation of the mesoscale environment, consistent with its comparatively poor TC track forecasts.
5. Discussion of multiple outer loops
OLs are part of the variational DA procedure when the cost function is written in incremental form (Courtier et al. 1994). The model guess at the start of each OL is the previous OL's analysis that was computed by iterative minimization in an inner loop. Use of multiple OLs permits assimilation of greater observation numbers, since an observation that failed an outlier check (see section 2c) and was rejected in an earlier OL may be close enough to an updated guess field to be assimilated in a subsequent OL.
Results showed that 3DVAR was improved by using multiple OLs but the hybrid was not. To investigate this difference, we first examine the variational cost function reduction throughout the OLs and then describe results from several auxiliary experiments designed to help interpret the results.
a. Cost function reduction


Figure 14 shows the cost function reduction during minimization of the initial HY-3OL and 3DVAR-3OL analyses (0600 UTC 4 September) when the backgrounds were identical. Thus, the same number of observations passed the outlier check in the first OLs of both analyses, and differences in the cost function reduction can be attributed entirely to differences between the hybrid and 3DVAR algorithms.

Cost function reduction for the three OL 0600 UTC 4 Sep (a) 3DVAR and (b) hybrid analyses.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

Cost function reduction for the three OL 0600 UTC 4 Sep (a) 3DVAR and (b) hybrid analyses.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Cost function reduction for the three OL 0600 UTC 4 Sep (a) 3DVAR and (b) hybrid analyses.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
The cost function J was decomposed into Jb, Jo, and Je (in the hybrid) to show the contribution of each term to the analyses. Beginnings of each OL are clearly demarcated by jumps in the J and Jo curves. Initially, by definition, J = Jo. As minimization progressed, Jo decreased and Jb and Je increased. By the end of the first OL, Jo was reduced more in the hybrid than 3DVAR, and this behavior was noted every analysis. However, during the second and third OLs, the hybrid J and Jo were reduced little, while substantial reduction occurred in 3DVAR. These behaviors were also observed every analysis. In fact, by the end of the third OL, 3DVAR had lower Jo values than the hybrid ~63% of the time.
As expected, more observations were assimilated in successive OLs each analysis as fewer observations failed the corresponding outlier checks. Simply assimilating more observations increases Jo. Nonetheless, despite the increased number of observations, Jo was reduced slightly in the hybrid and substantially in 3DVAR in the second and third OLs, indicating the observations were overall fit more closely (which decreases Jo) in later OLs. However, hybrid verification statistics at 6 h (Fig. 9) suggest only little value was gained in the hybrid by fitting the observations more closely in the second and third OLs.
The Jo values shown in Fig. 14 incorporated all assimilated observations. However, Jo can be decomposed into contributions from each observing platform or observation type, and these individual contributions were determined each analysis. Differences of Jo between the end of the first and third OLs (first OL minus third OL; denoted ΔJo) and differences of the number of rejected observations in the first and third OLs (first OL minus third OL) are shown in Fig. 15 for 0000 and 1200 UTC 3DVAR-3OL and HY-3OL analyses between 9 and 28 September for selected observation types. Positive values in Figs. 15a–d indicate more observations were rejected in the first OL while ΔJo > 0 (Figs. 15e–h) means Jo was smaller at the end of the third OL. Note that Fig. 10 shows the analysis fits for the different cycling experiments while Fig. 15 depicts differences between the first and third OLs for the two experiments that employed three OLs.

(a)–(d) Differences of the number of rejected observations in the first and third OLs (first OL minus third OL) and (e)–(h) differences of Jo between the end of the first and third OLs (first OL minus third OL; denoted ΔJo) each 0000 and 1200 UTC 3DVAR-3OL and HY-3OL analysis between 9 and 28 Sep for (a),(e) radiosonde temperature; (b),(f) radiosonde specific humidity; (c),(g) radiosonde horizontal wind; and (d),(h) satellite-tracked horizontal wind observations.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

(a)–(d) Differences of the number of rejected observations in the first and third OLs (first OL minus third OL) and (e)–(h) differences of Jo between the end of the first and third OLs (first OL minus third OL; denoted ΔJo) each 0000 and 1200 UTC 3DVAR-3OL and HY-3OL analysis between 9 and 28 Sep for (a),(e) radiosonde temperature; (b),(f) radiosonde specific humidity; (c),(g) radiosonde horizontal wind; and (d),(h) satellite-tracked horizontal wind observations.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
(a)–(d) Differences of the number of rejected observations in the first and third OLs (first OL minus third OL) and (e)–(h) differences of Jo between the end of the first and third OLs (first OL minus third OL; denoted ΔJo) each 0000 and 1200 UTC 3DVAR-3OL and HY-3OL analysis between 9 and 28 Sep for (a),(e) radiosonde temperature; (b),(f) radiosonde specific humidity; (c),(g) radiosonde horizontal wind; and (d),(h) satellite-tracked horizontal wind observations.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Both 3DVAR-3OL and HY-3OL rejected fewer observations in the third OL compared to the first OL, and the rejection rates were similar between the two experiments (Figs. 15a–d). Thus, differences in ΔJo between HY-3OL and 3DVAR-3OL can be primarily attributed to how the assimilated observations were used, rather than differences in the number of observations that were assimilated.
Radiosonde temperature and moisture Jo values (Figs. 15e,f) were smaller at the end of the first OL compared to the end of the third OL for both 3DVAR-3OL and HY-3OL (ΔJo < 0). This behavior reflects the increased number of assimilated observations in the third OL and the BEC tuning factors (section 2b), which increased emphasis of the guess in later OLs. Similar patterns prevailed for aircraft and surface temperature observations (not shown). However, the magnitudes of ΔJo for radiosonde temperature and moisture observations were small compared to those for wind observations from soundings and satellite-tracked winds (Figs. 15g,h), which were clearly fit more closely with three OLs, especially for 3DVAR. Similar patterns were also noted for wind observations taken from other observing platforms (not shown). As far more wind observations were assimilated than mass measurements, the marked reduction of 3DVAR total Jo (Fig. 14a) in the second and third OLs was due to progressively closer fits of wind observations and the total HY-3OL Jo (Fig. 14b) was smaller than the total 3DVAR-3OL Jo at the end of the first OL since the hybrid better fit wind observations in the first OL.
b. Auxiliary experiments
Several additional experiments configured similarly to those described in section 3 were performed to more cleanly interpret the impacts of using multiple OLs. First, an additional 3DVAR experiment with one OL was performed, but the background error standard deviations were inflated by a factor of 3, rather than 1.5, as in the original 3DVAR-1OL experiment (Table 1). As expected, increasing the background error standard deviations led to closer analysis observation fits, but the wind observations were still not fit as closely as in the 3DVAR-3OL experiment, and TC track forecasts remained substantially worse compared to 3DVAR-3OL (not shown). These findings suggest the poor 3DVAR-1OL TC track forecasts may be related to its underutilization of wind observations. Since there was room for substantial additional adjustment toward wind observations in subsequent 3DVAR OLs, better use of wind observations in 3DVAR-3OL may have contributed to the improved TC track forecasts.
Furthermore, an additional three OL hybrid and 3DVAR experiments were performed that were configured identically to the corresponding three OL experiments described in section 3, except the multiplicative static background error standard deviation and length scale factors used in the first OL were used for all OLs. Recall that the three OL experiments introduced in section 3 employed static BECs with varying standard deviations and length scales for each OL (Table 1) to place more emphasis on the guess and restrict correlation scales in successive OLs.
Regarding mean TC track errors, the newly introduced 3DVAR experiment with constant BECs each OL (3DVAR-3OL_FIXED_BECs) performed considerably worse than the original 3DVAR-3OL experiment (Fig. 16), with errors closer to those from 3DVAR-1OL (cf. Fig. 8). While 3DVAR-3OL_FIXED_BECs produced tighter analysis fits to mass observations compared to when the BECs varied each OL, wind observations were fit less closely than in the original 3DVAR-3OL experiment (Fig. 17). Because of the complexities of variational minimization, it is difficult to explain why the wind observations were not fit more closely in 3DVAR-3OL_FIXED_BECs than in 3DVAR-3OL, but this behavior may be related to the longer length scales in 3DVAR-3OL_FIXED_BECs. Nonetheless, this finding indicates that tuning the background error standard deviations and length scales each OL is important in 3DVAR and further suggests proper use of wind observations was crucial to successful TC forecasts over this period.

As in Fig. 8, but for comparisons of three OL hybrid and 3DVAR experiments with and without fixed BECs each OL.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

As in Fig. 8, but for comparisons of three OL hybrid and 3DVAR experiments with and without fixed BECs each OL.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
As in Fig. 8, but for comparisons of three OL hybrid and 3DVAR experiments with and without fixed BECs each OL.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

RMSE (solid lines) and bias (dashed lines) for verification vs (a) radiosonde temperature (K), (b) radiosonde horizontal wind (m s−1), and (c) radiosonde specific humidity (g kg−1), observations aggregated over all analyses between 1800 UTC 8 Sep and 0000 UTC 28 Sep for three OL hybrid and 3DVAR experiments with and without fixed BECs each OL. The sample size at each level is listed to the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

RMSE (solid lines) and bias (dashed lines) for verification vs (a) radiosonde temperature (K), (b) radiosonde horizontal wind (m s−1), and (c) radiosonde specific humidity (g kg−1), observations aggregated over all analyses between 1800 UTC 8 Sep and 0000 UTC 28 Sep for three OL hybrid and 3DVAR experiments with and without fixed BECs each OL. The sample size at each level is listed to the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
RMSE (solid lines) and bias (dashed lines) for verification vs (a) radiosonde temperature (K), (b) radiosonde horizontal wind (m s−1), and (c) radiosonde specific humidity (g kg−1), observations aggregated over all analyses between 1800 UTC 8 Sep and 0000 UTC 28 Sep for three OL hybrid and 3DVAR experiments with and without fixed BECs each OL. The sample size at each level is listed to the right of each panel.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Similarly, the newly introduced three OL hybrid experiment with constant BECs each OL (HY-3OL_FIXED_BECs) did not fit wind observations as closely as the original HY-3OL experiment, but the difference was not as large as that between the three OL 3DVAR configurations (Fig. 17). Regarding mean TC track errors, HY-3OL_FIXED_BECs performed comparably to the original HY-3OL experiment before ~48 h but slightly worse thereafter (Fig. 16). Again, the disparities were not as large as those between the three OL 3DVAR experiments. It seems logical that fixing the static background error standard deviations and length scales each OL yielded smaller differences in the hybrid compared to 3DVAR, as, in the hybrid, only 25% of the total BECs came from the static component.
Overall, results suggest the hybrid was less sensitive to multiple OL configurations than 3DVAR. Currently in WRFDA, the localization length scales for the hybrid component of the BECs are constant across multiple OLs. Adjustment of the localization cutoffs for each OL, similar to the adjustment of the static BEC length scales and standard deviations, may lead to a greater impact of using multiple OLs in the hybrid.
6. Impact of TC relocation
TC relocation schemes have been shown to benefit WRF forecasts of TCs when used in conjunction with a cycling 3DVAR system (Hsiao et al. 2010). However, it is less clear whether relocation within a hybrid framework is wise or necessary, given the dependence of the ensemble.
To assess the impact of relocation, an additional set of four experiments configured identically to those described in section 3 were performed, except TC relocation was applied immediately before each hybrid or 3DVAR analysis. The relocation scheme was described in detail by Hsiao et al. (2010). Briefly, the scheme removes the modeled TC vortex from the environmental flow and moves the model TC to the observed TC position. However, the scheme does not perform relocation under several scenarios, including if high terrain is near the TC center (Hsiao et al. 2010).
TC relocation clearly improved 3DVAR TC forecasts (Fig. 18a). The impact was largest when applied to 3DVAR-1OL, which had very large initial track errors without relocation. However, even though the 3DVAR-3OL had small initial track errors without relocation, forecasts after ~12 h were still improved by employing relocation. Conversely, the hybrid did not benefit from TC relocation (Fig. 18b) on average; and if anything, a slight degradation occurred with relocation. When mean track errors of the four relocation experiments were compared, the 3DVAR-3OL performed best after ~36 h (Fig. 18c). Even with relocation, HY-1OL still outperformed 3DVAR-1OL.

Mean 0–72-h absolute track errors (km) averaged over the three TCs for the (a) 3DVAR experiments with and without TC relocation, (b) hybrid experiments with and without TC relocation, and (c) 3DVAR and hybrid experiments that all employed relocation. Relocation was performed in those experiments whose legend labels end in RELO.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1

Mean 0–72-h absolute track errors (km) averaged over the three TCs for the (a) 3DVAR experiments with and without TC relocation, (b) hybrid experiments with and without TC relocation, and (c) 3DVAR and hybrid experiments that all employed relocation. Relocation was performed in those experiments whose legend labels end in RELO.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
Mean 0–72-h absolute track errors (km) averaged over the three TCs for the (a) 3DVAR experiments with and without TC relocation, (b) hybrid experiments with and without TC relocation, and (c) 3DVAR and hybrid experiments that all employed relocation. Relocation was performed in those experiments whose legend labels end in RELO.
Citation: Monthly Weather Review 141, 12; 10.1175/MWR-D-13-00028.1
The causes of this behavior in the hybrid are unclear. No TC relocation was performed on the prior ensemble, so it is possible that relocating TCs in the hybrid immediately before DA created imbalances between the deterministic background and the prior ensemble in the vicinity of the storms. However, there was no evidence that TC structure was degraded in the hybrid analyses after relocation. Moreover, an additional hybrid experiment was performed with TC relocation where the WRF ensemble was recentered about the hybrid analyses (see section 3 and Zhang et al. 2013). The combination of relocation and recentering effectively performed TC relocation on the ensemble. However, the mean hybrid TC track errors were nearly unchanged with this coupled hybrid–ensemble system that employed relocation (not shown). More work is needed to better understand TC relocation in ensemble-based DA systems.
7. Summary and conclusions
This study evaluated TC forecasts initialized with limited-area hybrid and 3DVAR DA systems. Three typhoons in September 2008 were examined. Parallel experiments employing WRFDA's 3DVAR and hybrid algorithms cycled continuously for ~3.5 weeks and produced new analyses every 6 h that initialized 72-h WRF forecasts. An EAKF was used to update a 32-member ensemble system that provided flow-dependent BECs for the hybrid. The experiments were configured to cleanly assess the differences between cyclic 3DVAR- and hybrid-initialized forecasts and explore the impact of applying multiple OLs and TC relocation.
Our findings support the following conclusions:
When just one OL was employed during variational minimization the hybrid DA system unequivocally produced superior TC track forecasts than a similarly configured 3DVAR system. These findings echo W11. Moreover, general synoptic conditions and the mesoscale environment surrounding the TCs were more accurately depicted in the one OL hybrid system compared to the one OL 3DVAR experiment.
Using multiple OLs greatly benefitted 3DVAR, but, on average, the hybrid was less sensitive to the number of OLs. In fact, even without TC relocation, 3DVAR configured with three OLs produced TC track forecasts comparable to those of the hybrid experiments. The dramatic improvement of TC track forecasts due to multiple OLs in 3DVAR was also noted by Hsiao et al. (2012), who studied this same period.
TC relocation improved TC track forecasts in the 3DVAR systems, corroborating Hsiao et al. (2010). However, the hybrid-initialized TC track forecasts were not bettered by TC relocation.
The results show that hybrid DA can initialize higher quality forecasts than 3DVAR systems. However, 3DVAR algorithms may potentially yield performance similar to hybrid configurations by carefully tuning static BEC parameters (e.g., standard deviation and length scale) and employing multiple OLs. It is likely that these tunings will vary geographically and with model resolution, and in some regions, 3DVAR tunings and multiple OLs may not significantly improve forecasts. We suggest additional testing of multiple OLs in 3DVAR frameworks for different weather regimes and domains to assess the general importance of multiple OLs in 3DVAR.
Moreover, using multiple OLs increases the computational cost of 3DVAR. There seems little need to run the hybrid with multiple OLs, which saves computational expense. While hybrid analyses are more expensive than 3DVAR because of the additional control variables, over this domain, the cost of producing a 3DVAR analysis with three OLs was similar to that of generating a one OL hybrid analysis. The primary additional expense of the hybrid is advancing the ensemble of forecasts between analyses and performing the EnKF analysis. Note, however, that ensemble forecasts can be run in parallel and the hybrid and EnKF analyses can be run simultaneously, even in systems where the ensemble and variational components are coupled (e.g., Zhang et al. 2013). Furthermore, the ensemble can be used to produce probabilistic forecasts.
The NCEP GFS has been initialized with hybrid-3DVAR DA beginning in May 2012, and a hybrid-4DVAR system initializes the Met Office global model (Clayton et al. 2012). However, limited-area hybrid systems have not yet been implemented operationally and regional hybrid DA warrants further development. Future work will likely address the impact of assimilating satellite radiances within limited-area hybrid configurations. Both Schwartz et al. (2012) and Liu et al. (2012) illustrated that assimilating microwave radiances with a limited-area EAKF benefitted TC forecasts. The hybrid may better use radiance observations than EnKFs with observation-space localization due to hybrid covariance localization in model space (Campbell et al. 2010). Additionally, the hybrid should be tested and evaluated using higher-resolution configurations. Finally, the impact of using a larger ensemble in the hybrid should be explored, which may necessitate retuning localization parameters and the relative contributions of the static and ensemble BECs.
Acknowledgments
The Taiwan Central Weather Bureau (CWB) partially funded this work. Der-Song Chen and Ling-Feng Hsiao (CWB) provided the observational data. Hui Liu (NCAR) helped with the DART configurations. Comments from three anonymous reviewers improved this paper.
REFERENCES
Ancell, B. C., 2012: Examination of analysis and forecast errors of high-resolution assimilation, bias removal, and digital filter initialization with an ensemble Kalman filter. Mon. Wea. Rev., 140, 3992–4004.
Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation. Mon. Wea. Rev., 129, 2884–2903.
Anderson, J. L., 2003: A local least squares framework for ensemble filtering. Mon. Wea. Rev., 131, 634–642.
Anderson, J. L., 2009: Spatially and temporally varying adaptive covariance inflation for ensemble filters. Tellus, 61A, 72–83.
Anderson, J. L., T. Hoar, K. Raeder, H. Liu, N. Collins, R. Torn, and A. Arellano, 2009: The Data Assimilation Research Testbed: A community facility. Bull. Amer. Meteor. Soc., 90, 1283–1296.
Barker, D. M., W. Huang, Y.-R. Guo, A. Bourgeois, and X. N. Xio, 2004: A three-dimensional variational data assimilation system for MM5: Implementation and initial results. Mon. Wea. Rev., 132, 897–914.
Barker, D. M., and Coauthors, 2012: The Weather Research and Forecasting Model's Community Variational/Ensemble Data Assimilation System: WRFDA. Bull. Amer. Meteor. Soc., 93, 831–843.
Berner, J., G. J. Shutts, M. Leutbecher, and T. N. Palmer, 2009: A spectral stochastic kinetic energy backscatter scheme and its impact on flow-dependent predictability in the ECMWF ensemble prediction system. J. Atmos. Sci., 66, 603–626.
Bishop, C. H., B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420–436.
Buehner, M., 2005: Ensemble-derived stationary and flow-dependent background error covariances: Evaluation in a quasi-operational NWP setting. Quart. J. Roy. Meteor. Soc., 131, 1013–1043.
Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010a: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part I: Description and single-observation experiments. Mon. Wea. Rev., 138, 1550–1566.
Buehner, M., P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010b: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part II: One-month experiments with real observations. Mon. Wea. Rev., 138, 1567–1586.
Burgers, G., P. J. van Leeuwen, and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter. Mon. Wea. Rev., 126, 1719–1724.
Campbell, W. F., C. H. Bishop, and D. Hodyss, 2010: Vertical covariance localization for satellite radiances in ensemble Kalman filters. Mon. Wea. Rev., 138, 282–290.
Cavallo, S. M., R. D. Torn, C. Snyder, C. Davis, W. Wang, and J. Done, 2013: Evaluation of the Advanced Hurricane WRF data assimilation system for the 2009 Atlantic hurricane season. Mon. Wea. Rev., 141, 523–541.
Chen, F., and J. Dudhia, 2001: Coupling an advanced land-surface/hydrology model with the Penn State/NCAR MM5 modeling system. Part I: Model description and implementation. Mon. Wea. Rev., 129, 569–585.
Chou, M.-D., and M. J. Suarez, 1994: An efficient thermal infrared radiation parameterization for use in general circulation models. NASA Tech. Memo. 104606, Vol. 3, 85 pp.
Clayton, A. M., A. C. Lorenc, and D. M. Barker, 2012: Operational implementation of a hybrid ensemble/4D-Var global data assimilation system at the Met Office. Quart. J. Roy. Meteor. Soc., 139, 1445–1461, doi:10.1002/qj.2054.
Courtier, P., J.-N. Thépaut, and A. Hollingsworth, 1994: A strategy for operational implementation of 4D-Var, using an incremental approach. Quart. J. Roy. Meteor. Soc., 120, 1367–1387.
Elsberry, R. L., and P. A. Harr, 2008: Tropical cyclone structure (TCS08) field experiment science basis, observational platforms, and strategy. Asia-Pac. J. Atmos. Sci., 44 (3), 209–231.
Etherton, B. J., and C. H. Bishop, 2004: Resilience of hybrid ensemble/3DVAR analysis schemes to model error and ensemble covariance error. Mon. Wea. Rev., 132, 1065–1080.
Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99 (C5), 10 143–10 162.
Guo, Y.-R., H.-C. Lin, X. X. Ma, X.-Y. Huang, C. T. Terng, and Y.-H. Kuo, 2006: Impact of WRF-Var (3DVar) background error statistics on typhoon analysis and forecast. Extended Abstracts, WRF Users' Workshop, Boulder, CO, NCAR, P4.2. [Available online at http://www.mmm.ucar.edu/wrf/users/workshops/WS2006/abstracts/PSession04/P4_2_Guo.pdf.]
Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter–3D variational analysis scheme. Mon. Wea. Rev., 128, 2905–2919.
Hamill, T. M., J. S. Whitaker, M. Fiorino, and S. G. Benjamin, 2011a: Global ensemble predictions of 2009's tropical cyclones initialized with an ensemble Kalman filter. Mon. Wea. Rev., 139, 668–688.
Hamill, T. M., J. S. Whitaker, D. T. Kleist, M. Fiorino, and S. G. Benjamin, 2011b: Predictions of 2010's tropical cyclones using the GFS and ensemble-based data assimilation methods. Mon. Wea. Rev., 139, 3243–3247.
Hong, S.-Y., Y. Noh, and J. Dudhia, 2006: A new vertical diffusion package with an explicit treatment of entrainment processes. Mon. Wea. Rev., 134, 2318–2341.
Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique. Mon. Wea. Rev., 126, 796–811.
Houtekamer, P. L., H. L. Mitchell, G. Pellerin, M. Buehner, M. Charron, L. Spacek, and B. Hansen, 2005: Atmospheric data assimilation with an ensemble Kalman filter: Results with real observations. Mon. Wea. Rev., 133, 604–620.
Hsiao, L.-F., C. S. Liou, T. C. Yeh, Y. R. Guo, D. S. Chen, K. N. Huang, C. T. Terng, and J. H. Chen, 2010: A vortex relocation scheme for tropical cyclone initialization in Advanced Research WRF. Mon. Wea. Rev., 138, 3298–3315.
Hsiao, L.-F., D.-S. Chen, Y.-H. Kuo, Y.-R. Guo, T.-C. Yeh, J.-S. Hong, C.-T. Fong, and C.-S. Lee, 2012: Application of WRF 3DVAR to operational typhoon prediction in Taiwan: Impact of outer loop and partial cycling approaches. Wea. Forecasting, 27, 1249–1263.
Huang, X.-Y., and P. Lynch, 1993: Diabatic digital filter initialization: Application to the HIRLAM model. Mon. Wea. Rev., 121, 589–603.
Huang, X.-Y., and Coauthors, 2009: Four-dimensional variational data assimilation for WRF: Formulation and preliminary results. Mon. Wea. Rev., 137, 299–314.
Ide, K., P. Courtier, M. Ghil, and A. Lorenc, 1997: Unified notation for data assimilation: Operational, sequential and variational. J. Meteor. Soc. Japan, 75, 181–189.
Kain, J. S., 2004: The Kain–Fritsch convective parameterization: An update. J. Appl. Meteor., 43, 170–181.
Kain, J. S., and J. M. Fritsch, 1990: A one-dimensional entraining/detraining plume model and its application in convective parameterization. J. Atmos. Sci., 47, 2784–2802.
Kain, J. S., and J. M. Fritsch, 1993: Convective parameterization for mesoscale models: The Kain–Fritsch scheme. The Representation of Cumulus Convection in Numerical Models, Meteor. Monogr., No. 24, Amer. Meteor. Soc., 165–170.
Kurihara, Y., M. A. Bender, R. E. Tuleya, and R. J. Ross, 1995: Improvements in the GFDL hurricane prediction system. Mon. Wea. Rev., 123, 2791–2801.
Liu, Q., T. Marchok, H. Pan, M. Bender, and S. Lord, 2000: Improvements in hurricane initialization and forecasting at NCEP with global and regional (GFDL) models. NCEP/EMC Tech. Procedures Bull. 472, 7 pp. [Available online at http://205.156.64.206/om/tpb/472.htm.]
Liu, Z., C. S. Schwartz, C. Snyder, and S.-Y. Ha, 2012: Impact of assimilating AMSU-A radiances on forecasts of 2008 Atlantic tropical cyclones initialized with a limited-area ensemble Kalman filter. Mon. Wea. Rev., 140, 4017–4034.
Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4D-VAR. Quart. J. Roy. Meteor. Soc., 129, 3183–3203.
Lorenc, A. C., and Coauthors, 2000: The Met. Office global three-dimensional variational data assimilation scheme. Quart. J. Roy. Meteor. Soc., 126, 2991–3012.
Lynch, P., 1997: The Dolph–Chebyshev window: A simple optimal filter. Mon. Wea. Rev., 125, 655–660.
Lynch, P., and X.-Y. Huang, 1992: Initialization of the HIRLAM model using a digital filter. Mon. Wea. Rev., 120, 1019–1034.
Ma, L.-M., and Z.-M. Tan, 2009: Improving the behavior of the cumulus parameterization for tropical cyclone prediction: Convection trigger. Atmos. Res., 92 (2), 190–211.
Meng, Z., and F. Zhang, 2008a: Tests of an ensemble Kalman filter for mesoscale and regional-scale data assimilation. Part III: Comparison with 3DVAR in a real-data case study. Mon. Wea. Rev., 136, 522–540.
Meng, Z., and F. Zhang, 2008b: Test of an ensemble Kalman filter for mesoscale and regional-scale data assimilation. Part IV: Comparison with 3DVar in a month-long experiment. Mon. Wea. Rev., 136, 3671–3682.
Miyoshi, T., Y. Sato, and T. Kadowaki, 2010: Ensemble Kalman filter and 4D-Var intercomparison with the Japanese Operational Global Analysis and Prediction System. Mon. Wea. Rev., 138, 2846–2866.
Mlawer, E. J., S. J. Taubman, P. D. Brown, M. J. Iacono, and S. A. Clough, 1997: Radiative transfer for inhomogeneous atmosphere: RRTM, a validated correlated-k model for the longwave. J. Geophys. Res., 102 (D14), 16 663–16 682.
Parrish, D. F., and J. C. Derber, 1992: The National Meteorological Center's spectral statistical interpolation analysis system. Mon. Wea. Rev., 120, 1747–1763.
Rainwater, S., and B. Hunt, 2013: Mixed resolution ensemble data assimilation. Mon. Wea. Rev., 141, 3007–3021.
Schwartz, C. S., Z. Liu, Y. Chen, and X.-Y. Huang, 2012: Impact of assimilating microwave radiances with a limited-area ensemble data assimilation system on forecasts of Typhoon Morakot. Wea. Forecasting, 27, 424–437.
Shutts, G., 2005: A kinetic energy backscatter algorithm for use in ensemble prediction systems. Quart. J. Roy. Meteor. Soc., 131, 3079–3102.
Skamarock, W. C., and Coauthors, 2008: A description of the Advanced Research WRF version 3. NCAR Tech Note NCAR/TN-475+STR, 113 pp. [Available from UCAR Communications, P. O. Box 3000, Boulder, CO 80307.]
Tao, W.-K., and J. Simpson, 1993: The Goddard cumulus ensemble model. Part I: Model description. Terr. Atmos. Oceanic Sci., 4, 35–72.
Tao, W.-K., and Coauthors, 2003: Microphysics, radiation and surface processes in the Goddard Cumulus Ensemble (GCE) model. Meteor. Atmos. Phys., 82, 97–137.
Torn, R. D., 2010: Performance of a mesoscale ensemble Kalman filter (EnKF) during the NOAA high-resolution hurricane test. Mon. Wea. Rev., 138, 4375–4392.
Torn, R. D., and G. J. Hakim, 2009: Ensemble data assimilation applied to RAINEX observations of Hurricane Katrina (2005). Mon. Wea. Rev., 137, 2817–2829.
Torn, R. D., G. J. Hakim, and C. Snyder, 2006: Boundary conditions for limited-area ensemble Kalman filters. Mon. Wea. Rev., 134, 2490–2502.
Wang, J., and Coauthors, 2010: Water vapor variability and comparisons in the subtropical Pacific from The Observing System Research and Predictability Experiment-Pacific Asian Regional Campaign (T-PARC) driftsonde, Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC), and reanalyses. J. Geophys. Res., 115, D21108, doi:10.1029/2010JD014494.
Wang, X., 2011: Application of the WRF Hybrid ETKF–3DVAR data assimilation system for hurricane track forecasts. Wea. Forecasting, 26, 868–884.
Wang, X., T. M. Hamill, J. S. Whitaker, and C. H. Bishop, 2007a: A comparison of hybrid ensemble transform Kalman filter–OI and ensemble square-root filter analysis schemes. Mon. Wea. Rev., 135, 1055–1076.
Wang, X., C. Snyder, and T. M. Hamill, 2007b: On the theoretical equivalence of differently proposed ensemble/3D–Var hybrid analysis schemes. Mon. Wea. Rev., 135, 222–227.
Wang, X., D. Barker, C. Snyder, and T. M. Hamill, 2008a: A hybrid ETKF–3DVAR data assimilation scheme for the WRF model. Part I: Observing system simulation experiment. Mon. Wea. Rev., 136, 5116–5131.
Wang, X., D. Barker, C. Snyder, and T. M. Hamill, 2008b: A hybrid ETKF–3DVAR data assimilation scheme for the WRF model. Part II: Real observation experiments. Mon. Wea. Rev., 136, 5132–5147.
Wang, X., T. M. Hamill, J. S. Whitaker, and C. H. Bishop, 2009: A comparison of the hybrid and EnSRF analysis schemes in the presence of model error due to unresolved scales. Mon. Wea. Rev., 137, 3219–3232.
Wang, X., D. F. Parrish, D. T. Kleist, and J. S. Whitaker, 2013: GSI 3DVAR-based ensemble-variational hybrid data assimilation for NCEP Global Forecast System: Single-resolution experiments. Mon. Wea. Rev., 141, 4098–4117.
Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations. Mon. Wea. Rev., 130, 1913–1924.
Whitaker, J. S., T. M. Hamill, X. Wei, Y. Song, and Z. Toth, 2008: Ensemble data assimilation with the NCEP Global Forecast System. Mon. Wea. Rev., 136, 463–482.
Zhang, F., Y. Weng, J. A. Sippel, Z. Meng, and C. H. Bishop, 2009a: Cloud-resolving hurricane initialization and prediction through assimilation of Doppler radar observations with an ensemble Kalman filter. Mon. Wea. Rev., 137, 2105–2125.
Zhang, F., M. Zhang, and J. A. Hansen, 2009b: Coupling ensemble Kalman filter with four-dimensional variational data assimilation. Adv. Atmos. Sci., 26, 1–8.
Zhang, F., M. Zhang, and J. Poterjoy, 2013: E3DVar: Coupling an ensemble Kalman filter with three-dimensional variational data assimilation in a limited-area weather prediction model and comparison to E4DVar. Mon. Wea. Rev., 141, 900–917.
Zhang, M., and F. Zhang, 2012: E4DVar: Coupling an ensemble Kalman filter with four-dimensional variational data assimilation in a limited-area weather prediction model. Mon. Wea. Rev., 140, 587–600.
Zhang, M., F. Zhang, X.-Y. Huang, and X. Zhang, 2011: Intercomparison of an ensemble Kalman filter with three- and four-dimensional variational data assimilation methods in a limited-area model over the month of June 2003. Mon. Wea. Rev., 139, 566–572.