## 1. Introduction

Since 1997, the National Oceanic and Atmospheric Administration (NOAA) has conducted operational synoptic surveillance missions with its Gulfstream-IV (G-IV) jet aircraft around storms expected to impact the United States (Aberson and Franklin 1999). During the first two years of such missions, the impact on operational dynamical track forecast models was less than 10% (Aberson 2002), and this was attributed to suboptimal sampling and data assimilation procedures. Improvements to the track guidance increased to 18% in the same sample during the critical period during which hurricane watches and warnings are issued when targeting and sampling strategies were implemented. These strategies involved identifying regions of likely error growth in the models by examining the National Centers for Environmental Prediction (NCEP) global ensemble forecasts for regions of high ensemble spread in the 850 to 200 hPa deep-layer-mean (DLM) winds; these entire regions (within and completely around variance maxima) are sampled with dropwindsondes at approximately the same resolution as the North American rawinsonde network. The DLM winds are used because, to a first approximation, tropical cyclones are steered by the DLM flow. Assimilating only this subset of data in each case led to larger forecast improvements than using all the data from the surveillance flights. As a result, surveillance missions have since been designed with this subjective targeting method.

During 2001, NOAA and the National Aeronautics and Space Administration (NASA) conducted the joint Hurricanes at Landfall (HAL) and Fourth Convection and Moisture Experiment (CAMEX-4). The most comprehensive datasets were collected during two consecutive days during Hurricane Humberto. During these two sets of missions centered around nominal times 0000 UTC 24 and 25 September, operational synoptic surveillance missions were conducted with the G-IV. The impact of these missions on the operational model guidance is presented in section 2. Section 3 provides descriptions and results of conventional and ensemble-based data assimilation techniques on forecasts from a barotropic hurricane track forecast model. Implications and conclusions are presented in section 4.

## 2. Operational surveillance results

### a. Techniques

To assess the impact of the surveillance missions on operational numerical forecasts, the version of the NCEP Global Data Assimilation System (GDAS) operational at the time of the missions was used. The GDAS is composed of a quality control algorithm, a tropical cyclone relocation procedure, an analysis procedure, and the Global Forecast System (GFS, previously the AVN). The quality control scheme evaluates the observations by optimal interpolation and hierarchical decision-making before they are input to the analysis (Woollen 1991). The vortex relocation scheme (Qingfu Liu 1999, personal communication) removes the symmetric cyclonic vortex from the initial condition, and relocates it using the operationally estimated position. The analysis scheme is the spectral statistical interpolation (Parrish and Derber 1992); the background field (the previous 6-h forecast) is combined with observations in a three-dimensional variational data assimilation (3DVAR) multivariate formalism. The GDAS forecast model has a horizontal resolution of spectral triangular 170 (T170) with 42 unequally spaced sigma levels on a Lorenz grid (Caplan et al. 1997; Surgi et al. 1998).

The impact of the surveillance missions is quantified in two operational dynamical models using the GDAS output for initial conditions: the Geophysical Fluid Dynamics Laboratory (GFDL; Kurihara et al. 1998) and the AVN (Caplan et al. 1997) models. The versions of these models operational in September 2001 are used in the present study. The GDAS analysis is the direct input to AVN, whereas GFDL modifies the near-storm analysis with its own vortex specification scheme. GFDL uses AVN forecast fields as boundary conditions throughout the forecast duration and does not directly ingest the dropwindsonde data. In addition to the operational AVN and GFDL model forecasts, a parallel data assimilation cycle in which data are assimilated into the model every 6 h was run in which the only difference was that dropwindsonde data were not assimilated but was otherwise identical. Because of this, dropwindsonde data from previous synoptic surveillance missions in Hurricane Gabrielle are included in the operational data assimilation cycle, but not in the parallel cycle.

### b. Results

#### 1) Nominal times

Absolute track forecast errors are defined as the great-circle distance between the forecast location and the concurrent postprocessed best-track position determined by the National Hurricane Center (NHC) after all available observations are investigated poststorm. Figure 1 shows the forecast tracks and verifying best tracks and the average track forecast errors of the all-dropwindsonde and no-dropwindsonde samples for the two nominal surveillance times (0000 UTC 24 and 25 September 2001), and the percentage improvement of the all- over the no-dropwindsonde runs. The comparisons within each model are homogeneous, but the intermodel comparisons are not since the storm dissipated in the AVN before 60 h in the second model integration. Therefore, the GFDL results are for both cases, whereas the AVN results from 60 h onward are from the first case only.

The AVN forecasts were improved between 40% and 80% between 24 and 48 h, and the GFDL forecasts were improved by up to 30% during the same forecast times. Large degradations to the AVN forecasts are seen at 12, 72, and 84 h, but the forecast errors for the no-dropwindsonde runs were exceptionally small at these forecast times making improvements difficult to obtain. The change in sign of the improvements is due to the forecast tracks crossing each other and the verifying track. The different results from the two models are likely due to the impact of the GFDL synthetic vortex. The GFDL system removes the cyclonic circulation near the tropical cyclone (TC) center from the AVN initial condition and replaces it with a vortex spun up by an axisymmetric version of the model. As a result, some of the near-storm dropwindsonde data is removed from the GFDL initial condition, and this difference accounts for the different results in the two models. The improvements in this very small sample are much larger than those seen in larger samples (Aberson 2002, 2003) and in recent seasons.

#### 2) All forecast times

The operational global model is initialized every 6 h. Though the surveillance missions were conducted only once per day, the impact of the additional dropwindsonde data may be seen at other initial times. Since surveillance flights are generally 8 h long, the early dropwindsonde data are assimilated into the previous day’s 1800 UTC model run. Figure 2 shows the average improvements to the track forecasts at each of the four initial times at all synoptic times from the time of the assimilation of the first dropwindsonde data (1800 UTC 23 September) to 12 h after the assimilation of the last dropwindsonde data (1200 UTC 25 September). As in the above sample, each model comparison is homogeneous, but the intermodel comparisons are not.

For both models, the best results are at the nominal mission time when all the observations are assimilated. All other initial times show degradations for both models except the 0600 UTC GFDL runs. The percentage degradations are very large in the AVN because all the forecasts without the dropwindsonde data were exceptionally good, so even small absolute degradations correspond to large percentage degradations. These results suggest that the very reliable dropwindsonde data are able to improve model forecasts at the time of their assimilation, but that the memory of the model is short, and the these data may not improve, and in fact may degrade, forecasts later in the cycle. This degradation may occur because the first guess, produced by a model, will necessarily have errors, and no new data may be available to correct these errors in sensitive regions. This general result was confirmed with a larger sample during the 2001 hurricane season and led to the implementation of twice-daily synoptic surveillance missions during the 2002 hurricane season.

### c. Model error growth

The ambiguous results at nonmission nominal times may be the result of unintended consequences of model error growth. Figure 3 shows the differences in the model initial condition DLM winds solely due to the assimilation of the dropwindsonde data. On the first day, signal maxima that are not collocated with dropwindsonde observations are seen to the east and north of the center of Humberto between the NOAA P-3 and G-IV flight patterns, and over the southwestern Caribbean Sea. On the second day, however, more signal maxima are seen, extending from south of Nova Scotia to the tropical eastern Pacific Ocean. These are likely due to a combination of signals left over from the assimilation of dropwindsonde data at earlier times (such as missions around Hurricane Gabrielle), and the way the data assimilation technique interpolates between observation locations.

Figure 4 shows the 24-h forecast of the variance of the DLM winds in the NCEP ensemble forecasting system. The largest signals not collocated with dropwindsonde observations are collocated with maxima in the ensemble variance, such as the region south of Nova Scotia, the southwestern Caribbean Sea, the central Gulf of Mexico, and the tropical eastern Pacific. The size of the variance of the NCEP ensemble represents the size of bred vectors that are closely related to the Lyapunov vectors, the fastest growing modes of the system (Toth and Kalnay 1993; Lorenz and Emanuel 1998). This result confirms that increments, the difference between the initial conditions with and without the assimilation of the data, that are both very small and very remote from observations, likely the result of the isotropic spread of information from the dropwindsondes decaying away from the observation location, can, but may not always, grow in regions of large ensemble spread. Since these features may be propagated through the 6-h data assimilation cycle, they may lead to negative forecast impacts at other forecast times or in other regions.

### d. Global impacts

Szunyogh et al. (2002) showed that downstream baroclinic development can play a major role in propagating the influence of data around the globe. Figure 5 shows that the main region of impact remains with Humberto through 24 h but becomes less coherently situated around the tropical cyclone by 36 h, the time that Humberto began weakening rapidly over the cold waters of the North Atlantic Ocean. By 48 h, forecast differences due to the dropwindsonde data are seen throughout the western North Atlantic Ocean, with no strong maximum where the weakening Humberto was located.

The surveillance missions are part of a program to place observation platforms where they will hopefully allow for the largest positive impact on a particular, usually high-impact, forecast. These so-called targeted observations also have been studied in a number of field experiments and programs in the wintertime extratropics [Fronts and Atlantic Storm-Track Experiment (FASTEX), North Pacific Experiment (NORPEX), Winter Storm Reconnaissance Program (WSR)], and are a focus of the upcoming The Observing System Research and Predictability Experiment (THORPEX). However, given the rapid spread of the data impact around the globe (Fig. 6), one must question whether targeting observations to improve specific local forecasts beyond three days in the future is a possibility given current models and data assimilation techniques.

## 3. Assimilation of DLM wind data into a barotropic model

Bishop et al. (2001) showed that the method used to assimilate just one observation in the environment of an idealized tropical cyclone in a simple model can have a profound impact on the initial conditions and subsequent track forecast of the cyclone. The following describes a first attempt to utilize the techniques described in that manuscript to the assimilation of data for forecasting a real tropical cyclone.

### a. Theory

Data assimilation is the process of blending observational data with a model forecast, a “first guess” field, to produce an analysis that is more accurate than either data source individually. Given that both observations and model fields are imperfect, data assimilation schemes require error statistics for both. These error statistics are represented by an observation error covariance matrix for observational data, and a prediction error covariance matrix for the first guess field. These matrices describe how errors of one variable at one location correlate with errors of other variables at other locations. The observational error covariance matrix is often diagonal, as observation errors (such as instrument error) associated with one observation are assumed to be uncorrelated with observation errors of another observation. The prediction error covariance matrix contains information about how errors of one variable at one model grid point of a model first guess correlate to errors of all other variables at all other grid points within the model domain.

**y**are the observations,

**x**

^{f}is the first guess field,

**x**

^{a}is the new analysis, 𝗛 is the observation operator translating variables from observation to model space, 𝗥 is the observation error covariance matrix providing the error statistics of the observational data, and 𝗣

^{f}is the prediction error covariance matrix. The innovation vector, (

**y**− 𝗛

**x**

^{f}), lists the differences between the observation and the first guess field values at the observation locations. Equation (1) is solved so that the analysis error variance is minimized (Cohn 1997).

Consider the case in which Eq. (1) is used to assimilate one observation. In this case, the vectors **y**, **x**^{a}, and **x**^{f} have only one element each. The matrices 𝗥 and 𝗣^{f} have only one element as well, each consisting of the error variance associated with the observation or forecast, respectively. Consider this special single-value case in which observations are always perfect. In this case, the error variance of the observations is zero, and, therefore, the ratio 𝗣^{f}𝗛^{T} (𝗛𝗣^{f}𝗛^{T} + 𝗥)^{−1} is one, as the value of 𝗥 is zero. Thus, the analysis value **x**^{a} is exactly the observed value **y**. If instead the first guess is perfect, then this ratio would be zero, and the analysis would equal the first guess.

^{f}commonly used in operational centers in 3DVAR schemes is to produce a parameterized, time-invariant matrix. One option is to have the impact of each observation decay isotropically away from the observation location. An example of such a time-invariant, spatially isotropic covariance matrix is given in Daley [1991, their Eq. (4.3.21)]:where r is the distance between the analysis and observation locations, and D is a correlation length scale at which the impact of the observation is 0.1 of the impact at the observation location. This equation has been used in Bishop et al. (2001) and Etherton and Bishop (2004).

^{f}is represented as the outer product of a matrix 𝗭, where each column of 𝗭 is a perturbation from the ensemble mean,where

*K*is the number of ensemble members. Another alternative is a hybrid analysis scheme that approximates the forecast error covariance matrix 𝗣

^{f}with a mix of parameterized covariances, 𝗕

^{f}, such as given in Eq. (2), and flow-dependent, ensemble-based covariances, 𝗙

^{f}, such as that in Eq. (3). The forecast error covariance matrix 𝗣

^{f}from Hamill and Snyder (2000) is given bywhere

*α*is a constant between 0 and 1. If

*α*= 1, then 𝗣

^{f}= 𝗕

^{f}, a 3DVAR correlation matrix. With

*α*= 0, then 𝗣

^{f}= 𝗙

^{f}, an ETKF correlation matrix. If

*α*is between zero and one, 𝗣

^{f}is a mixture of flow-dependent (ensemble based) and time-invariant (3DVAR) error statistics.

### b. Experimental procedure

The focus of this experiment is to use DLM dropwindsonde data from the NOAA G-IV aircraft surveillance flights in Hurricane Humberto to initialize the barotropic model VICBAR (Aberson and DeMaria 1994) model to test the effectiveness of various data assimilation schemes. A 41-member VICBAR ensemble is used to generate the error statistics in the prediction error covariance matrix (appendix).

The first guess field **x**^{f} [Eq. (1)] is the deep-layer-mean wind field from the VICBAR ensemble mean forecast at the nominal observation time. The field has a resolution of 1° latitude–longitude within a domain spanning from 5° to 45°N, 105° to 45°W. The value of D in Eq. (2) is set such that the distance at which correlations decay to 0.1 is 5° in all directions.

To produce an increment to the first guess field, covariance matrices 𝗣^{f} and 𝗥 are computed. Since 0.5 m s^{−1} is the expected error of a dropwindsonde wind observation (Hock and Franklin 1999), the diagonal elements of 𝗥 are set equal to (0.5)^{2}. The error of the first guess field is assumed to be 2 m s^{−1} at every point; thus the diagonal elements of 𝗣^{f} for the 3DVAR technique are set to 4. The traces of the covariance matrices 𝗕^{f} and 𝗙^{f} are rescaled in a fashion consistent with Dee (1995), such that when mapped onto observation sites (𝗛𝗕𝗛^{T} and 𝗛𝗙𝗛^{T}), their traces are equal to the most likely magnitude of these matrices, the square of the innovation vector (**y** − 𝗛**x**^{f}).

Increments to the ensemble mean are calculated for alpha values of 0.0, 0.5, and 1.0, which are 3DVAR, Hybrid, and ETKF increments, respectively. Adding each of these increments to the ensemble mean forecast produces three distinct initial condition fields for the VICBAR model. These fields are truncated to a 2.5° resolution for use by the VICBAR model.

### c. Results

The assimilation of the dropwindsonde data at nominal time 0000 UTC 24 September illustrates the differences between ensemble-based and conventional data assimilation. Figure 7a shows the ensemble mean 12-h forecast VICBAR DLM wind field valid at the nominal time. This wind field serves as the first guess state of the atmosphere for the data assimilation schemes. Figure 7b shows the vector difference between the DLM dropwindsonde observations and the first guess field at the observation sites. The largest differences (innovations) are to the north and east of Humberto. Figure 8 shows that the largest increments to the first guess are also in this region. However, the use of different error statistics results in very different increments.

A circulation center in the subtropical ridge is evident in the first guess field (Fig. 7a) near 32°N, 59°W. The 3DVAR data assimilation does not significantly modify this feature since it is far from observation locations (Figs. 8a, 9a). The ensemble-based scheme weakens this feature and moves it eastward (Figs. 8b, 9b). Other important differences are seen in the anticyclonic–cyclonic pair to the southwest of Humberto, and over Florida. The more accurate representation of these features in initial conditions generated using ensemble based error statistics resulted in an improved VICBAR forecast over one initialized by using more conventional data assimilation (Fig. 10). Results of the second case are similar.

One potential problem with ensemble-based data assimilation is that the relatively small number of ensemble members may lead to spurious long-distance error correlations. In this particular case, a shortwave in the flow well to the northeast of Humberto and the observations may be one such potentially spurious feature. At relatively long forecast times, the potentially spurious features may negatively impact the subsequent forecasts. A hybrid data assimilation technique may help to limit the negative effects of these potentially spurious features. In regions of low ensemble spread, the error statistics of the hybrid prediction error covariance matrix will be dominated by the isotropic 3DVAR error statistics. Though the error correlations are not set explicitly to zero at a prescribed distance from the observations, the impact of such long distance correlations is reduced when using the hybrid scheme.

## 4. Conclusions

NOAA conducted two operational synoptic surveillance missions in Hurricane Humberto with the G-IV jet aircraft during the CAMEX-4 field experiment. The dropwindsonde data gathered during these missions led to substantial track forecast improvements in both of the leading operational dynamical forecast models (AVN and GFDL) at the nominal times of these two missions. However, lesser improvements, and even some forecast degradations, are seen in those forecasts initiated at the nonnominal initial times. This may be because the model can “remember” the good dropwindsonde data for only a short time (up to 12 h) before they no longer improve the forecasts. The reasons for this are beyond the scope of the current study. Twice-daily missions initiated in 2002 are designed to limit this problem.

Examination of the increments and signals provided by the dropwindsonde data confirms previous results that the impact grows in areas of large spread in the NCEP ensemble forecasting system (Aberson 2003). In these cases, the impact has been shown to grow rapidly in regions distant from the observation sites just a couple of days into the forecast. This finding raises questions about the feasibility of conducting targeting missions to improve specific forecasts in the medium range.

The data from the surveillance missions provided an opportunity to test an ensemble-based Kalman filter data assimilation technique in a barotropic hurricane track forecast model, VICBAR, using a 41-member modified bred mode ensemble forecasting system. An isotropic assimilation scheme such as 3DVAR is only able to impact the initial conditions in a region surrounding each data point. However, the ETKF is able to make modifications based upon the unique “flow of the day,” such that meaningful initial increments to the first guess field are created. This has led to further improvements to track forecasts in this very small sample than is possible with currently operational data assimilation schemes.

Clearly, further tests of both these targeting and data assimilation schemes on a larger sample must be conducted before any meaningful and statistically significant results may be found. Such tests are currently being conducted. The current results do suggest that the ETKF has promise as a targeting and data assimilation technique for tropical cyclones.

## REFERENCES

Aberson, S. D., 2002: Two years of operational hurricane synoptic surveillance.

,*Wea. Forecasting***17****,**1101–1110.Aberson, S. D., 2003: Targeted observations to improve operational tropical cyclone track forecast guidance.

,*Mon. Wea. Rev.***131****,**1613–1628.Aberson, S. D., , and M. DeMaria, 1994: Verification of a nested barotropic hurricane track forecast model (VICBAR).

,*Mon. Wea. Rev.***122****,**2804–2815.Aberson, S. D., , and J. L. Franklin, 1999: Impact on hurricane track and intensity forecasts of GPS dropwindsonde observations from the first-season flights of the NOAA Gulfstream-IV jet aircraft.

,*Bull. Amer. Meteor. Soc.***80****,**421–427.Bishop, C. H., , B. J. Etherton, , and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects.

,*Mon. Wea. Rev.***129****,**420–435.Caplan, P., , J. Derber, , W. Gemmill, , S-Y. Hong, , H-L. Pan, , and D. Parrish, 1997: Changes to the 1995 NCEP operational medium-range forecast model analysis-forecast system.

,*Wea. Forecasting***12****,**581–594.Cohn, S. E., 1997: An introduction to estimation theory.

,*J. Meteor. Soc. Japan***75****,**257–288.Daley, R., 1991:

*Atmospheric Data Analysis*. Cambridge University Press, 457 pp.Dee, D., 1995: On-line estimation of error covariance parameters for atmospheric data assimilation.

,*Mon. Wea. Rev.***123****,**1128–1145.Etherton, B. J., , and C. H. Bishop, 2004: Resilience of hybrid ensemble/3DVAR analysis schemes to model error and ensemble covariance error.

,*Mon. Wea. Rev.***132****,**1065–1080.Hamill, T. M., , and C. Snyder, 2000: A hybrid ensemble Kalman filter/3D-variational analysis scheme.

,*Mon. Wea. Rev.***128****,**2905–2919.Hock, T. F., , and J. L. Franklin, 1999: The NCAR GPS dropwindsonde.

,*Bull. Amer. Meteor. Soc.***80****,**407–420.Kalman, R., , and R. Bucy, 1961: New results in linear filtering and prediction theory.

,*Trans. ASME J. Basic Eng.***82D****,**35–45.Kurihara, Y., , R. E. Tuleya, , and M. A. Bender, 1998: The GFDL hurricane prediction system and its performance in the 1995 hurricane season.

,*Mon. Wea. Rev.***126****,**1306–1322.Lorenz, E. N., , and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model.

,*J. Atmos. Sci.***55****,**399–414.Parrish, D. F., , and J. C. Derber, 1992: The National Meteorological Center’s spectral statistical interpolation analysis system.

,*Mon. Wea. Rev.***120****,**1747–1763.Surgi, N., , H-L. Pan, , and S. J. Lord, 1998: Improvement of the NCEP global model over the Tropics: An evaluation of model performance during the 1995 hurricane season.

,*Mon. Wea. Rev.***126****,**1287–1305.Szunyogh, I., , Z. Toth, , A. V. Zimin, , S. J. Majumdar, , and A. Persson, 2002: Propagation of the effect of targeted observations: The 2000 Winter Storm Reconnaissance Program.

,*Mon. Wea. Rev.***130****,**1144–1165.Toth, Z., , and E. Kalnay, 1993: Ensemble forecasting at NMC: The generation of perturbations.

,*Bull. Amer. Meteor. Soc.***74****,**2317–2330.Woollen, J. R., 1991: New NMC operational OI quality control. Preprints,

*Ninth Conf. on Numerical Weather Prediction,*Denver, CO, Amer. Meteor. Soc., 24–27.

## APPENDIX

### Generation of VICBAR Ensemble Perturbations

The *K* (where *K* = 41) initial perturbations 𝗫^{a} of the VICBAR ensemble about the analysis are formed by linearly transforming the forecast perturbations bred in a 12-h cycle, 𝗫^{f}, such that they are orthonormal with respect to a simple diagonal estimate of the DLM wind analysis error covariance 𝗔. Each column of 𝗫^{a} and 𝗫^{f} represents an individual analysis or forecast perturbation, respectively. The transformation is performed by first solving the simple *K* × *K* eigenvalue problem 𝗫^{fT}𝗔^{−1}𝗫^{f}𝗖 = **g**𝗖 where the diagonal matrix **g** lists the *K* eigenvalues, and each column of **C** represents the corresponding eigenvectors. The perturbations are then given by 𝗫^{a} = 𝗫^{f}𝗖**g**^{−1/2}; these are added to the control analysis to give the initial ensemble.