## Abstract

An ensemble Kalman filter data assimilation system for the Weather Research and Forecasting Model is used with ensemble-based sensitivity analysis to explore observing strategies and observation targeting for tropical cyclones. The case selected for this study is Typhoon Morakot (2009), a western Pacific storm that brought record-breaking rainfall to Taiwan. Forty-eight hours prior to making landfall, ensemble sensitivity analysis using a 50-member convection-permitting ensemble predicts that dropsonde observations located in the southwest quadrant of the typhoon will have the highest impact on reducing the forecast uncertainty of the track, intensity, and rainfall of Morakot. A series of observing system simulation experiments (OSSEs) demonstrate that assimilating synthetic dropsonde observations located in regions with higher predicted observation impacts will, on average, lead to a better rainfall forecast than in regions with smaller predicted impacts. However, these OSSEs also suggest that the effectiveness of the current-generation ensemble-based tropical cyclone targeting strategies may be limited. The limitations may be due to strong nonlinearity in the governing dynamics of the typhoon (e.g., moist convection), the accuracy of the ensemble background covariance, and the projection of individual dropsonde observations to the complicated targeted sensitivity vectors from the ensemble.

## 1. Introduction

Typhoons cause great loss of life and property each year for coastal regions, making them some of the most costly natural disasters. The accuracy of typhoon track and intensity forecasts is impaired in part by the lack of observations over the ocean, where tropical cyclones (TCs) form and intensify. One strategy that is used to alleviate the deficiencies in tropical cyclone forecasts is observation targeting. This process seeks to determine when and where supplemental observations (in addition to those from preexisting networks) should be taken in order to maximize the reduction in the forecast uncertainty of a numerical weather prediction model. Strategies for identifying the targeted locations depend on the flow-dependent dynamics, forecast model accuracy, background and observation errors, and the data assimilation technique (Berliner et al. 1999).

One targeting strategy, called sensitivity analysis, tries to determine how a numerical weather model behaves in the presence of small changes in initial conditions. In the context of predictability, sensitivity analysis relies on the underlying error growth dynamics to determine where additional observations are needed to reduce forecast uncertainty following data assimilation. Several methods that are based on linear assumptions regarding how a forecast changes with initial conditions have been developed. These approaches require that the linear regression between a scalar forecast metric and an initial condition variable be calculated to quantify the effects of assimilating observations. Both adjoint sensitivity and singular vector analyses have been used in past studies for observation targeting of extratropical and tropical cyclones (Errico and Vukicevic 1992; Langland et al. 1995; Rabier et al. 1996; Zou et al. 1998; Gelaro et al. 1999; Langland et al. 1999; Wu et al. 2007, 2009).

Anderson (2001) used an ensemble-based targeting method in which sample statistics are used to estimate relationships between the initial observable quantities and a forecast variable or function of forecast variables. A similar method was later adopted in Bishop et al. (2001) and Hamill and Snyder (2002) to estimate the effects of observations on analysis variance. The forecast impact associated with observations was also explored by Majumdar et al. (2001). The ensemble-based analysis method was recently demonstrated to be nearly equivalent to the adjoint approach in Liu and Kalnay (2008). However, in the study of a winter cyclone near the west coast of North American, Ancell and Hakim (2007) show that adjoint and ensemble sensitivities may differ greatly in location, scale, and magnitude. They showed that ensemble sensitivity analysis accurately estimated the changes of a forecast metric given the initial conditions. This method can identify a target region where additional observations are needed to reduce the greatest amount of forecast variance. Torn and Hakim (2008) evaluated the sensitivity of forecasts to observations using an ensemble approach with data from a pseudo-operational ensemble Kalman filter (EnKF; Evensen 2003). They found good agreement between forecast changes and ensemble predictions measured by the forecast metric, but only one observation per update time was used in their experiments to examine the effectiveness of ensemble sensitivity. Aberson et al. (2011) also found forecast improvements using ensemble variance, ensemble transform Kalman filter, and total energy singular vector methods with dropwindsonde data.

More recently, Kunii et al. (2012) used ensemble sensitivity analysis with the local ensemble transform Kalman filter (LETKF) and the Weather Research and Forecasting Model (WRF) to examine the impact of real dropsonde observations for the prediction of tropical cyclones. Jung et al. (2012) used an EnKF to assimilate different sets of targeted observations as part of observing system experiments to examine the impacts of real observations at different locations of tropical cyclones. Unlike Kunii et al. (2012) and Jung et al. (2012), this study pulls observations from a “truth” simulation in order to remove the impacts of random error or systematic bias in the observations and forecast model.

There are currently two agencies using the ensemble-based targeting strategy to identify sensitivity regions for airborne dropsondes in order to improve TC forecasting: one is operated by the National Oceanic and Atmospheric Administration (NOAA; Aberson and Franklin 1999) and the other is the Dropsonde Observations for Typhoon Surveillance near the Taiwan Region mission (DOTSTAR; Wu et al. 2007). From 1982 to 1996, there was an increase in airborne dropsonde observations collected by NOAA in TC environments, which helped the National Centers for Environmental Predictions (NCEP) reduce track forecast errors by 16%–30% for 12–60-h forecasts (Burpee et al. 1996; Tuleya and Lord 1997). Since 1997, there is some indication that the global positioning system (GPS) dropsondes have helped the Geophysical Fluid Dynamics Laboratory (GFDL) improve track and intensity forecasts by as much as 32% and 20%, respectively, when storms were within 48 h of the projected landfall (Aberson and Franklin 1999). However, statistically significant improvements in track forecasts were not obtained for dropsondes conducted during 1997 and 1998 (Aberson 2002). A strategy for identifying potential dropsonde release locations to improve TC track forecasts has been developed based on the NCEP operational global ensemble forecasting system (Aberson 2003). The dropsonde data improved the 24- and 48-h NCEP global model TC track forecasts during 2003 by an average of 18%–32%.

Starting from 2003, DOTSTAR has marked the beginning of a new era of TC surveillance and targeted observations in the western North Pacific using GPS dropsondes (Wu et al. 2007). Wu et al. (2007) evaluated the impact of dropsonde data on TC forecasts for 10 missions during 2004 using five models (four operational models and one research model). While the impact of the dropsonde data is not statistically significant because of the limited number of DOTSTAR cases in 2004, the overall impact over the western North Pacific is encouraging. The average 72-h track error reduction of the three global models was 22%, which is consistent with the track forecast improvement in Atlantic tropical cyclones from surveillance missions.

Complementary to the aforementioned studies, an ensemble-based sensitivity method is applied for Typhoon Morakot (2009) to evaluate the sensitivity of rainfall forecasts to synthetic dropsonde observations. Numerous observing system simulation experiments (OSSEs) using an EnKF data assimilation technique are performed for observations at optimized locations selected by the sensitivity analysis. These OSSEs are used to test the effectiveness of ensemble sensitivity analysis as an observation-targeting strategy for typhoons. The use of synthetic observations as part of the perfect-model OSSE experiments allows us to remove error sources posed by an imperfect model and examine the true observation impact given by the algorithm.

Typhoon Morakot hit the southeast coast of Taiwan at 1800 UTC 7 August 2009, bringing catastrophic damage to Taiwan that totaled $3.3 billion (U.S. dollars) in economic losses. Despite being only a weak typhoon, Morakot produced record-breaking rainfall that totaled 2777 mm in 72 h, as measured by the Central Weather Bureau of Taiwan. Zhang et al. (2010, hereafter Z10) explored the predictability of the high rainfall event from an ensemble of high-resolution model simulations. They found that most of the difficulty in predicting this record-breaking rainfall was due to uncertainty in Morakot’s track forecast, as nearly all members with good track forecasts yielded reasonable precipitation estimates. These results raise several questions regarding the predictability of this event: What initial condition differences are causing the large ensemble spread in track and rainfall amount? If additional dropsondes were to be added to reduce the initial condition and forecast uncertainties, which observations would yield the largest impacts to the forecast metrics? Is the ensemble sensitivity analysis effective at identifying the correct observations during the targeting? How should the effectiveness of a targeting method be evaluated? To answer these questions, the magnitude and distribution of an ensemble-sensitivity metric (see section 2) are used in this study to examine the sensitivity of forecast track and precipitation to initial conditions at various forecast lead times. The effectiveness of the ensemble-sensitivity method is verified by assimilating the observations of interest and running deterministic and ensemble forecasts from the corresponding EnKF analyses.

The outline of this paper is as follows. Section 2 introduces the methodology for ensemble sensitivity analysis. Details regarding the methodology and data used in this study are provided in section 3. Predicted observation impacts based on ensemble sensitivity analysis are shown in section 4, and compared with the simulated impacts from assimilating synthetic dropsonde observations in section 5. Conclusions are given in section 6.

## 2. Ensemble-based sensitivity and observation targeting

The ensemble sensitivity analysis used in this study for determining adaptive observation locations is the same as in Bei et al. (2012). The algorithm originated from Berliner et al. (1999) and Hamill and Snyder (2002), and is mathematically identical to the formulation in Bishop et al. (2001). We denote the model analysis state after data assimilation as **x**^{a}, the background state as **x**^{b}, and the truth as **x**^{t}, all of which are vectors of length *m*. Here ^{a} and ^{b} are the analysis and background error covariance matrices, respectively. For this method, the metric used to determine the total decrease in model uncertainty is the reduction in forecast error covariance corresponding to a model state variable defined by the user. As in Hamill and Snyder (2002), this expression is written as

where is a matrix operator that projects the model space to observational space, ^{T} is its transpose, and is the observation error covariance matrix, which includes both the instrument error and the representation error.

The true analysis and background error covariance are given by

For numerical weather prediction, the true state vector is always unknown, but ^{b} can be estimated using (3) with **x**^{t} replaced by the ensemble mean (denoted by an overbar):

Here *N* is the ensemble size and ^{b} is a matrix of background ensemble perturbations, where each column corresponds to a member normalized by the square root of *N ***−** 1. Combining (4) and (1) gives

If the goal of the targeting is to improve the likelihood of the analysis, then locations that maximize the reduction in analysis error variance, tr(^{b} − ^{a}), will be identified as the observation regions to be targeted. The targeted observations should result in the smallest analysis uncertainty from the set of other possible observations. Likewise, if the objective is to improve the likelihood of the forecast, then locations that maximize the reduction in ensemble forecast error variance will be identified as the observation regions to be targeted. Provided that the initial perturbations are small, Bishop et al. (2001) and Majumdar et al. (2001) showed that the reduction in forecast error variance can be estimated by applying the linear model operator and its transpose ^{T} to both sides of (5) to get

where ^{f|a} and ^{f|b} are used on the left-hand side of (6) to denote the reduction of forecast variance, given by (^{b} − ^{a})^{T}. The ensemble forecast members, with mean **x**^{b} and perturbations ^{b}, can be replaced by a user-specified forecast metric with perturbations and mean **x**^{j}. The forecast metric can be a function of one or more variables, and may have a lower dimension than the original state vector (e.g., accumulated rainfall or surface pressure). By letting **Y** denote ^{b} and replacing ^{b} with , the projection of background perturbations to observation space in (6) can be rewritten as

where and denote the covariance matrix for the forecast metric with and without the assimilation of the targeted observation, respectively. It follows from (7) that the reduction in forecast error variance is determined by the square of covariance between the background transformed to observation space and future state variables in model space, divided by the sum of observation and background error variances. Because (7) is positive definite, all assimilated observations will in principle reduce the error variance and improve the forecast accuracy, which is also constrained by the Kalman filter assumptions.

Two forecast metrics are used in our study: the area-averaged 72-h accumulated precipitation and the area-averaged sea level pressure (SLP) in south Taiwan. The **Y** is chosen as a vector of observations located at each grid point of the background state. The observation error variances used in this study are consistent with those provided by the WRF variational data assimilation system.

− can be approximated by (7) for each candidate observation location. Here we define the impact factor as the trace of ( − ) normalized by the maximum and minimum values at the respective time as in Stuart et al. (2007). Locations with the maximum impact factor values are expected to produce the largest reduction in uncertainty for the forecast metric and will therefore be selected for additional targeted observations. While ensemble sensitivity indicates how a forecast metric is expected to change due to modifications in the initial conditions, the impact factor estimates the benefit of additional observations for a targeting region. The two terms share a similar meaning, and are used interchangeably throughout this paper to discuss how observations impact the chosen forecast metrics.

It is common practice to localize the ensemble-estimated covariance to account for sampling error (Hamill et al. 2001), but we chose to omit the localization step in our study despite the known limitations in using unmodified ensemble error statistics. Localizing the matrix **Y**^{T} would require a time-dependent localization scheme, and pose some difficulty in ensuring that the localization radius of influence is consistent with what is used in our implementation of the EnKF. By not applying the localization, we can ensure that the EnKF and ensemble-sensitivity analysis schemes are using consistent ensemble statistics.

One advantage of ensemble sensitivity analysis is that it does not require actual observations to calculate the impact factor, only the observation operator and observation error covariance. All data come from the ensemble mean background state, and the forecast metric considers variables diagnosed from the ensemble forecast. Observation targeting for real-time applications requires enough lead time to prepare the observing platforms and for mission planning based on the targeted area. Aircraft-based observations such as dropsondes for example require the identification of targeting areas 24–48 h ahead of the proposed targeting time for forecast metrics such as track, intensity, and precipitation.

To evaluate the performance of ensemble sensitivity analysis, the predicted change in the ensemble mean of the forecast metric, given the known observation value, should be calculated. This can be achieved using the method proposed by Ancell and Hakim (2007) in which the forecast metric is estimated by modifying the covariance in the Kalman gain matrix. In the Kalman filter equations, the analysis increment for a single observation **y** is given by

where **x**^{o} is the background variable at the same location of **y**, and is the Kalman gain for the considered observation. can be expressed as

where is the observation error covariance matrix. For a set of initial perturbations given by ^{b}, a linear prediction for the future perturbations is given by ^{b}. From (8) and (9), the expected change in the forecast is given by

Now the change in the forecast metric is expressed as a function of the covariance between the forecast metric itself and the predicted observation, the sum of the model and observation space variances, and the observation innovation. Note that (11) is similar to the standard Kalman filter update equation, except that the covariance term for estimating the relationship between the observed variable and updated variable is replaced by the covariance between a forecast metric and the observed variable. Because both Ancell and Hakim (2007) and Liu and Kalnay (2008) showed that ensemble- and adjoint-based approaches make similar assumptions in how forecast errors grow with time, we believe that conclusions drawn here for ensemble-based targeting strategies may also be relevant to adjoint-based methods.

## 3. Model and targeting strategy for OSSEs

### a. Model configuration

This study uses version 3.1.0 of the WRF (Skamarock et al. 2007), with a horizontal grid spacing of 13.5 km (D1) for the outer domain and 4.5 km (D2) for the two-way nested inner domain [refer to Fig. 1 of Xie and Zhang (2012)]. Both domains use 34 vertical levels with coordinates that follow hydrostatic pressure. Each domain has 603 × 540 horizontal grid points, with D1 covering an area ranging from 7°S–44°N and 98°–177°E and D2 covering 13°–35°N and 113°–135°E (results are shown from the nesting domain only). Initial conditions are provided from an experimental real-time global ensemble data assimilation system that is based on the GFS model. This system assimilates the same observations as the NCEP operational global analysis, but features an ensemble data assimilation technique that uses 60 members to approximate flow-dependent background error statistics and a probabilistic analysis (Whitaker et al. 2008).

### b. Simulated observations

Z10 showed that the 60-member WRF ensemble forecast (EN60_GOOD) initialized from the above-mentioned GFS ensemble analysis performed well in predicting the severe flooding in Taiwan associated with Typhoon Morakot. A deterministic forecast from the ensemble mean predicted a maximum 72-h rainfall forecast of 2762 mm, which is close to observations (Fig. 1b). We use the forecast from the ensemble mean of EN60_GOOD as the true state of the atmosphere for this event (hereafter referred to as the truth simulation). Synthetic dropsonde observations of zonal and meridional winds, temperature, dewpoint temperature, and geopotential height are extracted from the truth simulation at potential targeted locations. These values are interpolated to 12 standard pressure levels between 1000 and 50 hPa. Unless otherwise specified, no observation error was added for any of the synthetic observations derived from the truth run. Sensitivity to random observation errors will be presented in section 5c. A total of 90 equally spaced dropsondes (every 270 km) are derived from the nested domain of the truth simulation.

### c. Targeting strategy

A set of numerical experiments aimed at targeting observations to improve ensemble forecasts is carried out in this study using ensemble sensitivity analysis. Among the ensemble members that have poor precipitation forecasts, most have large position errors that track Morakot far north of Taiwan. A member with considerable track error (member 54; Fig. 2a) and rainfall error (Fig. 1c) is selected from EN60_GOOD and used as the initial mean for a new ensemble. Perturbations from members 10 to 60 of EN60_GOOD are used to produce the new ensemble (EN50_POOR) at 0000 UTC 5 August, but with member 54 omitted. A 96-h ensemble forecast is generated from EN50_POOR to test the ensemble-sensitivity targeting technique with an EnKF data assimilation system. The deterministic forecast from the ensemble mean of EN50_POOR at 0000 UTC 5 August will be referenced frequently throughout the paper and denoted by NoDA. Figure 2a shows Morakot’s track spread produced from the EN50_POOR forecast. Similar to the forecast from the ensemble mean, most members fail to make landfall in Taiwan, causing them to produce rainfall forecasts that are far from the truth. The ensemble forecast exhibits large track spread by the end of the simulation, due to significant differences in the initial conditions. The large analysis uncertainty motivates an exploration into targeted observations to reduce the track spread and to improve the rainfall and intensity forecast of Morakot.

The accumulated rainfall and SLP averaged over southern Taiwan (indicated by the rectangular box in Fig. 1b) are chosen as the forecast metrics or response functions to be examined. A set of analyses are generated by assimilating each dropsonde individually at 0000 UTC 6 August (24 h after the original start of the EN50_POOR ensemble) with the WRF-EnKF. A 72-h deterministic forecast is then initialized with the EnKF mean analysis by assimilating each of the 90 dropsondes individually, and comparing the analyses with the truth simulation to calculate the simulated impacts of each dropsonde in terms of rainfall and SLP.

## 4. Predicted observation impacts based on ensemble-sensitivity analysis

The impact factor as defined by (7) estimates the degree to which additional observations will positively impact the forecast metric at hypothetical observation locations. Figure 3 shows the normalized impact factor for individual dropsondes taken at either 0000 UTC 5 or 0000 UTC 6 August with respect to the forecast of area-averaged 72-h accumulated rainfall valid at 0000 UTC 9 August and the area-averaged SLP over Taiwan at 0000 UTC 8 August. Because the impact factor is meant to show the relative importance of the hypothetical observations at different locations, we normalize the original value estimated from (7) at each dropsonde location by the maximum value of the field, which forces the values shown in Fig. 3 to range from 0 to 1. As anticipated, impact factors calculated from the two separate forecast metrics produce qualitatively similar patterns; a lower area-averaged SLP value over southern Taiwan is usually associated with a track that directly crosses over the island, thus increasing the possibility of large precipitation associated with the typhoon interaction with the Taiwan topography (e.g., Xie and Zhang 2012).

At 0600 UTC 5 August, 6 h after the start of the EN50 ensemble, dropsonde observations with the largest observation impact on the rainfall metric are located either close to the center of the initial typhoon, or in its northeast quadrant (Fig. 3a). The distribution of the largest impact factor on the SLP metric is qualitatively similar to that of the 72-h accumulated rainfall but more concentrated in a narrow banded structure that extends from the storm center to the northeast (Fig. 3c). Nevertheless, the ensemble sensitivity analysis suggests that individual dropsondes may have considerable impacts on the SLP and rainfall forecasts over Taiwan, especially for the SLP forecasts. Twelve hours into the simulation, the magnitude of impact factors increases in the western and northwestern regions of the typhoon inner core (not shown), forming a more isolated area of relatively large forecast sensitivity. By this time, the ensemble spread in storm positions has increased, and members have developed stronger, more organized typhoons (Z10); since the position uncertainty usually dominates the error covariance (Poterjoy and Zhang 2011), the increased track uncertainty in the ensemble may have contributed to the larger concentration of impact factors at this time. At the targeting time of 0000 UTC 6 August (Figs. 3b,d), anomalously high impact factors are concentrated in an even smaller area and the maximum has shifted from the northwest to southwest quadrant of the typhoon. A large difference in impact factor should exist between the location to be targeted and the neighboring regions for observation targeting to be effective. Here we selected 0000 UTC 6 August as the target time, which yields a clear local maximum of impact factors in the southwest quadrant of Morakot. This time also corresponds to the 24-h forecast lead time for the initial ensemble (EN50), which is a more realistic targeting time in practice because sufficient time is needed to allow for flight planning and the target operation. The following section will show results from assimilation and forecast experiments to examine the effectiveness of the targeting strategy. Observations used to estimate the impact factors in Figs. 3b,d are assimilated to calculate the actual effect of the targeted observations on the forecasts.

## 5. Simulated impacts from forecasts with EnKF assimilation of each dropsonde

### a. Simulated impacts through single deterministic forecasts

Single deterministic forecasts are integrated for 72 h from the 90 different EnKF analyses (section 3). The analyses are produced at 0000 UTC 6 August by assimilating each of the 90 synthetic dropsonde observations that were created from the truth simulation. For consistency, covariance localization is not used in the EnKF data assimilation because it is not used in the calculation of the predicted observation impact. Figure 3b shows several of the dropsonde locations near the typhoon. Dropsondes located in the inner core (points S0 and S5) and outside the typhoon (points S1, S2, S3, S4, and S6) are selected to show the results of data assimilation. Although S5 is located closer to the typhoon center, it has a smaller impact factor value than S0, which has the largest value for the entire domain (i.e., a normalized value of 1). Both S1 and S4 are located in a region of southwesterly flow, and are chosen to assess the potential impact of the southwest monsoon on the typhoon track, intensity, and rainfall forecast. Points S2, S3, and S6 are three locations at different sides of the storm that have relatively small impact factors. The maximum (minimum) SLP increments for soundings S0, S1, S2, S3, S4, S5, and S6 are 2.5 (−6.8), 1.7 (−8.1), 1.8 (−9.2), 1.7 (−7.3), 2.2 (−6.7), 3.7 (−8.3), and 2.2 (−6.7) hPa (Fig. 4), respectively.

With the exception of S6 (Fig. 4), the spatial pattern of the SLP increments from each of the EnKF analyses is qualitatively similar to the difference between the truth and NoDA (Fig. 2b), implying that each of the dropsondes extracted from the truth simulation are assimilated properly by the EnKF to improve the initial analysis. For example, the north–south-oriented dipole pattern of the SLP increments acts to move the storm from the NoDA location to the true location for all dropsondes (Fig. 2b). Nevertheless, significant differences also exist between these EnKF updates. Dropsondes closer to the typhoon inner-core area generally produce larger updates to the pressure field surrounding the minimum SLP of the background field. For example, assimilating observations at S0 and S5, located near the inner-core area of the typhoon, produces slightly stronger dipole patterns in the analysis increments than dropsondes at S4 and S6 (Figs. 4a,f vs 4e,g). The method used in this study is mathematically identical to the ETKF based method of Majumdar et al. (2002).

For all dropsondes, including those located outside the inner core, information failed to propagate to synoptic-scale features of the environment, such as the subtropical high and southwest monsoon. All updates made by the EnKF assimilation of dropsonde observations must come from the ensemble-estimated covariance between the model-predicted value of the observed quantity and the remaining state vector. Figure 5 shows the correlations between 850-hPa height at six dropsonde locations (S1–S6) and the sea level pressure over the nested domain. The spatial distribution of correlations is similar for all four dropsondes, except that the maximum correlation value for a dropsonde outside of the inner core is much smaller than those estimated for dropsondes near the inner core. In general, dropsonde variables are strongly correlated with sea level pressure over broad areas, especially in the vicinity of the observations. The reason why large pressure increments are not seen outside the inner core in Fig. 4 is likely due to the lack of background correlations in the environment.

To verify the impact of assimilating individual dropsondes on the model forecast, the 72-h track (Fig. 6) and rainfall (Fig. 7) forecasts are compared. The assimilation of individual dropsondes leads to track forecasts that are closer to the truth simulation than NoDA. The mean track errors are averaged every 6 h from 0000 UTC 6 August to 0000 UTC 9 August for each of the seven deterministic forecasts, yielding values of 79 km (S0), 113 km (S1), 114 km (S2), 85 km (S3), 119 km (S4), 124 km (S5), and 139 km (S6). The proximity of the track forecasts to central Taiwan at this time leads to maximum 72-h accumulated rainfall amounts in excess of 2000 mm for the seven selected cases (2480 mm for S0, 2368 mm for S1, 2407 mm for S2, 246 mm for S3, 2021 mm for S4, 2197 mm for S5, and 1588 mm for S6), which are also closer to the truth simulation. The experiment for S0, a dropsonde located in the maximum impact factor region, has the largest improvement in forecast track and rainfall. However, the dropsonde with the second highest impact factor (S5) does not improve the forecast any more than the S1 and S3 observations during these experiments. Although S1 and S3 are relatively far from the maximum impact factor area in the inner core, they make reasonable improvements in the rainfall and track forecasts. Therefore, the typhoon case presents a scenario in which the sensitivity algorithm fails to provide an accurate measure of future error reduction.

Ensemble sensitivity analysis is based on a linear theory that connects the response of a forecast metric to changes in initial conditions, so it is necessary to verify the predicted observation impacts with the actual change in forecast errors found after integrating the full nonlinear model with assimilation of observations at different locations. Several indicators are used to quantify differences between the predicted and actual changes in the forecast after assimilating adaptive observations. The first is to examine the predicted and actual reduction in root-mean-square error (RMSE) of the forecast metric (or the response function that is used to calculate the impact factor). As in Bei et al. (2012), the square root of the nonnormalized impact factor is defined as the predicted (or expected) RMSE reduction in the forecast metric using (7). To perform the comparison, the actual reduction in RMSE for the 72-h accumulated rainfall and SLP averaged in the verification region is calculated from forecasts initialized from the mean EnKF analysis of each OSSE. The scatterplots in Figs. 8a,c show that assimilating synthetic dropsonde observations located in regions with higher predicted observation impacts will, on average, lead to a better rainfall forecast than in regions with smaller predicted impacts. However, the correlation between the expected and the actual RMS error reduction is rather weak for both rainfall (0.38) and SLP (0.42). This suggests that there are strong limitations in the effectiveness of using ensemble-based impact factors for observation targeting to improve tropical cyclone forecasting that are deeply rooted in the assumptions used in deriving (7). In particular, the background error valid at the targeting time for this case may be too large for the linear assumption to be valid.

Another way of verifying the validity of the linear assumption is to compare the expected with the actual changes in the forecast metric. The expected change is computed using (11) for each of the selected grid points over all single-dropsonde assimilation cases. The actual results are given by the average changes compared with NoDA in the 72-h accumulated rainfall and SLP at the landfall time in the verification region. The scatterplots in Figs. 8b,d again show that the linear relationship between the expected and actual change are not as good as expected in the linear theory; the regression coefficients are 0.59 for average rainfall and 0.34 for average SLP. This result is consistent with Bei et al. (2012), though the correlation coefficients are even smaller in the current study. The discrepancy between predicted and actual impacts may be due to the assumption in ensemble sensitivity theory that forecast errors grow linearly over the forecast period during which targeting is applied. This assumption may not be valid in typhoons because of large initial condition errors caused by the location of the vortex in numerical weather models, as well as errors originating from scales equal to or smaller than the mesoscale that may have grown nonlinearly during the 72-h forecast. The ensemble sensitivity method can also be limited by error/bias in the forecast model, ensemble, and/or the dropsonde observations the last of which will be examined in section 5c.

### b. Simulated impacts through ensemble forecasts initialized with EnKF members

For a more direct comparison, the theoretical results should be verified using the actual reduction in forecast error variance from each dropsonde experiment. The large computational cost of running ensemble forecasts for all the experiments limits this type of verification to seven individual dropsondes (S0 to S6, with locations marked in Fig. 3b). Here the analysis ensembles produced from the assimilation of single-dropsonde experiments are integrated forward to estimate the actual forecast variance at the verification time. One of the dropsondes is located in the maximum impact location, and the others are selected from locations that are displaced in four different directions with respect to the center position of the typhoon. Figure 9 shows the ensemble spreads and mean forecast errors of track, SLP, and rainfall for each dropsonde experiment verified against the truth. In general, dropsondes located in the larger impact regions have more accurate track forecasts and smaller ensemble spread. Nevertheless, the forecast error variance of average rainfall over Taiwan for each of the single-dropsonde experiments is larger than the case without data assimilation, which is inconsistent with the track forecast and sensitivity analysis. The ensemble reveals complex relationships between storm track and the amount of total rainfall. For example, a heavy rainfall forecast for Morakot depends on both the accuracy of the simulated track and physical processes that lead to the topographically forced rainfall over Taiwan (Xie and Zhang 2012). This causes the ensemble variance in 72-h accumulated rainfall to be larger for an ensemble of reasonably accurate track forecasts, as opposed to an ensemble in which most members fail to track over Taiwan. This case provides a scenario in which the ensemble has a higher-than-average probability of underestimating the true forecast error near the time of landfall, which may decrease the effectiveness of the targeting system.

### c. Impact of random error in the dropsonde observations

The single-dropsonde assimilation experiments show that neighboring dropsondes with different impact factors have similar effects on the forecast metric. One issue is the representativeness of the observations, which is beyond the scope of this study. Another issue that should not be ignored is errors in the observations. To explore the impact of observation errors on forecasts, random errors (RE) proportional to the observation error assigned by the WRF variational data assimilation system are added to dropsonde observations S4 and S5, and the experiments are redone using the two new dropsondes: S4-RE and S5-RE. The resulting SLP differences with and without the random observation errors are plotted at the initial (Figs. 10a,b) and 72-h forecast lead times (Figs. 10c,d). Figure 11 shows the corresponding difference in the track predictions of Morakot initialized from the EnKF analyses with and without the random observation errors. These calculations examine the sensitivity of the targeting to observation errors and further demonstrate how nonlinear forecast error growth violates the linear assumption made in deriving the sensitivity analysis equations.

In the case of S4, which is a low-impact factor observation that is located south of the storm center in a region of southwesterly monsoon flow (Fig. 3e), the addition of random noise to the observations produces small but noticeable differences. For example, SLP differences of less than 1 mb are observed over the northeastern and western sides of the storm center in the analysis (Fig. 10a). These small initial differences grow much larger over the 72-h forecast (Fig. 10c), but cause only marginal changes to the track (Fig. 11), intensity (Fig. 10c), and rainfall (Figs. 3e,h) forecasts.

The impact of random error in dropsonde observation S5, on the other hand, is rather dramatic, despite an equally small initial difference in SLP (Fig. 10b). The track forecast of Morakot differs greatly between S5 (Fig. 10d) and S5-RE (Fig. 11). Consequently, the difference in rainfall prediction over Taiwan is equally dramatic; the maximum 72-h accumulated rainfall (Figs. 7f,i) is reduced from 2197 (S5) to 1459 mm (S5-RE).

The sensitivity experiments with and without random observation errors in the dropsondes further suggest the error growth is strongly nonlinear and the predictability of tropical cyclones can be limited, which is consistent with the ensemble sensitivity experiments shown in subsection 5b, as well as past tropical cyclone predictability studies (Zhang and Sippel 2008, 2009; Sippel and Zhang 2010). These limitations reduce the effectiveness of using ensemble sensitivity analysis, particularly for observation targeting.

## 6. Conclusions

This study examines the effectiveness of using an ensemble-based sensitivity analysis method with the WRF-EnKF for observing system design and observation targeting, in order to improve tropical cyclone forecasts. Typhoon Morakot (2009), a western Pacific storm that brought record-breaking rainfall to Taiwan, is selected for this study. The ensemble sensitivity analysis using a 50-member convection-permitting ensemble predicts that observations located in the southwest quadrant of the typhoon (48 h prior to making landfall) will have the highest impact on reducing the forecast uncertainty of track, intensity, and rainfall for Morakot.

A series of OSSEs are performed by assimilating targeted dropsonde observations located in the environment and inner core of Morakot. After assimilating the synthetic sounding observations with the WRF-EnKF, noticeable improvements can be seen in the initial conditions of sea level pressure (and other model state variables), with the maximum SLP increment in the inner core of the typhoon being greater than 5 hPa for some dropsonde locations. The 72-h deterministic forecasts initialized from the EnKF analyses show that the selected dropsondes are capable of improving the track and precipitation forecasts, but with varying impacts. Generally, dropsondes near the typhoon center have a greater impact than dropsondes in the environment. However, regressions suggest that the relationship between the expected and actual changes in forecast metrics is nonlinear, which is not consistent with the linear theory of ensemble sensitivity.

Several sets of ensemble forecasts initialized from the EnKF posterior members are performed to examine the impacts of individual dropsondes on the actual and predicted uncertainty reductions in the forecast metrics. These additional ensemble OSSEs show that in cases with a more accurate mean track forecast, the ensemble spread of typhoon tracks is smaller, which is expected with a larger observation impact factors. However, the ensemble spread of precipitation is larger, which contradicts the predicted impact of the ensemble sensitivity analysis. One reason for this result is that a larger number of members captures the landfall of Morakot after the targeted observations are assimilated, thus increasing the forecast variance. The dynamics governing the precipitation associated with Morakot were also shown to be highly nonlinear and sensitive to topography. Because ensemble sensitivity cannot resolve errors that grow nonlinearly, the actual simulated error variance from the EnKF ensemble forecasts differs from the predicted forecast error variance.

In summary, the current study demonstrates serious limitations in using the current-generation ensemble-based linear-sensitivity targeting strategies for tropical cyclones. The effectiveness of these linear methods can be adversely impacted by the strong nonlinearity in the governing dynamics of tropical cyclones, the accuracy of the sample ensemble background covariance, and the projection of individual dropsonde observations to the complex targeted sensitivity vectors from the ensemble. Though beyond the scope of the current study, we speculate that other types of observation errors including measurement bias, representativeness error, and/or error in the observation operators (which are not examined in the current study) may lead to further degradation of the overall effectiveness of this method. Given the common assumptions made between the ensemble and adjoint sensitivity analysis, as demonstrated in Ancell and Hakim (2007) and Liu and Kalnay (2008), future research will explore whether the limitations examined in this study for ensemble-based sensitivity analysis may be applicable to adjoint-based linear sensitivity analysis methods that are widely used for observation targeting.

## Acknowledgments

The first author is sponsored by the Graduate Exchange Funds provided by the China’s Ministry of Education. Proofreading by Ben Green is greatly appreciated. This research is partially supported by CNSF (Grants 40975059, 41275048, and 40921160380), ONR (Grants N000140410471 and N000140910526), NOAA (Grant under HFIP), and NSF (Grant 0840651) for financial support, and the Texas Advanced Computing Center (TACC) for computing resources and support.