The impact of assimilating data from the 2011 Winter Storm Reconnaissance (WSR) program on numerical weather forecasts was assessed. Parallel sets of analyses and deterministic 120-h numerical forecasts were generated using the ECMWF four-dimensional variational data assimilation (4D-Var) and Integrated Forecast System. One set of analyses was generated with all of the normally assimilated data plus WSR targeted dropwindsonde data, the other with only the normally assimilated data. Forecasts were then generated from the two analyses. The comparison covered the period from 10 January to 28 March 2011, during which 98 flights and 776 total dropwindsondes were deployed from four different air bases in the Pacific basin and the United States. The dropwindsondes were deployed in situations where guidance indicated the potential for high-impact weather and/or the potential for large subsequent forecast errors. Downstream target verification regions where the high-impact weather was expected were identified for each case. Forecast errors around the target verification regions were evaluated using an approximation to the total-energy norm. Precipitation forecasts were also evaluated over the contiguous United States using the equitable threat score and bias.
Forecast impacts were generally neutral and thus smaller than reported in previous studies, most from over a decade ago, perhaps because of the improved forecast and assimilation system and the somewhat denser observation network. Target areas may also have been undersampled in this study. The neutral results from 2011 suggest that it may be more beneficial to explore other targeted observation concepts for the midlatitudes, such as assimilation of a denser set of cloud-drift winds and radiance data in dynamically sensitive regions.
Since the mid-1990s, supplementary “targeted” atmospheric observations have been deployed in relative data voids in the extratropics, such as the open ocean under cloud shields. The additional data were collected in an attempt to improve the operational numerical weather prediction (NWP) of potential high-impact weather events through assimilation of these extra data. The most extensive use of targeted observations in the extratropics has been through the annual National Oceanic and Atmospheric Administration (NOAA) Winter Storm Reconnaissance (WSR) program, which has been operational since 2001. During each day of WSR, NOAA forecasters identify weather systems that may impact the contiguous United States and Alaska up to a week in advance and estimate the uncertainty associated with the forecast of each system. They pick a “target verification location” where the high-impact weather is centered and then subjectively assign a low, medium, or high priority to each case depending on the severity of the event and the potential impact to society. The ensemble transform Kalman filter technique (ETKF; Bishop et al. 2001) is then used to identify potential upstream “sensitive areas,” primarily over the northern Pacific Ocean, in which the assimilation of targeted observations is expected to maximally improve the subsequent forecast of the weather event in question. More specifically, the ETKF uses wind and temperature output at the 200-, 500-, and 850-hPa pressure levels from operational ensemble forecasts generated at the National Centers for Environmental Prediction (NCEP), the European Centre for Medium-Range Weather Forecasts (ECMWF), and the Canadian Meteorological Centre (CMC). Perturbations from these ensemble forecasts about their respective center’s ensemble means are used to predict error covariance matrices, and thereby the reduction in forecast error variance resulting from any potential deployment of targeted observations (e.g., a flight track). In other words, the variance of the “signal,” meaning the impact of the targeted observations using a difference total-energy metric, is predicted and mapped as a composite “summary map” that depict sensitive areas for sampling, and also as a function of a predefined series of flight tracks (Majumdar et al. 2002a). Once the optimal flight tracks have been determined by the ETKF for the aircraft that release the global positioning system (GPS) dropwindsondes, a flight request is submitted two days prior to the actual flight deployment. These data are then assimilated into operational global NWP systems. For more comprehensive details of the field of targeted observations, the interested reader is referred to review articles by Langland (2005) and Majumdar et al. (2011).
The decision to implement WSR in NOAA’s operations was based on the promising results of the North Pacific Experiment-98 (NORPEX-98) and experimental WSR field campaigns in 1999 and 2000, in which verification studies found that the majority of lower-resolution targeted forecasts were significantly improved (Langland et al. 1999; Szunyogh et al. 2000, 2002). Additionally, evaluations of the ETKF had demonstrated that it can efficiently and accurately predict the reduction in the error variance of 1–3-day forecasts as a result of targeted observations, prior to each deployment (Majumdar et al. 2001, 2002a). The broader-scale aspects of ETKF targets were largely found to agree with those of adjoint-based techniques such as singular vectors (Majumdar et al. 2002b). Recent studies have demonstrated the utility of the ETKF out to 7 days, with sensitive areas traceable as far upstream as Japan (Sellwood et al. 2008; Majumdar et al. 2010). Consequently, WSR aircraft have been stationed in Japan since 2009 to collect targeted observations, in an attempt to improve medium-range forecasts.
Since the advent of WSR, much has changed in numerical weather prediction (NWP), and there are concerns in the community that previous optimistic results from over a decade ago may not be replicable today. Forecast models are now much higher in resolution and incorporate better physical parameterizations, thus producing better prior forecasts for the data assimilation. Additionally, advanced data assimilation methods such as the four-dimensional variational data assimilation (4D-Var) are now operational at almost all NWP centers, further reducing analysis errors. The observing network is also more extensive than it was a decade ago, as is the assimilation of satellite data in operational NWP systems. Finally, there is concern that the areas that need to be sampled may be so prohibitively large that ~(10–20) additional dropwindsondes per flight may be inadequate (Langland 2005).
WSR has not recently performed careful data denial experiments with a modern data assimilation and forecast system, testing the forecast impact with and without the targeted observations. This paper reports on an attempt to perform such an experiment using 2011 WSR data and the ECMWF assimilation and forecast system. The hypothesis to be tested is as follows: given a reasonably selected set of targeted observations, forecasts that incorporate the assimilation of these additional observations will be significantly more skillful than forecasts that do not, and the extra observations will be especially important for cases with anticipated high-impact weather, often associated with rapidly developing cyclones and the rapid growth of forecast error. Examples of cases in which large forecast errors are associated with deepening cyclones are presented in Colle and Charles (2011). Further, we hypothesize that the impact of the targeted observations will be larger in specific downstream “verification regions” focused on the expected area with high-impact weather and that the impact will be smaller when evaluated over continental-sized areas.
2. Targeted data, model, and data assimilation system
The WSR program is coordinated each year by NOAA/NCEP, who have kept a log of daily flight requests, and the forecast lead time, verification time, target verification locations, and the priority of each forecast case (http://www.nco.ncep.noaa.gov/pmb/sdm_wsr/) from 2003 to the present. In 2011, a total of 776 dropwindsondes were deployed by the NOAA and U.S. Air Force (USAF) aircraft, which took off from four different air bases (Anchorage, Alaska; Biloxi, Mississippi; Yokota Japan; and Honolulu, Hawaii). During the 2011 WSR period there were 22 high-priority cases, 62 medium-priority cases, and 14 low-priority cases. The forecast lead time associated with a given target verification for an event ranged from +12 to +120 h post-assimilation. The lead time was calculated as the difference between the forecast target verification time and the initialization time. A plot of the target verification locations during the 2011 WSR campaign from 10 January through 26 March 2011 is shown in Fig. 1, including the assigned priority for each target and the forecast lead time.
Two parallel forecast experiments were carried out using the ECMWF’s 4D-Var data assimilation system and global weather forecast model for the period from 9 January through 28 March 2011. The first set included the 2011 WSR dropwindsonde data (“CONTROL”) and the second set excluded the dropwindsonde data (“NODROP”). For both assimilation cycles, ~107 other observations were assimilated in both CONTROL and NODROP experiments (i.e., the full data stream normally assimilated at ECMWF). In particular, the surface-based observations were SYNOP (measuring surface pressure, 10-m winds, and 2-m relative humidity), DRIBU (buoys measuring surface pressure and 10-m winds), radiosonde (measuring temperature, winds, and humidity profiles), aircraft (measuring temperature and wind profile), profilers, and PIBAL (measuring wind profiles). From the geostationary platforms [i.e., the Meteorological Satellite (Meteosat), the Geostationary Operational Environmental Satellite (GOES), the Multifunctional Transport Satellite (MTSAT), and the Moderate Resolution Imaging Spectroradiometer (MODIS)], two different observation types were assimilated: atmospheric motion vectors (retrieved wind profiles) and infrared sounder radiances. From the polar-orbiting platforms, the following were assimilated: Advanced Microwave Sounding Units A (AMSU-A) and B (AMSU-B), Microwave Humidity Sounder (MHS), and Meteosat Second Generation (MSG) (all measuring microwave-sounder radiance); IASI, Atmospheric Infrared Sounder (AIRS), and High Resolution Infrared Radiation Sounder (HIRS) (measuring infrared-sounder radiance); Special Sensor Microwave Imager (SSM/I), Special Sensor Microwave Imager/Sounder (SSM/IS), Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI), Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E) (microwave-imager radiance); Advanced Scatteromter (ASCAT) and European Remote Sensing Satellite (ERS) (retrieved wind product from microwave scatterometer backscatter coefficients); and GPS-Radio Occultation (measuring radio occultation bending angle). Both CONTROL and NODROP were cycled continuously for the entire campaign period, whether the targeted dropwindsonde data were available or not. When targeted observations were taken, subsequent deterministic forecasts were produced to +120-h lead. In all cases, the CONTROL analysis was used for verification, which may bias the results at the early leads slightly to favor the CONTROL forecasts. For both cycles, ECMWF used version 37r2 of their Integrated Forecast System (IFS; www.ecmwf.int/products/data/operational_system/evolution/evolution_2011.html). The resolution of the forecast model was T511 (~0.35° grid spacing on reduced linear Gaussian grid), with 91 vertical levels. The data assimilation, ECMWF’s 4D-Var system, uses a full nonlinear trajectory at T511 L91 (outer loop) and a linearized model (Janiskova and Lopez 2012) at the resolutions T159, T159, and T255 for the three minimization inner loops, respectively. The ECMWF 4D-Var system also used background error variances “of the day” as estimated from the low resolution (T399 L91 outer loop, linearized T159 inner loops) ensemble data assimilation (Bonavita et al. 2011).
3. Description of norms used to evaluate forecast impact
The impact of assimilating the dropwindsonde data on ECMWF forecast skill was calculated using a crude approximation to the commonly used dry total-energy norm. This norm is similar to the total-energy metric used in the ETKF computations of signal variance. Let u represent a gridded state vector of forecast minus analysis differences for the u-wind component. Similarly, v, t, and p represent fields of differences in υ-wind, temperature, and surface pressure, respectively. Then the error E for a domain A was
where the state vector subscripts denote the constant pressure level (250, 500, and 850 hPa) or the height above ground (10 and 2 m). Here cp represents the specific heat content of dry air at constant pressure (=1004 J K−1 kg−1), Tr is the reference temperature (=300 K), Rd is the gas constant for dry air (=287 J K−1 kg−1), and Pr is the reference pressure (=1000 hPa). The integral sign indicates that the error was integrated and averaged over the domain A, accounting for latitudinal variations in grid spacing. The domain A will differ with different tests. This approximation to the total-energy norm provides a little extra weight to near-surface fields, which may be desirable given their greater societal relevance. The impact was first evaluated in relatively confined verification regions, ±10° latitude and longitude around the verification location of interest at the specific lead time of the target forecast, which may change from +12 to +120 h depending on the case day. This size of verification region was chosen to closely resemble the 10° radius region used in previous WSR evaluations. Next, similar statistics were computed within a larger Pacific–North American (PNA) region covering North America and adjacent coastal waters (20°–75°N, 180°–320°E). Equitable threat scores and bias [Wilks 2006, Eqs. (7.18) and (7.10), respectively] were also computed over the contiguous United States (CONUS). Precipitation forecasts were evaluated at stations, bilinearly interpolating the forecast data to gauges within the CONUS that report 24-h accumulated amounts.
4. Forecast impact
Figure 2 provides a comparison of the forecast errors for NODROP versus CONTROL. Figure 2a provides a scatterplot of the data, with the CONTROL errors on the abscissa and NODROP errors on the ordinate. There is a symbol associated with each case, with different symbols for the different lead times. Cases above the diagonal line indicate cases with some improvement from the assimilation of dropwindsonde data. Figure 2b provides another way of viewing the differences, this time as a scatterplot as a function of the forecast lead time. Different symbols indicate the different priorities assigned to the cases. The solid line provides the mean difference for each lead time, and the dashed line indicates one standard deviation. While there are some slight positive differences, there are about as many negative differences. These 2011 data do not support the hypothesis that the differences with versus without targeted observations are statistically significant in the localized verification region. From visual inspection, there is no obvious relationship between the priority of the case and the impact; in fact, the forecast impact of high-priority cases appears well mixed with the ones of medium- and low-priority cases. Objective statistics as a function of the priority were not calculated because of the small sample sizes. Figure 3 provides the same type of information, but here over the PNA region. The forecast errors averaged over this larger area are also very similar between CONTROL and NODROP. A different forecast skill index such as the anomaly correlation (e.g., 500-hPa time series, not shown) also showed a similar lack of impact.
We also examined the precipitation equitable threat scores and biases for both +24- to +48-h accumulations (Fig. 4) and for +48- to +72-h accumulations (Fig. 5) over the CONUS. The differences are not statistically significant.
5. Discussion and conclusions
This study has briefly summarized the impact from the assimilation of targeted observations from the 2011 Winter Storms Reconnaissance Program. Parallel cycles of ECMWF’s data assimilation and deterministic forecasts were conducted, including and excluding the targeted observations with the rest of the regularly assimilated data. Differences were not statistically significant. The 2011 results do not support the hypothesis that differences between forecasts with and without these assimilated dropwindsondes are statistically significantly improved in the localized verification region. There may be several reasons for the lack of impact noted here. Observing systems have gotten denser in the ~10 years since the last systematic, peer-reviewed studies including the Pacific basin, with more cloud-track winds, aircraft, satellite radiance, and radio occultation data from global positioning satellites. Many other observing systems may now have relatively limited impact were they evaluated in a similar observing systems experiment. Data assimilation and forecast systems have improved as well. Additionally, it is recognized that a handful of dropwindsondes will incompletely sample the initial sensitive area because of limitations on how far and where the plane deploying them can fly. It is also worth recognizing that while the ETKF targeting technique has quantitatively proven to be skillful in predicting signal variance for short-range forecasts of winter weather, it is imperfect and also inconsistent with the operational data assimilation scheme used in this study. One might expect the ETKF to be more effective if an ensemble-based data assimilation scheme is used to assimilate the targeted data. However, it is generally accepted (e.g., Majumdar et al. 2011) that the targeting method is not the first-order problem.
It might be possible that data from different years or seasons has a different impact. Recently, R. Gelaro (2012, personal communication) found that using National Aeronautics and Space Administration’s (NASA’s) adjoint sensitivity method and their assimilation system (Gelaro et al. 2010), the assimilated dropwindsonde data had a large positive impact on a global measure of 24-h forecast error in several cases during WSR 2012. However, these impact results have not yet been measured with an observing system experiment such as were conducted here.
For the foreseeable future, the global observing network will continue to have regions with relatively sparse in situ data. The challenge will be to supplement the existing network in the most cost-effective manner. WSR plane flights into the central Pacific are typically quite expensive, with fuel costs alone typically in the tens of thousands of U.S. dollars. In a comparison study of observation impacts in three forecast systems, Gelaro et al. (2010) showed that only a small majority of the total number of assimilated observations actually improve the 24-h forecast, with much of the improvement coming from a large number of observations having relatively small individual impacts. Those authors argue that accounting for this behavior may be especially important when considering strategies for deploying adaptive components of the observing system. Given this and the results of the present study, we suggest refocusing the targeting concept to use available resources such as high-resolution satellite data. Sensitive areas, whether they are determined by forecasters or by objective algorithms, can potentially be monitored more closely by turning on the rapid-scan feature on geostationary satellites and then assimilating a denser network of motion vectors, such as in Berger et al. (2011). Perhaps a denser network of radiance data can be assimilated in sensitive regions (Bauer et al. 2011).
Members of the THORPEX Data Assimilation and Observing Systems Committee are thanked for providing guidance on the experimental design and the methods for verification and for informal reviews of this manuscript. Publication of this article was supported with a grant from the NOAA/THORPEX program, managed by John Cortinas, director of the Office of Weather and Air Quality.