Performance in the prediction of hurricane intensity and associated hazards has been evaluated for a newly developed convection-permitting forecast system that uses ensemble data assimilation techniques to ingest high-resolution airborne radar observations from the inner core. This system performed well for three of the ten costliest Atlantic hurricanes: Ike (2008), Irene (2011), and Sandy (2012). Four to five days before these storms made landfall, the system produced good deterministic and probabilistic forecasts of not only track and intensity, but also of the spatial distributions of surface wind and rainfall. Averaged over all 102 applicable cases that have inner-core airborne Doppler radar observations during 2008–2012, the system reduced the day-2-to-day-4 intensity forecast errors by 25%–28% compared to the corresponding National Hurricane Center’s official forecasts (which have seen little or no decrease in intensity forecast errors over the past two decades). Empowered by sufficient computing resources, advances in both deterministic and probabilistic hurricane prediction will enable emergency management officials, the private sector, and the general public to make more informed decisions that minimize the losses of life and property.
BACKGROUND AND INTRODUCTION.
Hurricane Sandy (2012) is a fresh reminder that hurricanes are among the worst natural disasters, with the potential for tremendous losses of life and property; accurate forecasts of these storms thus have significant socioeconomic value. Over the last few decades, despite substantial improvements in forecasting hurricane track operationally by the National Hurricane Center (NHC)—average position error for 72-h forecasts is currently less than half of that 20 years ago (Fig. 1a)—there has been virtually little or no decrease in intensity forecast errors [in terms of maximum surface wind speed (Fig. 1b)] except for at the long lead times (day 4–5). The reason for this discrepancy is straightforward. Hurricane track is determined mainly by large-scale environmental flows that have become much better analyzed and forecasted by global Numerical Weather Prediction (NWP) models thanks to improvements in model spatial resolution and physics, advanced data assimilation techniques capable of ingesting observations from an enhanced global network, and an exponential growth in computing resources. Hurricane intensity and structure are regulated somewhat by the large-scale environment, but are also strongly dependent on smaller-scale processes that are nonlinear and chaotic in nature (such as moist convection and inner-core dynamics), and thus harder to observe, resolve, and predict.
Some of the contributing factors include insufficient model spatial resolution, a lack of adequate routine observations to resolve the inner-core structure, and the absence of an efficient data assimilation technique. Under the auspices of NOAA’s Hurricane Forecast Improvement Project (www.hfip.org), this study presents a prototype future hurricane prediction system that performs cloud-permitting ensemble analysis and forecasting with an advanced data assimilation technique (the ensemble Kalman filter, hereafter EnKF) that ingests high-resolution airborne radar observations of the inner core. Since 2008, in collaboration with the National Oceanic and Atmospheric Administration (NOAA) and the Texas Advanced Computing Center (TACC), this experimental system—primarily developed at The Pennsylvania State University (PSU)—has been running on high-performance computing facilities in real time, and since 2011 has been designated by NOAA as a pseudo-operational model product, for all Atlantic storms with successful reconnaissance missions conducted by Doppler-equipped NOAA P3 aircraft (www.aoml.noaa.gov/hrd/aircraft.html). The PSU real-time experimental prediction of the Atlantic hurricanes is also made freely available online (http://hfip.psu.edu). With the success of the real-time application of the PSU airborne radar data assimilation for hurricane intensity prediction along with extensive experimental tests at NOAA, the regional dynamical NWP model—the Hurricane Weather Research and Forecasting Model (HWRF)—used operationally by NHC for hurricane intensity guidance, started for the first time to ingest the airborne radar observations in real time to initialize the forecast (refer to www.nhc.noaa.gov/archive/2013/al12/al122013.discus.007.shtml?).
AIRBORNE DOPPLER RADAR DATA AND HURRICANE PREDICTION SYSTEM.
Our experimental forecasts span a total of 102 applicable airborne Doppler missions for 22 Atlantic storms from 2008 through 2012. These include the PSU experimental real-time prediction for 2012 as part of the HFIP demonstration project (independent verification made by HFIP is available online at www.ral.ucar.edu/projects/hfip/d2012/verify/) and the retrospective runs by the same system for 2008–2011 (independent verification made by HFIP available at www.ral.ucar.edu/projects/hfip/includes/h2012/2012-Stream15-PSU.pdf; also in Gall et al. 2013). There are 6 storms (Dolly, Fay, Gustav, Ike, Kyle, Paloma) in 2008, 3 storms (Ana, Bill, Danny) in 2009, 6 storms (Alex, Two, Earl, Karl, Richard, Tomas) in 2010, 4 storms (Irene, Lee, Ophelia, Rina) in 2011, and 3 storms (Isaac, Leslie, Sandy) in 2012. Three of the top ten costliest and deadliest hurricanes occurred during this period: Ike (2008), Irene (2011), and Sandy (2012). Fig. ES2 shows the intensity-coded track of observed storms, while Tables ES1 and ES2 provides an extensive list of the NOAA P3-sampled storms as well as the time, duration, and number of super-observations (SOs) that were assimilated for each mission. The SO procedure documented in Weng and Zhang (2012) provides quality control and thins the voluminous airborne Doppler velocity observations to a spatial resolution comparable to that of the assimilation and forecasting system; the significantly thinned data can then be transmitted in real time from the aircraft to the NOAA Hurricane Research Division (HRD) to allow for timely assimilation into the forecast model. The SO procedure has since been implemented operationally in all P3 and G-IV reconnaissance missions, and the SOs are archived at the NOAA website: ftp://ftp.aoml.noaa.gov/pub/hrd/gamache/FuqingSO.
The prototype ensemble analysis and prediction system developed for this study uses Version 3.4.1 of the Weather Research and Forecasting (WRF) model with a purpose-built EnKF data assimilation algorithm. The model configuration is very similar to that of Zhang et al. (2011), but with an increase in model spatial resolution (the grid spacing of the innermost domain is decreased from 4.5 km to 3 km) and an improved parameterization of fluxes across the air–sea interface. The WRF model used for this study has 43 vertical levels and 3 two-way-nested domains (D1 to D3) with horizontal grid spacings of 27, 9, and 3 km covering areas 10,200 km × 6,600 km, 2,700 km × 2,700 km, and 900 km × 900 km, respectively (Fig. ES1). The outermost domain (D1) covers the Caribbean Sea and Gulf of Mexico as well as much of North America and the North Atlantic Ocean. D2 and D3 are “vortex following”—moving throughout the forecast such that the storm center is always in the middle of these inner domains. There is no ocean model coupled to WRF in this study. The EnKF configuration for the current study is the same as that used in Weng and Zhang (2012), except that the number of ensemble members is increased from 30 to 60. The current study focuses exclusively on the added value of assimilating airborne Doppler observations (this real-time system has since been expanded to ingest other in situ or remotely sensed inner-core data from reconnaissance aircraft and/or satellites from 2013). The ensemble is initialized by the NOAA Global Forecast System (GFS) operational analysis 6–12 h before a scheduled airborne Doppler mission, and uses the corresponding GFS operational forecast for its boundary conditions.
FIVE-YEAR PERFORMANCE OF THE WRF-ENKF HURRICANE PREDICTION SYSTEM.
The deterministic forecast in each case is initialized by the EnKF analysis (time-shifted to the closest 0000, 0600, 1200, or 1800 UTC) that assimilates the airborne Doppler radar observations in D1–D3, and then integrated forward for 126 h. Figures 2a,b show the mean absolute forecast errors—verified against poststorm best-track data estimated by the NHC—of track and intensity (in terms of maximum 10-m wind speed) for the WRF-EnKF system in comparison with the NHC official forecasts issued at the same synoptic time. A homogeneous comparison was applied, which means that these errors are averages over the same number of forecasts for both WRF-EnKF and the NHC official forecasts at each forecast time (Table ES3). Following a procedure known as “variable interpolator” for the “late models” used operationally at NHC (more details can be found at www.nhc.noaa.gov/modelsummary.shtml), a simple case-dependent bias correction (Fig. 2, bottom) is used to modify the maximum wind speed of the WRF-EnKF deterministic forecasts at 6–30-h lead times. The bias correction applies the full adjustment—the difference between each model forecast and the best track at 6 h—to the time-lagged forecast out to 18 h, with a linearly decreasing adjustment from 18 to 30 h (but no adjustment for the remainder of the forecast). Despite comparable errors in forecast track (Fig. 2a), the intensity forecasts from the WRF-EnKF system substantially outperformed the NHC official forecasts, with an error reduction between 15% and 43% for lead times of at least 24 h (Fig. 2b, Table ES3).
However, it is worth noting that given the time needed to process and assimilate the observations and integrate the model, the WRF-EnKF forecast in our experiment would have been classified by NHC as one of the “late guidance models” since it would not be available for forecasters at the same synoptic time. We thus further compared the WRF-EnKF forecasts with the NHC official forecasts issued 6 h later in the synoptic times by interpolating the APSU forecast to the time of 6 h later (Fig. 2, bottom). For example, the 06-h forecast initialized at 0000 UTC will be interpolated and treated as the 00-h forecast at 0600 UTC, and then compared with the NHC official 00-h forecast at 0600 UTC (the interpolation method is similar to that used for all late-model guidance at NHC). With the late-model treatment, the WRF EnKF forecast is still comparable to the NHC official forecasts in track (with noticeable degradation from day 3 to day 5; Fig. 2, bottom left), but the advantage over the NHC official intensity prediction is noticeably reduced, especially during day 1 and day 5 (Fig. 2b versus Fig. 2, bottom right). Nevertheless, the mean absolute forecast error for the WRF-EnKF system is still 25%–28% lower than the NHC official forecasts from day 2 through day 4.
It is also worth noting that, even though we have included all applicable 100+ cases of airborne Doppler missions for the Atlantic basin during 2008–12, we acknowledge the limitations of the current study due to the still-limited sample size, especially for long lead times. Under continued HFIP collaboration, ongoing experimental efforts at NOAA and PSU seek to considerably expand the sample sizes by including other airborne inner-core observations such as those made by the dropsondes, flight-level in situ and remote-sensing measurements from all hurricane reconnaissance flights, as well as through continuously cycled data assimilation that includes routine observations.
HIGHLIGHTS FROM FORECASTS OF HURRICANES IKE (2008), IRENE (2011), AND SANDY (2012).
Hurricanes Ike (2008), Irene (2011), and Sandy (2012)—among the top 10 costliest Atlantic storms on record—were the 3 deadliest and most costly storms during the 5-year period. Figure 3 shows the exemplar WRF-EnKF ensemble and deterministic forecasts of the track and intensity for these storms with the PSU 2012 experimental real-time system (as part of the HFIP real-time demonstration project); note the experiments of Ike and Irene are retrospectives and those of Sandy are real-time runs. Initialized at the end of the EnKF assimilation window—around four days before final landfall—the ensemble forecasts are from the individual members, whereas each deterministic forecast uses the ensemble mean. For consistency with Figs. 2a,b, the WRF-EnKF system as configured for the 2012 season is used in all three cases.
Reliable cloud-permitting deterministic and ensemble forecasts initialized with assimilation of high-resolution observations provide the potential for a fundamental shift away from emphasis on the “point metrics” of track (position of the hurricane’s center) and intensity (maximum 10-m wind speed anywhere in the storm) toward the use of more accurate and quantitative products that provide location-specific predictions—and associated uncertainties—of weather hazards such as rainfall and winds. In doing so, the major limitation associated with point metrics—hurricanes cause dangerous and damaging weather over large areas far from their centers—shall be alleviated. More accurate and quantitative location-specific forecast products would allow emergency managers, businesses, and individuals to allocate resources more effectively and efficiently (examples include prestorm evacuations and placement of power line repair crews).
While it remains an active area of research, location-specific metrics will most efficiently convey the forecasts to the emergency forecasters and the general public, like the NHC “Cone/Warnings” products (www.nhc.noaa.gov/aboutnhcprobs3.shtml). One such model product is the maximum wind swath (left column of Fig. 4) derived from hourly model forecasts, which shows for each of the three storms the maximum 10-m wind speed that each model grid point is predicted to experience over the entirety of the WRF-EnKF deterministic forecast. For example, although Ike was correctly predicted by the WRF-EnKF deterministic forecast to make landfall as a Category 2 hurricane, the maximum wind swath shows hurricane-force wind speeds (64 kt or greater) over land to be confined to a small area along the Gulf Coast in extreme east Texas (Fig. 4a). For both Irene and Sandy (Figs. 4e,i), the WRF-EnKF correctly predicted barely any hurricane-force winds over land. These wind swath plots indicate not only the areas potentially impacted by varying degrees of wind intensity, but also the size of the storms (as verified by surface observations, Fig. ES5). A related product (Figs. 4b–d, f–h, and j–l) is the probability, derived by the ensemble forecasts, of 10-m wind speed exceeding various threshold values (i.e., tropical storm force winds of 35 kt, gale force winds of 50 kt, and hurricane-force winds of 64 kt), providing a metric of uncertainty to the forecast, that may supplement the current operational wind speed probability forecasts by NHC that are based on a Monte Carlo method through randomly sampling from the operational forecast center track and intensity forecast error distributions from the past five years.
While hurricanes are best known for their extremely powerful winds, a hazard that can be just as devastating is severe flooding caused by prolonged heavy rainfall. This was the case for Irene, which caused record flooding over much of New England (refer to the NHC report at www.nhc.noaa.gov/data/tcr/AL092011_Irene.pdf). For all three cases, in addition to providing good track and intensity forecasts, the WRF-EnKF deterministic forecasts of event total rainfall (middle column of Fig. 5) verified qualitatively well with the NOAA gridded observational analyses (left column of Fig. 5) (http://water.weather.gov/precip/). For Ike, the WRF-EnKF system skillfully predicted not only the heavy rainfall along the coast associated with the storm’s inner core and outer rainband, but also the area of precipitation stretching from southeastern New Mexico all the way to Chicago associated with the storm’s outer fringes. Similar skill is evident in the WRF-EnKF forecast for Irene. The deterministic rainfall forecast for Sandy is even more noteworthy: in addition to depicting heavy precipitation along the mid-Atlantic coast, the WRF-EnKF forecast skillfully predicted the location and structure of the heavy precipitation along the Appalachians in West Virginia and Pennsylvania, and along the southern shores of Lake Erie. As with surface wind speed, the precipitation uncertainty can also be depicted by the ensemble members: the right column of Fig. 5 shows the probability of event total precipitation exceeding 100 mm at any given location.
Such high-resolution deterministic and ensemble forecasts can modernize the prediction of hurricane intensity (and associated hazards), as demonstrated to be feasible in our experimental real-time system. Essential ingredients of such future hurricane systems are enhanced inner-core observations that can be ingested efficiently by advanced data assimilation algorithms, state-of-the-science forecast models capable of resolving inner-core dynamics, and sufficient computing resources to perform ensemble-based probabilistic analysis and forecasts.
In all three events, the experimental system provided excellent deterministic forecasts of both track and intensity that are comparable to or better than the NHC official forecasts, consistent with summary performance (Figs. 2a,b). Most notably, the system correctly predicted Ike’s landfall near Galveston while the NHC official forecast had a landfall far to the left (Fig. 3a). The WRF-EnKF also captured the reintensification of Sandy before its final landfall on the New Jersey coast (Fig. 3f). For Irene, the WRF-EnKF intensity forecasts were better than the NHC (possibly because the former had a slightly better and more inland track forecast), although both had a considerable and consistent high bias (Fig. 3e).
While it may be difficult to systematically evaluate the performance of any probabilistic forecast with a limited number of cases, the ensemble forecasts (also shown in Fig. 3) initialized with the EnKF perturbations provide case-dependent uncertainties associated with the deterministic track and intensity forecasts discussed above. For both track and intensity, the spread of the ensemble forecasts in all three cases covered reasonably well the best track (as well as the WRF-EnKF deterministic and NHC official forecasts), except the intensity of the WRF-EnKF and NHC official forecast is stronger than the best track.
It is clear from these ensemble forecasts that the track uncertainty is greater for Ike and Sandy than for Irene, which is consistent with the flow and track patterns for these storms before and during landfall: storms with curved paths (Ike and Sandy) are usually less predictable than storms with straight paths (Irene). The larger uncertainty associated with the Ike and Sandy track forecasts was also reflected in the larger disagreement among several dynamic models relied upon by NHC when issuing official forecasts (Fig. ES4). Systematic evaluations of the deterministic and ensemble forecasts by the PSU WRF-EnKF real-time system for Hurricane Sandy can be found in Munsell and Zhang (2014).
This study examines the performance of an experimental hurricane prediction system developed at PSU based on the convection-permitting WRF model and an advanced data assimilation technique known as the EnKF. The system uses the EnKF to ingest high-resolution airborne Doppler radar observations of the hurricane’s inner core to provide a more realistic vortex to the WRF model. The PSU WRF-EnKF system demonstrated promising performance for all landfalling hurricanes from 2008 through 2012: averaged over all 102 applicable airborne Doppler missions, errors in forecast intensity for lead times of 2 to 4 days were 25%–28% less than the corresponding official forecasts issued by the National Hurricane Center. Highlights of this experimental system include the promising real-time forecasts of track and intensity prediction for 3 of the 10 costliest Atlantic hurricanes: Ike (2008), Irene (2011), and Sandy (2012). This is the first comprehensive study to demonstrate that hurricane intensity prediction may be improved through a combination of an advanced data assimilation algorithm capable of efficiently ingesting high-resolution inner-core observations, state-of-the-science forecast models that can resolve inner-core dynamics, and sufficient computing resources to perform ensemble-based probabilistic analyses and forecasts.
The authors wish to thank Ben Green, Wei Li, and Chris Cappella for their insightful comments and suggestions, and Green for proofreading an earlier version of the manuscript. We also benefitted greatly from formal review comments by two anonymous reviewers and Chris Landsea. This research is partially supported by the NOAA HFIP Program, NSF Grants 0840651 and 1305798, Office of Naval Research Grant N000140910526, and NASA Grant NNX12AJ79G. We also acknowledge the collaborations from many entities and individuals at NOAA and/or HFIP, without whom this extensive work would not be feasible. Special thanks are due to John Gamache, Frank Marks, Fred Toepfer, Bob Gall, James Franklin, Ed Rapport, and Ron Ferek for their input and support. The computing is conducted at the NOAA ESRL and the NSF-sponsored TACC high-performance computer clusters.
FOR FURTHER READING
A supplement to this article is available online (DOI:10.1175/BAMS-D-13-00231.2)