A regional-scale weather model is used to determine the potential for flood forecasting based on model-predicted rainfall. Extreme precipitation and flooding events are a significant concern in central Texas, due to both the high occurrence and severity of flooding in the area. However, many current regional prediction models do not provide sufficient accuracy at the watershed scale necessary for flood mitigation efforts. The Weather Research and Forecasting (WRF) model, created with the purpose of improving upon the current fifth-generation Pennsylvania State University–National Center for Atmospheric Research (PSU–NCAR) Mesoscale Model (MM5), is specifically designed for regional grid spacings of 1–10 km. Previous research by the authors resulted in the development of a regional-scale prediction system over the San Antonio River basin, using a geographic information system (GIS) database, a hydrologic model, and a hydraulic model. Observed precipitation drives the prediction system; the authors hypothesize that the WRF model has the potential to predict flooding, at a lead time of several days, with a level of accuracy near that of observed precipitation. Causes of model error are also investigated, to determine the relative errors caused by model physics, initialization interval, buffer zone and domain size, and small-amplitude random errors. Results show that the Betts–Miller–Janjić cumulus and Lin et al. microphysics schemes, 48-h initialization interval, and two-domain configuration covering minimal ocean and having a parent-to-nest area ratio of greater than 10 best simulates a recent (July 2002) large storm event over the San Antonio River basin. This particular storm was selected because it produced extreme rainfall volumes and intensities, and also because its meteorological characteristics are typical of central Texas storm events. Location errors in rainfall are most significant because of their typically nonlinear patterns (increasing location error does not linearly modify streamflow output). Errors in intensity and timing show a more predictable (linear) watershed response that may be useful in the estimation of streamflow ranges for flood forecasting.
Precipitation is the single most difficult and often erroneously modeled parameter in numerical weather models (Wang and Seaman 1997; Nielsen-Gammon et al. 2006). Determining the appropriate grid spacing that captures both small- and large-scale processes is vital to accurately representing storm events. For example, neglecting the subgrid-scale variability of precipitation has been shown to result in an underestimation of the total volume and runoff and, consequentially, an overestimation of the evapotranspiration (Wang et al. 2005). Mitigation decisions based on local-scale engineering models may not fully consider the hydrologic effects of large storm systems. Regional weather models have been successfully applied to the forecasting of hydrological phenomena and quantitative precipitation forecasts (Fovell 2006; Marchok et al. 2007; Westrick et al. 2002). A nested regional model provides a realistic method for modeling flood events at the watershed scale by capturing synoptic-scale triggers over the entire domain and downscaling to capture mesoscale factors over the region of interest. This approach has proved successful at various grid spacings (Liang et al. 2004a, b; Smedsmo et al. 2005).
Systematic errors still exist in regional models. Major causes of model error are investigated in this study. Since causes of error vary with the specific region and climate of interest, a regional model must be adapted for each application (Giorgi and Mearns 1999). Major errors arise from model physics, domain and buffer zone treatments, and initial and boundary conditions. Even after accounting for major sources of error, small-amplitude random errors may remain in the model.
The implementation of various physics schemes causes a large variation in the forecast output (Zhang et al. 2006), especially the choice of cumulus scheme (Gallus and Segal 2001). The particular skill of a cumulus scheme in simulating rainfall is dependent upon the region and storm being modeled (Giorgi and Mearns 1999). The Grell–Devenyi (GD) scheme has displayed widespread skill in simulating precipitation (Giorgi and Shields 1999; Warner and Hsu 2000; Xu and Small 2002; Zhang et al. 2006). However, several studies have shown the Kain–Fritsch (KF) scheme to produce realistic precipitation over the North American monsoon region (Gochis et al. 2002; Leung et al. 2003; Liang et al. 2004b); other research has demonstrated the strength of the Betts–Miller–Janjić (BMJ) scheme at producing accurate forecasts (Baik et al. 1991; Vaidya and Singh 2000). Emanuel and Zivkovic-Rothman (1999) found that the BMJ scheme performs consistently with observations for relative humidity from the surface up to 500 mb. Jankov and Gallus (2004) found that neither the BMJ nor the KF scheme accurately reproduced mesoscale convective systems (MCS) in the Upper Midwest, with the BMJ scheme consistently overestimating and the KF scheme underestimating rainfall.
Cloud microphysics is also an important parameterization in the production of precipitation (Fritsch and Carbone 2004; Smedsmo et al. 2005). Cloud microphysics schemes vary widely. Some schemes only include water classes (e.g., Kessler 1969). Other schemes include some ice hydrometeors such as the Ferrier (1994) method with four total hydrometeor classes, the Hong and Lim (2006) six-class scheme with graupel, or the Thompson et al. graupel scheme (Thompson et al. 2004). Microphysics schemes for the present study were chosen from those available in version 2 of the Weather Research and Forecasting (WRF) model. The scheme most suited to the model application varies with location; for example, the Lin scheme (Lin et al. 1983) may be appropriate for hail-bearing storms in the Midwest (Smedsmo et al. 2005).
Emanuel and Zivkovic-Rothman (1999) assert that correct microphysics is integral to determining cumulus convection and hence precipitation. They postulate that reevaporation of condensed water, currently not included in most parameterizations, allows for correct estimates of convective moistening. Their cumulus parameterization controls entrainment and detrainment by allowing the level of natural buoyancy to vary depending on the moisture levels in cloud air. Both BMJ (Janjić 1994) and Emanuel’s scheme are comparable to the observed relative humidity profiles; other schemes tested were less so.
Errors in the driving initial and boundary conditions can cause large variations in forecast output. Liang et al. (2004b) found large uncertainties in boundary conditions, mostly over oceans and other areas lacking complete data, contributed greatly to model error. The use of the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis caused less sensitivity to model domain when compared to the European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (Liang et al. 2001). The initialization interval is important because of the spinup time needed for model adjustment, as well as the inability of most weather models to accurately forecast beyond several days. In this study, the initialization interval is taken to mean the difference in initialization times between each series of model runs over the storm event. A more detailed description of this methodology can be found in section 4b.
The size of the model domains and the interaction between nested domains greatly impact forecasts. If the domain edges are too close to the area of interest, edge effects may cause inaccuracies in the forecast. For example, Seth and Giorgi (1998) found that small domains caused unrealistic responses in the inner domain that were inconsistent with large-scale forcing. Locations of domain edges must be chosen so that important regional meteorological features can be resolved, and the edges must be located away from areas where reanalysis inaccuracies exist; parent and nest domains should be spaced far enough apart to avoid edge distortions (Liang et al. 2001). Increasing the width across which the dynamical relaxation of boundary conditions occurs, or nudging over the entire domain, may help preserve large-scale waves and hence reduce errors due to domain size (Juang and Hong 2001).
The major objectives of the present study surround the following questions: 1) How well does WRF model the synoptic setting that produced the July 2002 storm event? 2) What are major causes of model error in terms of physics, initialization, domain, and small-amplitude errors? 3) What types of precipitation error propagate most strongly into streamflow: magnitude, location, or timing? How do these each affect flood forecast outcomes? In the next section, an overview of the July 2002 storm meteorology is presented. Section 3 discusses the background information on the driving datasets and the WRF model structure, section 4 presents experimental methods and results, and section 5 discusses the conclusions and implications of this research.
2. Meteorology of the July 2002 storm event
The high frequency of flooding in Texas results from a unique juxtaposition of meteorological factors including moisture influx from the Gulf of Mexico, easterly waves moving across the area, and orographic uplift from the Balcones Escarpment (Hirschboeck 1987). In fact, Texas has recorded some of the greatest precipitation intensities in the world, for storms with durations of up to 24 h (Patton and Baker 1977). Storm events over central Texas are generally controlled by large-scale anomalies in atmospheric circulation patterns. The interaction of these anomalous behaviors with local and mesoscale processes, such as topographic influences and the location of mesohigh outflow boundaries, determines whether or not a flood event occurs in any given region. Extreme storms in Texas may be triggered by tropical cyclogenesis, cyclonic dissipation, frontal processes, orographic lifting, or a combination of several local- and synoptic-scale events. Most extreme storm events over Texas show large-scale meteorology that is significantly different from climatology at the 95% significance level (Nielsen-Gammon et al. 2006). The July 2002 storm was related to a deep upper-level trough that became stationary over south-central Texas. Much of the subsequent discussion is based upon the work of Nielsen-Gammon et al. (2006), who performed a compositing analysis to determine typical meteorological factors present during Texas storms and particularly the July 2002 event.
At the 850-mb level during typical storm evolution in central Texas, strong trade winds blow across the Yucatan toward Mexico and Texas, bringing moisture into the area. Decreasing wind speed from the Gulf of Mexico to the Great Plains implies low-level moisture convergence over the region. This is most easily seen as a deceleration of winds over eastern Texas. However, this deceleration was less prominent in the July 2002 storm compared to other large historical storm events over Texas. Observations on 30 June, several days before the onset of heavy rainfall, still show strong southeasterlies over the Gulf of Mexico. Observations for 5 and 6 July (after peak rainfall has occurred) show weaker winds and less deceleration over Texas.
Typical midlevel development (500 mb) is dominated by cyclonic circulation in the form of a north–south-elongated trough over south Texas. This cyclonic circulation causes a high potential vorticity (PV) anomaly that directs moisture away from Central America and into Texas. Precipitation efficiency is increased by southerly winds blowing into Texas from the Gulf of Mexico. A strong low-level jet (LLJ) feature, defined by Bonner (1968) as a low-level local maximum in the vertical wind profile, is generally observed in Texas nontropical storms during September and October. In summer storm events, the strongest LLJ features are located more to the north in the central Great Plains; the southerly winds present during the July 2002 event do not show such localized maxima and are better referred to as a southerly wind event (SWE; Mitchell et al. 1995). However, similar to the Great Plains LLJ, the strong SWE transports vast amounts of moisture into the region and plays a large role in precipitation formation during the storm. In the upper levels of the atmosphere (200 mb), typical storm evolution in the region shows a high PV anomaly producing upper-level divergence over Texas. This PV anomaly causes an upper-level ridge to develop over the southeastern states, which prevents further movement of the trough.
Locally, mechanical lifting is thought to be a factor in central Texas storm evolution, demonstrated by the location of Balcones Fault Zone (BFZ) in relation to heavy rain concentration areas. Southeasterly to easterly winds at the 850-mb level over the region imply upward motion due to topography (Patton and Baker 1977). Many convective cells and high precipitation accumulations were observed in July 2002 along the BFZ. In addition to trigger factors for heavy rainfall, basin flooding is aggravated by the shallow soils, karstic terrain, bedrock channels, and increasing levels of urbanization over the San Antonio River watershed.
3. Reanalysis dataset and meteorological model
a. North American Regional Reanalysis
Initialization and boundary conditions were interpolated from the NCEP North American Regional Reanalysis (NARR; Mesinger et al. 2006). NARR is a comprehensive hydrometeorological dataset used to drive regional-scale models. Based on the Global Reanalysis (GR) Project that ran 30 yr of simulations using the Eta Model and assimilated observations, NARR assimilates observational data for analysis, boundary conditions, and execution of the Eta Model.
Data are output as 3-hourly files on a 32-km Eta grid with 29 pressure levels, distributed on a Lambert conformal grid. Data are available for free download from the NCAR Web site in gridded binary (GRIB) format. The regional reanalysis appears to correctly capture important elements of the hydrologic cycle; NARR streamflow output compares well with the measured discharge, and has been shown to perform better than the previous 40-yr ECMWF Re-Analysis (ERA-40; Déry et al. 2007). Although simulations with NARR have been unable to completely balance the moisture budget, summer flux convergence over Texas and the northern Gulf of Mexico is much better represented with NARR than with previous global reanalyses (Nigam and Ruiz-Barradas 2006).
b. WRF model
WRF version 2 (Michalakes et al. 2005), a nonhydrostatic, mesoscale weather model developed in a collaborative effort between NCAR, the National Oceanic and Atmospheric Administration (NOAA), and the Department of Defense (DOD), is used in this study. Initial conditions from NARR drive the model at the first time step, and boundary conditions (also from NARR) are assimilated continuously at hourly time steps. The preliminary WRF configuration used in this study consists of a 12-km outer domain with a horizontal grid spacing of 105 × 105 grid points, and a 4-km inner domain with a horizontal grid spacing of 97 × 94 grid points. The area of interest for flood modeling, the San Antonio River basin (SAB), lies in the center of the inner domain; for reference, the basin area accounts for approximately 1/15 of the inner-domain area (Fig. 1). Both domains have 31 variably spaced, vertical grid cells in sigma coordinates, σ = (P − Pt)/(Ps – Pt), with its model top at 100 mb. Ten levels reside at or below the planetary boundary layer (1.5 km); high vertical resolution in the boundary layer is thought to improve the simulation of the LLJ or SWE (Ting and Wang 2006), hence improving the model’s representation of moisture transport and storm generation. The model physics are variable and described further in the following section.
4. Experimental design and results
a. Physics parameterizations
The WRF model contains multiple options for the physics parameterizations. Adjustable physics components include the microphysics, cumulus parameterization, long- and shortwave radiation, boundary layer turbulence (PBL), surface layer, land surface parameterization, and subgrid-scale diffusion. Warm season precipitation output is especially sensitive to the choice of cumulus scheme and moderately sensitive to the microphysics, radiation, and boundary layer scheme (Wang and Seaman 1997; Liang et al. 2004a; Jankov et al. 2005; Nielsen-Gammon et al. 2006). Warm season convective precipitation is the most difficult variable to model accurately (Olson et al. 1995; Nielsen-Gammon et al. 2006), possibly due to the importance of the initial precipitation peak locations in providing feedback mechanisms that dictate the subsequent event evolution. For this study, sensitivity is tested using various combinations of schemes (approximately 40 different combinations); only those combinations that produce significantly different simulations of precipitation are described in this paper. The physics options are combined as shown in Table 1 and included the Grell–Devenyi, Kain–Fritsch, and Betts–Miller–Janjić cumulus schemes; the Kessler, Lin, Eta, and WRF Single-Moment 3-Class (WSM) cloud microphysics; the Rapid Radiative Transfer Model (RRTM) and Eta longwave radiation; and Dudhia, Goddard, and Eta shortwave radiation. Several studies have suggested that for domains using a fine grid spacing (<5 km), a regional model may simulate the convective processes equally or better than using a parameterization of the convection (Bélair and Mailhot 2001; Smedsmo et al. 2005); hence, the effect of removing the convection parameterization from the nested domain is also studied. Each model is run for the entire 11-day storm event using only a single initialization at day 1 of the storm.
Comparisons between observed and modeled rainfall are performed by upscaling from the fine-resolution WRF output to the coarser-resolution Higgins observed dataset (Higgins et al. 2000). The Higgins dataset is derived from rain gauge records from approximately 2500 stations on a 0.25° latitude × 0.25° longitude grid (about 32 km2). Upscaling is performed by averaging the rain rate (mm day−1) across the WRF grid points; the total water volume is conserved in the upscaling process. Regional averages of WRF precipitation output are computed for 32 km2 regions aligning with the Higgins rainfall grid that intersect the study basin, for a total of 25 grid cells. These grid regions are the comparison points for all subsequent analyses.
Precipitation is found to be most sensitive to the cumulus and to a lesser extent the microphysics schemes used. In Figs. 2 –4, the San Antonio River watershed and its gauges are superimposed on the WRF rainfall map for the purpose of identifying areas of large errors across the basin. Although comparisons between the model and the observations were made over the entire 10-day storm, the following discussion pertains specifically to the peak days of rainfall (2–3 July). The Kain–Fritsch scheme (Fig. 2) produces spurious locations of high precipitation surrounding the basin on 2 July, especially in combination with the RRTM shortwave radiation scheme, while underestimating precipitation inside the watershed. The Grell–Devenyi cumulus scheme (Fig. 4) largely underestimates precipitation on 2 July, completely missing the large peak over San Antonio. Removing convection from the nested domain, regardless of the cumulus scheme used, results in a gross underestimation of the rainfall over the basin (Fig. 2d). This result is consistent with Kain and Fritsch (1998), who found the parameterization of convection to be necessary at grid lengths as fine as 5–10 km. Variations in both the longwave and shortwave radiation schemes appear to have insignificant effects on the precipitation forecasts over the basin (Fig. 3d for 2 July; others not shown). Overall, the Betts–Miller–Janjić cumulus scheme (Fig. 3) most accurately simulates the precipitation and subsequent river flooding over the SAB; however, this scheme produces a wider swath of peak precipitation than observed on 3 July, accounting for the large errors in the lower portions of the basin on that day (Table 2). Results from experiment MP2CU2 most closely match the observed mean and standard deviation of rainfall for the basin. Despite the emphasis on the correct microphysical properties in modeling deep convection (Fritsch and Carbone 2004; Smedsmo et al. 2005), model runs using varying microphysics schemes produce less striking differences in precipitation than do model runs with varying cumulus schemes. This observation is most prominent when the BMJ convective scheme is used (Figs. 3b–f), likely because the BMJ dries the atmosphere and reduces the grid-resolved precipitation, minimizing the role of the microphysics in generating precipitation (Jankov et al. 2005). In Fig. 2b, the Kessler scheme underestimates rainfall over much of the basin on 2 July, especially the lower portions of the area. In the WSM scheme (Fig. 2e), the peak rainfall signal is much weaker than is observed. The Lin (Fig. 2c) and Ferrier (Fig. 2f) schemes produce more accurate rainfall over all areas of the basin, despite both having a peak that is located southwest of the observed rainfall peak. The finding that rainfall rate is most sensitive to changes in convective scheme and less sensitive to changes in microphysics and PBL schemes agrees with the recent results of Jankov et al. (2005), who also found the microphysics along with the convective treatment to be most influential on the total rain volume.
In Figs. 5 and 6, it is clear that most physics combinations underestimate precipitation during the peak days of 2 and 3 July; on average, the combination runs underestimated precipitation by 47%. Experiments utilizing the KF cumulus scheme consistently underestimated rainfall for both days over nearly the entire basin. Experiments with the BMJ scheme tended to underestimate rainfall for 2 July and overestimate rainfall for 3 July, which may be attributed to error in the timing of the convection. However, the BMJ scheme produces some of the most accurate values over the central basin region. Experiments using the GD cumulus scheme show sporadic results, with the highest accuracy over the southern portions of the basin on 3 July. Daily totals averaged over the upper, middle, and lower basin areas (not shown) further support these characteristics. The physics configuration for the succeeding experiments are as follows: Lin microphysics, BMJ cumulus parameterization, Janjić Eta surface layer scheme, Rapid Update Cycle (RUC) land surface model, Mellor–Yamada–Janjić Eta boundary layer physics, and RRTM (longwave) and Dudhia (shortwave) radiation.
b. Initialization interval
Decision making in flood management demands a high accuracy of weather model output, and many weather forecasts are considered impractical for use beyond 36 h. The potential performance of WRF in modeling future storm events is investigated in this study; in the following experiments, the effect of forecast length on convective rainfall output is addressed through variation of the initialization interval in WRF.
The WRF model is initialized at intervals of 24 h (1 day), 48 h (2 days), 72 h (3 days), 120 h (5 days), and 264 h (11 days, covering the entire storm length). A series of model runs are completed with each interval as necessary to cover the 11-day storm (Table 3). For example, the 24-h initialization experiment consists of 11 runs, the 48-h initialization consisted of 6 runs, etc. Results from the model runs are combined for each interval length and plotted for the entire storm length. Precipitation output grids are then run through the hydrologic model and the streamflow output compared between experiments.
Comparisons between precipitation output for different initialization intervals are shown in Figs. 7 –9. Experiment results demonstrate that a 48-h initialization best reproduces the observed rainfall (Fig. 9). The 24-h run misses or underestimates several precipitation peaks, while the longer forecasts severely underestimate precipitation, and in some cases produce spurious areas of precipitation. Although this finding may be due to numerous factors, one possible reason that an intermediate initialization interval is more accurate in representing rainfall may be the adjustment time in the model, as the model integrates the NARR initial boundary conditions (IBCs) and begins producing grid variables on its own (i.e., the time needed for the model to reach equilibrium). In this particular case, the adjustment time may also include an adjustment between the differences in the Eta land surface model used in the NARR data and the RUC land surface model used in the WRF model runs for this experiment.
The correlation with the observations is significantly higher for smaller initialization intervals, with an r value of 0.47 for a 1-day initialization and 0.63 for a 2-day initialization (Table 4). The streamflow output shows patterns that are similar to those for the precipitation (Fig. 10), with all but the 48-h run underestimating the streamflow. The 48-h run produces a slight time lag compared to the observed streamflow, but captures the overall peak intensity and timing of the basin streamflow. Correlation values are high for both experiments 1D and 2D. At urban gauges, the correlation is very low (Table 5); this low correlation was also observed when driving the streamflow model with observed radar data (Knebl et al. 2005) and, therefore, is likely due to error inherent in the hydrological model and not error in the WRF output. A comparison of the mean biases for each experiment (Table 5) as well as an analysis of the bias distribution (Fig. 11) clearly demonstrates that the 2-day initialization best represents the entire storm. These results demonstrate that WRF performs reasonably well at producing precipitation on its own (without assimilation of observations) for 48-h intervals.
c. Domain and boundary size
To study the sensitivity of precipitation forecasts to the size and location of the model domain, various domain configurations are designed as shown in Fig. 1. Six experiments are completed based on the six configurations (Table 6). All experiments utilize two domains, and the outer domain is driven by NARR forcing data. Physics and initialization details are dictated by best-fit results from the previous sections. Experiments are named using capital letters for larger domains and lowercase letters for smaller domains, according to the following convention: P and p for parent domains 1 and 2; N, n, and nn for nest domains 1, 2, and 3, and BZ for increased boundary zone width. Experiment pn is designated as the control run, with the outer domain covering the entire state of Texas, and the inner domain covering the area of interest (SAB), including a wide radius around the basin to allow for the adjustment of parameters between the two domains. The outer domain in experiment Pn is enlarged to cover a significant portion of the Gulf of Mexico at its southeast corner (20 points added to the south and east sides). The objective of this configuration is to allow more adjustment time between ocean and land variables. To lessen the boundary effects that may occur when the forecast area is too close to the parent–nest boundary, experiment pN enlarges only the nested domain, approximately doubling its grid points in both directions. Both the parent and nest domains are enlarged in experiment PN, to observe the combined effects of the previous two configurations. Experiment pnn shrinks the nested domain eight points in the north–south direction and four points in the east–west direction to singularly model the area of flood interest. Last, experiment pnBZ doubles the default buffer zone width from 5 to 10 grid cells in an effort to reduce edge effects from nudging between the parent and nest domains.
A qualitative comparison of the precipitation grids appears to show the control experiment pn to be the most reasonable configuration for modeling the 2002 storm event over the San Antonio River basin (Figs. 12 –13; others not shown). Statistical analysis, however, demonstrates that each configuration has strengths and weaknesses (Tables 7 and 8). Enlarging the parent domain results in location errors of rainfall over the San Antonio River basin during times of peak precipitation (experiment Pn; results not shown). These errors in the rainfall simulations may be, in part, due to the placement of the southern boundary in the Gulf of Mexico where large forcing errors exist (Liang et al. 2001); they may also be, in part, introduced via competition for convection at the land–ocean boundary (Nigam and Ruiz-Barradas 2006). In experiment pN (Fig. 13), rainfall is underestimated, suggesting that enlarging the nest domain may reduce the effects of the synoptic forcing from the parent domain. In experiment PN (result not shown), the effects of enlarging both domains are compounded, with large errors in both the intensity (underestimation) and location of the rainfall. Doubling the buffer zone width in experiment pnBZ (result not shown) results in an underestimation of the precipitation, which is presumably caused by the diminished importance of small-scale mechanisms in producing convective precipitation in the nested domain. Decreasing the area of the nest in experiment pnn (result not shown) has a similar effect to increasing the buffer zone width; precipitation patterns remain similar but decreased in intensity, and boundary edge effects are clearly observable.
d. Performance of best-fit WRF model
The best-fit WRF model, using a 48-h initialization, domain pn, and physics as described previously, is investigated in more detail in order to assess its ability to simulate the July 2002 storm event. Figure 14 compares the horizontal winds and divergence between the NARR reanalysis data and two WRF configurations: the best-fit configuration and a poor-performing configuration (i.e., one that produces significant precipitation error, domain PN).
Figure 14 displays the observed winds (from the NARR reanalysis) and the modeled wind vectors for the upper, middle, and lower levels of the atmosphere. Strong trade winds are apparent at the 850-mb level over the Gulf of Mexico; these winds bring moisture into the central Texas region and weaken as the storm progresses. As these southerly winds move across land, they begin to weaken slightly, triggering low-level convergence. Weakening winds are also apparent as air flows over the BFZ in central Texas. Both the best-fit and poor-performing WRF models appear to simulate this near-surface wind pattern quite well (Figs. 14b and 14c).
At midlevels of the atmosphere, a strong cyclonic circulation, consistent with the observations, develops in the best-fit model (Fig. 14e). The poor-performing WRF places this circulation too far north, over the Texas–Oklahoma boundary, which may contribute to its underestimation of the storm’s extent (Fig. 14f).
Both the upper-level divergence and a related anticyclone east of the storm are clearly displayed in the WRF best-fit model (Fig. 14h). The poor-performing model run (Fig. 14i) demonstrates much weaker divergence and only minimal anticyclonic winds. This lack of divergence in the upper atmosphere may inhibit convection and contribute to precipitation underestimation.
Additionally, water vapor mixing ratios display patterns that are consistent with the observations (not shown), with increasing values as the storm progresses and significantly decreasing values several days after the storm has passed. The largest variation in moisture occurs at midlevels of the atmosphere, where cyclonic circulation dominates, and over the middle and upper regions of the basin, where convective precipitation is strongest.
e. Small-amplitude random errors
Even after adjusting the model physics, initialization, and domain configurations, the resulting best-fit model still shows significant errors compared to the observations. The remaining errors may be partially attributed to random error but also partially attributed to errors in the best-fit model physics (it should be noted that this study only reduces errors attributable to physics by finding the best-fit physics set; the set of physics schemes still contains substantial errors). Random errors can develop from small inaccuracies in the initial conditions that grow exponentially with time. Introducing small perturbations into the model can aid in understanding how these initial errors grow. Zhang et al. (2006) perturbed their model domain using thermal bubbles and found that nonlinear error growth mechanisms are of secondary importance to practical errors such as erroneous physical parameterization. Their results implied that error growth is greatest for high-resolution domains and in regions of high convective activity. Major error types encountered in regional models include intensity, location, and timing errors, with erroneous location shifts most common.
Since precipitation errors are nonlinear, their effects on various hydrologic fluxes may be manifest in either an amplification or a dampening of the variable. To test the effects of common precipitation errors in the WRF model on streamflow output from a surface hydrologic model, WRF output grids were perturbed arbitrarily (in both the positive and negative directions) in intensity, location, and timing. Only those grid cells residing in the area of flood interest (Fig. 1) were perturbed. Intensity error was introduced by adding the following rainfall to each grid cell (in mm h−1): ±0.01, 0.05, 0.1, 0.5, 1, 2, 3, 4, and 5. The absolute value, instead of a percentage, was used to obtain more meaningful results over the basin. Location error was introduced by shifting rainfall amounts 1, 3, 5, and 10 grid cells in each of four directions: NE, NW, SE, and SW. Finally, timing errors were introduced in two ways. First, hours were added or subtracted from each rainfall grid cell as follows: ± 3, 6, 12, and 18 h. This simple shift is not particularly realistic for typical rainfall errors, so additional experiments were run changing the duration of the rainfall. Rainfall grid cells during peak times were perturbed to (a) widen the peak, (b) narrow the peak, and (c) extensively narrow the peak (to mimic flash flooding).
Table 9 uses a moderate rainfall error of +2 mm h−1 to demonstrate the variation in streamflow error with location in the basin, over the 72-h peak rainfall period. Correlation values over the storm period are high for all locations. The value of the best-fit to modeled ratio CTL/EXP2 allows study of the relative overestimation and underestimation of precipitation in each subbasin region. As expected, increasing rainfall error in a positive direction results in an overestimation of the streamflow in all subbasins, with the exception of the lowermost subbasin. This large subbasin consists of mostly agricultural soils, and may be better equipped to dampen the response to increased rainfall and streamflow upstream. In addition, this subbasin has a long response time; much of the impact from upstream rainfall may be delayed beyond the time scope observed in this study. In subbasins with a shorter response time, such as those with bedrock-dominated streams (gauge 2) or those in urban areas (gauge 7), increases in precipitation are quickly translated to runoff, and hence in these basins streamflow is greatly overestimated.
Statistics for various error types are displayed in Table 10 and Figs. 15 –17. Positive (negative) perturbations to the rainfall magnitude show a comparable increase (decrease) in streamflow. Interestingly, positive rainfall errors are manifested in streamflow error throughout the entire storm, while negative errors result in lower overall streamflow error and are concentrated at the storm peak (Fig. 15). Intensity errors in precipitation show a linear translation pattern for streamflow, especially for positive rainfall perturbations; intensity errors also tend to retain the shape of the hydrograph. Location shifts in precipitation are a significant error type, because the streamflow response is erratic and difficult to predict. Rainfall location errors to the east, especially the southeast, result in the highest streamflow errors (Fig. 16). Location errors to the west (NW and SW) appear to retain the correct hydrograph shape and have fewer errors in discharge magnitude. This directional variability of the streamflow error is highly dependent on the actual location and size of the peak rainfall; glancing back at Fig. 12 (the best-fit rainfall), it is clear that rainfall changes most rapidly in an ESE direction away from the peak. Lagging or advancing the precipitation peak timing results in a linear lag or advance of the streamflow peak (Figs. 17a and 17b); however, the error is significantly larger for time-lagged experiments. For example, perturbing the precipitation −12 h results in a mean absolute error (MAE) of 213.3%, while perturbing the precipitation +12 h results in an MAE of only 58.4%. In the case of a negative lag, the peak also increases with increased lag time, due to increased time of concentration, which allows the basin more time to accumulate and then translate runoff to the river network. The effect of the peak shape perturbations is most pronounced for an erroneously wide peak (Fig. 17c); in this case the overall volume of the streamflow is increased to the extent that the hydrograph peak is not well defined compared to the best-fit result.
5. Conclusions and implications
This paper studies the sensitivity of WRF model precipitation forecasts to different physics, initialization, and domain configurations, and examines the ability of WRF to accurately reproduce a summer convective rainfall event. The propagation of error from rainfall to streamflow is investigated by applying a variety of perturbations to the precipitation and finding the relative significance of each in terms of streamflow.
WRF’s ability to accurately simulate a July 2002 central Texas storm event is most sensitive to cumulus parameterization, slightly sensitive to microphysics parameterization, and nearly unaffected by the choice of shortwave and longwave radiation schemes. The Betts–Miller–Janjić cumulus scheme combined with Lin microphysics produces the most reasonable rainfall over the San Antonio River basin. Other cumulus schemes tend to underestimate rainfall intensity and shift the peak away from its true location. Contrary to modeling studies suggesting that domains with close grid spacing have the ability to resolve convective motions on their own, this research found that simulation of the 11-day convective storm requires parameterization of convection in the inner 4-km domain.
A 48-h initialization interval produces optimum performance in WRF, as determined by the high correlation values and low errors in both rainfall and streamflow. This interval allows WRF to successfully produce precipitation on its own, and the forecast is not substantially degraded, as in longer forecasts. Investigation of various domain configurations suggests it is best to avoid placing any domain over oceans or other regions lacking sufficient data. The relative size and placement of the parent and nested grid(s) are also important; to best simulate the combined effects of synoptic, mesoscale, and local atmospheric forcings, the nest must be close to the area of flood interest while being kept a significant distance from the parent domain. Additionally, studies of domain grid spacing should be undertaken; Zhang et al. (2006) found that a finely gridded nest (3.3 km) did not produce the best simulation (using the MM5 model) compared to coarser nest grids. It would be useful to test alternate grid spacings to determine if forecast degradation occurs at high resolutions in WRF.
The best-fit WRF model accurately reproduces precipitation patterns and intensity over the warm season, convective central Texas storm of interest. Comparisons of observed and modeled wind vectors at three different atmospheric levels suggest that errors in horizontal winds for the poor-performing WRF configurations may be a major factor in producing insufficient precipitation. There are several implications from the success of WRF in simulating the July 2002 storm and its subsequent streamflow. For data-sparse areas or International River Basins (IRB) nations that do not have access to upstream data, due to either a lack of infrastructure or to a lack of data sharing between coincident riparian nations, WRF output can provide a proxy for such data. However, WRF may not increase the lead time enough to significantly improve flood forecasting in these regions. Additionally, WRF output can be used as the driving meteorological forcing data for regional- and watershed-scale hydrological models.
Current numerical weather models categorically contain small errors associated with initial and boundary conditions, which result in errors in the precipitation forecast. This research investigates numerous perturbations to the WRF rainfall grid and their effects on streamflow output after running through a hydrological model. Results indicate that 1) error dampening from rainfall to streamflow is greatest when the initial rainfall perturbation caused an increase in rainfall relative to the control and 2) translation of error from precipitation to streamflow is strongly affected by surface characteristics such as topography, land use, and soils. Intensity and timing errors propagate fairly linearly to streamflow error, which therefore may be simpler to predict. Simulations that result in location error are most important because the effects of erroneous precipitation locations on streamflow are dependent on the pattern of observed precipitation. Streamflow effects from these errors are nonlinear and difficult to determine for operational flood prediction.
The study of error in WRF may help to establish limits on acceptable rainfall error for flood forecasting of large events. For example, WRF rainfall errors less than or equal to ±0.5 mm h−1 appear to have a negligible effect on the quality of a streamflow forecast, as do timing errors of 6 h or less. This type of research can serve as a guideline for error estimation and correction in operational modeling with the WRF model. Further research is necessary, using a wider range of perturbation experiments, to better define the relative effects of different error types. Future publications on this subject should include more quantitative analysis, such as the growth rate of error for each perturbation type. In addition, a finescale precipitation product such as NOAA stage III should be used for more detailed verifications of results.
This research was performed while on appointment as a U.S. Department of Homeland Security (DHS) Fellow under the DHS Scholarship and Fellowship Program, a program administered by the Oak Ridge Institute for Science and Education (ORISE) for DHS through an interagency agreement with the U.S. Department of Energy. ORISE is managed by Oak Ridge Associated Universities under DOE Contract DE-AC05-00pOR22750. All opinions expressed in this paper are the authors’ and do not necessarily reflect the policies and views of DHS, DOE, or ORISE. Software support for the WRF model was provided by UCAR user support. The authors thank Guo-Yue Niu, Jeff Lo, and Xiaoyan Jiang for their insight and suggestions on the WRF model and experimental design. We also appreciate the comments of two anonymous reviewers that helped to improve the original manuscript.
Corresponding author address: Marla R. Knebl Lowrey, Dept. of Geological Sciences, 1 University Station, C-1100, University of Texas at Austin, Austin, TX 78712-0254. Email: email@example.com