Rainfall diagnostics from 48-h, 20-km mesoscale runs of the RAMS model configured for the New Zealand region have been characterized and compared to forecasts from the U.K. Meteorological Office global model with a view to operational use. The accuracy and precision of these diagnostics and their sensitivity to various model parameters have been determined by conducting several parallel series of experiments for the month-long SALPEX96 observing period (October–November 1996) and by comparing model results with rain gauge data.
A detailed validation reveals that the mesoscale configuration of RAMS adds significant value to rainfall forecasts from the global model in situations of heavy orographic rain, particularly when the full RAMS microphysics scheme is used. The higher spatial resolution of the mesoscale model allows a better representation of the steep New Zealand orography and the observed sharp rainfall gradients. The mesoscale model and the global model both overforecast light rain and perform more poorly for light rain than for moderate or heavy rain.
In the sensitivity study it is found that snow, graupel, and aggregates provide important enhancement mechanisms for rainfall in the Southern Alps, and modeling processes related to these hydrometeor species improves forecasts in the lee of the Southern Alps (the “spillover” effect). It is also found that the soil moisture initialization strongly affects forecasts of light rain in our study, and that increasing the size of the mesoscale model domain does not always improve rainfall forecasts in the data sparse New Zealand region. The implications of these findings for future data assimilation work are also discussed.
New Zealand’s extreme orography is not well resolved in today’s generation of global models but has a profound effect on the country’s weather (Wratt et al. 1996). Consequently there is a natural tendency to investigate the value that can be added to the forecasting process by operational runs of mesoscale models forced by analyses and forecasts from global numerical weather prediction (NWP) centers. These mesoscale runs are expected to provide improved guidance for forecasts of rainfall for the general public and the agricultural community, for the day-to-day management of inflows into hydro-lakes, and for hazard warnings in case of floods and storms.
The first goal of this paper is the validation of mesoscale numerical rainfall forecasts for the New Zealand region in comparison with forecasts from a global model. Rainfall is a particularly attractive weather element to validate due to its importance for hazard warnings and the forecasting difficulties involved (e.g., Olson et al. 1995). For the validation of quantitative rainfall forecasts, various statistical scores are usually calculated for operational forecasts (e.g., Mesinger 1996; Olson et al. 1995; Ebert and McBride 1997) or single case studies (Wang and Warner 1988). Although these statistical scores are still generally very poor compared to perfect scores, some improvements have been reported resulting from the use of mesoscale models due to their higher horizontal and vertical resolution and better model physics.
A second goal of this paper is to characterize the sensitivities of the rainfall forecasts to various model parameters and in particular to identify areas in which further work would be rewarded. Issues to investigate include the most suitable size of model domain (e.g., Warner et al. 1997), parameter settings for cloud microphysics schemes (e.g., Walko et al. 1995a), the impact of various initialization data, and the influence of surface parameterizations. Here, we do not address the problem of spinup of precipitation (e.g., Turpeinen et al. 1990) because it is not crucial in New Zealand as the initial part of any mesoscale run is out of date by the time the run has completed because the global run from which it is initialized takes 6–12 h to arrive.
To characterize our rainfall forecasts we examined rainfall diagnostics from several sets of mesoscale forecasts for the October to early November 1996 field campaign of the Southern Alps experiment (SALPEX—Wratt et al. 1996). This is a multiyear, collaborative mountain meteorology research program undertaken to improve understanding of the processes through which the Southern Alps influence New Zealand’s weather and climate. We trust that our validation and the identification of the sensitivities involved are the first steps toward forecast improvements (Ebert and McBride 1997).
The structure of the paper is as follows: the numerical model and the experiments used in our study are described in section 2. The statistical scores used are introduced in section 3, and the results of the validation are presented in section 4. Our conclusions and some further discussions are presented in section 5.
2. Global data, numerical model, and experiments
a. Global forcing data
The most valuable resource for this study are sets of global analyses and forecasts spanning the SALPEX96 observing period (8 October–7 November 1996). These are used to initialize the mesoscale forecasts and to provide boundary forcing data. The first set consists of 6-hourly European Centre for Medium-Range Weather Forecasts (ECMWF) Tropical Ocean and Global Atmosphere (TOGA) advanced operational analyses specified on a regular latitude–longitude grid at a resolution of 0.5625° in north–south and east–west directions (i.e., approximately 60 km in north–south and 50 km in east–west over New Zealand). Fifteen pressure levels are provided: 1000, 925, 850, 700, 500, 400, 300, 250, 200, 150, 100, 70, 50, 30, and 10 hPa. Data on rainfall are not included in the analyses.
During the SALPEX96 field campaign, operational runs of our mesoscale model entailed the use of analyses and forecasts from the global “Bracknell Fine Mesh Model,” (BRACKFM) which is run by the U.K. Meteorological Office (UKMO). The data were made available to us by MetService of New Zealand. These data are 6-hourly but at a lower resolution of 1.25° north–south and east–west directions (i.e., approximately 140 km in north–south and 105 km in east–west over New Zealand), with only 10 levels provided: 1000, 850, 700, 500, 400, 300, 250, 200, 150, and 100 hPa. These BRACKFM forecasts are the most extensive forecast data available in New Zealand in real time and as such represent the best source of initial and boundary forcing data for an operational mesoscale system in New Zealand. The data includes predictions of 6-h accumulated precipitation.
The key differences between the BRACKFM and the ECMWF datasets are the lower spatial and vertical resolution of the BRACKFM data and the fact that the ECMWF data contains analyses only, whereas the BRACKFM data are analyses at the initial times and forecasts at later times. Due to the higher spatial resolution we expect the ECMWF analyses to be superior in quality to the BRACKFM analyses, but due to the sparsity of observations in the New Zealand region the difference in quality may be small.
b. The model
For this study we have used version 3b of the Regional Atmospheric Modelling System (RAMS), described in Pielke et al. (1992). This model has an excellent pedigree and earlier versions of the code have been used in a wide variety of mesoscale studies. These include precipitation events in the neighborhood of significant orography (e.g., Snook and Pielke 1995; Albers et al. 1996; Zhong et al. 1996). Because one of the long-term goals of the SALPEX project is to improve mesoscale forecasting in the New Zealand region, RAMS seems an excellent choice because it is used for many real-time forecasting systems (e.g., Manobianco et al. 1996; Snook et al. 1995).
While RAMS has a sophisticated interactive grid nesting capability, for our study we use just one grid for each run nested within ECMWF and BRACKFM fields. The baseline model grid is illustrated as the inner grid in Fig. 1a and as dots in Fig. 1b, and has the characteristics given in Table 1. The baseline model options that we use in our experiments are also briefly described in Table 1. Most of these are discussed in more detail in Pielke et al. (1992). Three-dimensional fields used to initialize the RAMS model are: geopotential, temperature, zonal and meridional velocities, and relative humidity.
Data assimilation was not included in any of our RAMS experiments, apart from the built-in relaxation toward global fields during the first 6 h as listed in Table 1. This relaxation toward global forecast fields as part of the initialization process was included to ensure an adequate specification of the synoptic situation in RAMS (M. R. Sinclair 1998, personal communication) and its usefulness has been investigated in trial experiments with and without the relaxation (not shown).
Of particular importance for the rainfall forecasts is the comprehensive microphysics package of RAMS, and it is worthwhile to give a brief description of this here. Extensive details can be found in Walko et al. (1995a). The initial 3D relative humidity field (i.e., total water) is specified and the mixing ratios of rain, pristine ice, snow, aggregates, graupel, and hail are obtained by integrating forward (from initial zero values) their respective equations, which include terms due to advection, turbulent transport, sedimentation, etc. A prognostic equation for total water is also integrated, and by subtracting from this the sum of the six mixing ratios mentioned above the sum of mixing ratios of cloud water and vapor is obtained. The difference (if any) between this and the local saturation mixing ratio then yields the cloud water mixing ratio, leaving the water vapor mixing ratio as the remainder.
Included in the prognostic equations are terms describing the interaction between species, and these processes require knowledge of the size distribution of each hydrometeor category. In version 3b these are assumed to follow a generalized gamma distribution with a shape parameter (ν) to be specified (e.g., ν = 1 gives the Marshall–Palmer distribution). With the functional form of the distribution known, the total mass of each hydrometeor category requires a knowledge of either the total number concentration (Nt) or the mean diameter (Dm); RAMS offers users the choice of specifying either of these.
All experiments described here concern 48-h runs starting at 1200 UTC (New Zealand standard time is UTC + 12 h) every day of the SALPEX96 observing period for which data from both global datasets were available: while temporal coverage of ECMWF data is 100%, transmission problems caused BRACKFM data to be missing on almost one-third of days, so that a total of 20 runs were possible. Several parallel series of model runs with altered parameters were performed for the SALPEX96 period, and these are summarized in Table 2 and discussed in the following:
The S96BFM runs were forced with BRACKFM forecasts and use the model parameters given in Table 1. The model configuration used in S96BFM was our “first guess,” which at the start of this study seemed appropriate for an operational environment. S96BFM serves as control experiment in our sensitivity study.
To investigate the sensitivity of our rainfall forecasts to the model setup, we performed three further series of experiments that were identical to S96BFM except for subtle changes to the model configuration regarding the soil moisture initialization, the microphysics scheme, and the domain size (Table 2). The S96BFMDRY runs use a lower horizontally homogeneous initial soil moisture content than S96BFM, and are thus designed to test the hypothesis that the surface moisture (via the soil moisture initialisation) is crucial in determining the accuracy of forecasts of light rain. This has been found to be the case in the summertime continental United States (Paegle et al. 1996) and also during slowly evolving synoptic situations in the United Kingdom (B. Macpherson 1998, personal communication). For experiment S96BFMMIC we included the hydrometeor species snow, graupel, and aggregates in with rain, pristine ice, and cloud water already used in S96BFM to investigate their impact on rainfall generation in the model. S96BFMGNU2 also included the hydrometeor species snow, graupel, and aggregates, but used the hydrometeor size distribution given by ν = 2 instead of the Marshall–Palmer distribution used in S96BFMMIC [see Walko et al. (1995a), their Fig. 1, for the resulting distribution functions]. This was done to test the sensitivity of the simulations to one further parameter of the microphysics scheme. Last, the S96BFMBIG experiments were conducted with a domain covering twice the area of the other experiments (see outer grid in Fig. 1a). This was done in order to move possible errors introduced at the lateral boundary further away from the region of interest (e.g., Majewski 1997).
Two further experiments have been conducted with an improved model configuration as motivated by results from the sensitivity study: S96BFMFULL used the same setup as S96BFM, except that it incorporated additional hydrometeor species as in S96BFMMIC and used the lower initial soil moisture content of S96BFMDRY. As S96BFM, S96BFMFULL was forced with BRACKFM forecasts. In contrast, S96ECFULL was forced using the ECMWF high-resolution analyses, but used the same model configuration as S96BFMFULL. Comparison of S96BFMFULL and S96ECFULL thus enables us to investigate the influence of global model forecast errors (an appreciation of which is required before starting major efforts to assimilate local data). S96BFMFULL and S96ECFULL are considered our best set of runs.
3. Validation methodology and statistical scores
The rainfall validation is done against 6-, 12-, and 24-h accumulations derived from 3-hourly rain gauge data from the 166 sites shown in Fig. 1c. While rain gauge data are not perfect for model evaluation, this is the only way to get coverage for the whole of New Zealand. Gridded model fields of accumulated precipitation totals (the sum of all precipitation categories in the model) were bilinearly interpolated to station locations, yielding pairs of observed and model values. These pairs are subsequently used to calculate an ensemble of statistical scores defined below.
Rainfall verification based on gridded analyses is usually preferred to our approach to avoid comparing fields of different representativeness, that is, point measurements on the one hand and model gridbox averages on the other (e.g., Ebert and McBride 1997; Mesinger 1996). However, for several reasons this does not work for the New Zealand situation. First, the observational coverage makes it impossible to calculate a meaningful rainfall analysis on our 20 km RAMS grid. Second, to verify the model forecasts on a grid with coarser resolution is not desirable as this produces less useful results for a small country like New Zealand and moreover degrades the detail of the high-resolution mesoscale model forecast.
The station density of the rain gauge data is not uniform across the country (Fig. 1c). It is slightly higher in the alpine region, and gaps are noticeable in the northern and southern parts of the South Island. While this can lead to biases toward the performance of the forecasts in particular regions, further rain gauge data were not available and the data coverage is assumed to be acceptable. However, note that the spacing of some stations limits the identification of positioning or timing errors in our validation method. Potentially, this makes it more difficult to demonstrate that our mesoscale rainfall forecasts added value to the global forecasts.
The statistical measures used are the bias score (BS), the equitable threat score (ETS), and the correlation coefficient. Here BS and ETS are threshold dependent and defined as
with O representing the number of stations reporting rain in excess of a certain threshold, F the number of stations with model values exceeding the threshold, C the number of stations with model values and observations greater than the threshold, and E = FO/N the number of random correct forecasts expected due to chance within the total number N of model and observational pairs. In this study, BS and ETS are calculated as a function of forecast hour from the whole pool of pairs of model and observed values.
The interpretation of these scores is as follows: BS gives a measure of the bias in event frequency regardless of location, with the ideal score being 1. Typical scores for 24-h accumulated rainfall vary between 0 and 2 depending on the threshold (e.g., Mesinger 1996; Stauffer and Seaman 1990). ETS measures the agreement between forecast and observed rainfall pattern with ETS = 0 for a random forecast and ETS = 1 for a perfect forecast, which correctly places rain exceeding the threshold at the right locations. For operational models, ETSs are typically of the order of 0.3 for 24-h accumulated rainfall (e.g., Mesinger 1996). Further discussion on theses scores can be found in Mesinger (1996) and Ebert and McBride (1997).
To assess the performance of the rainfall forecasts regardless of the magnitude, spatial and temporal correlation coefficients were also calculated (e.g., Ebert and McBride 1997). We use an average spatial correlation coefficient calculated in the following way: for each model run we compute spatial correlation coefficients between model and observed rainfall at 3-h intervals; we then average these spatial correlations, yielding the average spatial correlation coefficient as a function of forecast hour. It was found that this method of calculating a spatial correlation coefficient yielded about 50% smaller scores than deriving a correlation coefficient from the whole pool of values valid at a certain forecast time. However, the average spatial correlations are more relevant for day-to-day forecast performance and are therefore used in this study.
We also calculated a correlation coefficient for each station based on all pairs of observed and model values available from all model runs at each station. This correlation coefficient will be referred to as temporal correlation coefficient. We did not choose to calculate individual temporal correlation coefficients for each model run and then average over all model runs as the number of available observed and model values from one run and for one station was considered too small (e.g., eight values for 24-h accumulations).
4. Validation results
We will first investigate the performance of our two “best” sets of RAMS experiments S96BFMFULL and S96ECFULL, and contrast the results against the validation of corresponding rainfall forecasts gained directly from the global BRACKFM. Here, our objectives are to characterize deficiencies resulting from forecast errors and to determine whether or not RAMS can add value to the global forecasts for the 20 runs considered. The sensitivity study that motivated the setup for the experiments S96BFMFULL and S96ECFULL is discussed later in this section.
a. General performance
Both RAMS experiments S96BFMFULL and S96ECFULL show a similar behavior in terms of bias statistics and pattern accuracy as measured by BS and ETS for various thresholds (Fig. 2). Light and moderate rain is overforecast, whereas the frequency of heavy rain events is better predicted for the thresholds considered.1 Accordingly, the pattern accuracy of the predicted rainfall is best for heavy rain, but worst for light rain. In contrast to the RAMS experiments, the global BRACKFM strongly underforecasts heavy rain and shows a smaller range of forecast rain amounts. For all three forecasts considered, BSs and ETSs at most thresholds are similar to those reported for operational models in the literature (e.g., Mesinger 1996; Ebert and McBride 1997). Although there are differences in the validation methodology, this indicates the usefulness of all three forecast products despite the fact that ETSs are small compared to the ideal score of 1.
For heavy rain, both RAMS experiments yield better ETSs than BRACKFM, demonstrating the superiority of the mesoscale model in resolving heavy rainfall (Fig. 2). This is related to the finer horizontal resolution in the mesoscale model, which allows a better representation of the underlying terrain with more realistic mountain heights and slopes (Fig. 1b), thus yielding better results for orographic rain, which is the primary source for moderate and heavy rain in New Zealand (e.g., Sinclair et al. 1997).
For light rain, all three forecasts perform poorly and produce too much light rain, as can be seen from too high BSs and low ETSs in Fig. 2. For S96BFMFULL, this wet bias for light rain increases slightly during the integration. In addition to the wet bias for light rain, S96BFMFULL also strongly overforecasts moderate rain later in the forecast, similar to BRACKFM, but in contrast to S96ECFULL (Fig. 2b). Comparison of S96ECFULL and S96BFMFULL suggests that the wet bias for moderate rain results from forecast errors introduced by the global model rather than from deficiencies in the RAMS model, as the use of analyses as forcing data eliminated the wet bias. A main contributor to inaccurate global forecast fields and in particular an oversmooth moisture field is certainly the lack of observations in the New Zealand region.
The ability of RAMS to yield better forecasts for moderate and heavy rain is also highlighted if we compare spatial correlation scores for S96BFMFULL and BRACKFM for individual forecasts. For forecasts with significant amounts of rain observed across the country, S96BFMFULL yields significantly better spatial correlation scores than BRACKFM, whereas BRACKFM gives better correlations for light rain events (not shown). In total, this leads to comparable average spatial correlation coefficients for both series of forecasts with a slight advantage for BRACKFM.
It is worthy of mention that all validation scores show a strong forecast-to-forecast variability. For example, individual forecasts of 24-h accumulated rainfall between forecast hours 24 and 48 in S96BFMFULL produced spatial correlations as good as 0.86 and as poor as −0.17. Similar findings can be reported for the other scores and the other series of experiments. This variability is partly due to the variability of the observed rainfall for each forecast. This shows that it is inadequate to base model validation on single case studies only as demonstrating a particular model performance is largely dependent on choosing the “right” case. However, single case studies may be useful in investigating certain aspects of model performance and in highlighting typical characteristics, as done later in this section.
As a function of forecast hour, the average spatial correlation coefficients for the experiment S96BFMFULL show a decline with forecast hour likely in an operational environment (Fig. 3a). Forecasts for shorter accumulation periods are consistently poorer, as a longer accumulation period helps to reduce the influence of timing errors common in rainfall forecasts. Similar findings can be reported for BRACKFM (not shown). In contrast, average spatial correlations for S96ECFULL show a significantly slower decline with forecast hour (cf. Figs. 3a and 3b) starting from values similar to the ones in S96BFMFULL during the early portion of the forecast. The slower decline shows that the use of analyses as forcing data reduces error growth during the integration. Note, however, that the average spatial correlation coefficient of S96ECFULL also shows a slight decline with forecast hour despite using analyses of approximately constant quality as forcing data. This decline is likely to be a result of deficiencies in the mesoscale model.
The finding that average spatial correlation coefficients for S96BFMFULL and S96ECFULL are comparable for the early portion of the forecast is surprising. For early forecast hours, the initial fields are the dominating factor in the performance of RAMS, and the difference in the quality of the boundary fields (analyses/forecasts) for S96ECFULL and S96BFMFULL is less important. As the ECMWF TOGA analyses have a higher spatial resolution than the initial data used in S96BFMFULL and therefore allow a better fit to observations, we expected that S96ECFULL would generally lead to better simulations, that is, also during the early portion of the forecast. The most likely explanation why this was not the case is that in the New Zealand region the higher spatial resolution of the ECMWF analyses does not guarantee a higher quality due to the sparsity of observations. Moreover, horizontal diffusion or filtering used in RAMS to suppress numerical noise might smooth out additional detail early in the integration similar to findings reported by Mesinger (1996) for the Eta model. The latter explanation is supported by a comparison of sample moisture fields from S96BFMFULL and S96ECFULL at forecast hour 12, which show similar levels of detail (not shown).
Temporal correlation coefficients for individual stations (Fig. 4) highlight advantages in the RAMS runs in regions where the mesoscale model can provide a more detailed representation of the terrain. An example for this is the central North Island, where both RAMS runs yield excellent temporal correlations (≈0.85), but BRACKFM performs poorly (temporal correlations ≈ 0.45) as it cannot resolve detail in the orography. All three forecasts show the best temporal correlations in areas of orographic rain on windward (western) slopes of larger mountains (cf. Figs. 4a–c and 1b), that is, where large-scale atmospheric forcing and interaction with the terrain dominates. Note that these are also the areas with higher rainfall (Fig. 4d), and the signal measured by the temporal correlation coefficient is thus stronger.
BRACKFM outperforms S96BFMFULL in some regions. In the northern and northeastern part of the North Island or the southern part of the South Island temporal correlations are about 0.1–0.2 lower for the RAMS runs than for BRACKFM. This is caused by the proximity to the lateral boundary of the RAMS grid (Fig. 1b) where relaxation to the global fields and zero vertical velocity takes place. This generates boundary errors as described in Majewski (1997) and suppresses the generation of mesoscale circulations needed for rainfall. The latter point is supported by an examination of sample vertical velocity fields in this region (not shown). The influence of the proximity of the lateral boundary condition is addressed further in our sensitivity study later in this section. Also, the poor performance in the eastern part of the North Island is linked to a poor performance of RAMS for light rain (cf., e.g., Figs. 4b and 4d).
b. Forecast example
In the following we briefly investigate one particular rainfall forecast for the storm of 19–20 October 1996 to further highlight how in some cases RAMS adds significant value to forecasts of orographic rain. During 19–20 October 1996, a long frontal rainband extended over the South Island of New Zealand and the Tasman Sea and moved slowly eastward (Fig. 5). A strong northwesterly flow lay over the country caused by a large anticyclone in the east of New Zealand. This is a typical situation for New Zealand leading to excessive rainfall on the west coast of the South Island (e.g., Wratt et al. 1996).
To investigate the performance of RAMS and BRACKFM, we compare the 24–48-h forecast of daily precipitation from the runs initialized at 1200 UTC 18 October 1996. Only the run from experiment S96BFMFULL is considered in this comparison as only this one is a forecast. For ease of comparison with observational data, we also only consider model values at observing stations.
The RAMS forecast shows an area of heavy precipitation along the west coast of the South Island and a sharp rainfall gradient across the Southern Alps (Fig. 6a). The area of heavy precipitation and the sharp rainfall gradient are in good agreement with the observations (Fig. 6c). In contrast, the forecast from BRACKFM is much smoother, reflecting the coarser spatial resolution of the global model. This coarser spatial resolution makes it impossible to resolve New Zealand’s rugged topography and to capture the observed sharp gradients in rainfall. Due to these limitations in the global model, BRACKFM strongly overestimates the amount of spillover across the alpine divide, whereas spillover appears to be well predicted in the RAMS run (see Fig. 1c for the position of the alpine area). Both of the above findings demonstrate the significant value added to this forecast by the higher resolution of the RAMS run and the better representation of the terrain. Note, however, that the sparsity of the observations makes it difficult to identify deficiencies in the mesoscale model in this regard.
At most stations west of the alpine divide RAMS predicts about 60–110 mm daily rain and these amounts usually agree well with the observations (within 20%–40%), although there is a tendency to underforecasting. However, some deficiencies in the positioning of the predicted rainfall can be identified: the region of heavy rain in the RAMS run extends farther north than observed and for the stations shown the largest model rain amount lies about 70 km southeastward compared to the observed peak of 239.5 mm. Moreover, the peak rainfall is strongly underestimated (148.5 mm). We assume this underestimation could be ameliorated by the use of a higher spatial resolution, which would allow an even better representation of the terrain. Revell et al. (1995) found a spatial resolution of 5 km necessary to model accurately the contribution of mountain gradients to rain production in the Southern Alps. However, current computer power does not allow a further increase in resolution in our operational system. Due to an even coarser resolution, underestimation of heavy falls is a more severe problem in BRACKFM than it is in RAMS. Moreover, BRACKFM puts its highest falls even farther to the south than the RAMS run.
c. Sensitivity experiments
We now investigate the sensitivities of the rainfall forecasts to certain aspects of the model setup. To do so we compare the experiment S96BFM with experiments S96BFMDRY, S96BFMMIC, and S96BFMBIG (described in Table 2).
Comparing S96BFM and S96BFMDRY reveals a strong sensitivity of light rain forecasts to the initial soil moisture content. Compared to S96BFM, S96BFMDRY yields a substantially reduced wet bias for light rain, which also resulted in an improved pattern accuracy (Fig. 7). This suggests that an initial soil moisture content of 30% is more representative for New Zealand at this time of the year than 50% and that the poorer performance of RAMS for light rain can partly be attributed to a poor specification of the soil moisture variable in our runs. As expected, moderate and heavy rainfall is mainly unaffected by horizontally homogeneous changes in the initial soil moisture content as other processes such as the interaction of large-scale fields with the terrain dominate in these cases. The better performance of S96BFMDRY compared to S96BFM led us to choose an initial soil moisture content of 30% in S96BFMFULL.
The importance of an adequate specification of soil moisture in numerical weather models has been stressed by many authors and the derivation of a suitable analysis of this field has been identified as a major component of regional and global numerical weather prediction where further progress is needed (e.g., Majewski 1997;Bougeault 1997; Betts et al. 1996). Given the strong sensitivity of model light rain to soil moisture, a homogeneous initialization of the soil moisture content is certainly a major limitation in our forecasts. This initialization was necessary due to the lack of observations. However, a realistic soil moisture field would certainly show large geographical variations, especially given the strong geographical differences in observed rainfall during the trial period (Fig. 4d).
Comparing S96BFM and S96BFMMIC reveals a strong sensitivity of heavy precipitation forecasts of RAMS to processes related to snow, graupel, and aggregates. In general the inclusion of these processes lead to substantially improved bias statistics for heavy falls and accordingly better pattern accuracies (Fig. 7) resulting from an improved range of forecast amounts. For example, the peak 24-h observed precipitation during the trial period was 334.5 mm, and in the S96BFMMIC the modeled value is 184.3 mm compared to 97.8 mm for S96BFM. We thus chose to include snow, graupel, and aggregates in our improved setup S96BFMFULL. The additional hydrometeor species also act to produce more light precipitation, leading to poorer bias statistics for light rain (Fig. 7).
Compared to S96BFM, the additional hydrometeor species in S96BFMMIC lead to a net improvement of precipitation forecasts associated with the Southern Alps. Temporal correlations at stations eastward of the main divide are substantially improved, particularly in the Christchurch region (see Fig. 1c for the location of Christchurch). At Christchurch, temporal correlations increased from 0.38 to 0.81 for 24-h accumulations. Similar improvements can be reported for stations in the central North Island. These findings suggest that hydrometeor species such as snow, graupel, and aggregates provide important enhancement mechanisms for precipitation in the Southern Alps and play a key role in determining spillover across the main divide. This is in agreement with findings from, for example, Sinclair et al. (1997), who hypothesized that snow blown over the Southern Alps by strong northwesterly winds is at least partly responsible for spillover events.
Once snow, graupel, and aggregates are included in the model setup, changing the shape parameter ν for the size distribution of the hydrometeors in the microphysics scheme from ν = 1 in S96BFMMIC to ν = 2 in S96BFMGNU2 did not result in significantly altered simulations (not shown). This suggests that some error is acceptable in modeling the size distribution, and accounting for snow, graupel, and aggregates in the model is more important than the specification of the shape of the size distribution. Nevertheless, other parameters in the microphysics scheme potentially affect the performance of the scheme (e.g., total number concentration of the hydrometeors, the mean diameter), and further studies are needed to investigate their impact (Walko et al. 1995a).
S96BFMBIG shows a significantly decreased range of model rain amounts compared to S96BFM. The largest 24-h accumulated model rain amount in S96BFMBIG reaches 73.1 mm compared to 97.8 mm in S96BFM leading to poor BSs and ETSs for heavy rain, which are worse in comparison to S96BFM (Fig. 7). Moreover, rainfall forecasts at stations close to the lateral boundaries show some sensitivity to the domain size, but are not generally improved: although bias statistics indicate a slightly better performance in these regions (not shown), temporal correlations in the southern part of the South Island are about 0.1–02 lower (poorer) in S96BFMBIG compared to S96BFM, whereas temporal correlations in the northern part of the North Island are about 0.1 higher (not shown). Overall, S96BFMBIG exhibits a slightly poorer performance for rainfall forecasts than S96BFM.
This was unexpected as mesoscale modelers usually argue that a larger domain of the limited-area model leads to better forecasts, as the influence of boundary errors in the region of interest is reduced (e.g., Warner et al. 1997). For our experiments, this argument holds true for the rainfall forecasts for stations closest to the northern boundary, but fails for the stations closest to the southern boundary and for the predicted range of model rain amounts. A likely explanation for the poorer performance of S96BFMBIG in the southern part of the South Island is the data sparsity over the Southern Ocean: south of New Zealand the quality of the global forecast data is expected to drop markedly with latitude south due to the sparsity of observations over the Southern Ocean. Thus moving the southern boundary of the RAMS domain further south means that RAMS is forced by global fields of poorer quality leading to poorer forecasts. Moreover, the poor performance of S96BFMBIG might be a result of model deficiencies, which can have a stronger impact in the larger domain. While beyond the scope of this study, experiments with an altered position of the larger model domain or with a much larger domain and possibly nested grids would be useful to further investigate the impact of the domain size on our simulations.
Comparing the sensitivities investigated in this subsection, we find the strongest response with the inclusion of snow, graupel, and aggregates in the microphysics scheme, highlighting the importance of model physics. Light rain was strongly sensitive to the soil moisture initialization indicating the importance of the treatment of surface processes. Increasing the domain size in data sparse regions does not always lead to better rainfall forecasts. Our experiments showed no significant sensitivity to the shape parameter used to determine the size distribution function for the hydrometeors in the microphysics scheme.
5. Discussion and conclusions
In this paper we have characterized rainfall diagnostics from the RAMS model (Pielke et al. 1992) configured for use over New Zealand in a set of mesoscale forecasting experiments covering the month-long SALPEX96 field campaign. Mesoscale modeling and forecasting is essential in order to resolve features resulting from the atmosphere’s interaction with New Zealand’s varied and mountainous terrain, which is not well represented in global models. Before such a model can be used operationally several issues have to be evaluated:namely, the accuracy and precision of rainfall diagnostics and their sensitivity to the quality of initial data, the size of the domain, and the treatment of surface parameters. Also of interest are indications of the areas in which further investment of effort is likely to bring about most cost-effective improvement.
Rainfall forecasts from RAMS show significant skill for New Zealand. Comparison with rainfall forecasts direct from the global UKMO Fine-Mesh model shows that our 20-km RAMS configuration has higher skill for heavy precipitation associated with orography, especially when its full microphysics scheme is used. In particular, the higher spatial resolution allows RAMS to better resolve the strong mountain and rainfall gradients of New Zealand’s topography, and using detailed microphysics enables RAMS to better model orographic enhancement and spillover. Our study thus highlights the benefits of running a mesoscale model over New Zealand. At present RAMS performs poorly for light rain events without accurate specification of the initial soil moisture field. In general the validation scores for all forecasts considered in this study are of a similar magnitude to those reported in the literature (e.g., Mesinger 1996; Ebert and McBride 1997) indicating the usefulness of the forecast products.
Our experiments show the strongest sensitivity to the complexity of the microphysics scheme and demonstrate that modeling processes related to snow, graupel, and aggregates in the model is essential. The lack of sensitivity to the shape parameter for the hydrometeor size distribution suggests that some error is acceptable in modeling this size distribution. The sensitivities to other parameters specified in the microphysics scheme have not been investigated, as it was beyond the scope of this study. Forecasts of light rain are sensitive to the soil moisture field indicating that the horizontally homogeneous initialization used in our experiments is a severe limitation and a more careful treatment of the soil moisture initialization is desirable. Our experiments with the domain size show that increasing the size of the mesoscale model domain does not always improve rainfall forecasts in data sparse regions.
The sensitivities discussed above suggest that further work is needed regarding the configuration of the RAMS microphysics scheme and the soil moisture initialization. For a better specification of parameters used in the microphysics scheme (e.g., the total number concentration), it seems sensible to make use of the various observations available for the SALPEX96 period and to investigate how precisely the various hydrometeor species act in the model and in reality. To improve the soil moisture initialization it would be beneficial to implement a method to derive a spatially variable soil moisture field, for example, using data assimilation schemes based on screen-level forecast errors (e.g., Bouttier et al. 1993) or satellite inferred heating rates (e.g., McNider et al. 1994; Jones et al. 1998).
Our results show that using analyses as forcing fields at the lateral boundaries significantly limits error growth in the mesoscale model during the integration, highlighting the importance of high quality global forcing fields. Similar findings have been reported by Anthes et al. (1989) for regional models. This finding also suggests that data assimilation would be beneficial for our operational forecasting system: by the time the global forcing data arrives in New Zealand (6–12 h late), more recent observations are locally available (e.g., surface observations, satellite data), and incorporating these data into the forecasting system should help to reduce forecast errors.
One example for locally available data is satellite-derived moisture information. Work at major centers has shown that infrared satellite imagery from geostationary satellites plays a very important role in the mesoscale moisture analysis process. The UKMO supplement sparse upper-air moisture observations with synthetic moisture profiles output from a moisture preprocessing scheme (MOPS; Macpherson et al. 1996, 1995). In a similar fashion the LAPS mesoscale cloud and precipitation analysis scheme described in Albers et al. (1996) has been used with success in the initialization of the RAMS model for mesoscale forecasts in the United States (Snook and Pielke 1995; Snook et al. 1995).
Of course the difficulty of assimilating moisture data into models is stated explicitly by Raymond et al. (1995): “The initialisation and assimilation of cloud and rainwater in numerical models is nontrivial because these quantities are the cumulative products of the transport of atmospheric moisture, both vapour and liquid, plus the physical processes associated with cloud development and decay.” However, there are many studies in this area in which theoretically simple approaches to data assimilation using analyzed precipitation and cloud fields have added value to forecasts (e.g., Wang and Warner 1988; Turpeinen 1990; Nehrkorn et al. 1993; Chang and Holt 1994; Manobianco et al. 1994; Ledvina and Pfaendtner 1995; Wu et al. 1995; Treadon 1996; Peng and Chang 1996). Even basic information such as the delineation of areas of rain/no-rain have proved useful in this regard (Manobianco et al. 1994), so we have high hopes of eventual success.
The authors would like to thank John Lumsden, Roddy Henderson, Andrew Harper, Jeff Copeland, James McGregor, and two anonymous reviewers for their help with and comments on this work, which has been completed under Foundation for Research Science and Technology Contract CO1521 and with the support of a Victoria University of Wellington Ph.D. scholarship.
Corresponding author address: Niels Bormann, National Institute of Water and Atmospheric Research, Ltd., P.O. Box 14-901, Wellington, New Zealand.
The terms “light,” “moderate,” and “heavy” rain used in the following refer to daily rain amounts of 0.2–5, 5–20, and >20 mm, respectively. These categories have been chosen as a compromise between meteorological and statistical significance.