## Abstract

The fifth-generation Pennsylvania State University–NCAR Mesoscale Model (MM5)-based regional climate model (CMM5) simulations of U.S.–Mexico summer precipitation are quite sensitive to the choice of Grell or Kain–Fritsch convective parameterization. An ensemble based on these two parameterizations provides superior performance because distinct regions exist where each scheme complementarily captures certain observed signals. For the interannual anomaly, the ensemble provides the most significant improvement over the Rockies, Great Plains, and North American monsoon region. For the climate mean, the ensemble has the greatest impact on skill over the southeast United States and North American monsoon region, where CMM5 biases associated with the individual schemes are of opposite sign. Results are very sensitive to the specific methods used to generate the ensemble. While equal weighting of individual solutions provides a more skillful result overall, considerable further improvement is achieved when the weighting of individual solutions is optimized as a function of location.

## 1. Introduction

No existing model, global or regional, fully represents the observed climate system. Each model contains substantial climate biases and inherits distinct climate sensitivities, both of which mask the correct prediction of regional climate responses to any forcing. Even for a single model, solutions may vary substantially depending on which physical parameterizations are activated. Thus, consensus weather and climate predictions based on the ensemble of multiple models or multiple physical configurations of a model have recently been highlighted due to their superior skill over those using a single model or configuration (Krishnamurti et al. 1999; Fritsch et al. 2000; Rajagopalan et al. 2002; Palmer et al. 2004; Murphy et al. 2004; LaRow et al. 2005).

Precipitation modeling is particularly sensitive to the cumulus parameterization, especially in summer when convection is most active and model predictive skill is low. While numerous cumulus schemes exist, none performs equally well under all conditions (Wang and Seaman 1997; Giorgi and Shields 1999). Given incomplete physical understanding of convective processes and the large-scale forces that modulate them, different schemes have been developed with different conceptual underpinnings and tunable parameters (Arakawa 2004), both of which are not universal but quite uncertain. Their predictive skills are thus highly dependent upon weather or climate regimes (Liang et al. 2004a, b; Mapes et al. 2004; Zhu and Liang 2007). Given this circumstance, significant skill improvement is achievable through the use of an optimal ensemble of multiple cumulus schemes, including regime dependence for activation and relative contribution from the different schemes. This is the motivation for the present study.

The fifth-generation Pennsylvania State University–NCAR Mesoscale Model (MM5)-based regional climate model (CMM5) produces important regional differences in many characteristics when the Grell (1993) and Kain and Fritsch (1993) cumulus parameterization schemes are interchanged, including the diurnal and annual cycles (Liang et al. 2004a, b) and interannual variation (Zhu and Liang 2007). These differences indicate a strong degree of independence and complementarity between the two parameterizations. The purpose of this study is to generate effectively a two-member ensemble in which the relative contributions or weights associated with different parameterizations are optimized locally to yield overall minimum rms errors and/or maximum temporal correlations with observations, and then to demonstrate the superior skill of such an ensemble in simulating U.S.–Mexico summer precipitation for both the climate mean and the interannual variability.

The targeted application of this ensemble technique is improving the prediction of seasonal to interannual precipitation variability (although, in concept, it may have many related applications in weather forecast and climate change projection). Hence, we use CMM5 simulations of 21-yr historical records as downscaled from or driven by a global observational reanalysis (see below) to develop the optimization. In principle, it can be applied for the optimal ensemble downscaling of the actual predictions from global climate models like the operational National Centers for Environmental Prediction (NCEP) Climate Forecast System (Saha et al. 2006). In the present context, the word “prediction” and the phrase “predictive skill” are in reference to the CMM5 downscaling ability.

## 2. Model simulations and observations

Liang et al. (2004b) described the CMM5 model formulation, computational domain design, and its skill in reproducing the annual cycle of precipitation using the Grell cumulus scheme for a continuous baseline integration of the period 1982–2002 with a 30-km resolution as driven by the NCEP–Department of Energy (DOE) Atmospheric Model Intercomparison Project II (AMIP II) global reanalysis (Kanamitsu et al. 2002). Liang et al. (2004a) later documented the CMM5 skill in simulating the precipitation diurnal cycle and its dependence on the cumulus parameterization by comparing the baseline integration with a summer simulation using the Kain–Fritsch scheme. They demonstrated that the CMM5 results are sensitive to the choice of cumulus schemes because the skill of individual schemes is regime dependent. In particular, the Grell scheme realistically simulates the nocturnal precipitation maxima over the central United States and the associated eastward propagation of convective systems from the Rockies to the Great Plains where the diurnal timing of convection is controlled by large-scale tropospheric forcing, whereas the Kain–Fritsch scheme is more accurate for the late afternoon peaks in the southeast where moist convection is governed by near-surface forcing (Liang et al. 2004a). Summer rainfall amounts in the North American monsoon region are very poorly simulated by the Grell scheme but well reproduced by the Kain–Fritsch scheme, whereas rainfall amounts from moist convection in the southeast are underestimated by the former and overestimated by the latter (Liang et al. 2004b). The above sensitivity characteristics have also been reflected by the CMM5 simulations of interannual variations of precipitation and surface air temperature (Zhu and Liang 2007).

This study focuses on the summer (June, July, and August) months. Monthly mean precipitation during the 1982–2002 summers simulated by the CMM5 with the Grell and Kain–Fritsch schemes are combined with observations to develop and optimize a two-member ensemble and verify its skill. Observations are derived from a composite of three objective analyses, all based on gauge measurements over the United States and Mexico and mapped onto the CMM5 30-km grid mesh [see Liang et al. (2004b) for the data source and analysis procedure].

## 3. Ensemble optimization solution

Given the actual (observed or modeled) precipitation rate *R*, a variable *P,* denoted here as “estimator,” is first constructed to represent its various transformations:

where *A* and *D* are choices of the offset and scaling; subscripts gr and kf denote the CMM5 simulation using the Grell and Kain–Fritsch cumulus schemes, while *m* and *o* denote the model ensemble and observations. For the purpose of this study, *A _{M}* =

*A*,

_{O}*D*=

_{M}*D*, since

_{O}*A*,

_{M}*D*are not known in priori while seeking the ensemble to best reproduce observations. Assuming

_{M}*R*and

*σ*represent the climate mean and interannual standard deviation of

*R*during the data period, three conventional forms are evaluated below:

*A*= 0,

_{X}*D*= 1 (including mean bias);

_{X}*A*=

_{X}*R*,

_{X}*D*= 1 (excluding mean bias); and

_{X}*A*=

_{X}*R*,

_{X}*D*=

_{X}*σ*(excluding mean bias and scaled by the standard deviation). They are referred to as the biased, unbiased, and normalized precipitation estimator, respectively.

_{X}The ensemble solution is assumed to be a linear combination of two members, according to

where fields listed in the parentheses represent the dependent variables, with *s* for space and *t* time, and *α* is the weight to be estimated.

Our main purpose is to seek the optimal matrix of weight *α* for the ensemble to best represent observations. This requires an objective function *F* to be minimized by the optimization solver in estimating *α*. We consider three objective functions:

where RMS and COR are the pointwise rms error and correlation coefficient between the precipitation rates of the model ensemble (*R _{M}*) and observations (

*R*) during the data period. Given a precipitation estimator from Eq. (1),

_{O}*R*can be calculated from Eq. (2) and an objective function can be constructed from Eq. (3). Thus we evaluate the sensitivity of the result to three different precipitation estimators and three different objective functions used in the optimization procedures. Note that the method using the unbiased estimator with the RMS objective function is equivalent to the “superensemble” technique of Krishnamurti et al. (1999).

_{M}The remaining challenge is solving the nonlinear constrained optimization problem to derive the geographic distribution of weight *α*(*s*). A common approach is to apply a linear multiple regression technique along with the least squares minimization of the model−observation difference (e.g., Krishnamurti et al. 1999), but this strategy limits the use of other linear and nonlinear objective functions and conditional constraints. The optimization solver used in this study is Feasible Sequential Quadratic Programming, which is designed to find the optimal solution for the minimization of the maximum of a set of smooth objective functions subject to equality and inequality constraints, linear or nonlinear, and simple bounds on the variables (Zhou et al. 1997). It has been applied by Rajagopalan et al. (2002) in their superensemble creation and by Liang et al. (2005) in developing a new dynamic–statistical land surface albedo parameterization.

The optimization requires a long time series of historical observations and model hindcasts to train the ensemble scheme and derive the appropriate weight. Many previous studies have adopted the cross-validation approach because of constraints imposed by short data records, although this approach may mask the advantage of the superensemble over a simple composite (Peng et al. 2002; Kharin and Zwiers 2002). Here we focus on monthly mean variations during the 1982–2002 summers using the leave-one-out cross-validation approach (Michaelsen 1987), where the weight is trained on data from all years except one and then used to make an ensemble prediction for that excluded year. Specifically, 20 yr of data are used for training in each optimization solution and 21 yr of the resulting ensemble prediction are applied for verification. Thus, the total number of monthly data samples for training (verification) is 60 (63), sufficient to obtain robust statistics.

Note that all calculations in this study are conducted on monthly mean precipitation, whether biased, unbiased, or normalized. As such, statistical quantities *R* and *σ* are obtained for each summer month (June, July, and August) over all years of the target period, while RMS and COR are calculated once from all the months.

## 4. Results

The simplest ensemble is the arithmetic mean (or equal weight composite) of all members (here KF and GR), which defines the minimum skill score over which any effective optimization should improve. We quantify the predictive skill score by two measures: pointwise rms errors and correlation coefficients with observations during the entire verification period 1982–2002. Both are based on interannual anomalies at each grid point, where the monthly climate bias of each respective 20-yr training period is removed from the prediction during the skill assessment. This enables direct comparisons between individual members and their ensembles using different precipitation estimators. Assuming independence of the monthly samples, correlations exceeding +0.21 (+0.29) are statistically significant at the 95% (99%) confidence level.

Figures 1a–d compare frequency distributions of correlation coefficients and rms errors of precipitation interannual anomalies between KF, GR, and their ensembles with various weights. The statistics are based on all 30-km land grids with available observational data over the United States and Mexico. Clearly, KF is on average more skillful than GR mainly because of the general failure of the latter in simulating precipitation features in the south and southeast United States and Mexico (see Liang et al. 2004a, b; Zhu and Liang 2007). Their ensemble with an equal or optimal weight, however, is superior to both individuals because there are distinct regions (such as central United States) where GR is persistently more skillful than KF and thus supplies independent information for the composite improvement. Among the ensembles, the superensemble approach, using the unbiased estimator plus RMS objective function, produces the highest predictive skill of interannual anomalies in both score measures. The superensemble with the locally optimized weight predicts realistic interannual anomalies with large correlations >0.4 and small rms errors <1.5 (mm day^{−1}) over 15% more area as compared with the equal weight.

The objective function (1 − COR) results in the worst ensemble prediction with a skill score no better than the equal weight composite. The skill is improved by using RMS(1 − COR) but still worse than that using RMS. Thus the optimization by minimizing RMS is considered as the best method, based on which all subsequent ensemble weights are solved. On the other hand, the mean climate bias removal *before* the optimization to derive the weight, that is, using the unbiased precipitation estimator, is found to be the most critical step in enhancing the ensemble predictive skill. Without this removal, the biased ensemble does not gain skill by using the optimal from equal weight. This supports Yun et al. (2003) in explaining the disagreement of Krishnamurti et al. (1999) with Kharin and Zwiers (2002) about the advantage of the superensemble over simple ensemble approach. In contrast, the normalization using the standard deviation does not improve the correlation skill but actually degrades the rms error score. This normalization yields optimal weights that are very close to 0.5 everywhere, so its ensemble predictive skill is similar to the equal weight composite.

Figures 1e and 1f compare geographic distributions of correlation coefficients of precipitation interannual anomalies between KF and the superensemble. Clearly, almost everywhere the superensemble increases correlations over KF (recall that KF has an overall higher skill score than GR). The most significant enhancement is identified over the Rockies where light precipitation occurs and over the Great Plains and the North American monsoon region where light to moderate rainfall amounts prevail. Important correlation increases are also seen in the northeast United States. As a result, the superensemble captures observed precipitation anomalies over 90% (80%) of the land area of the United States and Mexico in the domain with correlation coefficients statistically significant at the 95% (99%) confidence level. These include high correlations over most of the western United States, Mexico, and the Great Plains. Note that small differences in the correlation skill between KF, GR, and their ensembles over the northwest reflect a dominance of precipitation induced by the orographically forced uplift, with minimal contributions from convection, as discussed in Liang et al. (2004b) and Zhu and Liang (2006). Thus, using the superensemble, the CMM5 skill in simulating precipitation interannual anomalies over most of the western United States and Mexico is excellent for both summer and winter and likely high in other seasons. Relatively lower skill is produced over the eastern United States, especially along the Gulf Coast where rainfall is heavy, indicating substantial room for further improvement by the superensemble based on more cumulus schemes and/or multiple models. The low correlation in the Gulf Coast could be attributed to variability in precipitation associated with tropical depression and cyclone activity, which may not adequately be resolved by CMM5 at the 30-km grid. Errors in the driving reanalysis, especially over the south and east buffer zones, may also play an important role in the CMM5 failure (Liang et al. 2004b).

One disadvantage of the superensemble approach is that it predicts only the anomaly from the observed climate mean of the training period, not the total precipitation amount. In many practical applications, the climate mean precipitation is as essential as the anomaly. This can be best represented by the optimized ensemble using the biased precipitation estimator. Figures 1g–i compare geographic distributions of the climate mean precipitation observed and simulated by KF and the biased ensemble. As shown by Liang et al. (2004b), over the southeast United States and North American monsoon region, rainfall amounts are overestimated by KF but underestimated by GR. As such, the optimal ensemble produces a substantially improved precipitation distribution, superior to KF or GR and their equal weight composite. By the ensemble, monthly mean climate rms errors (mm day^{−1}) are less than 1.0 over most of the domain except for northern Texas (1.2–1.5) and the North American monsoon region (1.5–3.0). These latter areas need further model improvement.

## 5. Summary and discussion

It is demonstrated that a two-member ensemble based on the CMM5 simulations using the GR and KF cumulus parameterizations is superior to both individual members because there exist distinct regions where one scheme is persistently more skillful than the other. For the interannual anomaly, the superensemble approach, using the unbiased estimator plus the RMS minimization, produces the highest predictive skill. The removal of the mean climate bias before optimizing the weight is found to be a key procedure that significantly enhances the ensemble predictive skill. In contrast, normalization by further removing the deviation departure from observations does not improve the skill. The most significant improvement is found over the Rockies, Great Plains, and North American monsoon region where light to moderate rainfall prevails in summer. For the climate mean, the optimal ensemble using the biased estimator results in substantial improvements over the southeast United States and North American monsoon region where the CMM5 biases in GR and KF are of opposite sign. To simulate the total rainfall amount, we suggest integrating the separately optimized ensembles of the climate mean and interannual anomaly by minimizing rms errors using, respectively, the biased and unbiased precipitation estimators. Note that the optimized weight *α* largely differs between the two, especially over the Rockies, with values near zero for the climate mean and close to one for the interannual anomaly (not shown). This may indicate a scale separation in the model sensitivity to cumulus parameterizations.

The above results contain several important issues that warrant further investigation. First, we implicitly assume that the linear combination of the two CMM5 simulations using only GR and KF cumulus schemes fully represents observations. This would be relaxed and likely improved by incorporating more model physics configurations or multiple models. Second, since the precipitation distribution is known to be non-Gaussian, a more appropriate procedure would be to initially rank the data into several threads, such as light, moderate, heavy, and very heavy rainfall events, and then conduct a separate optimization for each thread. This would require substantially longer data records to obtain statistically robust weights for all threads. We plan to study the ranking effect on the optimization of daily precipitation ensemble forecasts, where data records are sufficient. Third, this study focuses on using multiple model outputs for postcomposite analyses rather than attempting to improve a specific model or a particular parameterization. But our recent research shows that the accuracy of model simulations with specific physical schemes is regime dependent, suggesting that better results are achievable through intelligent optimization of an overall modeling system. This indicates the need for further investigation of techniques similar to Grell and Dvénéyi (2002), who developed an ensemble cumulus parameterization (ECP) that utilizes a suite of alternative closure assumptions. In concept, this approach applies an ensemble of cumulus parameterizations at every time step and grid point and then feeds back the average of all solutions to the predictive system. Their current ECP implements an identical weight for all closures, implying that each one is an equally likely representation of reality, irrespective of regime, location, and time. Given the known regime dependence of closure assumptions, the ECP appears to have great potential for improvement if appropriate regime-specific weights can be derived. Future work should be focused in this area, both for postprocessing output from multiple ensemble members and for developing dynamic weighting of physical parameters and parameterizations.

As a final note, our emphasis on interannual variability makes the developed optimization procedure and optimal weight most applicable to improving the seasonal–interannual prediction of precipitation through the CMM5 ensemble downscaling from actual global climate model forecasts under the present-day condition. We have tested this approach over the U.S.–Mexico domain, where CMM5 with both KF and GR cumulus schemes are driven by the output of a fully coupled general circulation model (Liang et al. 2006); the ensemble mean using the same weight derived in this study from the reanalysis-driven CMM5 simulations significantly reduces overall precipitation biases with individual schemes. We are now experimenting with the application as driven by the operational NCEP Climate Forecast System (CFS) predictions. As pointed out above, the model sensitivity to cumulus parameterizations depends on time scales, where the distribution of the optimal weight may differ between daily and monthly precipitation or change from the present to future climate conditions. As such we cannot guarantee the statistically derived weight is the optimal choice for all. Nonetheless, the procedure for optimizing the ensemble can be applied in general and the interactive ECP with dynamic (rather than statistical) weighting as proposed is anticipated to be more credible for future climate change projections.

## Acknowledgments

We appreciate the constructive comments of Michael Palecki and James Angel. We acknowledge NOAA/ESRL/GSD and NCSA/UIUC for the supercomputing support. The research was partially supported by the U.S. Environmental Protection Agency Award RD-83096301-0 and the National Oceanic and Atmospheric Administration Award NA04OAR4310162. The views expressed are those of the authors and do not necessarily reflect those of the sponsoring agencies or the Illinois State Water Survey.

## REFERENCES

**,**

**,**

**,**

**,**

**.**

**,**

**,**

**,**

**,**

**.**

**,**

**.**

**.**

**,**

**,**

**,**

**,**

**.**

**,**

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Dr. Xin-Zhong Liang, Illinois State Water Survey, University of Illinois at Urbana–Champaign, 2204 Griffith Drive, Champaign, IL 61820-7495. Email: xliang@uiuc.edu