1. Introduction
Seasonal climate forecasts are necessarily probabilistic, and forecast information is most completely characterized by a probability density function (pdf). Estimation of the forecast pdf is required to measure predictability and to issue accurate forecasts. For reliable forecasts, the difference between the climatological and forecast pdfs represents predictability, and several measures of this difference have been developed to quantify predictability (Kleeman 2002; DelSole 2004; Tippett et al. 2004; DelSole and Tippett 2007). Quantile probabilities are the probabilities assigned to quantile-delimited categories and provide a coarse-grained description of the forecast and climatological pdfs, which is appropriate for ensembles with relatively few members. The International Research Institute for Climate and Society (IRI) issues seasonal forecasts of precipitation and temperature in the form of tercile-based categorical probabilities (hereafter called tercile probabilities), that is, the probability of the below-normal, normal, and above-normal categories (Barnston et al. 2003). Forecasts that differ from equal-odds probabilities, to the extent that they are reliable, are indications of predictability in the climate system. Accurate estimation of quantile probabilities is important both for quantifying seasonal predictability and for making climate forecasts.
In single-tier seasonal climate forecasts, initial conditions of the ocean–land–atmosphere system are the source of predictability, and ensembles of coupled model forecasts provide samples of the model atmosphere–land–ocean system evolution consistent with the initial conditions, their uncertainty, and the internal variability of the coupled model. In two-tier seasonal forecasts, ensembles of atmospheric general circulation models (GCMs) provide samples of equally likely model atmospheric responses to a particular configuration of sea surface temperature (SST). Tercile probabilities must be estimated from finite ensembles in either system. A simple nonparametric estimate of the tercile probabilities is the fraction of ensemble members in each category. Alternatively, the entire forecast pdf including tercile probabilities can be estimated by modeling the ensemble as a sample from an analytical pdf with adjustable parameters for mean, spread, shape, etc. Here we use a Gaussian distribution described by its mean and variance. The counting method has the advantage of making no assumptions about the form of the forecast pdf. Both approaches are affected by sampling error due to finite ensemble size, though to different degrees. This paper is about the impact of sampling error on parametric and nonparametric estimates of simulated and forecast tercile probabilities for seasonal precipitation totals. We analyze precipitation because of its societal importance and because, even on seasonal time scales, its distribution is farther from being Gaussian, and hence more challenging to describe, than quantities like temperature and geopotential height, which have been previously examined.
In this paper we present analytical descriptions of the accuracy of the counting and Gaussian tercile probability estimators. These analytical results facilitate the comparison of the counting and Gaussian estimates and show how the accuracy of the estimators increases as ensemble size and predictability level increase. The analytical results support previous empirical results showing the advantage of the parametric estimators. Wilks (2002) found that modeling numerical weather prediction ensembles with Gaussian or Gaussian mixture distributions gave more accurate estimations of quantile values than counting, especially for quantiles near the extremes of the distribution. Kharin and Zwiers (2003) used Monte Carlo simulations to show that a Gaussian fit estimate was more accurate than counting for Gaussian distributed forecast variables.
We show how the accuracy of the tercile probability estimates affects the rank probability skill score (RPSS). The RPSS is a multicategory generalization of the two-category Brier skill score. Richardson (2001) found that finite ensemble size had an adverse effect on the Brier skill score with low-skill regions being more negatively affected by small ensemble size. Changes in ensemble size that cause only modest changes in Brier skill score can lead to large changes in economic value implied by a simple cost–loss decision model, particularly for extreme events (Richardson 2001).
Accurate estimation of tercile probabilities from GCM ensembles does not ensure a skillful simulation or forecast if there are systematic errors in the GCM pdf. Calibration of model probabilities is needed to account for model deficiencies and produce reliable climate forecasts (Robertson et al. 2004). We expect that forecast skill would be improved by reducing sampling error in the GCM probabilities that are inputs to both the calibration system and the procedure to estimate calibration parameters. We investigate the roles of sampling and model error using a 79-member ensemble of GCM simulations of seasonal precipitation made with observed SST; we examine the impact of reducing sampling error on the skill of the simulations with and without calibration. Additionally, we use the GCM data to assess the importance of some simplifying assumptions used in the calculation of the analytical results by comparing the analytical results with empirical ones obtained by subsampling from the large ensemble of GCM simulations.
An important predictability issue relevant to parametric estimation of tercile probabilities is the relative roles of the forecast mean and variance in determining predictability (Kleeman 2002). Since predictability is a measure of the difference between forecast and climatological distributions, identifying the parameters associated with predictability also identifies the parameters that are useful for estimating tercile probabilities. For instance, if the predictability of a system is due to only the changes in the forecast mean, then the forecast mean should also be useful for estimating tercile probabilities. One approach to this question is to identify the parameters that give the most skillful forecast probabilities (Buizza and Palmer 1998; Atger 1999). Kharin and Zwiers (2003) showed that the Brier skill score of hindcasts of 700-mb temperature and 500-mb height was improved when probabilities were estimated from a Gaussian distribution with constant variance as compared with counting; fitting a Gaussian distribution with time-varying variance gave inferior results. Hamill et al. (2004) used a generalized linear model (GLM; logistic regression) to estimate forecast tercile probabilities of 6–10 day and week-2 surface temperature and precipitation and found that the ensemble variance was not a useful predictor of tercile probabilities. In addition to looking at skill, we examine the relative importance of the forecast mean and variance for predictability in the perfect model setting by asking whether including ensemble variance in the Gaussian estimate and the GLM estimate reduces sampling error.
The paper is organized as follows. The GCM and observation data are described in section 2. In section 3, we derive some theoretical results about the relative size of the error of the counting and fitting estimates and about the effect of sampling error on the ranked probability skill score. The GLM is also introduced and related to Gaussian fitting. In section 4, we compare the analytical results with empirical GCM-based ones and include effects of model error. A summary and conclusions are given in section 5.
2. Data
The precipitation observations used to evaluate model skill and to calibrate model output come from the extended New et al. (2000) gridded dataset of monthly precipitation for the period 1950 to 1998, interpolated to the T42 model grid.
3. Theoretical considerations
a. Variance of the counting estimate
Since the counting estimate pN is not normally distributed or even symmetric for p ≠ 0.5 (for instance, the distribution of sampling error necessarily has a positive skew when the true probability p is close to zero), it is not immediately apparent whether its variance is a useful measure. However, the binomial distribution becomes approximately normal for large N. Figure 2 shows that the standard deviation gives a good estimate of the 16th and 84th percentiles of pN for p = 1/3 and modest values of N. In this case, the counting estimate variance is (2/9)N. The percentiles are obtained by inverting the cumulative distribution function of the sample error. Since the binomial cumulative distribution is discrete, we show the smallest value at which it exceeds 0.16 and 0.84. Figure 2 also shows that for modest-sized ensembles (N > 20) the standard deviation is fairly insensitive to incremental changes in ensemble size; increasing the ensemble size by a factor of 4 is necessary to reduce the standard deviation by a factor of 2.
b. Variance of the Gaussian fit estimate
c. Estimates from generalized linear models
We show an example with synthetic data to give some indication of the robustness of the GLM estimate when the population that the ensemble represents does not have a Gaussian distribution. We take the forecast pdf to be a gamma distribution with shape and scale parameters (2, 1). The pdf is asymmetric and has a positive skew (see Fig. 3a). Samples are taken from this distribution and the probability of the below-normal category is estimated by counting, Gaussian fit, and GLM; the Gaussian fit assumes constant known variance, and the GLM uses the ensemble mean as an explanatory variable. Interestingly the rms error of both the GLM and Gaussian fit estimates is smaller than that of counting for modest ensemble size (Fig. 3b). As the ensemble size increases further, counting becomes a better estimate than the Gaussian fit. For all ensemble sizes, the performance of the GLM estimate is better than the Gaussian fit.
Other experiments (not shown) compare the counting, Gaussian fit, and GLM estimates when the ensemble is Gaussian with nonconstant variance. The GLM estimate with ensemble mean and variance as explanatory variables and the two-parameter Gaussian fit have smaller error than counting and the one-parameter models (for large enough ensemble size) as expected.
d. Ranked probability skill score
4. Estimates of GCM-simulated seasonal precipitation tercile probability
a. Variance of the counting estimate
We expect especially close agreement between the subsampling calculations and the analytical results of (5) in regions where there is little predictability and the signal-to-noise ratio S2 is small, since, for S2 = 0, the analytical result is exact. In regions where the signal-to-noise ratio is not zero, though generally fairly small, we expect that the average counting variance still decreases as 1/N. However, there is no guarantee that the Gaussian approximation will provide an adequate description of the actual behavior of the GCM data.
Figure 5 shows that in the land gridpoint average the variance of the counting estimate is very well described by the analytical result in (5), with the difference from the analytical result being on the order of a few percent for the below-normal category probability and less than one percent for the above-normal category probability. The accuracy difference between the below- and above-normal categories may be due to the below-normal category being more affected by non-Gaussian behavior. Figure 6a shows the spatial variation of the convergence factor −0.0421868 + 0.264409/
b. Error of counting, Gaussian fit, and GLM estimators
We begin by examining the land gridpoint average of the sampling error of the three methods. Figure 7a shows the gridpoint-averaged rms error of the tercile probability estimates as a function of ensemble size. The variance of the counting estimate is well described by theory (Fig. 7a) and is larger than that of the parametric estimates. The one-parameter GLM and constant variance Gaussian fit have similar rms error for larger ensemble sizes; the GLM estimate is slightly better for very small ensemble sizes. While the magnitude of the error reduction due to using the parametric estimates is modest, the savings in computational cost compared to the equivalent ensemble size is significant.
The single parameter estimates, that is, the constant variance Gaussian fit and the GLM based on the ensemble mean, have smaller rms error than the estimates based on ensemble mean and variance (Fig. 7b). The advantage of the single parameter estimates is greatest for smaller ensemble sizes. This result is important because it shows that attempting to account for changes in variance, even in the perfect model setting where ensemble size is the only source of error, does not improve estimates of the tercile probabilities for the range of ensemble sizes considered here (Kharin and Zwiers 2003). The sensitivity of the tercile probabilities to changes in variance is, of course, problem specific.
Figure 8 shows the spatial features of the rms error of the below-normal tercile probability estimates for ensemble size 20. Using a Gaussian with constant variance or a GLM based on the ensemble mean has error that is, on average, less than counting; the average performances of the Gaussian fit and the GLM are similar. In a few dry regions, especially in Africa, the error from the parametric estimates is larger. This problem with the parametric estimates in the dry regions is reduced when a Box–Cox transformation is applied to the data (not shown), and overall error levels are slightly reduced as well. The spatial features of rms error when the variance of the Gaussian is estimated and when the mean and standard deviation are used in the GLM are similar to those in Fig. 8, but the overall error levels are slightly higher.
c. RPSS
In the previous section we evaluated the three probability estimation methods in the perfect model setting, applying the estimators to small ensembles and asking how well they reproduce the probabilities from the large ensemble. We now compare the three probability estimation methods in an imperfect model setting by computing their RPSS using observations. We expect the reduction in sampling error to result in improved RPSS, but we cannot know beforehand the extent to which model error confounds or offsets the reduction in sampling error. Figure 9 shows maps of RPSS for ensemble size 20 for the counting, Gaussian fit, and GLM estimates. The results are averaged over 100 random selections of the 20-member ensemble from the full 79-member ensemble. The overall skill of the Gaussian fit and GLM estimate is similar and both are generally larger than that of the counting estimate.
Figure 10 shows the fraction of points with positive RPSS as a function of ensemble size. Again results are averaged over 100 random draws of each ensemble size except for N = 79 when the entire ensemble is used. The parametrically estimated probabilities lead to more grid points with positive RPSS. The Gaussian fit and GLM have similar skill levels with the GLM estimate having larger RPSS for the smallest ensemble sizes and the Gaussian fit being slightly better for larger ensemble sizes. It is useful to interpret the increases in RPSS statistics in terms of effective ensemble size. For instance, applying the Gaussian fit estimator to a 24-member ensemble give RPSS statistics that are on average comparable to those of the counting estimator applied to an ensemble size of about 39. Although all methods show improvement as ensemble size increases, it is interesting to ask to what extent the improvement in RPSS due to increasing ensemble size predicted by (24) is impacted by the presence of model error. For a realistic approximation of the RPSS in the limit of infinite ensemble size, we compute the RPSS for N = 1 and solve (24) for RPSSperfect; we expect that in this case sampling error dominates model error and the relation in (24) holds approximately. Then we use (24) to compute the gridpoint-averaged RPSS for other values of N; the theory curve in Fig. 10 shows these values. In the absence of model error, the count and theory curves of RPSS in Fig. 10 would be the same. However, we see that the effect of model error is such that the curves are close for N = 5 and N = 10 and diverge for larger ensemble sizes with the actual increase in RPSS being lower than that predicted by (24).
The presence of model error means that some calibration of the model output with observations is needed. The GCM ensemble tends to be overconfident and calibration tempers this. To see if reducing sampling error still has a noticeable impact after calibration, we use a simple version of Bayesian weighting (Rajagopalan et al. 2002; Robertson et al. 2004). In the method, the calibrated probability is a weighted average of the GCM probability and the climatology probability (1/3). The weights are chosen to maximize the likelihood of the observations. There is cross-validation in the sense that the weights are computed with a particular ensemble of size N, and the RPSS is computed by applying those weights to a different ensemble of the same size and then comparing the result with observations. The calibrated counting–estimated probabilities still have slightly negative RPSS in some areas (Fig. 11a), but the overall amount of positive RPSS is increased compared to the uncalibrated simulations (cf. with Fig. 9a); the ensemble size is 20 and results are averaged over 100 realizations. The calibrated Gaussian and GLM probabilities have modestly higher overall RPSS than the calibrated counting estimates with noticeable improvement in skillful areas like southern Africa (Figs. 11b,c). We note that a simpler calibration method based on a Gaussian fit with the variance determined by the correlation between ensemble mean and observations, as in Tippett et al. (2005), rather than ensemble spread, performs nearly as well as the Gaussian fit with Bayesian calibration.
It is interesting to look at examples of the probabilities given by the counting and Gaussian fit estimate to see how the spatial distributions of probabilities may differ in appearance. Figure 12 shows uncalibrated tercile probabilities from DJF 1996 (ENSO neutral) and 1998 (strong El Niño). Counting and Gaussian probabilities appear similar, with Gaussian probabilities appearing spatially smoother.
5. Summary and conclusions
Here we have explored how the accuracy of tercile category probability estimates are related to ensemble size and the chosen probability estimation technique. The counting estimate, which uses the fraction of ensemble members that fall in the tercile category, is attractive because it is simple and places no restrictions on the form of the ensemble distribution. The error variance of the counting estimate is a function of the ensemble size and tercile category probability. For Gaussian variables, the tercile category probability is a function of the ensemble mean and variance. Therefore, for Gaussian variables, the counting estimate variance for an individual forecast depends on ensemble size, mean, and variance; the average (over forecasts) counting estimate variance depends on ensemble size and the signal-to-noise ratio. An alternative to the counting estimate is the Gaussian fit estimate, which computes tercile probabilities from a Gaussian distribution with parameters estimated from the forecast ensemble. Like the counting estimate, the variance of the Gaussian fit tercile probabilities is also shown to be a function of the ensemble size and the ensemble mean and variance, and the average variance depends on ensemble size and the signal-to-noise ratio. When the variables are indeed Gaussian, the error variance of the Gaussian fit estimate is smaller than that of the counting estimate by approximately 40% in the limit of small signal. The advantage of the Gaussian fit over the counting estimate is equivalent to fairly substantial increases in ensemble size. However, this advantage depends on the forecast distribution being well described by a Gaussian distribution. Generalized linear models (GLMs) provide a parametric estimate of the tercile probabilities using a nonlinear regression with the ensemble mean and possibly the ensemble variance as predictors. The GLM estimator does not explicitly assume a distribution but, as implemented here, is equivalent to the Gaussian fit estimate in some circumstances.
The accuracy of the tercile probability estimates affects probability forecast skill measures such as the commonly used ranked probability skill score (RPSS). Reducing the variance of the tercile probability estimate is shown to increase the RPSS. We examined this connection in the perfect model setting used extensively in predictability studies in which the “observations” are assumed to be indistinguishable from an arbitrary ensemble member. We find the expected RPSS in terms of the above- and below-normal tercile probabilities and, for Gaussian variables, in terms of the ensemble mean and variance. Finite ensemble size degrades the expected RPSS, conceptually similar to the way that finite ensemble size reduces the expected correlation (Sardeshmukh et al. 2000; Richardson 2001).
Many of the analytical results are obtained assuming that the ensemble variables have a Gaussian distribution. We test the robustness of these findings using simulated seasonal precipitation from an ensemble of GCM integrations forced by observed SST, subsampling from the full ensemble to estimate sampling error. We find that the theoretical results give a good description of the average variance of the counting estimate, particularly in a spatially averaged sense. This means that the theoretical scalings can be used in practice to understand how sampling error depends on ensemble size and level of predictability. Although the GCM-simulated precipitation departs somewhat from being Gaussian, the Gaussian fit estimate had smaller error than the counting estimate. The behavior of the GLM estimate is similar to that of the Gaussian fit estimate. The parametric estimators based on ensemble mean had the best performance; adding ensemble variance as a parameter did not reduce error. This means that with the moderate ensemble sizes typically used, differences between the forecast tercile probabilities and the equal-odds probabilities are due essentially to shifts of the forecast mean away from its climatological value rather than to changes in variance. Since differences between the forecast tercile probabilities and the equal-odds probabilities are a measure of predictability, this result means that predictability in the GCM is due to changes in ensemble mean rather than changes in spread. This result is consistent with Tippett et al. (2004), who found that differences between forecast and climatological GCM seasonal precipitation distributions as measured by relative entropy were basically due to changes in the mean rather than changes in the variance.
The reduced sampling error of the Gaussian fit and GLM is shown to translate into better simulation skill when the tercile probabilities are compared to actual observations. Examining the dependence of the RPSS on ensemble size shows that, although RPSS increases with ensemble size, model error limits the rate of improvement compared to the ideal case. Calibration improves RPSS, regardless of the probability estimator used. However, estimators with larger sampling error retain their disadvantage in RPSS even after calibration. The application of the Gaussian fit estimator to specific years shows that the parametric fit achieves its advantages while also producing probabilities that are spatially smoother than those estimated by counting.
In summary, our main conclusion is that carefully applied parametric estimators provide noticeably more accurate tercile probabilities than do counting estimates. This conclusion is completely rigorous for variables with Gaussian statistics. We find that for variables that deviate modestly from Gaussianity, such as seasonal precipitation totals, the error of the Gaussian fit tercile probabilities is smaller than that of the counting estimates. More substantial deviation from Gaussianity may be treated by transforming the data or using the related GLM approach.
Acknowledgments
We thank Lisa Goddard and Simon Mason for stimulating discussions and Benno Blumenthal for the IRI Data Library. GCM integrations were performed by David DeWitt, Shuhua Li, and Lisa Goddard with computer resources provided in part by the NCAR CSL. Comments from two anonymous reviewers greatly improved the clarity of this paper. IRI is supported by its sponsors and NOAA Office of Global Programs Grant NA07GP0213. The views expressed herein are those of the authors and do not necessarily reflect the views of NOAA or any of its subagencies.
REFERENCES
Atger, F., 1999: The skill of ensemble prediction systems. Mon. Wea. Rev., 127 , 1941–1953.
Barnston, A. G., S. J. Mason, L. Goddard, D. G. Dewitt, and S. E. Zebiak, 2003: Multimodel ensembling in seasonal climate forecasting at IRI. Bull. Amer. Meteor. Soc., 84 , 1783–1796.
Buizza, R., and T. N. Palmer, 1998: Impact of ensemble size on ensemble prediction. Mon. Wea. Rev., 126 , 2503–2518.
DelSole, T., 2004: Predictability and information theory. Part I: Measures of predictability. J. Atmos. Sci., 61 , 2425–2440.
DelSole, T., and M. K. Tippett, 2007: Predictability, information theory, and stochastic models. Rev. Geophys., in press.
Epstein, E. S., 1969: A scoring system for probability forecasts of ranked categories. J. Appl. Meteor., 8 , 985–987.
Hamill, T. M., J. S. Whitaker, and X. Wei, 2004: Ensemble reforecasting: Improving medium-range forecast skill using retrospective forecasts. Mon. Wea. Rev., 132 , 1434–1447.
Kharin, V. V., and F. W. Zwiers, 2003: Improved seasonal probability forecasts. J. Climate, 16 , 1684–1701.
Kleeman, R., 2002: Measuring dynamical prediction utility using relative entropy. J. Atmos. Sci., 59 , 2057–2072.
Kleeman, R., and A. M. Moore, 1999: A new method for determining the reliability of dynamical ENSO predictions. Mon. Wea. Rev., 127 , 694–705.
Kumar, A., A. G. Barnston, and M. P. Hoerling, 2001: Seasonal predictions, probabilistic verifications, and ensemble size. J. Climate, 14 , 1671–1676.
McCullagh, P., and J. A. Nelder, 1989: Generalized Linear Models. Chapman and Hall, 387 pp.
Murphy, A. H., 1973: A new vector partition of the probability score. J. Appl. Meteor., 12 , 595–600.
New, M., M. Hulme, and P. Jones, 2000: Representing twentieth-century space–time climate variability. Part II: Development of 1901–96 monthly grids of terrestrial surface climate. J. Climate, 13 , 2217–2238.
Rajagopalan, B., U. Lall, and S. E. Zebiak, 2002: Categorical climate forecasts through regularization and optimal combination of multiple GCM ensembles. Mon. Wea. Rev., 130 , 1792–1811.
Richardson, D. S., 2001: Measures of skill and value of ensemble prediction systems, their interrelationship and the effect of ensemble size. Quart. J. Roy. Meteor. Soc., 127 , 2473–2489.
Robertson, A. W., U. Lall, S. E. Zebiak, and L. Goddard, 2004: Improved combination of multiple atmospheric GCM ensembles for seasonal prediction. Mon. Wea. Rev., 132 , 2732–2744.
Roeckner, E., and Coauthors, 1996: The atmospheric general circulation model ECHAM-4: Model description and simulation of present-day climate. Max Planck Institute for Meteorology Tech. Rep. 218, 90 pp.
Sardeshmukh, P. D., G. P. Compo, and C. Penland, 2000: Changes of probability associated with El Niño. J. Climate, 13 , 4268–4286.
Tippett, M. K., R. Kleeman, and Y. Tang, 2004: Measuring the potential utility of seasonal climate predictions. Geophys. Res. Lett., 31 .L22201, doi:10.1029/2004GL021575.
Tippett, M. K., L. Goddard, and A. G. Barnston, 2005: Statistical–dynamical seasonal forecasts of central-southwest Asian winter precipitation. J. Climate, 18 , 1831–1843.
Wilks, D. S., 2002: Smoothing forecast ensembles with fitted probability distributions. Quart. J. Roy. Meteor. Soc., 128 , 2821–2836.
APPENDIX
Error in Estimating Tercile Probabilities
Variance of the counting estimate
Error of the Gaussian fit estimate
Spatial distribution of λ appearing in the Box–Cox transformation of Eq. (1).
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1
The 16th and 84th percentiles (see text for details) of the counting estimate pN (solid lines) and p plus and minus the standard deviation of the estimate pN (dashed lines) for p = 1/3 (dotted line).
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1
The (a) gamma distribution with shape and scale parameters (2, 1), respectively, and (b) rms error as a function of ensemble size N for the counting, Gaussian fit, and GLM tercile probability estimates.
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1
Perfect model measures of potential probability forecast skill (a) RPSperfect and (b) RPSSperfect for DJF precipitation.
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1
Percent difference between the gridpoint average of the theoretical and empirically estimated variance of the tercile probability estimate for the below-normal and above-normal categories.
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1
(a) Spatial variation of the convergence factor − 0.0421868 + 0.264409/
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1
The rms error of the below-normal probability as a function of ensemble size N for the (a) one-parameter and (b) two-parameter estimates. The gray curves in (a) are the theoretical error levels for the counting and Gaussian fit methods. Fit-2 (GLM-2) denotes the two-parameter Gaussian (GLM) method.
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1
(a) The rms error of the counting estimate of the below-normal tercile probability with ensemble size 20. The rms error of the counting error minus that of the (b) Gaussian fit and (c) the GLM based on the ensemble mean. The gridpoint averages are shown in the titles.
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1
RPSS of (a) the counting-based probabilities and its difference with that of the (b) Gaussian and (c) GLM-estimated probabilities. Positive values in (b) and (c) correspond to increased RPSS compared to counting. The gridpoint averages are shown in the titles.
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1
The fraction of land points with RPSS > 0.
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1
As in Fig. 9 but for the Bayesian calibrated probabilities.
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1
Probability of above-normal precipitation for DJF 1996 estimated by (a) counting and (b) Gaussian fit and DJF 1998 using (c) counting and (d) Gaussian fit.
Citation: Journal of Climate 20, 10; 10.1175/JCLI4108.1