Improved Trend-Aware Postprocessing of GCM Seasonal Precipitation Forecasts

Yawen Shao aDepartment of Infrastructure Engineering, University of Melbourne, Melbourne, Australia

Search for other papers by Yawen Shao in
Current site
Google Scholar
PubMed
Close
,
Quan J. Wang aDepartment of Infrastructure Engineering, University of Melbourne, Melbourne, Australia

Search for other papers by Quan J. Wang in
Current site
Google Scholar
PubMed
Close
,
Andrew Schepen bCSIRO Land and Water, Brisbane, Australia

Search for other papers by Andrew Schepen in
Current site
Google Scholar
PubMed
Close
,
Dongryeol Ryu aDepartment of Infrastructure Engineering, University of Melbourne, Melbourne, Australia

Search for other papers by Dongryeol Ryu in
Current site
Google Scholar
PubMed
Close
, and
Florian Pappenberger cEuropean Centre for Medium-Range Weather Forecasts, Reading, United Kingdom

Search for other papers by Florian Pappenberger in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

Climate trends have been observed over the recent decades in many parts of the world, but current global climate models (GCMs) for seasonal climate forecasting often fail to capture these trends. As a result, model forecasts may be biased above or below the trendline. In our previous research, we developed a trend-aware forecast postprocessing method to overcome this problem. The method was demonstrated to be effective for embedding observed trends into seasonal temperature forecasts. In this study, we further develop the method for postprocessing GCM seasonal precipitation forecasts. We introduce new formulation and evaluation features to cater for special characteristics of precipitation amounts, such as having a zero lower bound and highly positive skewness. We apply the improved method to calibrate ECMWF SEAS5 forecasts of seasonal precipitation for Australia. Our evaluation shows that the calibrated forecasts reproduce observed trends over the hindcast period of 36 years. In some regions where observed trends are statistically significant, forecast skill is greatly improved by embedding trends into the forecasts. In most regions, the calibrated forecasts outperform the raw forecasts in terms of bias, skill, and reliability. Wider applications of the new trend-aware postprocessing method are expected to boost user confidence in seasonal precipitation forecasts.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Yawen Shao, yawens@student.unimelb.edu.au

Abstract

Climate trends have been observed over the recent decades in many parts of the world, but current global climate models (GCMs) for seasonal climate forecasting often fail to capture these trends. As a result, model forecasts may be biased above or below the trendline. In our previous research, we developed a trend-aware forecast postprocessing method to overcome this problem. The method was demonstrated to be effective for embedding observed trends into seasonal temperature forecasts. In this study, we further develop the method for postprocessing GCM seasonal precipitation forecasts. We introduce new formulation and evaluation features to cater for special characteristics of precipitation amounts, such as having a zero lower bound and highly positive skewness. We apply the improved method to calibrate ECMWF SEAS5 forecasts of seasonal precipitation for Australia. Our evaluation shows that the calibrated forecasts reproduce observed trends over the hindcast period of 36 years. In some regions where observed trends are statistically significant, forecast skill is greatly improved by embedding trends into the forecasts. In most regions, the calibrated forecasts outperform the raw forecasts in terms of bias, skill, and reliability. Wider applications of the new trend-aware postprocessing method are expected to boost user confidence in seasonal precipitation forecasts.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Yawen Shao, yawens@student.unimelb.edu.au

1. Introduction

Skillful seasonal climate forecasts are valuable for managing climate variability and change (An-Vo et al. 2019; Pechlivanidis et al. 2020). Global climate models (GCMs) are commonly employed to produce seasonal climate forecasts (Saha et al. 2014; Kirtman et al. 2014; Johnson et al. 2019). Typically, GCMs are run to generate retrospective forecasts (reforecasts) for a historical period of two to four decades. These reforecasts are mainly used to evaluate forecast performance, to produce tailored products such as anomalies, and to establish calibration models for new forecasts. One issue that has been identified is the inability of the GCM reforecasts to capture observed climate trends (Cai et al. 2009; Krakauer 2017; Shin and Huang 2019; Shao et al. 2021a). This inability lowers seasonal climate forecast skill and reliability and, importantly, undermines user confidence in using the forecasts (Livezey and Timofeyeva 2008; Barnston et al. 2010).

Precipitation is a climate variable of crucial importance to climate-sensitive sectors, such as agriculture and water resource management. In recent decades, precipitation has exhibited both increasing and decreasing trends around the world (Hartmann et al. 2013). For example, mixed state-wide trends have been observed in seasonal and annual precipitation variables in North Carolina in the United States for 1950–2009 (Sayemuzzaman and Jha 2014). In Australia, southwestern and southeastern parts have a declining trend in April–October while most of northern Australia has increased precipitation across all seasons since the 1970s, particularly during the northern wet season from October to April (Bhend and Whetton 2015; CSIRO and BoM 2020; Wasko et al. 2021). Despite notable changes in precipitation, the observed trends are often sensitive to the evaluation periods, which may substantially vary in time (Hartmann et al. 2013) and their associated uncertainty needs to be carefully interpreted.

Previous research has explored how the climatic trend is represented in seasonal precipitation forecasts produced by GCMs. Huang et al. (2019) compared trends in observations and 2-month-ahead U.S. seasonal precipitation reforecasts from a modified version of the CFSv2 model for 1958–78, 1979–99, and 2000–17 separately. They found that the reforecasts roughly reproduced the observed trends in winter over the full 60-yr period and in spring and summer since the 2000s. However, the reforecasts failed to capture the observed trends in spring, summer, and autumn during 1958–78 and 1979–99. In this study, we will demonstrate that precipitation reforecasts of ECMWF SEAS5 (Johnson et al. 2019) mismatch trends seen in observations over parts of Australia in some seasons.

While an ultimate solution to the trend mismatch problem lies in further improving the GCMs, there is a practical approach that can yield more immediate benefits, that is, observed climate trends can be embedded into forecasts through statistical postprocessing of GCM raw forecasts. Postprocessing has, in the past, aimed at removing biases and improving skill and reliability of forecasts. Methods for addressing the trend issue in seasonal forecasts are beginning to emerge (Krikken et al. 2016; Dirkson et al. 2019). Most recently, a trend-aware method was developed to embed trends as well as achieving other aims of postprocessing (Shao et al. 2021a,b). Building on a Bayesian joint probability (BJP) modeling approach (Wang et al. 2009; Wang and Robertson 2011; Wang et al. 2019), this method explicitly models trends in both observations and GCM forecasts. This method has been shown to be effective for postprocessing seasonal temperature forecasts (Shao et al. 2021a,b).

Before employing the trend-aware method to postprocess forecasts of seasonal accumulated precipitation, we need to give careful attention to the following special characteristics of the precipitation data: 1) precipitation records have a natural lower bound of zero occurrence, which is not compatible with the use of continuous bivariate normal distribution in the trend-aware model (Shao et al. 2021a); 2) precipitation amounts can be strongly positively skewed; and 3) precipitation records and trends are often associated with large uncertainties (Hartmann et al. 2013) and trend magnitude varies widely across regions (Kumar et al. 2013), as a result of underlying physical processes (Rowell 2012) that lead to a lower spatial and temporal autocorrelation of the precipitation than temperature.

In this study, we extend the trend-aware method for postprocessing GCM forecasts of precipitation. New formulations and evaluation features are introduced to account for the above characteristics of the precipitation amount. We evaluate the improved method on ECMWF SEAS5 seasonal forecasts of precipitation (i.e., total precipitation with ECMWF parameter ID 228) for the Australian continent.

The remainder of the paper is organized as follows. Section 2 introduces datasets of SEAS5 forecasts and observations. Section 3 describes the trend-aware method and forecast verification metrics while the appendix supplements the trend-aware algorithm. Section 4 presents the findings. Section 5 discusses the results and extension opportunities and concludes the study.

2. Study data

In this study, gridded daily precipitation forecasts are derived from ECMWF SEAS5 seasonal forecasting system (Johnson et al. 2019). This global climate model is composed of atmosphere, land, ocean, and sea ice components. It uses the Integrated Forecast System (IFS) atmosphere model cycle 43r1 with horizontal resolution of ∼36 km and integrates the Hydrology Tiled ECMWF Scheme of Surface Exchanges over Land (HTESSEL) land surface model into IFS. It implements the Nucleus for European Modeling of the Ocean (NEMO) v3.4.1 model at 0.25° resolution. The atmosphere component of SEAS5 hindcasts is initialized from ERA-Interim, while the initial conditions for the land surface component are provided from a more recent version of HTESSEL (cycle 43r1) that has been run offline for the hindcast period. The initial conditions for ocean and sea ice components are generated by the historical reanalyses (ORAS5) from an operational ocean analysis system, OCEAN5. For ensemble generation schemes, perturbations are applied to atmosphere initial conditions, while perturbations to the assimilated observations and the surface forcing fields are used for ocean initial conditions. Furthermore, both stochastically perturbed physical tendency scheme and stochastic kinetic energy backscatter scheme are used to perturb atmospheric model in the generation of all ensemble members. Greenhouse gas (GHG) radiative forcing implemented in SEAS5 utilizes seasonally varying climatology from the Monitoring Atmospheric Composition and Climate reanalysis (Inness et al. 2013). Such climatology is scaled to capture the trend in GHG emissions as used in CMIP5 GHGs during 1981–2000 and CMIP5 RCP 3-PD from 2000 onward. A more detailed description of SEAS5 is available in Stockdale (2021) and Johnson et al. (2019).

SEAS5 hindcasts are initialized on the first day of every month for 1981–2016 and run for 7 months ahead. The ensemble generation scheme produces 25 ensemble members to represent forecast uncertainty. In this study, we will mainly present results for the 1 month lead forecasts of seasonal precipitation for 12 overlapping seasons from January–March (JFM) to December–February (DJF) in Australia (seasons will be abbreviated as the initial letters of three consecutive months hereafter). Forecasts with 1-month lead time represent the forecasts for a rolling season beginning in 1 month’s time. Moreover, seasonal forecasts at all lead times (0–4-month lead) will be investigated for selected cases. As an example, consider the forecasts initialized on the 1 January, the forecasts aggregated for JFM are with 0-month lead time while the forecasts aggregated for FMA are with 1-month lead time.

Monthly observations of precipitation are derived from the Australian Water Availability Project (AWAP) climate dataset (Jones et al. 2009), and then accumulated over three consecutive months to obtain seasonal precipitation data. The AWAP observations at 0.05° resolution are regridded using the area-conservative interpolation method to match the SEAS5 data at 0.4° resolution.

3. Methods

a. Model formulation

We introduce a more advanced version of the trend-aware forecast-calibration method, with adaptations and extensions necessary for postprocessing precipitation forecasts. Precipitation variables pose unique challenges because seasonal amounts follow a skewed distribution that is bounded below at zero. Furthermore, precipitation is highly variable in space and time. Precipitation trends are inherently more uncertain and difficult to detect than temperature trends.

1) Data transformation

The forecast calibration model works under the assumption that predictor (raw ensemble forecast means y1) and predictand (observations y2) are jointly modeled as a continuous bivariate normal distribution, and the marginal distributions of yi, i = 1, 2 are normal. Since precipitation amount is generally highly skewed, we employ a two-parameter log–sinh transformation scheme to facilitate modeling precipitation data using a normal distribution (Wang et al. 2012),
yi=1λilog[sinh(εi+λiyi)],
where εi and λi are transformation parameters.

When there are instances of zero precipitation amounts, for example, in dry regions and seasons, zero values are treated as left-censored data (see further below).

2) Model specification

After data transformation, we calculate the anomalies zi(t), t = 1, 2,…, N, from the trendline of the transformed variables yi,
zi(t)=yi(t)αi(ttm),
where t is the event time, αi is a trend parameter, and tm is chosen to be approximately the time of the middle event in the analysis period. The choice of tm will only affect the mean of the marginal distribution of zi but not the final postprocessing results. The joint distribution of the detrended transformed predictor z1 and detrended transformed predictand z2 is modeled as
z=[z1z2] N(μ,Σ),
where μ and Σ are mean vector and covariance matrix, respectively.
We use a Bayesian approach to infer the parameter set θ = {μ, Σ, α1, α2}. Before inferring the model parameters, their Bayesian prior distributions need to be specified. For parameters μ and Σ, we use noninformative multivariate Jeffreys priors (Gelman et al. 2014). For trend parameters αi, i = 1, 2, Shao et al. (2021b) compared two types of prior distribution. The first type was a noninformative uniform prior and the second type was an informative normal distribution prior, centered at zero and with an empirically determined variance. The resulting models were named BJP-t and BJP-ti, respectively. The informative prior in the BJP-ti model was to incorporate information on a broad range of trends observed across Australia and lead to more stable calibrations. The informative prior is of the form
p(αi)N(0,mi2),
where mi is empirically determined.

Shao et al. (2021b) determined mi for temperature variables using all available data (all seasons of the year and all grid cells across the Australian continent). We suggest that this global approach leads to a weakly determined variance parameter because of vast heterogeneity in the dataset, a problem which may be more detrimental for precipitation applications due to the distinctness of precipitation regimes across tropical, arid, and temperate regions, for example. We therefore propose a modification to determine mi using local data. That is, by prescribing a neighborhood in terms of grid cells and seasons, mi is uniquely determined on a cell-by-cell and season-by-season basis. Details of this new scheme are given in the appendix.

3) Zero lower bound

Precipitation has a natural zero lower bound. To allow the use of existing modeling approach, we treat the precipitation variables as left-censored, where zero values are treated as having unknown true values that are equal to or below zero (Wang and Robertson 2011). When the variable yi(t) has a constant censoring threshold of yic, the corresponding detrended variable zi(t) should have a censoring threshold that varies with t,
zic(t)=yicαi(ttm)

4) Parameter inference

As a first step, the best set of transformation parameters εi and λi is estimated for each grid cell and for each variable separately by using the method of Bayesian maximum a posteriori (MAP; Schepen et al. 2020). These transformation parameters are fixed for the rest of the modeling process.

The parameter set θ, the unknown censored values, and any missing values are inferred from D={[y1(t),y2(t)],t=1,2,,N}, which is a sequence of training data pairs. We use Bayesian inference with Gibbs sampling to successively obtain samples of the parameters or variables being inferred (Wang et al. 2019; Shao et al. 2021a).

To begin with the model parameters, the posterior distribution of the model parameter set θ is given by
p(θ|D)p(θ)p(D|θ),
where p(θ) is the prior distribution for model parameters, and p(D|θ) is the likelihood function. In applying Gibbs sampling, this overall posterior distribution is broken down into a series of conditional distributions for different subsets of parameters. In one iteration, each parameter is sampled from its conditional distribution in turn with the remaining parameters fixed to their current values. This sampling process continues until convergence, that is, the sampled parameters have the same distribution as sampled from the overall posterior distribution. The conditional distributions for the model parameters can be found in the online supplemental material.
To deal with missing and censored data of the variables, values are sampled from the conditional distribution,
[zi(t)|]=N[μi(t),Σi,i],
where
Σi,i=σi2(ρσ1σ2)2/σ(i)2,
μi*(t)=μi+ρσ1σ2/σ(i)2×[z(i)(t)μ(i)],
where (i) denotes the index in {1, 2} that is not i; σ1, σ2, and ρ are the parameters that constitute Σ; and μ(i) is the parameter that constitutes μ. In case of censored values, the sampling of zi(t) is restricted to zi(t)zic(t). We note that the frequency of the zero occurrences has an impact on the inference of trend parameters αi. With a large number of zero values present, say over 20% of the available data, or consecutive zero occurrences present at the start or end of the training data, the magnitude of the inferred trend may be greater than the one inferred from nontreatment of censored values due to the restriction of the sampling of zi(t).

The sampling of the parameters and the sampling of the missing and censored data of the variables are carried out in sequence, and the whole process is repeated 30 000 times to generate inference chains. The first 5000 iterations are discarded as burn-in because the early iterations may not be representative of the actual posterior distribution. The implementation and pseudo codes of the Gibbs sampling steps are elaborated in the supplemental material.

5) Model use for prediction

The established model can be used in predictive mode once all the parameter sets θ are inferred. Given a new transformed predictor value y1(t), we obtain a calibrated ensemble forecast member y2(t) corresponding to each set of the model parameters. In each iteration, we treat the predictand as missing in value and use a Gibbs sampler to sample a new calibrated forecast value z2(t*) from the conditional distribution of the predictand variable given by Eqs. (7)(9), and retrend it to y2(t). When the variable y1(t) is of a censored value, the sampling range of the detrended variable z1(t*) is restricted to z1(t)z1c(t). Again, the first 5000 iterations are discarded as burn-in in predictive mode.

Besides the descriptions above, a pragmatic approach is also used to adjust extremely large z1 (t*) values that occur in prediction before sampling z2(t*). These large transformed values are considered unrealistic based on the marginal distribution of the transformed raw forecasts. In this study, we set the extreme threshold as 0.999 in the nonexceedance probability based on the marginal distribution of z1 (Wang et al. 2019).

By back transforming each of sampled y2(t), and converting the negative value to zero, we derive and save an ensemble of 1000 calibrated forecast values to represent forecast uncertainty. Readers are referred to the supplemental material for the complete algorithm and pseudo codes of implementing the trend-aware method.

b. Forecast verification

In this study, we evaluate and compare the 1000 ensemble members of the trend-aware BJP-ti calibrated forecasts with the 25 ensemble members of the raw forecasts and the 1000 ensemble members of the BJP calibrated forecasts. The postprocessing models are established separately for each grid cell, each season, and each lead time under a leave-one-year-out cross-validation scheme, where each pair of data points within the year left out for validation is omitted from the data series and verified with the calibration model trained by the remaining data. This configuration is only appropriate to validate the anomaly component rather than the trend component. For the latter, the results from the cross validation are similar to that of model fitting, which is an inherent limitation when it comes to trend evaluation during the record period. The cross-validated forecasts over the hindcast period may contain artificial skill because the information from the future period is used to train the calibration model. However, such future information would not be available when real-time forecasts are calibrated for operational use. Consequently, real-time forecasts may have lower skill than expected after calibration (Risbey et al. 2021). Given short data records, other alternative validation methods, such as validating the forecasts at the start and end of the full evaluation period, are subject to large sampling uncertainties because there are not sufficient events to train the model and cover the multidecadal climate variability (Huang et al. 2019). As a result, such methods are not suitable for the model validation here (Shao et al. 2020).

We assess the ensemble forecasts via trend testing methods and forecast verification tools. For trend analysis, we use the Theil–Sen approach (Theil 1992; Sen 1968) to calculate trend slopes in the observations, raw ensemble forecast medians, and calibrated ensemble forecast medians. This nonparametric trend detection technique does not require the estimated trend to be linear or to conform to a Gaussian distribution (Kumar et al. 2013). In addition, this method can deal with positively skewed distributions and is not sensitive to extreme values (Sayemuzzaman and Jha 2014).

Given a data sequence y(t), t = 1, 2, …, N, the Theil–Sen slope is calculated as
β=median[y(b)y(a)ba], forall1a<bN

Here, the Theil–Sen slope β is the median value of the slopes estimated from [N(N − 1)]/2 combinations of two data points in the data sequence.

The statistical significance of the trend is checked by the nonparametric two-sided Mann–Kendall test (Mann 1945; Kendall 1975), which is a distribution independent method frequently applied for hydroclimatic trend tests (Kumar et al. 2013). Here, we check the trend significance for each grid cell across Australia and summarize the findings based on individual test results. Wilks (2016) pointed out that global statistical significance results were often overinterpreted when the input data of a global hypothesis test were a collection of the results from individual local hypothesis tests. Although we do not have this problem for all the significance tests in this study, we advise the caution about the interpretation of the collective significance from multiple hypothesis tests.

Forecast skill is evaluated by the continuous ranked probability score (CRPS; Matheson and Winkler 1976). For an individual event t, the CRPS is defined as
CRPS(t)={F(t,y)H[yyobs(t)]}2dy,
H[yyobs(t)]={0  ify<yobs(t)1  ifyyobs(t),
where H is the Heaviside step function; for an event t, F(t, y) is the cumulative distribution function (CDF) of the ensemble forecasts, and yobs(t) is the observed value. For each grid cell, the CRPS skill score is calculated as the comparison between the averaged CRPS of the target forecasts and the averaged CRPS of the reference forecasts across total sets of events, given as
CRPSskill score=CRPS¯refCRPS¯CRPS¯ref×100 (%)

Here, reference forecasts are leave-one-year-out cross-validated climatology ensemble forecasts generated from the BJP model. A higher value of the resulting CRPS skill score indicates more skillful forecasts.

Forecast bias is measured by the percentage bias, which is the relative error between ensemble forecast means and observations,
bias=t=1N[y¯(t)yobs]t=1Nyobs×100(%),
where y¯(t) is the ensemble forecast means and yobs is the observation for an event t in each grid cell. The bias value equal to 0 indicates that the ensemble forecast means perfectly correspond to the observations.
Reliability is quantified by a PIT-based score (Renard et al. 2010), measuring the tendency of the PIT (probability integral transform; Gneiting et al. 2007) values to deviate from the corresponding theoretical standard quantiles. Theoretically, a reliable forecasting system has a collection of the PIT values that follows a standard uniform distribution. For each grid cell, the PIT value πt for an event t is given as
πt=F[t,y=yobs(t)],
where F(t, y) is the ensemble forecast CDF and yobs(t) is the corresponding observations. When yobs(t) equals to zero, we randomly sample a pseudovalue from a uniform distribution within the range of [0, πt] and replace the zero values (Wang and Robertson 2011). The final PIT score is calculated as
PIT score=1.02Tt=1N|π(t)tT+1|,
where π(t) is the tth ranked PIT value in an increasing order. The PIT score ranges from 0 (worst reliability) to 1 (perfect reliability).

Sharpness is checked by the sharpness ratio, defined as the ratio of the average interval widths between target ensemble forecasts and BJP calibrated ensemble forecasts (as the baseline). Here, for each grid cell, the average interval width is calculated as the average of the central 50% [0.25, 0.75], and 90% [0.05, 0.95] interquantile intervals for all individual events (Gneiting et al. 2007). If the resulting ratio is lower than 1, target ensemble forecasts are interpreted as sharper than the BJP calibrated forecasts.

4. Result

a. Trend of observations and forecasts

The geographic patterns of precipitation trends in observations and model forecasts for the period of FMA 1981–JFM 2017 are shown in Fig. 1. The AWAP observational precipitation dataset is spatially interpolated from the rainfall gauging network across Australia. In central Australia, the network is extremely sparse, resulting in missing daily data in several clusters (see Fig. 2 in Brocca et al. 2016). Although this study uses the seasonal precipitation data aggregated from monthly precipitation product, which has been recalibrated with improved data quality, we still focus on interpreting the results in data-rich regions hereafter.

Fig. 1.
Fig. 1.

Decadal Theil–Sen’s slopes for observations, raw, BJP, and BJP-ti calibrated ensemble forecast medians of seasonal precipitation for 12 overlapping seasons with 1-month lead time from FMA 1981 to JFM 2017.

Citation: Journal of Hydrometeorology 23, 1; 10.1175/JHM-D-21-0099.1

Decadal trends of the observed precipitation are evident across Australia. Precipitation increases at over 20 mm decade−1 during the warmer seasons (from OND to JFM) across many parts of the continent (first column in Fig. 1). Strong increasing trends are found in northern Australia during the northern wet seasons (i.e., October–April). Meanwhile, drying trends are dominant during the southern wet seasons (i.e., April–November) in southwestern and southeastern Australia.

Raw forecasts reproduce the observed precipitation trends over some parts of Australia (second column in Fig. 1), such as the wetter conditions in the north for the warmer seasons and the drier conditions in the southeast for most seasons. However, these forecast trends are generally weaker than the observed ones. In many other parts, such as western Australia in FMA, trends in the raw forecasts significantly mismatch the trends in observations. Likewise, the BJP calibrated forecasts also fail to capture the observed trends in some regions (third column in Fig. 1). Visually, the spatial patterns of the raw and BJP calibrated forecasts show remarkable consistency in most regions, but different trends are still discernible elsewhere.

By using the trend-aware postprocessing model, BJP-ti, the resulting calibrated forecasts are shown to reproduce the observed trends more accurately than raw and BJP calibrated forecasts (fourth column in Fig. 1). Strong trends are closely reproduced in, for example, the wetter conditions in the north from OND to JFM, and the drier conditions in the southwest from AMJ to JJA.

b. Overall performance of the forecasts

Results of the CRPS skill score maps for the BJP-ti calibrated forecasts are shown in Fig. 2 (left plot). Skillful forecasts dominate large areas from ASO to NDJ, particularly northern Australia while climatology-like forecasts (i.e., score ranging from −5% to 5% in white) are widespread in some seasons, such as MJJ and DJF. Interestingly, for individual cells when the observed trend is statistically significant at 10% significance level (Fig. 3), the skill of the BJP-ti forecasts is mostly no worse than climatology. Note that in this study, the collection of the individual test results for the significance of trend is not indicative of the regional significance in Fig. 3 (Wilks 2016). For example, it is not valid to say statistically significant trends at 10% significance level are prevailing in northern Australia from OND to DJF.

Fig. 2.
Fig. 2.

(left) CRPS skill score (%) of the BJP-ti calibrated ensemble forecasts and (right) the score difference between the BJP-ti and BJP calibrated ensemble forecasts of seasonal precipitation at 1-month lead time.

Citation: Journal of Hydrometeorology 23, 1; 10.1175/JHM-D-21-0099.1

The skill score difference between BJP-ti and BJP calibrated forecasts is presented in the right plot of Fig. 2, indicating how forecast skill changes by embedding observed trends into the forecasts through postprocessing. Noticeable skill improvement (in darker blue) of the BJP-ti calibrated forecasts predominately occurs in the trend-significant clusters where the BJP calibrated forecasts do not properly represent the observed trends, such as in northwestern Australia in SON. We also find that rectifying the trend direction or counteracting large trend difference rarely imparts forecast skill where observed trends are nonsignificant, such as in parts of eastern Australia in SON and OND. Overall, the BJP-ti calibration leads to detectable skill gain with the score increase larger than 5% in many trend-significant regions, for example, parts of northern Australia from OND to DJF, and leads to slight skill loss with the score decrease less than 5% relative to the BJP calibration, such as northern Australia in FMA and MAM. Using the BJP-ti calibration also turns the negative skills to positive skills in some regions, such as part of western Australia in AMJ and MJJ, and part of northern Australia in NDJ and DJF (not shown). We also employ a bootstrap procedure as detailed in Schepen et al. (2016) and Shao et al. (2021a) to test whether the BJP-ti calibration significantly improves or worsens the CRPS skill score over the BJP calibrated forecasts at 5% level. Significant skill improvement is largely found in the regions with the score increase larger than 5%, while fewer cases have significant skill worsening (Fig. S1 in the online supplemental material). These findings indicate that the underlying decadal trend is an important contributor to the interannual variability for precipitation, especially in the regions with significant trends, where skill improvement is pronounced by properly incorporating the underlying historical trend into the calibrated forecasts. Other sources also contribute to the high forecast skill, such as the good modeling of the teleconnection between large-scale climate drivers and seasonal precipitation (Wang et al. 2019).

Fig. 3.
Fig. 3.

Statistical significance of the trend in observations at 5% and 10% significance level for seasonal precipitation using the Mann–Kendall test.

Citation: Journal of Hydrometeorology 23, 1; 10.1175/JHM-D-21-0099.1

We then compare the overall performance for raw, BJP and BJP-ti calibrated ensemble forecasts in terms of the trend difference between the model forecast medians and observations (mm decade−1), CRPS skill score (%), percentage bias (%), PIT score, and sharpness ratio. We pool the results of the forecast verification metrics from all the grid cells and all the seasons to plot the proportion of the cells that do not exceed a score value (Fig. 4). Consistent with the trend results shown in Fig. 1, there exist substantial trend discrepancies between the raw forecasts, the BJP calibrated forecasts and the observations (Fig. 4a). For comparison, minor trend difference is seen between the BJP-ti calibrated forecasts and observations, reiterating that the BJP-ti calibration is effective at embedding the observed trends into the resulting forecasts.

Fig. 4.
Fig. 4.

Nonexceedance plot comparing the overall performance of the raw, BJP and BJP-ti calibrated ensemble forecasts at 1-month lead time. Note: the blue line is behind the green line for the PIT score plot.

Citation: Journal of Hydrometeorology 23, 1; 10.1175/JHM-D-21-0099.1

For forecast skill (Fig. 4b), around half of the raw forecasts have negative skill scores while the BJP and BJP-ti calibrated forecasts rarely have scores lower than −5%. Postprocessing also increases the proportion of the cases with positive skills. The BJP calibration produces slightly fewer cases with negative skills while the BJP-ti and BJP calibrated forecasts are equally skillful in terms of the positive skill score. For forecast bias (Fig. 4c), raw forecasts are largely biased, with a higher proportion to be negatively biased. Both postprocessing methods effectively reduce the biases in forecast means, where the cumulative lines are closer to the zero-vertical line. The BJP model slightly outperforms the BJP-ti model in removing the biases in forecast means. After postprocessing with BJP and BJP-ti model, there remain more positive biases than negative ones. As discussed in Schepen et al. (2020), in very dry regions, small biases are shown as large percentage biases. Moreover, the calibration models introduce parameter uncertainty to the resulting forecasts, which may lead to some extreme values and give rise to higher forecast means in these dry regions. Apart from the visual comparison, we also use the bootstrap method to determine whether using the BJP-ti calibration could result in a greater number of cases with negative skills, and larger magnitude of the bias than the BJP calibration, both at 5% significance level. Results suggest that overall, the BJP-ti model does not lead to significantly more cases with negative skills but does lead to a significantly larger magnitude of the biases than the BJP model.

For reliability (Fig. 4d) and sharpness (Fig. 4e), although raw ensemble forecasts tend to have narrower widths of the interquantile intervals than the BJP calibrated forecasts, they are generally not reliable indicated by the low values of the PIT score. This means that the raw forecasts could not accurately estimate the likelihood of the observed events and are thus not informative for forecast users. In contrast, the BJP and BJP-ti calibrated forecasts are comparably reliable (Fig. 4d), while the BJP-ti calibrated forecasts are slightly less sharp (Fig. 4e) than the BJP forecasts due to the introduction of additional trend parameters.

We also explore the forecast performance in the region with statistically significant observed trends at 10% level (Fig. S2 in the online supplemental material). Again, the BJP-ti calibrated forecasts are found to represent the observed trend more accurately than both raw and BJP calibrated forecasts. The BJP-ti calibration leads to more skillful forecasts than raw and BJP calibrated forecasts for the trend-significant cases. This finding is consistent with the statements made from Figs. 2 and 3, demonstrating that the BJP-ti calibration leads to apparent skill gains in the trend-significant regions. For forecast bias and reliability, the BJP-ti and BJP model are comparably effective at removing biases and making calibrated forecasts more reliable than raw forecasts. As for sharpness, the BJP-ti calibrated forecasts appear to have narrower widths of the interquantile intervals than the BJP calibrated forecasts, indicating that the BJP-ti calibration could produce ensemble forecasts with maximal sharpness and high reliability in the regions with statistically significant observed trends.

c. Forecast performance of selected grid cells

To explore the performance of the BJP-ti model for individual cells and at longer forecast lead times, we select four grid cells (A–D in Fig. 5) for detailed evaluations. The selection criteria include 1) observed trends are statistically significant at 5% significance level using Mann–Kendall test over 1981–2016; 2) located in the regions of high rain gauge density but in different climate zones; and 3) the BJP-ti calibration improves the skill of 1-month ahead forecasts in one of the four main seasons.

Fig. 5.
Fig. 5.

Location of selected cases. Contours in gray line show the boundary of major climate zones in Australia (Peel et al. 2007).

Citation: Journal of Hydrometeorology 23, 1; 10.1175/JHM-D-21-0099.1

To investigate how raw ensemble forecasts represent observed precipitation trends, we examine the trends of raw forecast means and each of 25 ensemble members with 1-month lead time for winter (JJA) of cell A, spring (SON) of cell B, autumn (MAM) of cell C, and summer (DJF) of cell D (Fig. S3). For a straightforward comparison of trend magnitudes, the trendlines are shown after subtracting the temporal mean of each line, so that all the trendlines meet in the midpoint of the horizontal axis. The trendlines of raw ensemble forecast means generally fail to follow the magnitude of the observed trends, except for cell A. Regarding the trendlines of 25-member raw ensemble forecasts, only a few follow the direction and magnitude of the observed trendlines in cell B and C, while none of the member trendlines appears to be aligned with the magnitude of the observed trendlines in cell A and D.

Again, for these four cases, we plot the BJP and BJP-ti calibrated ensemble forecast quantiles and linear trendlines estimated from the Theil–Sen’s slopes and the corresponding intercepts (Fig. 6). The historical data for cell A and B include occurrences of zero precipitation. As shown in Figs. 6a and 6b (right plots), the BJP-ti calibration is capable of concurrently modeling zero and nonzero values of precipitation. Compared to raw and BJP calibrated ensemble forecasts that do not explicitly follow the trend signal (left plots), the interquantile ranges ([0.25 0.75] quantile in deep blue and [0.1 0.9] quantile in light blue) of the BJP-ti calibrated forecasts roughly follow the underlying trend (right plots), with the trendline of the forecast medians close to or overlapping the trendline of the observations. This is also evident for cells C and D without zero precipitation values (right plots Figs. 6c and 6d), where the BJP-ti calibrated ensemble forecasts appear to model the interannual variability more explicitly than the BJP calibrated ensemble forecasts in these two cases. Overall, the BJP-ti model is capable of properly incorporating the climate trend into the forecasts while improving the prediction of the interannual variability.

Fig. 6.
Fig. 6.

Forecast quantile plots for selected cells with 1-month lead time. Red dots are observed data. White squares are calibrated forecast median values. Dashed black lines are trendlines for raw forecast medians. Black lines are trendlines for calibrated forecast medians. Red lines are trendlines for observed data. Light blue strips are [0.1, 0.9] quantile forecasts. Deep blue strips are [0.25, 0.75] quantile forecasts.

Citation: Journal of Hydrometeorology 23, 1; 10.1175/JHM-D-21-0099.1

We also examine the performance of the BJP-ti calibrated forecasts for all the forecast lead times. For the four target cells, we apply the BJP-ti model to each of the 12 overlapping seasons and each of the forecast lead times separately and explore the forecast skill (Fig. 7) and the trend slope. Most of the resulting forecasts in cells D and A are skillful at all lead times. Negative skill scores are no lower than −10% in all cells, indicating that the BJP-ti calibration is highly effective in producing skillful, and at least climatology-like forecasts for longer lead times. Again, trend evaluation demonstrates the ability of the BJP-ti model to properly represent the underlying trend in the calibrated forecasts. As shown in Fig. 8, trend difference between the BJP-ti calibrated forecasts and observations is generally smaller than 20 mm decade−1. In a few other cases, such as NDJ in cell D, although trend difference is still large at all lead times, the trends in the BJP-ti calibrated forecasts are much closer to the observations than the BJP calibrated forecasts that show trend difference larger than 60 mm (Fig. S4).

Fig. 7.
Fig. 7.

CRPS skill score (%) of the BJP-ti calibrated ensemble forecasts of seasonal precipitation for all lead times. Locations of the grid cells are shown in Fig. 5.

Citation: Journal of Hydrometeorology 23, 1; 10.1175/JHM-D-21-0099.1

Fig. 8.
Fig. 8.

Trend difference between the BJP-ti calibrated ensemble forecast medians and observations of seasonal precipitation for all lead times. Locations of the grid cells are shown in Fig. 5.

Citation: Journal of Hydrometeorology 23, 1; 10.1175/JHM-D-21-0099.1

5. Discussion and conclusions

Retrospective forecasts from global climate models (GCM) have often shown inability to reproduce historical climate trends, making the forecasts less informative and undermining user confidence. In this study, we aim to resolve the trend mismatch problem between GCM reforecasts of seasonal precipitation and observations. The trend-aware forecast postprocessing method introduced in the previous work has shown effectiveness for postprocessing seasonal temperature forecasts. However, it is not directly applicable to precipitation forecasts due to the unique features of seasonal precipitation amounts, such as following a positive skewed distribution, having zero occurrences, and being more variable and uncertain than temperatures. To overcome these challenges, we make significant improvements to the algorithm for postprocessing precipitation forecasts.

This study adopts a refined prior specification scheme for trend parameters. The previous method employs a single prior distribution for trend parameters across all grid cells and all seasons (Shao et al. 2021b). We found that using this approach could result in weaker trends in the calibrated precipitation forecasts, particularly in the regions where observed trends are strong (not shown). Here we determine the prior distribution cell-by-cell and season-by-season based on the neighborhood information, which explicitly accounts for the local precipitation regimes on a spatial continental scale. This prior selection approach is applicable for other continental or large-scale studies where there are sufficient data available to specify the prior. When the trend-aware method is to be applied to a site study or a small spatial scale, the settings of informative priors elaborated in this study may not be valid. In these cases, it is more suitable to use the trend-aware BJP-t model (Shao et al. 2021a) with noninformative trend priors or the previous prior selection method that applies a fixed trend prior across all the cells and/or seasons.

To comprehensively assess the trend in precipitation variable, we employ the Theil–Sen approach to quantify the trend slope and the Mann–Kendall test to check the statistical significance of the trend. Referring to Figs. 1 and 3, strong but not statistically significant observed trends are detected in some regions, such as eastern Australia from MAM to MJJ. In fact, the monotonic trend with great magnitude is not necessarily recognized as statistically significant in the Mann–Kendall test (Wang et al. 2020). Apart from trend magnitude, other factors can also affect the power of the method. The effectiveness of the method is reduced with limited length of the data series, large data variance, and the existence of the positive autocorrelation in the time series.

In this study, we validate the improved trend-aware method on ECMWF SEAS5 forecasts of seasonal precipitation for Australia over the 36-yr period 1981–2016 and compare the trend-aware calibrated forecasts against the raw forecasts and the BJP calibrated forecasts that do not have observed trend information embedded. Results reveal that the trend-aware calibrated forecasts properly capture the observed trends and reproduce the magnitude of strong trends when raw and BJP calibrated forecasts fail to do so. The trend-aware method appears to greatly enhance the forecast skill primarily over the regions where the observed trend is statistically significant, while the method slightly lowers forecast skill elsewhere. Overall, the resulting forecasts substantially outperform raw forecasts and perform comparable to the BJP calibrated forecasts in terms of bias, skill, and reliability. With the use of the method, skillful or at least climatology-like forecasts are produced at all lead times in selected cases.

Our trend-aware postprocessing method has shown promise for forecasting seasonal precipitation. The method will be further improved or extended in the subsequent study. First, alternative settings of prior specification scheme for trend parameters are available for testing in future applications. For example, cells located in the same climate zone could share the same prior value for a season to reduce the computational costs. Other forms of the prior distribution are also worth investigating to more explicitly show local neighborhood behaviors. Other physical and localized properties of the variable, such as mountainous terrain, may also be considered in the prior selection. Second, this work, along with the method developed for postprocessing seasonal temperature forecasts (Shao et al. 2021a,b), are limited to the single target variable configuration, which models the relationship between a predictor and a predictand climate variable using a continuous bivariate normal distribution. In fact, their predecessor, the BJP method, allows for simultaneously calibrating multiple climate variables in high-dimensional settings (Wang et al. 2019; Schepen et al. 2020), where any number of the predictors and predictands are jointly modeled. In this respect, the trend-aware postprocessing is also extendable for the multivariate forecast calibration of hydrometeorological variables, such as forecasting seasonal streamflow for multiple sites and for months ahead (Wang and Robertson 2011), and potentially using the calibrated forecasts (e.g., temperatures) to forecast other meteorological variables (e.g., precipitation). Third, future work will test the efficiency of the trend-aware method for postprocessing subseasonal climate forecasts and will improve the algorithm or develop a more robust model calibration scheme where necessary.

Acknowledgments.

This study is funded by the Australian Research Council and industry partners in terms of an ARC Linkage Project (LP170100922). We thank the European Centre for Medium-Range Weather Forecasts (ECMWF) for providing SEAS5 forecast data. We thank the Australian Bureau of Meteorology for the freely downloaded AWAP dataset. We thank Research Computing Services at the University of Melbourne for providing computational resource. We appreciate the valuable comments made by three anonymous reviewers.

Data availability statement.

Access to SEAS5 forecast data is subject to the terms and conditions from the ECMWF. AWAP dataset can be downloaded from http://www.bom.gov.au/climate/maps/rainfall.

APPENDIX

Trend Prior Selection Scheme

In Eq. (4), mi is set as mi=δi×mi, where δi is the MAP estimate of the standard deviation of yi derived at the transformation step. By removing from mi the scaling factor δi, which is affected by the transformation parameter values, mi should be more uniform spatially and easier to specify.

To select mi, we first run the trend-aware model with noninformative prior to determine the trend slope from the transformed data only. Specifically, we run the BJP-t model (Shao et al. 2021a) without cross-validation setup. Our preliminary analysis shows the final prior values derived from cross-validation and non-cross-validation experiments do not differ much, so we simply employ the latter setup to reduce computation costs. We record the values of trend parameters αi sampled in the parameter inference mode. Then we calculate the median value of αi to represent the trend slope of yi, and divide the slope by δi. This trend slope/δi value is calculated and recorded for each grid cell, for each season, for raw forecasts and for observations separately. After archiving all trend slope/δi values, the prior parameter mi for each case is specified via a temporal and spatial window. That is, for the case of interest, we choose the values of trend slope/δi from the close-by 49 cells (i.e., cells located within a 7 cell × 7 cell square) and from the adjacent three seasons (last, this, and next season). In total, the trend slope/δi values from 147 cells (49 × 3) are pooled together to determine the specific prior parameter mi, estimated as the 75th percentile of the pooled absolute trend slope/δi values. Here, when we pool the cells, only land values within the national boundary are considered, which means for the cells along the coastline, there are fewer than 147 cells pooled for specifying the prior parameters. We note that it is also valid to pool cells within a different spatial window, a 5 cell × 5 cell square for example, and the results do not differ much in both settings. The 75th percentile number is specified as a compromise to slightly moderate the inferred trend. In our work, the rationale of applying the zero-centered normal informative prior is that trends inferred from a limited training period are subject to sampling errors and thus may not properly represent the underlying observed trends. If mi is determined from a higher percentile (e.g., 90th percentile), the prior may not be useful to moderate the trend because there is a higher chance for the trends to fall in a wider value range. Consequently, the inferred trends are likely to be accurately aligned with the trends of the observations. In contrast, selecting mi from a relatively lower percentile (e.g., 60th percentile) may overly constrain the inference of the trends, and make the inferred trend closer to zero.

REFERENCES

  • An-Vo, D., S. Mushtaq, K. Reardon-Smith, L. Kouadio, S. Attard, D. Cobon, and R. Stone, 2019: Value of seasonal forecasting for sugarcane farm irrigation planning. Eur. J. Agron., 104, 3748, https://doi.org/10.1016/j.eja.2019.01.005.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnston, A. G., S. Li, S. J. Mason, D. G. DeWitt, L. Goddard, and X. Gong, 2010: Verification of the first 11 years of IRI’s seasonal climate forecasts. J. Appl. Meteor. Climatol., 49, 493520, https://doi.org/10.1175/2009JAMC2325.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bhend, J., and P. Whetton, 2015: Evaluation of simulated recent climate change in Australia. Aust. Meteor. Oceanogr. J., 65, 418, https://doi.org/10.22499/2.6501.003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brocca, L., and Coauthors, 2016: Rainfall estimation by inverting SMOS soil moisture estimates: A comparison of different methods over Australia. J. Geophys. Res. Atmos., 121, 12 06212 079, https://doi.org/10.1002/2016JD025382.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cai, M., C.-S. Shin, H. van den Dool, W. Wang, S. Saha, and A. Kumar, 2009: The role of long-term trend in seasonal predictions: Implication of global warming in the NCEP CFS. Wea. Forecasting, 24, 965973, https://doi.org/10.1175/2009WAF2222231.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • CSIRO and BoM, 2020: State of the Climate 2020. 24 pp., http://www.bom.gov.au/state-of-the-climate/documents/State-of-the-Climate-2020.pdf.

  • Dirkson, A., W. J. Merryfield, and A. H. Monahan, 2019: Calibrated probabilistic forecasts of Arctic sea ice concentration. J. Climate, 32, 12511271, https://doi.org/10.1175/JCLI-D-18-0224.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, D. B. Rubin, 2014: Bayesian Data Analysis. 3rd ed. CRC Press 202 pp.

  • Gneiting, T., F. Balabdaoui, and A. E. Raftery, 2007: Probabilistic forecasts, calibration and sharpness. J. Roy. Stat. Soc., 69B, 243268, https://doi.org/10.1111/j.1467-9868.2007.00587.x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hartmann, D. L. , and Coauthors, 2013: Observations: Atmosphere and surface. Climate Change 2013: The Physical Science Basis, T. F. Stocker et al., Eds., Cambridge University Press, 159–254.

  • Huang, B., C.-S. Shin, and A. Kumar, 2019: Predictive skill and predictable patterns of the U.S. seasonal precipitation in CFSv2 reforecasts of 60 years (1958–2017). J. Climate, 32, 86038637, https://doi.org/10.1175/JCLI-D-19-0230.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Inness, A., and Couthors, 2013: The MACC reanalysis: An 8 yr data set of atmospheric composition. Chem. Phys., 13, 40734109, https://doi.org/10.5194/acp-13-4073-2013.

    • Search Google Scholar
    • Export Citation
  • Johnson, S. J., and Coauthors, 2019: SEAS5: The new ECMWF seasonal forecast system. Geosci. Model Dev., 12, 10871117, https://doi.org/10.5194/gmd-12-1087-2019.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Jones, D. A., W. Wang, and R. Fawcett, 2009: High-quality spatial climate data-sets for Australia. Aust. Meteor. Oceanogr. J., 58, 233248, https://doi.org/10.22499/2.5804.003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kendall, M. G., 1975. Rank Correlation Methods. 4th ed. Charles Griffin, 202 pp.

  • Kirtman, B. P., and Coauthors, 2014: The North American multimodel ensemble: Phase-1 seasonal-to-interannual prediction; phase-2 toward developing intraseasonal prediction. Bull. Amer. Meteor. Soc., 95, 585601, https://doi.org/10.1175/BAMS-D-12-00050.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Krakauer, N. Y., 2017: Temperature trends and prediction skill in NMME seasonal forecasts. Climate Dyn., 53, 72017213, https://doi.org/10.1007/s00382-017-3657-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Krikken, F., M. Schmeits, W. Vlot, V. Guemas, and W. Hazeleger, 2016: Skill improvement of dynamical seasonal Arctic sea ice forecasts. Geophys. Res. Lett., 43, 51245132, https://doi.org/10.1002/2016GL068462.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kumar, S., V. Merwade, J. L. Kinter III, and D. Niyogi, 2013: Evaluation of temperature and precipitation trends and long-term persistence in CMIP5 twentieth-century climate simulations. J. Climate, 26, 41684185, https://doi.org/10.1175/JCLI-D-12-00259.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Livezey, R. E., and M. M. Timofeyeva, 2008: The first decade of long-lead US seasonal forecasts - Insights from a skill analysis. Bull. Amer. Meteor. Soc., 89, 843854, https://doi.org/10.1175/2008BAMS2488.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mann, H. B., 1945: Nonparametric tests against trend. Econometrica, 13, 245259, https://doi.org/10.2307/1907187.

  • Matheson, J. E., and R. L. Winkler, 1976: Scoring rules for continuous probability distributions. Manage. Sci., 22, 10871096, https://doi.org/10.1287/mnsc.22.10.1087.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pechlivanidis, I., L. Crochemore, J. Rosberg, and T. Bosshard, 2020: What are the key drivers controlling the quality of seasonal streamflow forecasts? Water Resour. Res., 56, e2019WR026987, https://doi.org/10.1029/2019WR026987.

    • Crossref
    • Export Citation
  • Peel, M. C., B. L. Finlayson, and T. A. McMahon, 2007: Updated world map of the Köppen-Geiger climate classification. Hydrol. Earth Syst. Sci., 11, 16331644, https://doi.org/10.5194/hess-11-1633-2007.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Renard, B., D. Kavetski, G. Kuczera, M. Thyer, and S. W. Franks, 2010: Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors. Water Resour. Res., 46, https://doi.org/10.1029/2009WR008328.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Risbey, J. S., and Coauthors, 2021: Standard assessments of climate forecast skill can be misleading. Nat. Commun., 12, 4346, https://doi.org/10.1038/s41467-021-23771-z.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Rowell, D. P., 2012: Sources of uncertainty in future changes in local precipitation. Climate Dyn., 39, 19291950, https://doi.org/10.1007/s00382-011-1210-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Saha, S., and Coauthors, 2014: The NCEP Climate Forecast System version 2. J. Climate, 27, 21852208, https://doi.org/10.1175/JCLI-D-12-00823.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sayemuzzaman, M., and M. K. Jha, 2014: Seasonal and annual precipitation time series trend analysis in North Carolina, United States. Atmos. Res., 137, 183194, https://doi.org/10.1016/j.atmosres.2013.10.012.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schepen, A., Q. J. Wang, and Y. Everingham, 2016: Calibration, bridging, and merging to improve gcm seasonal temperature forecasts in Australia. Mon. Wea. Rev., 144, 24212441, https://doi.org/10.1175/MWR-D-15-0384.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schepen, A., Y. Everingham, and Q. J. Wang, 2020: On the joint calibration of multivariate seasonal climate forecasts from GCMs. Mon. Wea. Rev., 148, 437456, https://doi.org/10.1175/MWR-D-19-0046.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sen, P. K., 1968: Estimates of the regression coefficient based on Kendall’s tau. J. Amer. Stat. Assoc., 63, 13791389, https://doi.org/10.1080/01621459.1968.10480934.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Shao, Y., Q. J. Wang, A. Schepen, and D. Ryu, 2021a: Embedding trend into seasonal temperature forecasts through statistical calibration of GCM outputs. Int. J. Climatol., 41, E1553E1565, https://doi.org/10.1002/joc.6788.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Shao, Y., Q. J. Wang, A. Schepen, and D. Ryu, 2021b: Going with the trend: forecasting seasonal climate conditions under climate change. Mon. Wea. Rev., 149, 25132522, https://doi.org/10.1175/MWR-D-20-0318.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Shin, C.-S., and B. Huang, 2019: A spurious warming trend in the NMME equatorial Pacific SST hindcasts. Climate Dyn., 53, 72877303, https://doi.org/10.1007/s00382-017-3777-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stockdale, T. N., 2021: SEAS5 user guide. ECMWF, 44 pp., https://doi.org/10.21957/2y67999y.

    • Crossref
    • Export Citation
  • Theil, H., 1992: A rank-invariant method of linear and polynomial regression analysis. Henri Theil’s Contributions to Economics and Econometrics, Springer, 345–381.

    • Crossref
    • Export Citation
  • Wang, F., W. Shao, H. Yu, G. Kan, X. He, D. Zhang, M. Ren, and G. Wang, 2020: Re-evaluation of the power of the Mann-Kendall test for detecting monotonic trends in hydrometeorological time series. Front. Earth Sci., 8, 14, https://doi.org/10.3389/feart.2020.00014.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, Q. J., and D. E. Robertson, 2011: Multisite probabilistic forecasting of seasonal flows for streams with zero value occurrences. Water Resour. Res., 47, W02546, https://doi.org/10.1029/2010WR009333.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, Q. J., D. E. Robertson, and F. H. S. Chiew, 2009: A Bayesian joint probability modeling approach for seasonal forecasting of streamflows at multiple sites. Water Resour. Res., 45, W05407, https://doi.org/10.1029/2008WR007355.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, Q. J., D. L. Shrestha, D. E. Robertson, and P. Pokhrel, 2012: A log-sinh transformation for data normalization and variance stabilization. Water Resour. Res., 48, W05514, https://doi.org/10.1029/2011WR010973.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, Q. J., Y. W. Shao, Y. Song, A. Schepen, D. E. Robertson, D. Ryu, and F. Pappenberger, 2019: An evaluation of ECMWF SEAS5 seasonal climate forecasts for Australia using a new forecast calibration algorithm. Environ. Modell. Software, 122, 104550, https://doi.org/10.1016/j.envsoft.2019.104550.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wasko, C., Y. Shao, E. Vogel, L. Wilson, Q. Wang, A. Frost, and C. Donnelly, 2021: Understanding trends in hydrologic extremes across Australia. J. Hydrol., 593, 125877, https://doi.org/10.1016/j.jhydrol.2020.125877.

    • Crossref
    • Export Citation
  • Wilks, D. S., 2016: “The stippling shows statistically significant grid points”: How research results are routinely overstated and overinterpreted, and what to do about it. Bull. Amer. Meteor. Soc., 97, 22632273, https://doi.org/10.1175/BAMS-D-15-00267.1.

    • Crossref
    • Search Google Scholar
    • Export Citation

Supplementary Materials

Save
  • An-Vo, D., S. Mushtaq, K. Reardon-Smith, L. Kouadio, S. Attard, D. Cobon, and R. Stone, 2019: Value of seasonal forecasting for sugarcane farm irrigation planning. Eur. J. Agron., 104, 3748, https://doi.org/10.1016/j.eja.2019.01.005.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barnston, A. G., S. Li, S. J. Mason, D. G. DeWitt, L. Goddard, and X. Gong, 2010: Verification of the first 11 years of IRI’s seasonal climate forecasts. J. Appl. Meteor. Climatol., 49, 493520, https://doi.org/10.1175/2009JAMC2325.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Bhend, J., and P. Whetton, 2015: Evaluation of simulated recent climate change in Australia. Aust. Meteor. Oceanogr. J., 65, 418, https://doi.org/10.22499/2.6501.003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Brocca, L., and Coauthors, 2016: Rainfall estimation by inverting SMOS soil moisture estimates: A comparison of different methods over Australia. J. Geophys. Res. Atmos., 121, 12 06212 079, https://doi.org/10.1002/2016JD025382.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Cai, M., C.-S. Shin, H. van den Dool, W. Wang, S. Saha, and A. Kumar, 2009: The role of long-term trend in seasonal predictions: Implication of global warming in the NCEP CFS. Wea. Forecasting, 24, 965973, https://doi.org/10.1175/2009WAF2222231.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • CSIRO and BoM, 2020: State of the Climate 2020. 24 pp., http://www.bom.gov.au/state-of-the-climate/documents/State-of-the-Climate-2020.pdf.

  • Dirkson, A., W. J. Merryfield, and A. H. Monahan, 2019: Calibrated probabilistic forecasts of Arctic sea ice concentration. J. Climate, 32, 12511271, https://doi.org/10.1175/JCLI-D-18-0224.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, D. B. Rubin, 2014: Bayesian Data Analysis. 3rd ed. CRC Press 202 pp.

  • Gneiting, T., F. Balabdaoui, and A. E. Raftery, 2007: Probabilistic forecasts, calibration and sharpness. J. Roy. Stat. Soc., 69B, 243268, https://doi.org/10.1111/j.1467-9868.2007.00587.x.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hartmann, D. L. , and Coauthors, 2013: Observations: Atmosphere and surface. Climate Change 2013: The Physical Science Basis, T. F. Stocker et al., Eds., Cambridge University Press, 159–254.

  • Huang, B., C.-S. Shin, and A. Kumar, 2019: Predictive skill and predictable patterns of the U.S. seasonal precipitation in CFSv2 reforecasts of 60 years (1958–2017). J. Climate, 32, 86038637, https://doi.org/10.1175/JCLI-D-19-0230.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Inness, A., and Couthors, 2013: The MACC reanalysis: An 8 yr data set of atmospheric composition. Chem. Phys., 13, 40734109, https://doi.org/10.5194/acp-13-4073-2013.

    • Search Google Scholar
    • Export Citation
  • Johnson, S. J., and Coauthors, 2019: SEAS5: The new ECMWF seasonal forecast system. Geosci. Model Dev., 12, 10871117, https://doi.org/10.5194/gmd-12-1087-2019.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Jones, D. A., W. Wang, and R. Fawcett, 2009: High-quality spatial climate data-sets for Australia. Aust. Meteor. Oceanogr. J., 58, 233248, https://doi.org/10.22499/2.5804.003.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kendall, M. G., 1975. Rank Correlation Methods. 4th ed. Charles Griffin, 202 pp.

  • Kirtman, B. P., and Coauthors, 2014: The North American multimodel ensemble: Phase-1 seasonal-to-interannual prediction; phase-2 toward developing intraseasonal prediction. Bull. Amer. Meteor. Soc., 95, 585601, https://doi.org/10.1175/BAMS-D-12-00050.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Krakauer, N. Y., 2017: Temperature trends and prediction skill in NMME seasonal forecasts. Climate Dyn., 53, 72017213, https://doi.org/10.1007/s00382-017-3657-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Krikken, F., M. Schmeits, W. Vlot, V. Guemas, and W. Hazeleger, 2016: Skill improvement of dynamical seasonal Arctic sea ice forecasts. Geophys. Res. Lett., 43, 51245132, https://doi.org/10.1002/2016GL068462.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kumar, S., V. Merwade, J. L. Kinter III, and D. Niyogi, 2013: Evaluation of temperature and precipitation trends and long-term persistence in CMIP5 twentieth-century climate simulations. J. Climate, 26, 41684185, https://doi.org/10.1175/JCLI-D-12-00259.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Livezey, R. E., and M. M. Timofeyeva, 2008: The first decade of long-lead US seasonal forecasts - Insights from a skill analysis. Bull. Amer. Meteor. Soc., 89, 843854, https://doi.org/10.1175/2008BAMS2488.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mann, H. B., 1945: Nonparametric tests against trend. Econometrica, 13, 245259, https://doi.org/10.2307/1907187.

  • Matheson, J. E., and R. L. Winkler, 1976: Scoring rules for continuous probability distributions. Manage. Sci., 22, 10871096, https://doi.org/10.1287/mnsc.22.10.1087.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pechlivanidis, I., L. Crochemore, J. Rosberg, and T. Bosshard, 2020: What are the key drivers controlling the quality of seasonal streamflow forecasts? Water Resour. Res., 56, e2019WR026987, https://doi.org/10.1029/2019WR026987.

    • Crossref
    • Export Citation
  • Peel, M. C., B. L. Finlayson, and T. A. McMahon, 2007: Updated world map of the Köppen-Geiger climate classification. Hydrol. Earth Syst. Sci., 11, 16331644, https://doi.org/10.5194/hess-11-1633-2007.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Renard, B., D. Kavetski, G. Kuczera, M. Thyer, and S. W. Franks, 2010: Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors. Water Resour. Res., 46, https://doi.org/10.1029/2009WR008328.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Risbey, J. S., and Coauthors, 2021: Standard assessments of climate forecast skill can be misleading. Nat. Commun., 12, 4346, https://doi.org/10.1038/s41467-021-23771-z.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Rowell, D. P., 2012: Sources of uncertainty in future changes in local precipitation. Climate Dyn., 39, 19291950, https://doi.org/10.1007/s00382-011-1210-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Saha, S., and Coauthors, 2014: The NCEP Climate Forecast System version 2. J. Climate, 27, 21852208, https://doi.org/10.1175/JCLI-D-12-00823.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sayemuzzaman, M., and M. K. Jha, 2014: Seasonal and annual precipitation time series trend analysis in North Carolina, United States. Atmos. Res., 137, 183194, https://doi.org/10.1016/j.atmosres.2013.10.012.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schepen, A., Q. J. Wang, and Y. Everingham, 2016: Calibration, bridging, and merging to improve gcm seasonal temperature forecasts in Australia. Mon. Wea. Rev., 144, 24212441, https://doi.org/10.1175/MWR-D-15-0384.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schepen, A., Y. Everingham, and Q. J. Wang, 2020: On the joint calibration of multivariate seasonal climate forecasts from GCMs. Mon. Wea. Rev., 148, 437456, https://doi.org/10.1175/MWR-D-19-0046.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sen, P. K., 1968: Estimates of the regression coefficient based on Kendall’s tau. J. Amer. Stat. Assoc., 63, 13791389, https://doi.org/10.1080/01621459.1968.10480934.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Shao, Y., Q. J. Wang, A. Schepen, and D. Ryu, 2021a: Embedding trend into seasonal temperature forecasts through statistical calibration of GCM outputs. Int. J. Climatol., 41, E1553E1565, https://doi.org/10.1002/joc.6788.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Shao, Y., Q. J. Wang, A. Schepen, and D. Ryu, 2021b: Going with the trend: forecasting seasonal climate conditions under climate change. Mon. Wea. Rev., 149, 25132522, https://doi.org/10.1175/MWR-D-20-0318.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Shin, C.-S., and B. Huang, 2019: A spurious warming trend in the NMME equatorial Pacific SST hindcasts. Climate Dyn., 53, 72877303, https://doi.org/10.1007/s00382-017-3777-8.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Stockdale, T. N., 2021: SEAS5 user guide. ECMWF, 44 pp., https://doi.org/10.21957/2y67999y.

    • Crossref
    • Export Citation
  • Theil, H., 1992: A rank-invariant method of linear and polynomial regression analysis. Henri Theil’s Contributions to Economics and Econometrics, Springer, 345–381.

    • Crossref
    • Export Citation
  • Wang, F., W. Shao, H. Yu, G. Kan, X. He, D. Zhang, M. Ren, and G. Wang, 2020: Re-evaluation of the power of the Mann-Kendall test for detecting monotonic trends in hydrometeorological time series. Front. Earth Sci., 8, 14, https://doi.org/10.3389/feart.2020.00014.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, Q. J., and D. E. Robertson, 2011: Multisite probabilistic forecasting of seasonal flows for streams with zero value occurrences. Water Resour. Res., 47, W02546, https://doi.org/10.1029/2010WR009333.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, Q. J., D. E. Robertson, and F. H. S. Chiew, 2009: A Bayesian joint probability modeling approach for seasonal forecasting of streamflows at multiple sites. Water Resour. Res., 45, W05407, https://doi.org/10.1029/2008WR007355.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, Q. J., D. L. Shrestha, D. E. Robertson, and P. Pokhrel, 2012: A log-sinh transformation for data normalization and variance stabilization. Water Resour. Res., 48, W05514, https://doi.org/10.1029/2011WR010973.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wang, Q. J., Y. W. Shao, Y. Song, A. Schepen, D. E. Robertson, D. Ryu, and F. Pappenberger, 2019: An evaluation of ECMWF SEAS5 seasonal climate forecasts for Australia using a new forecast calibration algorithm. Environ. Modell. Software, 122, 104550, https://doi.org/10.1016/j.envsoft.2019.104550.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wasko, C., Y. Shao, E. Vogel, L. Wilson, Q. Wang, A. Frost, and C. Donnelly, 2021: Understanding trends in hydrologic extremes across Australia. J. Hydrol., 593, 125877, https://doi.org/10.1016/j.jhydrol.2020.125877.

    • Crossref
    • Export Citation
  • Wilks, D. S., 2016: “The stippling shows statistically significant grid points”: How research results are routinely overstated and overinterpreted, and what to do about it. Bull. Amer. Meteor. Soc., 97, 22632273, https://doi.org/10.1175/BAMS-D-15-00267.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Decadal Theil–Sen’s slopes for observations, raw, BJP, and BJP-ti calibrated ensemble forecast medians of seasonal precipitation for 12 overlapping seasons with 1-month lead time from FMA 1981 to JFM 2017.

  • Fig. 2.

    (left) CRPS skill score (%) of the BJP-ti calibrated ensemble forecasts and (right) the score difference between the BJP-ti and BJP calibrated ensemble forecasts of seasonal precipitation at 1-month lead time.

  • Fig. 3.

    Statistical significance of the trend in observations at 5% and 10% significance level for seasonal precipitation using the Mann–Kendall test.

  • Fig. 4.

    Nonexceedance plot comparing the overall performance of the raw, BJP and BJP-ti calibrated ensemble forecasts at 1-month lead time. Note: the blue line is behind the green line for the PIT score plot.

  • Fig. 5.

    Location of selected cases. Contours in gray line show the boundary of major climate zones in Australia (Peel et al. 2007).

  • Fig. 6.

    Forecast quantile plots for selected cells with 1-month lead time. Red dots are observed data. White squares are calibrated forecast median values. Dashed black lines are trendlines for raw forecast medians. Black lines are trendlines for calibrated forecast medians. Red lines are trendlines for observed data. Light blue strips are [0.1, 0.9] quantile forecasts. Deep blue strips are [0.25, 0.75] quantile forecasts.

  • Fig. 7.

    CRPS skill score (%) of the BJP-ti calibrated ensemble forecasts of seasonal precipitation for all lead times. Locations of the grid cells are shown in Fig. 5.

  • Fig. 8.

    Trend difference between the BJP-ti calibrated ensemble forecast medians and observations of seasonal precipitation for all lead times. Locations of the grid cells are shown in Fig. 5.

All Time Past Year Past 30 Days
Abstract Views 323 14 0
Full Text Views 309 224 14
PDF Downloads 209 113 6