Reply to “Comment on ‘Bias Correction, Quantile Mapping, and Downscaling: Revisiting the Inflation Issue’”

Douglas Maraun Wegener Center for Climate and Global Change, University of Graz, Graz, Austria

Search for other papers by Douglas Maraun in
Current site
Google Scholar
PubMed
Close
Full access

Corresponding author address: Douglas Maraun, Wegener Center for Climate and Global Change, University of Graz, Graz, Austria. E-mail: douglas.maraun@uni-graz.at

The original article that was the subject of this comment/reply can be found at http://journals.ametsoc.org/doi/abs/10.1175/JCLI-D-12-00821.1.

Corresponding author address: Douglas Maraun, Wegener Center for Climate and Global Change, University of Graz, Graz, Austria. E-mail: douglas.maraun@uni-graz.at

The original article that was the subject of this comment/reply can be found at http://journals.ametsoc.org/doi/abs/10.1175/JCLI-D-12-00821.1.

Glahn (2016) raises several important points in his comment on my articles, Maraun (2013) and Maraun (2014). Overall, he concludes that the statement that inflation is flawed is unjustified: “Regression inflation is a technique that has some good features and, like any technique, some bad” (Glahn 2016, p. xxx). More specifically, Glahn highlights the severe limitations of standard linear models in forecasting rare events, and in contrast the respective merits of inflation. He considers probabilistic forecasts as discussed in Maraun (2014) “an excellent avenue”, but points out that often “a specific value is imperative” rather than a probability (Glahn 2016, p. xxx). I do agree with the advantages of inflated regression over linear regression as discussed by Glahn when it comes to forecasting extreme events, but still I believe that probabilistic forecasts are the method of choice, even if a specific value is required. The point about simulation of time series for climate science is completely untouched by this discussion, and remains valid.

Glahn lists a couple of examples to support his line of argument, in particular the prediction of rare events in nonnormal distributions. A theoretical example might help to illustrate the case. Consider some weather variable yi at time steps ti, which depends on some predictor xi. The weather variable could be local wind speed, and the predictor a numerical weather forecast of some grid-average wind speed. Assume that the predictor is normally distributed, , and the predictand follows an exponential distribution conditional on the predictor, with rate parameter λ. The exponential link function is chosen to ensure positive rate parameters. A typical time series is shown in Fig. 1 (left, solid black line): it is highly skewed with only positive values. Assume one aims to predict excesses of the climatological 95th percentile of the yi (top dashed black line).

Fig. 1.
Fig. 1.

Forecast of exponentially distributed data, conditional on a normally distributed predictor. (left) Time series; observed (black), predicted mean of linear model (orange), inflated linear prediction (red), predicted 86.3%-quantile of generalized linear model (i.e., the threat score is maximized). Black circles: observed threshold exceedances; red: true positive predictions with inflation; blue: true positive predictions with generalized linear model. (right) Histogram of 10 000 observations (black), predicted means of linear model (orange), inflated linear prediction (red), and simulation from the generalized linear model (blue).

Citation: Journal of Climate 29, 23; 10.1175/JCLI-D-16-0592.1

As discussed by Glahn, the prediction based on a linear regression model fitted to the (xi, yi) pairs (orange line) hardly ever (in a simulation with 10 000 data points never) exceeds the chosen threshold—not necessarily because it is bad, but because it predicts the expected value of yi, given a forecast of xi, but not extremes of the distribution. As a result the threat score (TS) = true positive/(true positive + false positive + false negative), has a value of zero, even worse than a climatological forecast that randomly predicts exceedances with a 5% probability.

Inflation “blows up” the spread of the linear model prediction to have equal variance as the predictand data. This ad hoc procedure, as highlighted by Glahn, improves the forecast of threshold excesses (Fig. 1, red line): the threat score is much improved, also compared to the climatological forecast. The reason is obvious: the linear model forecast of course has skill—the correlation between forecast and observations in the example is almost 0.4 (Table 2)—and the inflation helps to exceed the threshold at the right time steps.

An alternative to inflation is probabilistic forecasts. For instance, one might construct a generalized linear model (McCullagh and Nelder 1983; Dobson 2001) that predicts a time-varying exponential distribution, given a predictor xi. If a specific value of either “event” or “no event” is required, a probabilistic forecast can easily be transformed into a deterministic dichotomous forecast by choosing an appropriate probability threshold (Wilks 2006). If the predicted occurrence probability is below the threshold, a forecast of no event is issued; if the threshold is exceeded, an event is forecast. The choice of this probability threshold is arbitrary and depends on the user (Wilks 2006). At first sight, this freedom might seem disturbing, but it is a key strength: Is a high hit rate (true positives) more important, or a low false alarm rate (false positives)? Is the overall skill relevant, or a high threat score? For illustrative purposes, I have chosen two thresholds: one (0.920) that increases the hit rate to the same as obtained with inflated regression, and one (0.863) that maximizes the threat score. In the first example, the resulting threat score is, by construction, essentially identical to that of inflated regression (the ranking of the two approaches depends on the sample). In the second example, the threat score is considerably improved, as the number of false alarms is greatly reduced (Table 1 and Fig. 1, blue line).

Table 1.

Forecast verification based on 10 000 values. Climatology: random prediction based on climatological exceedance rate, LM: linear model, ILM: inflated linear model, GLM: prediction based on GLM, with a probability threshold chosen to maximize true positives (tpos) or threat score (TS).

Table 1.

So, indeed, inflated regression improves forecasts of extreme events compared to the predicted mean of a linear model. Glahn lists several successful applications of inflation. But still, probabilistic forecasts—even if transformed into deterministic forecasts—have a key advantage beyond the fact that they are based on sound statistical theory: in the chosen example, inflated regression produces many false positive results. These may or may not be acceptable; the inflation procedure provides no straightforward way to adjust the forecast. Yet the threshold choice of a probabilistic forecast allows one, based on historical observations and forecasts, to easily and transparently tailor the forecast to the need of a given user.

For downscaling and simulating local climate time series, the purpose is different as stated by Glahn. Here, inflation is not a suitable procedure (von Storch 1999; Maraun 2013, 2014). First, it retains the distribution of the predictor1 and may produce unphysical negative values (Fig. 1, right). But more importantly, the simulated local temporal structure is that of the large-scale predictor (Table 2). The observed correlation between predictor and predictand is 0.397, which is well reproduced by a simulation from a generalized linear model. But the correlation between predictor and local simulation based on inflated regression is one. In other words, when in reality local weather is not fully determined by large-scale weather, but may exhibit random small-scale variations, inflated regression assumes an unjustified deterministic one-to-one relationship.2

Table 2.

Correlations between predictor and observation/simulation based on 10 000 values. LM: mean of linear model, ILM: inflated linear model, GLM: simulation from generalized linear model.

Table 2.

REFERENCES

  • Dobson, A. J., 2001: An Introduction to Generalized Linear Models. 2nd ed. Chapman and Hall, 240 pp.

  • Glahn, H., 2016: Comments on “Bias correction, quantile mapping, and downscaling: Revisiting the inflation issue.” J. Climate, 29, 86658667, doi:10.1175/JCLI-D-16-0362.1.

    • Search Google Scholar
    • Export Citation
  • Maraun, D., 2013: Bias correction, quantile mapping, and downscaling: Revisiting the inflation issue. J. Climate, 26, 21372143, doi:10.1175/JCLI-D-12-00821.1.

    • Search Google Scholar
    • Export Citation
  • Maraun, D., 2014: Reply to “Comment on ‘Bias correction, quantile mapping, and downscaling: Revisiting the inflation issue.’” J. Climate, 27, 18211825, doi:10.1175/JCLI-D-13-00307.1.

    • Search Google Scholar
    • Export Citation
  • McCullagh, P., and J. A. Nelder, 1983: Generalized Linear Models. Chapman and Hall, 261 pp.

  • von Storch, H., 1999: On the use of “inflation” in statistical downscaling. J. Climate, 12, 35053506, doi:10.1175/JCLI-D-16-0592.1.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. Academic Press/Elsevier, 627 pp.

1

If no nonlinear transformations are applied as mentioned by Glahn.

2

In those cases where such a relationship would be justified, the correlation would be perfect and no inflation would happen.

Save
  • Dobson, A. J., 2001: An Introduction to Generalized Linear Models. 2nd ed. Chapman and Hall, 240 pp.

  • Glahn, H., 2016: Comments on “Bias correction, quantile mapping, and downscaling: Revisiting the inflation issue.” J. Climate, 29, 86658667, doi:10.1175/JCLI-D-16-0362.1.

    • Search Google Scholar
    • Export Citation
  • Maraun, D., 2013: Bias correction, quantile mapping, and downscaling: Revisiting the inflation issue. J. Climate, 26, 21372143, doi:10.1175/JCLI-D-12-00821.1.

    • Search Google Scholar
    • Export Citation
  • Maraun, D., 2014: Reply to “Comment on ‘Bias correction, quantile mapping, and downscaling: Revisiting the inflation issue.’” J. Climate, 27, 18211825, doi:10.1175/JCLI-D-13-00307.1.

    • Search Google Scholar
    • Export Citation
  • McCullagh, P., and J. A. Nelder, 1983: Generalized Linear Models. Chapman and Hall, 261 pp.

  • von Storch, H., 1999: On the use of “inflation” in statistical downscaling. J. Climate, 12, 35053506, doi:10.1175/JCLI-D-16-0592.1.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. Academic Press/Elsevier, 627 pp.

  • Fig. 1.

    Forecast of exponentially distributed data, conditional on a normally distributed predictor. (left) Time series; observed (black), predicted mean of linear model (orange), inflated linear prediction (red), predicted 86.3%-quantile of generalized linear model (i.e., the threat score is maximized). Black circles: observed threshold exceedances; red: true positive predictions with inflation; blue: true positive predictions with generalized linear model. (right) Histogram of 10 000 observations (black), predicted means of linear model (orange), inflated linear prediction (red), and simulation from the generalized linear model (blue).

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 692 394 69
PDF Downloads 122 32 5