## 1. Introduction

### a. Forecasting stochastic process

An element of sensible weather is typically forecasted and observed at predetermined times in the daily cycle. The associated sequence of predictands (variates whose realizations are forecasted) forms a discrete-time *stochastic process*, or a time series, which can be characterized by its marginal distribution functions and its temporal dependence structure (in particular, the autocorrelation structure). For a continuous element, such as temperature, humidity, or pressure, and a short time step, say 24 h or less, the temporal dependence may be strong enough to allow forecasting future realizations based on antecedent observations. This is, of course, well known and has been exploited in various time series models (e.g., Brown et al. 1984; Murphy and Katz 1985; Wilks 1995). Typically, such statistical models produce deterministic forecasts for a few steps ahead. A recent example is the periodic autoregressive model of the daily temperature time series by Lund et al. (2006).

Much less known in meteorology are (i) the concept of forecasting probabilistically a time series via a stochastic model (e.g., Campbell and Diebold 2005; Gneiting et al. 2006) and (ii) the concept of fusing a probabilistic forecast produced by a stochastic model with a deterministic forecast produced by a numerical weather prediction (NWP) model, or by a human forecaster in a National Weather Service (NWS) field office, or by a statistical postprocessor of the NWP model output, such as the model output statistics technique of the NWS.

### b. Markov Bayesian processor of forecast

This article presents the Bayesian theory and the meta-Gaussian model for implementing both concepts. Toward this end, we extend the previously developed Bayesian processor of forecast (BPF) for the National Digital Forecast Database (NDFD). The purpose of that BPF (Krzysztofowicz and Evans 2008) is to process a *deterministic forecast* (a point estimate of the predictand) into a *probabilistic forecast* (a distribution function, a density function, and a quantile function). The quantification of uncertainty is accomplished via Bayes theorem, which extracts and fuses two kinds of information from two different sources: (i) Information about the natural variability of the predictand is extracted from a (relatively) long climatic sample of the predictand, and is quantified by a prior distribution function. (ii) Information about the predictive performance of the deterministic forecast system is extracted from a (relatively) short joint sample of the forecast and the predictand, and is quantified by a family of the likelihood functions.

The prior distribution function *G _{k}* of predictand

*W*constitutes a climatic probabilistic forecast of

_{k}*W*; in the application reported herein,

_{k}*W*denotes the maximum temperature on day

_{k}*k*of the year (

*k*= 1, . . . , 365) at a given station. When the stochastic process {

*W*:

_{k}*k*= 1, . . . , 365} is not independent but Markov (of order one), it may be advantageous to formulate a Markov BPF in which both the prior distribution function and the family of the likelihood functions are conditioned on the antecedent observation

*W*

_{k}_{−}

*=*

_{l}*w*

_{k}_{−}

*, the last observation preceding the forecast time when*

_{l}*W*is forecasted with the lead time of

_{k}*l*days. Thus, in the Markov BPF, the prior marginal distribution function

*G*is replaced by the prior

_{k}*l*-step transition distribution function

*H*(·|

_{kl}*w*

_{k}_{−}

*), which constitutes a climatic Markov probabilistic forecast of*

_{l}*W*with lead time

_{k}*l*. In particular,

*H*

_{k}_{1}(·|

*w*

_{k}_{−1}) is the distribution function of

*W*, conditional on the antecedent observation

_{k}*W*

_{k}_{−1}=

*w*

_{k}_{−1}; it is the general stochastic model of the Markov process.

### c. Stochastic–deterministic model fusion

Three research questions arise: How to obtain a flexible yet simple parametric model for the family *H _{kl}* of the

*l*-step transition distribution functions for all lead times

*l*of interest when the stochastic process {

*W*:

_{k}*k*=

*l*, . . . , 365} is Markov and nonstationary, as is typically the case in meteorology? How to incorporate this model into the BPF? What advantages can be expected from the Markov BPF, beyond the advantages of the simpler BPF? The last question may be rephrased in the context of weather forecasting: What advantages can be expected from fusing (i) a climatic probabilistic forecast produced by a stochastic model of the autocorrelated predictand time series and (ii) an operational deterministic forecast produced by a NWP model, or a human forecaster, or a statistical postprocessor?

### d. Modeling approach

The Markov BPF is a natural extension of the BPF developed previously (Krzysztofowicz and Evans 2008) and represents a specialized application of the Bayesian theory of probabilistic forecasting formulated and tested for various time series over the past two decades (e.g., Krzysztofowicz 1983, 1985; Krzysztofowicz and Kelly 2000; Krzysztofowicz and Herr 2001). This Markov BPF is applicable to any continuous predictand. Herein, it is applied to quantify the uncertainty in a deterministic forecast of the daily maximum temperature—one of the predictands selected by the NWS for development of their technique (Peroutka et al. 2005). This deterministic forecast is the official forecast produced by a NWS field office and stored in the NDFD.

The article is organized as follows. Section 2 outlines the theoretic foundation of the Markov BPF, and defines its major components. Section 3 details the modeling and estimation of the first component: the Markov prior (climatic) distribution function (the first research question). Section 4 does the same for the second component: the family of the conditional likelihood functions. Section 5 gives the forecasting equations (the second research question), presents examples of probabilistic forecasts from the Markov BPF, and compares them with examples of probabilistic forecasts from the BPF. Section 6 reports a numerical experiment designed to uncover the role and the impact of the autocorrelation in the predictand time series on probabilistic forecasts (the third research question). Section 7 summarizes the advantages of the Markov BPF.

## 2. Bayesian processor

The Bayesian theory of probabilistic forecasting of time series (Krzysztofowicz 1985) provides the structure for modeling the stochastic dependence between the predictand, the predictor, and the antecedent. A schematic of this structure for forecasting a Markov process two steps ahead is shown in Fig. 1. The multivariate dependence structure is decomposed into the prior dependence structure and the likelihood dependence structure; the posterior dependence structure is derived from them.

### a. Prior uncertainty

Let *W _{k}* denote the

*predictand*on day

*k*of the year (

*k*= 1, . . . , 365). Herein,

*W*is the maximum temperature on day

_{k}*k*at a given station (or a grid point). Suppose the time series of the predictands {

*W*:

_{k}*k*= 1, . . . , 365} forms a

*nonstationary Markov process of order one*. The climatic uncertainty (or the prior uncertainty) about such a process is fully characterized in terms of a sequence of families of 1-step transition density functions {

*r*(·|

_{k}*w*

_{k}_{−1}): all

*w*

_{k}_{−1}} for

*k*= 1, . . . , 365, where

*r*(·|

_{k}*w*

_{k}_{−1}) is the density function of

*W*, conditional on the antecedent observation

_{k}*W*

_{k}_{−1}=

*w*

_{k}_{−}

_{1}. (The periodicity convention is assumed throughout the article, whereby

*k*− 1 = 365 if

*k*= 1; in general,

*k*−

*l*= 365 +

*k*−

*l*if

*k*−

*l*< 1 for any integer

*l*, 1 ≤

*l*≤ 364. For simplicity, 29 February is excluded.)

*r*(·|

_{k}*w*

_{k}_{−1}) gives a climatic Markov forecast of

*W*with the lead time of one day. To obtain a climatic Markov forecast of

_{k}*W*with the lead time of

_{k}*l*days, the density function

*h*(·|

_{kl}*w*

_{k}_{−}

*) of*

_{l}*W*, conditional on observation

_{k}*W*

_{k}_{−}

*=*

_{l}*w*

_{k}_{−}

*must be derived. It is, in fact, the*

_{l}*l*-step transition density function, which can be obtained recursively:and for

*l*= 2, 3, . . . ,

*L*(

*L*< 365),Given the antecedent observation

*W*

_{k}_{−}

*=*

_{l}*w*

_{k}_{−}

*, the function*

_{l}*h*(·|

_{kl}*w*

_{k}_{−}

*) constitutes the*

_{l}*Markov prior density function*of

*W*. It quantifies the uncertainty about the predictand

_{k}*W*which exists after the observation

_{k}*w*

_{k}_{−}

*is collected but before a forecast is issued by the NWS.*

_{l}### b. Forecast uncertainty

Let *X _{kl}* denote a

*predictor*of

*W*, whose realization is available with the lead time of

_{k}*l*days. Herein, the realization

*X*=

_{kl}*x*is a deterministic forecast of the maximum temperature

_{kl}*W*on day

_{k}*k*issued by the NWS with the lead time of

*l*days. The forecast uncertainty or, more specifically, the stochastic dependence between the predictor

*X*and the predictand

_{kl}*W*, is characterized by the family of conditional density functions {

_{k}*f*

_{kl}(·|

*w*

_{k},

*w*

_{k−l}): all

*w*

_{k},

*w*

_{k−l}}, where

*f*(·|

_{kl}*w*,

_{k}*w*

_{k}_{−}

*) is the density function of*

_{l}*X*, conditional on the hypothesis that the realization of the predictand is

_{kl}*W*=

_{k}*w*, and given that the antecedent observation is

_{k}*W*=

_{k−l}*w*. Then, for fixed realizations

_{k−l}*X*=

_{kl}*x*and

_{kl}*W*=

_{k−l}*w*, there exists a function

_{k−l}*f*(

_{kl}*x*|·,

_{kl}*w*

_{k}_{−}

*); it is called the conditional likelihood function of*

_{l}*W*. More generally, there exists a

_{k}*family of the conditional likelihood functions*{

*f*(

_{kl}*x*|·,

_{kl}*w*

_{k}_{−}

*): all*

_{l}*x*,

_{kl}*w*

_{k}_{−}

*}.*

_{l}The family of the conditional likelihood functions *f _{kl}* is doubly nonstationary because, in general, the performance of a deterministic forecast varies throughout the year (hence index

*k*) and with the lead time (hence index

*l*).

### c. Bayesian revision

*k*(

*k*= 1, . . . , 365) and lead time

*l*(

*l*= 1, . . . ,

*N*), the Bayesian procedure for information fusion and revision of uncertainty involves two steps. First, the

*expected density function κ*(·|

_{kl}*w*

_{k}_{−}

*) of predictor*

_{l}*X*, conditional on an antecedent observation

_{kl}*W*

_{k}_{−}

*=*

_{l}*w*

_{k}_{−}

*, is derived via the total probability law:Second, the*

_{l}*posterior density function*

*ϕ*

_{kl}(·|

*x*

_{kl},

*w*

_{k−l}) of predictand

*W*, conditional on a deterministic forecast

_{k}*X*=

_{kl}*x*and an antecedent observation

_{kl}*W*

_{k}_{−}

*=*

_{l}*w*

_{k}_{−}

*, is derived via Bayes theorem:*

_{l}The posterior density function constitutes the probabilistic forecast of the predictand *W _{k}* with the lead time of

*l*days; it quantifies the uncertainty about

*W*that remains after the NWS collects the antecedent observation

_{k}*W*

_{k}_{−}

*=*

_{l}*w*

_{k}_{−}

*and issues deterministic forecast*

_{l}*X*=

_{kl}*x*.

_{kl}*posterior distribution function*Φ

_{kl}(·|

*x*

_{kl},

*w*

_{k−l}) of predictand

*W*is defined byThe inverse function Φ

_{k}^{−1}

_{kl}(·|

*x*

_{kl},

*w*

_{k−l}) is called the

*posterior quantile function*. For any number

*p*, such that 0 <

*p*< 1, the

*posterior p*-

*probability quantile*of predictand

*W*is the quantity

_{k}*w*such that Φ

_{kp}*(*

_{kl}*w*|

_{kp}*x*,

_{kl}*w*

_{k}_{−}

*) =*

_{l}*p*. Therefrom,

Equations (1)–(4) define the theoretic structure of the Markov BPF. Equations (4)–(6) specify the three outputs, each of which constitutes the probabilistic forecast of *W _{k}*, given deterministic forecast

*x*and antecedent observation

_{kl}*w*

_{k}_{−}

*.*

_{l}### d. Bayesian meta-Gaussian model

The Markov BPF, being a natural extension of the BPF described by Krzysztofowicz and Evans (2008), will be implemented likewise—by adapting the meta-Gaussian model of Krzysztofowicz and Kelly (2000). This model has the structural properties that are quintessential for correct representation of meteorological processes: (i) Each element is allowed to be nonstationary. (ii) The predictand *W _{k}* and the predictor

*X*are allowed to have distribution functions of any form. (iii) The temporal dependence structure between

_{kl}*W*

_{k}_{−1}and

*W*is allowed to be nonlinear and heteroscedastic. (iv) The likelihood dependence structure between

_{k}*X*,

_{kl}*W*, and

_{k}*W*

_{k}_{−}

*is pairwise and is allowed to be nonlinear and heteroscedastic (which is the case with most meteorological forecasts, especially for longer lead times).*

_{l}### e. Modeling and estimation

The remaining sections describe the modeling process, the estimation procedure, the goodness of fit to data, the statistical properties, and the practical advantages of the meta-Gaussian Markov BPF. In all illustrations, the predictand is the daily maximum temperature; the forecast lead times are 24 h (1 day), 96 h (4 days), and 168 h (7 days) after 0000 UTC; the forecast point is Savannah, Georgia.

## 3. Markov prior distribution function

The development of a meta-Gaussian model for the family of 1-step transition density functions involves two steps: (i) modeling the marginal distribution functions and (ii) modeling the dependence structure. The first step was performed by Krzysztofowicz and Evans (2008) and is summarized in the next section. The second step is detailed in the subsequent sections.

### a. Marginal distribution functions

*W*(

_{k}*k*= 1, . . . , 365), define the mean, the variance, and the marginal distribution function:where

*E*stands for expectation,

*P*stands for probability, and

*w*is any point in the sample space of

*W*. The standardized maximum temperature,has stationary first two moments,

_{k}*E*(

*W*′

_{k}) = 0 and Var(

*W*′

_{k}) = 1, as is well known, and an approximately stationary marginal distribution function

*G*′, as shown by Krzysztofowicz and Evans (2008): for

*k*= 1, . . . , 365 and at any point

*w*,where

*G*′(

*w*′) =

*P*(

*W*′

_{k}≤

*w*′) at any point

*w*′ in the standardized sample space. The estimation of

*m*,

_{k}*s*, and

_{k}*G*′ is detailed in the previous article.

### b. Meta-Gaussian dependence model

*W*′

_{k}:

*k*= 1, . . . , 365}. In general, this dependence structure may be nonlinear and heteroscedastic, while the marginal distribution function

*G*′ may be of any form. To allow for such general properties, we employ the meta-Gaussian model of Krzysztofowicz and Kelly (2000). At the heart of this model is the

*normal quantile transform*(NQT):where

*Q*is the standard normal distribution function and

*Q*

^{−1}is its inverse. The NQT guarantees that the marginal distribution of the transformed variate

*V*is standard normal (Kelly and Krzysztofowicz 1995, 1997). Our modeling hypothesis is that the 1-step transition distribution from (

_{k}*V*

_{k−1}=

*υ*

_{k−1}) to

*V*is normal as well. This hypothesis has been validated empirically for the river stage process (Krzysztofowicz and Kelly 2000; Krzysztofowicz and Herr 2001), and will be tested herein for the temperature process.

_{k}*V*

_{k}:

*k*= 1, . . . , 365} iswhere Θ

*is a variate stochastically independent of*

_{k}*V*

_{k}_{−1}and normally distributed with mean zero and variance 1 −

*c*

^{2}

_{k}, and

*c*is the Pearson’s product-moment autocorrelation coefficient:It follows that for any

_{k}*p*such that 0 <

*p*< 1, the

*p*-probability quantile of

*V*, conditional on observation

_{k}*V*

_{k−1}=

*υ*

_{k−1}, isThus for any

*p*, the conditional quantile of

*V*is a linear function of the antecedent observation

_{k}*υ*

_{k}_{−1}. In particular, the conditional median is

*υ*

_{k}(0.5|

*υ*

_{k−1}) =

*c*

_{k}

*υ*

_{k−1}, and is equal to the conditional mean

*E*(

*V*

_{k}|

*V*

_{k−1}=

*υ*

_{k−1}) =

*c*

_{k}

*υ*

_{k−1}.

*p*-probability quantile of

*W*′

_{k}, conditional on observation

*W*

_{k−1}′ =

*w*

_{k−1}′, is obtained by embedding the NQT in Eq. (13):

*c*is a fully efficient measure of stochastic dependence between

_{k}*V*

_{k}_{−}

_{1}and

*V*. Under the meta-Gaussian model,

_{k}*c*remains a fully efficient measure of stochastic dependence between the standardized variates

_{k}*W*

_{k−1}′ and

*W*′

_{k}, as well as between the original variates

*W*

_{k}_{−1}and

*W*(Kelly and Krzysztofowicz 1997); it can be transformed into the Spearman’s rank autocorrelation coefficient:

_{k}### c. Empirical analyses

#### 1) Joint samples

Let {(*w _{k}*

_{−}

_{1}(

*n*),

*w*(

_{k}*n*)):

*n*= 1, . . . ,

*M*} be the climatic joint sample of the maximum temperatures (

*W*

_{k}_{−1},

*W*) on two consecutive days. Herein, it is an augmented sample in that for each day

_{k}*k*data were pooled from the consecutive five days centered on day

*k*. Thus, the record of 119 yr (from 1874 to 2001, with 9 yr missing) gave the sample size

*M*= 119 × 5 = 595.

*k*and

*k*− 1, every realization from the climatic joint sample is first standardized,and then processed through the NQT with the stationary marginal distribution functionThe realizations are reassembled to form the transformed climatic joint sample {(

*υ*

_{k−1}(

*n*),

*υ*

_{k}(

*n*)):

*n*= 1, . . . ,

*M*} for each day

*k*(

*k*= 1, . . . , 365). This sample is used to estimate the Pearson’s product-moment correlation coefficient

*c*between the standard normal variates

_{k}*V*

_{k}_{−1}and

*V*, from which the Spearman’s rank correlation coefficient

_{k}*ρ*between the original variates

_{k}*W*

_{k}_{−1}and

*W*is calculated according to Eq. (15).

_{k}#### 2) Model validation

Validation of the meta-Gaussian dependence structure amounts to checking the three requirements of the Gaussian model (11):

*Linearity*—the regression of*V*on_{k}*V*_{k}_{−1}must be linear.*Homoscedasticity*—the variance of the residual Θ=_{k}*V*−_{k}*c*_{k}V_{k}_{−1}must be independent of*V*_{k}_{−1}.*Normality*—the distribution function of Θ_{k}must be normal (Gaussian) with mean 0 and variance 1 −*c*^{2}_{k}.

These requirements can be validated graphically (as well as through formal hypothesis testing); the results are shown for *k* = 32 214.

Figure 2 shows the empirical dependence structure (the scatterplot of 595 points, some of which overlap) and the parametric dependence structure (the conditional quantile functions for *p* = 0.1, 0.5, 0.9). In the standardized sample space (Figs. 2a,b), the dependence structure is slightly nonlinear and heteroscedastic; the conditional median of *W* ′_{k} plots as a slightly concave–convex function of *w*_{k−1}′; and the width of the 80% central credible interval decreases with *w*_{k−1}′. In the normal sample space (Figs. 2c,d), the dependence structure is linear (as the conditional median of *V _{k}* varies linearly with

*υ*

_{k}_{−1}) and homoscedastic [as the width of the 80% central credible interval,

*υ*

_{k}(0.9|

*υ*

_{k−1}) −

*υ*

_{k}(0.1|

*υ*

_{k−1}), is constant with

*υ*

_{k}_{−1}]. The homoscedasticity is validated in Figs. 3a,b: the scatter of residuals appears independent of the predictor value. The normality is validated in Figs. 3c,d: the quantile–quantile plot of the residuals is predominantly linear. (The grid pattern in Figs. 2b,d and its effect on Figs. 3b,d are the artifacts of the precision of measurement, 1°F, which appear when the observation variability is low and the sample size is large. The pattern affects especially the tails in Fig. 3d, where realizations are sparse.)

#### 3) Nonstationary autocorrelation

The time series of the autocorrelation coefficients {*c _{k}*:

*k*= 1, . . . , 365} at Savannah (Fig. 4) reveals two properties. First, there is a relatively high variability of estimates from day to day; for operational forecasting, the time series can be smoothed and approximated by fourth-order Fourier series expansion, as shown in Fig. 4. Second, the fitted function and the envelope of estimates show that the autocorrelation of the daily maximum temperatures is moderate (between 0.5 and 0.75) and periodic, with two maxima (a higher in July and a lower in January) and two minima (a lower in April and a higher in October). In other words, the autocorrelation is the strongest in the middle of a season (warm, cold) and the weakest in the transition between the seasons.

The main implication for further modeling is that the process of the standardized daily maximum temperatures {*W* ′_{k}: *k* = 1, . . . , 365} is nonstationary, even though it has a stationary marginal distribution function *G*′.

### d. Structural assumptions

The analyses of the climatic samples reported in the preceding section and in the previous article (Krzysztofowicz and Evans 2008) support four assumptions upon which the meta-Gaussian model for the Markov prior distribution function will be built.

- The predictand time series {
*W*:_{k}*k*= 1, . . . , 365} forms a*nonstationary Markov process of order one*. After the standardization, this process has distribution functions with the following properties. - The marginal distribution function
*G*′ is*stationary*. - The family of 1-step transition distribution functions
*R*′_{k}is*nonstationary*. - The family of 1-step transition distribution functions is
*locally stationary*: between day*k*−*l*and day*k*, every 1-step transition is governed by the same family of 1-step transition distribution functions*R*′_{k}; this applies to every*k*∈ {1, . . . , 365} and every*l*∈ {1, . . . ,*L*}, when*L*is small.

Assumption 1 formalizes the basic dependence structure present in daily time series of many meteorological variates, for example the temperature (Lund et al. 2006). Assumption 2 has empirical support for the daily maximum temperature (Krzysztofowicz and Evans 2008); in general, it should be viewed as an approximation that may be reasonable for operational forecasting. Assumption 3 recognizes the empirical evidence supplied by the time series of the autocorrelation coefficients (Fig. 4). Assumption 4 states an approximation. Its graphical interpretation is that the plot of the autocorrelation coefficients in Fig. 4 can be approximated stepwise, using the step width of *L* days. Given the day-to-day variability of *c _{k}* within a relatively narrow envelope, it appears reasonable to assume that

*c*does not vary appreciably within

_{k}*L*= 14 days.

### e. Transition distribution functions

*k*∈ {1, . . . , 365}, given the 1-step autocorrelation coefficient Cor(

*V*

_{k}_{−1},

*V*) =

_{k}*c*, the

_{k}*l*-step autocorrelation coefficient isThus, a single function, such as the Fourier series of

*c*in Fig. 4, is sufficient to model the autocorrelation coefficients for all lead times. This property is obviously convenient for operational forecasting, and it ensures monotonicity of the

_{k}*l*-step autocorrelation:

*c*

^{l}

_{k}converges toward zero as

*l*increases. For example,

*c*= 0.50 (the lower bound in Fig. 4) yields

_{k}*c*

^{4}

_{k}= 0.06 and

*c*

^{7}

_{k}= 0.01, whereas

*c*= 0.75 (the upper bound in Fig. 4) yields

_{k}*c*

^{4}

_{k}= 0.32 and

*c*

^{7}

_{k}= 0.13. Table 1 reports the

*c*values used later in the examples.

_{k}*W*:

_{k}*k*= 1, . . . , 365} is constructed (Krzysztofowicz and Kelly 2000). For any day

*k*∈ {1, . . . , 365} and any lead time

*l*∈ {1, . . . ,

*L*}, the prior

*l*-step transition distribution function takes the formFor any

*p*such that 0 <

*p*< 1, the prior

*p*-probability quantile of

*W*, conditional on observation

_{k}*W*

_{k}_{−}

_{l}= w_{k}_{−}

*isThe prior*

_{l},*l*-step transition density function takes the form

Figure 5 shows two examples. For a short lead time (*l* = 1 in Figs. 5a,c), the prior 1-step transition density functions are shifted relative to the prior marginal density function; the shift direction and magnitude depend upon the antecedent observation *w*_{31}. There is also a reduction in the prior variance of *W*_{32}; this reduction depends upon *w*_{31} because of the heteroscedasticity of the Markov prior dependence structure (as revealed in Fig. 2). For a long lead time (*l* = 4 in Figs. 5b,d), the effect of the antecedent observation *w*_{31} is negligible. This is explained by the declining autocorrelation coefficient (Table 1): *c*_{32} = 0.572, but *c*^{4}_{35} = 0.147.

## 4. Conditional likelihood function

The formulation and estimation of the meta-Gaussian model for the family of conditional likelihood functions follow the methodology of Krzysztofowicz and Evans (2008). Therefore, details are omitted and only the equations which define the parameters of the model are presented.

### a. Likelihood parameters

*k*∈ {1, . . . , 365} and each lead time

*l*∈ {1, . . . ,

*L*}, there are now three variates: the predictand

*W*having the marginal distribution function

_{k}*G*, the antecedent

_{k}*W*

_{k}_{−}

*having the marginal distribution function*

_{l}*G*

_{k}_{−}

*, and the predictor*

_{l}*X*having the marginal distribution function

_{kl}K

_{kl}. Each variate is subjected to the NQT:The likelihood parameters

*a*,

_{kl}*b*,

_{kl}*d*, and

_{kl}*σ*are defined by the Gaussian model:and

_{kl}Although the likelihood parameters are indexed by *k*, their values need not change from day to day because the performance of a forecasting system does not change, in a statistical sense, every 24 h. Thus, the frequency of updating the likelihood parameters may be dictated by operational considerations. For instance, under the adaptive scheme for sampling, estimation, and forecasting (Krzysztofowicz and Evans 2008), the standardized time series of each variate is assumed to be stationary and ergodic within the *sampling window* (about 90–120 days) and the subsequent *forecasting window* (about 5–10 days). Consequently, the likelihood parameters are reestimated every 5–10 days, after the sampling window shifts forward.

Table 2 reports the estimates obtained from two sampling windows. Of particular interest here are the values of *d _{kl}*. They are significantly different from zero for

*l*= 1, 4 in the cool season and for

*l*= 1 in the warm season. In other words, the antecedent observation explains (or predicts), in part, the error of a deterministic forecast up to 4 days ahead. Hence, the conditioning of the likelihood function on the antecedent observation

*W*

_{k}_{−}

*=*

_{l}*w*

_{k}_{−}

*, as dictated by the theory of Markov BPF (section 2c), serves not merely to ensure the coherence but also to improve the probabilistic forecast.*

_{l}### b. Forecast informativeness

*W*

_{k}_{−}

*on the probabilistic forecast of*

_{l}*W*, it is necessary to characterize the informativeness of the predictor

_{k}*X*alone. For this purpose, the likelihood function from the original BPF (Krzysztofowicz and Evans 2008) must be recalled. Its parameters

_{kl}*ȧ*

_{kl},

*ḃ*

_{kl}, and

*σ̇*

_{kl}are defined byand the

*informativeness score*of predictor

*X*with respect to predictand

_{kl}*W*is given byThe score is bounded, 0 ≤ IS

_{k}*≤ 1, with IS*

_{kl}*= 0 for an uninformative predictor, and IS*

_{kl}*= 1 for a perfect predictor. The quantity*

_{kl}*γ*is the Pearson’s product-moment correlation coefficient:Table 3 reports the estimates obtained from two sampling windows.

_{kl}## 5. Posterior distribution from Markov prior

### a. Posterior parameters

*t*

^{2}

_{kl}= 1 −

*c*

^{2l}

_{k}, and thenThese parameters are for the lead time of

*l*days, and are valid for every day

*k*within the forecasting window, as explained in section 4a.

### b. Forecasting equations

Given the antecedent observation *W _{k}*

_{−}

*=*

_{l}*w*

_{k}_{−}

*and the deterministic forecast*

_{l}*X*=

_{kl}*x*, the probabilistic forecast of

_{kl}*W*on day

_{k}*k*with the lead time of

*l*days is specified by one of the following constructs (Krzysztofowicz and Kelly 2000).

### c. Posterior functions

Figures 6 and 7 show examples of probabilistic forecasts of the daily maximum temperature in Savannah. (The notation in the figures is complete for the density functions but abbreviated for the distribution functions because of lack of space.)

For a cool day and the 24-h lead time (Fig. 6), the prior 1-step transition density function (Fig. 6d) differs from the prior marginal density function (Fig. 6c), but the two posterior density functions are not significantly affected by the choice of the prior density function. The reason is that the informativeness of the deterministic forecast (*γ*_{32,1} = 0.920) is high enough to render even a moderate autocorrelation of the predictand series (*c*_{32} = 0.572) useless.

For a warm day and the 96-h lead time (Fig. 7), the prior 4-step transition density function (Fig. 7d) differs from the prior marginal density function (Fig. 7c), and the two posterior density functions are significantly affected by the choice of the prior density function. Given the forecast *X*_{217,4} = 80°F, the posterior density function resulting from the Markov prior density function (Fig. 7d) is flatter and shifted toward the antecedent observation *W*_{213} = 105°F. The larger posterior variance may be explained by the large difference between the forecasted temperature and the antecedent observation. Given the forecast *X*_{217,4} = 95°F, the posterior density function resulting from the Markov prior density function (Fig. 7d) is sharper and also shifted toward *W*_{213} = 105°F. Overall, the Markov prior density function has a significant effect on the posterior density function. The reason is that the informativeness of the deterministic forecast (*γ*_{217,4} = 0.725) is low enough to make even a weak autocorrelation of the predictand series (*c*^{4}_{217} = 0.248) useful.

In summary, the usage of the Markov prior density function instead of the marginal prior density function may, or may not, affect (or improve) the resultant probabilistic forecast. It is important, therefore, to identify the conditions under which a significant effect occurs.

## 6. Comparison of forecasting models

### a. Experimental design

*G*was chosen for every day in the forecasting window. [In the notation of Krzysztofowicz and Evans (2008, their appendix) its parameters (

*α*= 55,

*β*= 6,

*η*= 12) were representative of the cool season in Savannah.] Similarly, to eliminate forecast bias, the marginal distribution function of the forecast variate was set equal to the prior marginal distribution function of the predictand. Because no specific day is being considered, the

*k*subscript can be dropped from the notation. Thus, the following distribution functions are equivalent for all

*k*and

*l*in the experiment:

*a*= 1,

*b*= 0, and

*d*= 0. As a result, the informativeness score of the predictor

*X*is given by the equationand, with

*t*

^{2}

_{l}= 1 −

*c*

^{2l}, the posterior parameters are given by the equationsFinally, given a deterministic forecast

*X*=

*x*and an antecedent observation

*W*

_{0}=

*w*

_{0}, the posterior quantile of

*W*corresponding to probability

*p*(0 <

*p*< 1) is specified by the equation

One of the simplest probabilistic forecasts, which conveys a minimum information about uncertainty, consists of the posterior median, *w*_{0.5}, and the posterior 50% central credible interval (CCI), given by (*w*_{0.25}, *w*_{0.75}), whose width, *w*_{0.75} − *w*_{0.25}, is a measure of the posterior uncertainty. In the experiment, this forecast is assumed to be well calibrated (Krzysztofowicz and Sigrest 1999).

### b. Impacts of autocorrelation

Figure 8 shows how the width of the posterior 50% CCI varies with *γ*, *c*, and *l*. The horizontal line in each graph is the case with *c ^{l}* = 0, effectively the case of the BPF with the marginal prior distribution function. The intercept of this line decreases with

*γ*; the highest intercept is 13.4°F, which is the width of the prior 50% CCI under the marginal prior distribution function

*G*.

In Figs. 8a–c, the antecedent temperature *W*_{0} = 64°F equals the climatic median of *W*, and the forecast temperature *X* = 60°F is nearby. For every value of *γ* and *c* (not only those shown in the three graphs), the posterior 50% CCI resulting from the Markov prior distribution is never wider than the posterior 50% CCI resulting from the marginal prior distribution.

In Figs. 8d–f, the antecedent temperature *W*_{0} = 46°F equals the 5th percentile of *W* under *G* and thus predicts a rather cold day, whereas the forecast temperature *X* = 78°F equals the 95th percentile and thus predicts a rather warm day. With the two predictor values being contradictory, there exists a region of *γ* and *c* values producing the posterior 50% CCI that is wider than the posterior 50% CCI resulting from the marginal prior distribution. This illustrates an important property of probabilistic forecasts based on two or more predictors: relative to the degree of uncertainty indicated by a single predictor, the degree of uncertainty indicated by two predictors may be smaller, or equal, or larger, depending on the predictor values. Loosely speaking, the predictor values may be either “confirmatory” (as in Figs. 8a–c), or “contradictory” (as in Figs. 8d–f).

With regard to the main objective of this study, Fig. 8 demonstrates that the impact of the autocorrelation in the predictand time series on the posterior uncertainty (i) is nonlinear and (ii) depends on the informativeness of the predictor: the less informative the predictor is, the greater is the impact of the autocorrelation.

Figure 9 depicts the probabilistic forecasts corresponding to the lowest curve (*l* = 1) and the highest curve (*l* = 7) in Figs. 8d–f. A forecast is represented by three posterior quantiles (*w*_{0.25}, *w*_{0.5}, *w*_{0.75}), given the particular values of *γ*, *c*, and *l*, and given “contradictory” predictor values *X* = 78°F, *W*_{0} = 46°F. The posterior median *w*_{0.5} approaches 46°F as *c* increases while *γ* is fixed, and approaches 78°F as *γ* increases while *c* is fixed. Obviously, the day 1 forecasts are more sensitive to changes in *c* than the day 7 forecasts.

Overall, the example in Fig. 9 demonstrates that the autocorrelation of the predictand time series does have an impact on the location of the posterior quantiles regardless of the *c* value (when *l* is small) and on the degree of posterior uncertainty when *c* is large enough.

### c. Comprehensive evaluation

#### 1) Methodology

There are four basic models for producing probabilistic forecasts that differ in their use of climatic data and deterministic forecasts:

- The climatic model—The forecast of
*W*is simply the marginal prior (climatic) distribution function_{k}*G*._{k} - The Markov climatic model—The forecast of
*W*is the Markov prior (climatic) distribution function_{k}*H*(·|_{k}*w*_{k}_{−1}), conditional on the antecedent observation*W*_{k}_{−1}=*w*_{k}_{−1}. - The BPF—The forecast of
*W*is the posterior distribution function Φ_{k}(·|_{k}*x*), conditional on the deterministic forecast_{k}*X*=_{k}*x*, and derived from the marginal prior distribution function_{k}*G*._{k} - The Markov BPF—The forecast of
*W*is the posterior distribution function Φ_{k}(·|_{k}*x*,_{k}*w*_{k}_{−1}), conditional on the deterministic forecast*X*=_{k}*x*, and derived from the Markov prior distribution function_{k}*H*(·|_{k}*w*_{k}_{−1}), which is conditional on the antecedent observation*W*_{k}_{−1}=*w*_{k}_{−1}.

*t*=

*w*

_{0.75}−

*w*

_{0.25}be the width of a 50% CCI, let

*m*(

*m*= 1, 2, 3, 4) be the index of the model, and let

*t*(

*m*) be the width of the 50% CCI under the distribution function output as the forecast by model

*m*. Then the percent reduction in the width of the 50% CCI achieved by the Markov BPF (

*m*= 4) relative to model

*m*(

*m*= 1, 2, 3) is

For each *m*, the surface of *R*_{4}* _{m}* in the space of the informativeness score

*γ*and the autocorrelation coefficient

*c*is depicted by isoquants drawn at 10% increments. Of course, the values of

*R*

_{4}

*depend upon the values of*

_{m}*X*and

*W*

_{0}; however, the pattern of isoquants is essentially invariant. For this reason, only the results for

*X*= 60°F and

*W*

_{0}= 64°F are discussed.

#### 2) Markov BPF versus climatic model

This comparison serves to evaluate gains from two simultaneous predictors, *X* and *W*_{0}. The surface of *R*_{41} (Fig. 10) shows essentially a symmetric influence of *γ* and *c*: when *γ* = *c*, the predictors *X* and *W*_{0} are essentially exchangeable. For example, the Markov BPF with *c* = 0 and is *γ* = 0.6 (effectively using only predictor *X*) gives *R*_{41} of about 20%, which is the same as would be given by the Markov BPF with *c* = 0.6 and *γ* = 0 (effectively using only predictor *W*_{0}).

Finally, when either *γ* increases, or *c* increases, or both *γ* and *c* increase, the spacing of the *R*_{41} isoquants decreases, implying the increasing marginal gain in uncertainty reduction.

#### 3) Markov BPF versus Markov climatic model

This comparison serves to evaluate gains from the deterministic forecast *X* as the second predictor, adjoining the antecedent observation *W*_{0}. The surface of *R*_{42} (Fig. 11) shows (i) that, for a fixed *c*, an increase in *γ* reduces the uncertainty; (ii) that, to achieve a sizable reduction (say 10%), *γ* must be larger than a threshold (which increases with *c*); (iii) that the marginal gain in *R*_{42} increases as *γ* approaches 1; and (iv) that, as the autocorrelation rises, the informativeness must also be rising, at an increasing rate, to maintain a constant gain in *R*_{42}.

#### 4) Markov BPF versus BPF

This comparison serves to evaluate gains from the antecedent observation *W*_{0} as the second predictor, adjoining the deterministic forecast *X*. The surface of R_{43} (Fig. 12) shows (i) that, for a fixed *γ*, an increase in *c* reduces the uncertainty; (ii) that, to achieve a sizable reduction (say 10%), *c* must be larger than a threshold (which increases with *γ*); (iii) that the marginal gain in *R*_{43} increases as *c* approaches one; and (iv) that, as the informativeness rises, the autocorrelation must also be rising, at an increasing rate, to maintain a constant gain in *R*_{43}.

Finally, Fig. 13 shows two surfaces of *R*_{43}, which result from “contradictory” predictor values, giving an idea about the range of variability of *R*_{43}, and confirming the essential invariance of the pattern of isoquants.

### d. Conclusions

The Markov BPF offers several advantages over the BPF and, of course, over the two climatic models. When the informativeness of the deterministic forecast is high but the autocorrelation of the predictand series is low, the BPF automatically gives more weight to the deterministic forecast. When the informativeness is low but the autocorrelation is high, the BPF gives more weight to the antecedent observation. Thus in principle, the Markov BPF should always be preferred over the BPF because it can automatically account for any level of forecast informativeness and any degree of predictand autocorrelation. But as the development of the Markov BPF entails a greater cost than the development of the BPF, the following practical and preliminary (because of the limited scope of this study) guidelines may be offered. Depending upon the level of informativeness of the deterministic forecast, the Markov BPF should be considered a contender for operational implementation when the autocorrelation of the predictand time series is approximately between 0.3 and 0.6, and should be considered the preferred processor when the autocorrelation exceeds 0.6. (Nota bene: the above values of the autocorrelation coefficient pertain to variates which have been suitably transformed and conform to the multivariate Gaussian distribution.)

## 7. Closure

Three contributions to the field of probabilistic forecasting of nonstationary, discrete-time, continuous-state stochastic processes in meteorology have been presented. The first one is the meta-Gaussian Markov model; it characterizes the (nonstationary) autocorrelation of the process and provides, for each day (or some other suitable time step) a family of the (prior) *l*-step transition distribution functions, from which a climatic probabilistic forecast with the lead time of *l* days can be obtained, given an antecedent observation. The second one is the meta-Gaussian Markov BPF; it fuses the (prior) climatic probabilistic forecast with a deterministic forecast produced by any system (such as a numerical weather prediction model, a human forecaster, or a statistical postprocessor), and outputs a probabilistic forecast of the predictand; this forecast is in the form of a (posterior) *l*-step transition distribution function, which quantifies the uncertainty about the predictand that remains, given the antecedent observation and the deterministic forecast. The third one is the demonstration that the climatic autocorrelation of the predictand time series, when suitably exploited within the Bayesian forecast processor, can play a significant role in quantifying and in reducing the meteorological forecast uncertainty. Further research should, therefore, consider extending the meta-Gaussian Markov BPF so that a probabilistic forecast could be obtained from an ensemble of deterministic forecasts.

## Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant ATM-0641572, “New Statistical Techniques for Probabilistic Weather Forecasting.” The Meteorological Development Laboratory of the National Weather Service provided the data.

## REFERENCES

Brown, B. G., , R. W. Katz, , and A. H. Murphy, 1984: Time series models to simulate and forecast wind speed and wind power.

,*J. Climate Appl. Meteor.***23****,**1184–1195.Campbell, S. D., , and F. X. Diebold, 2005: Weather forecasting for weather derivatives.

,*J. Amer. Stat. Assoc.***100****,**6–16.Gneiting, T., , K. Larson, , K. Westrick, , M. G. Genton, , and E. Aldrich, 2006: Calibrated probabilistic forecasting at the stateline wind energy center: The regime-switching space-time method.

,*J. Amer. Stat. Assoc.***101****,**968–979.Kelly, K. S., , and R. Krzysztofowicz, 1995: Bayesian revision of an arbitrary prior density.

*Proc. Section on Bayesian Statistical Science,*Alexandria, VA, American Statistical Association, 50–53.Kelly, K. S., , and R. Krzysztofowicz, 1997: A bivariate meta-Gaussian density for use in hydrology.

,*Stochastic Hydrol. Hydraul.***11****,**17–31.Krzysztofowicz, R., 1983: A Bayesian Markov model of the flood forecast process.

,*Water Resour. Res.***19****,**1455–1465.Krzysztofowicz, R., 1985: Bayesian models of forecasted time series.

,*J. Water Resour. Bull.***21****,**805–814.Krzysztofowicz, R., 1992: Bayesian correlation score: A utilitarian measure of forecast skill.

,*Mon. Wea. Rev.***120****,**208–219.Krzysztofowicz, R., , and A. A. Sigrest, 1999: Calibration of probabilistic quantitative precipitation forecasts.

,*Wea. Forecasting***14****,**427–442.Krzysztofowicz, R., , and K. S. Kelly, 2000: Hydrologic uncertainty processor for probabilistic river stage forecasting.

,*Water Resour. Res.***36****,**3265–3277.Krzysztofowicz, R., , and H. D. Herr, 2001: Hydrologic uncertainty processor for probabilistic river stage forecasting: Precipitation-dependent model.

,*J. Hydrol.***249****,**46–68.Krzysztofowicz, R., , and W. B. Evans, 2008: Probabilistic forecasts from the National Digital Forecast Database.

,*Wea. Forecasting***23****,**270–289.Lund, R., , Q. Shao, , and I. Basawa, 2006: Parsimonious periodic time series modeling.

,*Aust. N. Z. J. Stat.***48****,**33–47.Murphy, A. H., , and R. W. Katz, 1985:

*Probability, Statistics, and Decision Making in the Atmospheric Sciences*. Westview Press, 545 pp.Peroutka, M. R., , G. J. Zylstra, , and J. L. Wagner, 2005: Assessing forecast uncertainty in the National Digital Forecast Database. Preprints,

*21st Conf. on Weather Analysis and Forecasting,*Washington, DC, Amer. Meteor. Soc., P2B.3. [Available online at http://ams.confex.com/ams/pdfpapers/94464.pdf.].Wilks, D. S., 1995:

*Statistical Methods in the Atmospheric Sciences: An Introduction*. Academic Press, 467 pp.

## APPENDIX

### Forecasting Experiment

#### Objective and design

The three contributions presented in this article are theoretic. How to implement them effectively in operational forecasting is a separate issue that cannot be treated thoroughly in the same article, if only because of its length. Yet a reviewer craved some verification results. Therefore, we performed a simple forecasting experiment using solely the data and the estimates already reported. The objective of this experiment is to illustrate the *coherence* and the *robustness* of our Bayesian theory on real data.

The four basic models compared in section 6c are employed to produce probabilistic forecasts of the daily maximum temperature in Savannah. The parameters of these models are set to the estimates reported in this article. To recall, the climatic model and the Markov climatic model have their parameters estimated (section 3) for each day of the year from a sample of size *M* = 595 recorded in 119 yr (1874–2001). The BPF and the Markov BPF have their likelihood parameters (Tables 2 and 3) estimated for a cool season and a warm season, and for each lead time, from a sample of size *N*, varying between 38 and 116, and recorded in 1.5 years: October 2004–January 2005 for cool season and April–July 2005 for warm season.

Forecasts with lead time of *l* days (*l* = 1, 4, 7) are next produced by each of the four models for every day for which the joint realization (*x _{kl}*,

*w*,

_{k}*w*) is available but was not included in the joint sample for the likelihood parameter estimation. Thereby the

_{k−l}*verification sample*comprises days from 9 months: (i) February–March 2005 and October 2005–February 2006 in cool season and (ii) August–September 2005 in warm season. Four of the months (February, March, August, September) were not represented at all in the joint sample for the likelihood parameter estimation. Thus, the test is unfavorable to the BPF and the Markov BPF because it presumes the stationarity of the likelihood function during two months beyond the sampling window—which is a gross approximation, especially for longer lead times (here

*l*= 4, 7), as evidenced by the differences in the likelihood parameter estimates for the cool season and the warm season (Tables 2 and 3).

#### Verification measures

*p*-probability quantiles of

*W*(

_{k}*p*= 0.25, 0.5, 0.75). The

*calibration*of

*p*-probability quantile is evaluated in terms of

*r*—the relative frequency with which the quantile is not exceeded by the predictand realization. The calibration of forecast is evaluated in terms of the

_{p}*calibration score*(Krzysztofowicz and Sigrest 1999):The score is bounded, 0 ≤ CS ≤ 0.677, with CS = 0 being the best; it represents the Euclidean distance between the empirical and the normative nonexceedance probabilities.

The *sharpness* of forecast (equivalently, the degree of uncertainty indicated by the forecast about the predictand) is evaluated in terms of AW—the *average width* of the 50% CCI, in degrees Fahrenheit. Of course, AW is meaningful provided the 50% CCI is well calibrated; that is, *r*_{0.75} − *r*_{0.25} = 0.5, at least approximately.

The *informativeness* of the forecast is evaluated in terms of IS—the *informativeness score* calculated from the joint realizations of the forecast median (which in each of the four models carries all predictive information there is) and the predictand (Krzysztofowicz 1992). The score is bounded, 0 ≤ IS ≤ 1, with IS = 0 for an uninformative forecast model and IS = 1 for a perfect forecast model; it ranks the forecast models consistently with their economic values (which would be output from a Bayesian decision procedure with any utility function and any prior distribution function). Because the time series of the predictand within the 9-month verification window is obviously nonstationary (Krzysztofowicz and Evans 2008, their Fig. 2), the *IS* is calculated from the joint sample of the forecast-predictand vector in the standardized sample space; and because the marginal distribution function of each standardized variate is non-Gaussian, the joint sample is processed through the empirical NQT (Krzysztofowicz 1992, his appendix D).

[Note that the standardization of the forecast-predictand vector (i) does not affect the values of *r _{p}* and CS and (ii) does not alter the ranking of

*AW*values across the forecast models, but only changes their scale from degrees Fahrenheit to dimensionless. Therefore, all three measures—CS, AW, and IS—are compatible.]

#### Hypotheses

For the months covered in this forecasting experiment, the autocorrelation coefficients *c _{k}* are in the range 0.52–0.72 (Fig. 4) and the informativeness scores

*γ*

_{k}_{1}are about 0.9 (Table 3). When these values are inserted into Figs. 12 and 13, they yield the hypothesis that the impact of the climatic autocorrelation on forecasts of the daily maximum temperature at Savannah is

*negligible*.

The second hypothesis is that the theoretical *coherence* between the four models is preserved despite the pitifully small joint samples (Tables 2 and 3) for the likelihood parameter estimation of the BPF and the Markov BPF. The coherence means (i) a stable calibration of models for all lead times and (ii) a stable preference order of models (climatic, Markov climatic, BPF, Markov BPF) in terms of sharpness and informativeness for all lead times.

The third hypothesis is that the Markov BPF is *robust* in that it performs no worse than the BPF despite (i) the experimental conditions in which its advantage is negligible, (ii) the two extra parameters (*c _{k}*,

*d*), and (iii) the potentially large parameter uncertainty due to the small joint samples.

_{kl}#### Results

The calibration of the three quantiles of the climatic model (Table A1) reveals that the 9-month verification period was somewhat warmer than the historical period: the prior (climatic) distribution would have to be shifted to the right (so that the climatic median becomes the 0.43-probability quantile) to match the relative nonexceedance frequencies during the verification period. (Whether this shift is caused by a random fluctuation, and therefore is unpredictable, or by a long-term trend, and therefore can be accounted for, is of no interest herein.)

What is of interest is the stability of the calibration of all three quantiles across the models and the lead times (Table A1). Taking the calibration of the climatic model as the standard, the other three models are calibrated nearly as well, with the exception of the 0.25-probability quantiles from the BPF and the Markov BPF at *l* = 7; but even in these cases the largest relative miscalibration is only 0.08, and it can be traced to the warm season for which the samples were the smallest (the estimation sample of size 38; the verification sample of size 44).

Despite warmer than normal verification period, the 50% CCI is about equally well calibrated across the models and the lead times (except for the Markov BPF at *l* = 7, where the miscalibration by 0.08 is the largest for the reason explained above).

For the calibration of forecast (all three quantiles), the CS of the climatic model (Table A2) sets the standard: its values reflect warmer-than-normal verification period, and its variability across the lead times reflects small verification samples. Relative to this standard, the Markov climatic model is calibrated equally well, whereas the BPF and the Markov BPF are either slightly better calibrated or slightly worse calibrated, depending on the lead time. Based strictly on the CS, the best calibrated model is the BPF for *l* = 1, the Markov BPF for *l* = 4, and the Markov climatic model for *l* = 7. This shows that the complexity of the Bayesian forecasting models does not degrade their calibration, unless the estimation sample is pitifully small (like for *l* = 7 in the warm season), but even then the maximum miscalibration relative to the standard remains small (0.099 − 0.062 = 0.037).

The ranking of the four models in terms of AW is coherent. Theoretically, the climatic model has a constant AW independent of the lead time *l*, but because the verification sample size varies slightly with *l*, so does AW. For each of the other three models, AW should not decrease with *l*, and it does not. For each lead time *l*, AW should not increase with model complexity, and it does not.

The ranking of the four models in terms of IS is also coherent. Theoretically and practically, the climatic model is uninformative (IS = 0). For each of the other three models, *IS* should not increase with *l*, and it does not. For each lead time *l*, IS should not decrease with model complexity, and it does not, except for *l* = 7 where the order of scores 0.509 and 0.508 is reversed. This is a numerical fluke, and with the verification sample size of 213, the difference of 0.0009 is insignificant: the BPF and the Markov BPF are equally informative.

#### Conclusions

The confirmation of the first hypothesis illustrates the advantage of the theoretic analysis from section 6: it can predict the potential impact of the climatic autocorrelation in the Markov BPF, without the need for forecasting experiments.

The confirmation of the second and the third hypotheses illuminates two unique properties of our Bayesian forecasting theory: *coherence* and *robustness*. In particular, when the theory is properly implemented (using proper parametric models and proper estimation procedures), the four forecasting models (i) are about equally well calibrated and (ii) are progressively more informative: climatic model, Markov climatic model, BPF, Markov BPF. Thus, the seeming complexity of the BPF and of the Markov BPF is not a hindrance, despite the pitifully small joint samples for the likelihood parameter estimation, and despite the testing conditions being unfavorable to them (as explained in sections a and c of this appendix).

Overall, this forecasting experiment sheds some light on the empirical properties of our Bayesian forecasting models. But the full power of their coherence and robustness is yet to be demonstrated and appreciated.

Estimates of the autocorrelation coefficients obtained from the climatic sample at Savannah.

Estimates of the likelihood parameters for the Markov BPF obtained from the sampling windows ending before a cool day (31 Jan, *k* = 31) and a warm day (1 Aug, *k* = 213) at Savannah.

Estimates of the likelihood parameters for the BPF obtained from the sampling windows ending before a cool day (31 Jan, *k* = 31) and a warm day (1 Aug, *k* = 213) at Savannah.

Table A1. Calibration of the three quantiles and the 50% CCI.

Table A2. The CS, average width of the 50% CCI (AW) in degrees Fahrenheit, and IS.