## 1. Introduction

### a. The uncertainty quantification problem

The National Digital Forecast Database (NDFD) was designed by the National Weather Service (NWS) to store the official forecasts of the sensible weather elements produced by the NWS field offices throughout the United States (Glahn and Ruth 2003). The official forecasts are subjective in that they are made judgmentally by human forecasters with the support of software systems and are based on information from multiple sources, including output from numerical weather prediction models and guidance from the national centers. With the exception of the occurrence of precipitation, which is forecasted in terms of probability, all other weather elements are forecasted deterministically. Hence the deficiency of the NDFD: it contains no information about forecast uncertainty (Ryan 2003).

To remedy this deficiency, the Meteorological Development Laboratory of the NWS began developing statistical techniques for assessing the uncertainty in forecasts disseminated through the NDFD (Peroutka et al. 2005). This article presents a solution to the same problem, but via a different technique and in a different format.

### b. Bayesian processor of forecast

The Bayesian processor of forecast (BPF) for the NDFD is a specialized application of the Bayesian theory of probabilistic forecasting formulated and tested in various settings over the past two decades (e.g., Krzysztofowicz 1983; Alexandridis and Krzysztofowicz 1985; Krzysztofowicz and Watada 1986; Krzysztofowicz and Reese 1991; Krzysztofowicz 1999; Krzysztofowicz and Kelly 2000b).

The BPF developed and illustrated herein quantifies the uncertainty in a deterministic forecast of the daily maximum temperature—one of the predictands selected by the NWS for development and testing of their technique (Peroutka et al. 2005). In general, this BPF is applicable to any continuous predictand. The inputs to and the outputs from the BPF are as follows (Fig. 1). In the estimation phase, the inputs are a climatic sample of the predictand, and a joint sample of the forecast and the predictand for a given forecast point (a grid point or a station) and a specified lead time; the outputs are the values of parameters (as the BPF is entirely parametric). In the forecasting phase, the input is a *deterministic forecast* (a point estimate of the predictand) and the output is a *probabilistic forecast* (a distribution function, a density function, and a quantile function). Thus, the BPF outputs the complete and well-calibrated characterization of uncertainty needed by rational decision makers who use formal decision models and by information providers who want to extract various forecast products for their customers (e.g., quantiles with specified exceedance probabilities, credible intervals with specified inclusion probabilities, probabilities of exceedance for specified thresholds).

### c. Information fusion

In concept, the BPF quantifies the *total uncertainty* about a predictand, given a deterministic forecast. The quantification of uncertainty is accomplished via Bayes theorem, which extracts and fuses two kinds of information from two different sources (Fig. 1). (i) Information about the natural variability of the predictand is extracted from a climatic sample, which may be retrieved from the National Climatic Data Center (NCDC). (ii) Information about the predictive performance of the deterministic forecast is extracted from a joint sample, which may be retrieved from the NDFD.

The size of each sample may be limited, not only by the length of the record, but also by the requirement of statistical homogeneity. For instance, the nonstationarity of climate may necessitate a truncation of the available climatic sample, and a modification in the forecasting system (e.g., hiring of an experienced or a novice forecaster by a field office; an improvement of a numerical weather prediction model by a national center) may necessitate a truncation of the available joint sample. However, the most important fact here is that the joint sample is typically much shorter than the climatic sample. This gives rise to the statistical problem of information fusion, which can be solved correctly only by a proper application of Bayes theorem.

### d. Modeling approach

The key challenges that must be overcome during the development of a proper technique for quantifying uncertainty include the nonstationary behavior of the meteorological time series due to seasonality, the non-Gaussian form of the probability distributions of meteorological variates, and the nonlinear and heteroscedastic dependence structure between the forecast and the predictand. These features of meteorological data, and the ways the BPF handles them, are explained and illustrated throughout the article.

The article is organized as follows. Section 2 outlines the theoretic foundation of the BPF and defines its major components. Section 3 details the modeling and estimation of the first component: the prior (climatic) distribution function. Section 4 does the same for the second component: the family of the likelihood functions. Section 5 presents examples of probabilistic forecasts. Section 6 discusses several attributes of the BPF and the empirical results obtained for three stations. Section 7 summarizes the unique advantages of the BPF.

## 2. Bayesian processor

### a. Concept

Let *W* be the *predictand*—a continuous variate whose realization *w* is being forecasted. Let *X* be the *estimator*—a continuous variate whose realization *x* constitutes a point estimate of *W*.

From the viewpoint of the NWS, *x* is the official, deterministic forecast of *W* prepared by a field office. From the viewpoint of a rational decision maker, who recognizes the uncertainty about *W*, forecast *x* is merely a realization of the predictor *X*—a piece of information that may reduce the uncertainty about *W*, but cannot eliminate it. What the rational decision maker then needs is not a number *x*, but a function of *w*—the distribution function of predictand *W*, conditional on the predictor realization *X* = *x*. The purpose of the BPF is to supply such a conditional distribution function.

### b. Characterization of uncertainties

The inputs to the BPF are the prior density function and the family of likelihood functions. These inputs are defined and interpreted as follows.

Let *g* denote the *prior density function* of the predictand *W*. This density function characterizes the natural variability of *W*. Equivalently, it characterizes the uncertainty about the predictand that exists before the NWS issues a forecast. This uncertainty may be called the climatic uncertainty (or the prior uncertainty) and may be quantified based on climatic data.

Let *f* (·|*w*) denote the density function of the predictor *X*, conditional on the hypothesis that the realization of the predictand is *W* = *w*. This density function characterizes the variability of *X* on all those occasions on which *W* = *w* is observed. What is needed is a family of the conditional density functions {*f* (·|*w*): all *w*}. Then, for a fixed forecast *X* = *x*, a function *f* (*x*|·) exists; it is called the likelihood function of *W*. More generally, there exists a *family of the likelihood functions* {*f* (*x*|·): all *x*}. The family *f* quantifies the stochastic dependence between the predictor *X* and the predictand *W*.

### c. Bayesian revision

*expected density function κ*of predictor

*X*is derived via the total probability law: Second, the

*posterior density function ϕ*(·|

*x*) of predictand

*W*, conditional on a deterministic forecast

*X*=

*x*, is derived via Bayes theorem:

In concept, Bayes theorem revises the prior density function *g*, which characterizes the climatic uncertainty about *W*, given forecast *X* = *x*. The extent of the revision is determined by the likelihood function *f* (*x*|·), which characterizes the degree to which *X* = *x* reduces the uncertainty about *W*. The result of this revision is the posterior density function *ϕ*(·|*x*); it quantifies the uncertainty about *W* that remains after the NWS issues forecast *X* = *x*.

*posterior distribution function*Φ(·|

*x*) of predictand

*W*is defined by It gives

*P*(

*W*≤

*w*|

*X*=

*x*) = Φ(

*w*|

*x*), where

*P*stands for probability. The inverse function Φ

^{−1}(·|

*x*) is called the posterior quantile function. For any number

*p*, such that 0 <

*p*< 1, and any deterministic forecast

*X*=

*x*, the

*p*-

*probability posterior quantile*of predictand

*W*is the quantity

*w*such that Φ(

_{p}*w*|

_{p}*x*) =

*p*. Therefrom,

Equations (1) and (2) define the theoretic structure of the BPF. Equations (2)–(4) specify the three outputs, each of which constitutes the probabilistic forecast of *W*, given deterministic forecast *x*.

### d. Bayesian meta-Gaussian model

The theoretic structure of the BPF can be implemented in many ways, as different mathematical models for *g* and *f* can be formulated, and different solution techniques for *ϕ*, Φ, and Φ^{−1} can be developed. A particularly elegant BPF is Gaussian linear (Krzysztofowicz 1983, 1987; Krzysztofowicz and Watada 1986; Krzysztofowicz and Reese 1991), but its applicability in meteorology is limited to a few predictands (with near-Gaussian *g*) and a few short lead times (situations wherein a linear and homoscedastic dependence structure between *X* and *W* is mostly found).

Our objective is to propose a BPF of wide applicability—in that the predictand *W* and the predictor *X* are allowed to have distribution functions of any form, and the dependence structure between *X* and *W* is allowed to be nonlinear and heteroscedastic (which is the case with most meteorological forecasts, especially for longer lead times). This is the meta-Gaussian BPF (Kelly and Krzysztofowicz 1995; Krzysztofowicz and Kelly 2000b). In addition to the aforementioned advantages, it subsumes the Gaussian-linear BPF and, thus, is a proper generalization thereof.

### e. Modeling and estimation

The remaining sections describe the modeling process, the estimation procedure, the goodness of fit to data, the statistical properties, and the practical advantages of the meta-Gaussian BPF. In all illustrations, the predictand is the daily maximum temperature; the forecast lead times are 24 h (1 day), 96 h (4 days), and 168 h (7 days) after 0000 UTC; the forecast points are the three stations: Savannah, Georgia (KSAV); Portland, Maine (KPWM); and Kalispell, Montana (KFCA).

## 3. The prior distribution function

### a. Climatic sample

Let *W _{k}* denote the maximum temperature on day

*k*of the year (

*k*= 1, . . . , 365) at a given station. To keep the number of days in each year constant, for simplicity, 29 February is excluded. Because the time series {

*W*:

_{k}*k*= 1, . . . , 365} is obviously nonstationary, the climatic sample is formed for each day

*k*. To increase the sample size, data are pooled from the consecutive 5 days centered on the given day. (The 5-day window offers a reasonable compromise between increasing the sample size and precluding the nonstationarity effects.) Because of missing data, the sample size varies from day to day. To ensure uniformity, for comparison and convenience, the oldest data are removed from each day until the sample size for that day equals the smallest size among the 365 samples.

In summary (Table 1), for each day *k* of the year (*k* = 1, . . . , 365), there is a climatic sample {*w _{k}*(

*n*):

*n*= 1, . . . ,

*M*} of size

*M*, where

*w*(

_{k}*n*) denotes the

*n*th observation of the maximum temperature in the 5-day

*sampling window*for day

*k*.

### b. Standardization

The daily time series of sample deciles, the lowest observation, and the highest observation at Savannah are plotted in Fig. 2a. As expected, these time series confirm the nonstationarity of the maximum temperature. Hence, each predictand *W _{k}* has a different prior distribution function

*G*(

_{k}*k*= 1, . . . , 365). Whereas modeling and estimation of 365 different distribution functions is feasible—and has been done by the NWS (Peroutka et al. 2005)—we seek a more efficient method of handling the nonstationarity.

*W*can be removed via standardization. First, for each day

_{k}*k*(

*k*= 1, . . . , 365), the mean

*E*(

*W*) =

_{k}*m*and the variance Var(

_{k}*W*

_{k}) =

*s*

^{2}

_{k}are estimated from the

*original climatic sample*{

*w*(

_{k}*n*):

*n*= 1, . . . ,

*M*}. For operational forecasting, the time series of the estimates can be smoothed and approximated by first-order Fourier series expansions (Wilks 1995), as shown in Fig. 3; herein, they are used directly. Second, each observation is standardized: The resultant

*standardized climatic sample*{

*w*′

_{k}(

*n*):

*n*= 1, . . . ,

*M*} is reanalyzed. Figure 2b shows the daily time series of the sample statistics: all decile time series are fairly flat, suggesting that (i) the seasonality of the maximum temperature is explained almost entirely by the seasonality of the mean and variance, and (ii) the prior distribution function

*G*′

_{k}of the standardized maximum temperature, is approximately stationary. Of course, the standardization guarantees the stationarity of the first two moments:

*E*(

*W*′

_{k}) = 0 and Var(

*W*′

_{k}) = 1 for

*k*= 1, . . . , 365. The qualification “approximately” stationary is made, at least tentatively, because the time series of the lowest and the highest standardized observations exhibit some variability and a slight trend in the range, which appears wider in the warm season than in the cool season; however, these are extreme observations, and only two per day. Thus, their practical significance cannot be ascertained until the distribution functions are estimated.

### c. Empirical distribution function

Results are reported for four diverse days in order to make a convincing case for the stationarity of the standardized maximum temperature. The chosen days are (Table 2) one of the coldest, one of the warmest, one with the maximum range in Fig. 2b, and one with the minimum range in Fig. 2b.

*k*, the empirical distribution function of

*W*′

_{k}was constructed; it is specified by the set of

*M*points, such that where

*w*′

_{k(n)}is the

*n*th realization in the sample for day

*k*sorted in the ascending order,

*p*=

_{n}*n*/(

*M*+ 1) is the Weibull plotting position, and

*M*is the sample size.

### d. Parametric distribution function

*G*′

_{k}of

*W*′

_{k}was estimated and its goodness of fit to the empirical distribution function was evaluated in terms of the maximum absolute difference (MAD), a measure consistent with the Kolmogorov–Smirnov statistic: When there were several identical realizations in the sample, they formed a step in the empirical distribution function; the median plotting position in this step was used to calculate the absolute difference.

Excellent fits were obtained (Table 3) for all 4 days with *G*′_{k} coming from the Weibull family of distributions (see the appendix). Moreover, the parameter estimates were similar across the 4 days, implying that the standardization successfully removed the seasonality. Therefore, instead of estimating a different distribution function for every day of the year, it may be sufficient to estimate just one distribution function that will be valid for every day.

### e. Stationary prior distribution function

Under the stationarity hypothesis, *W* ′_{k} = *W* ′ and *G*′_{k} = *G*′ for *k* = 1, . . . , 365, where *W* ′ is the stationary-standardized maximum temperature on any day of the year and *G*′ is the stationary prior distribution function of *W* ′.

To obtain *G*′, the standardized climatic samples for the four diverse days are pooled together and a single parametric distribution function is estimated from the pooled sample (Fig. 4). It is a Weibull distribution (see the appendix) with parameter estimates *α* = 5.409, *β* = 5.570, and *η* = −5. [Although temperature is bounded from below at absolute zero, a lower bound of −5 barely restricts the sample space: under the standard normal distribution, *P*(*W* ′ < −5) = 0.2867 × 10^{−6}.] The goodness of fit of this stationary parametric distribution function to the empirical distribution function for each day is reported in Table 3. When the stationary parameter estimates are used, the MAD is somewhat higher for 2 days and slightly lower, by chance, for the other 2 days. Overall, the fits remain very good (MAD < 0.05), corroborating the stationarity hypothesis.

### f. Destandardization

*k*(

*k*= 1, . . . , 365), the transformation between the stationary distribution function

*G*′ of

*W*′ and the distribution

*G*of

_{k}*W*is giving

_{k}*P*(

*W*≤

_{k}*w*) =

*G*(

_{k}*w*) at any point

*w*in the original sample space of the maximum temperature. The transformation between the corresponding density functions is

*G*′ is the Weibull distribution with parameters (

*α, β, η*), it can be shown, via (10), that

*G*is also the Weibull distribution with parameters (

_{k}*α*,

_{k}*β*,

*η*), where Thus, the scale parameter

_{k}*α*and the shift parameter

_{k}*η*capture the seasonality of the maximum temperature, whereas the shape parameter

_{k}*β*captures the intrinsic, season-invariant, stochasticity of the maximum temperature.

The Weibull distribution functions *G _{k}* with the parameters calculated according to (12) are plotted along with the empirical distribution functions in the original sample spaces in Fig. 5. The nonstationarity of

*G*is vivid, yet the fit remains excellent for each day. [The MAD defined in (9) remains invariant under the destandardization.]

_{k}The above procedure, illustrated herein for Savannah, performed equally well for Portland and Kalispell, with the only distinction being that the best-fit parametric distribution for Portland turned out to be log-logistic rather than Weibull. This suggests that the best form of the distribution may be different for different climatic regions.

In conclusion, the prior distribution functions *G _{k}* for all days of the year (

*k*= 1, . . . , 365) can be specified by (i) a single, stationary, three-parameter distribution drawn from one of the common families (Johnson and Kotz 1970a, b), with parameters (

*α, β, η*) estimated from the pooled standardized sample, and (ii) the means and the standard deviations of the maximum temperatures {(

*m*,

_{k}*s*):

_{k}*k*= 1, . . . , 365} estimated from the climatic samples for all days. Once estimated, the prior distribution functions remain valid until enough additional climatic observations are collected to detect a change in climate since the last estimation.

### g. Comparison of models

The NWS took a different approach to modeling the nonstationary distribution functions *G _{k}* for

*k*= 1, . . . , 365 (Peroutka et al. 2005). It chose a four-parameter generalized lambda distribution (GLD) of Karian and Dudewicz (2000); then it modeled the four time series of daily parameter estimates by fitting cosine series. We calculated the parameter values for each of the four test days and evaluated the goodness of fit of the GLD to the empirical distribution function (Table 3). The fit of the GLD is decisively inferior to the fit of the Weibull distribution. In an extended test, we compared the two models on the first day of every month and then calculated the average MAD; it is 0.0270 for the Weibull model and 0.0456 for the GLD.

Overall, the model employing the standardization of the daily variates (as a means of obtaining a stationary time series) and a single, stationary, three-parameter Weibull distribution fits the data better than does the model employing a nonstationary four-parameter GLD. Moreover, the estimation of the nonstationary GLD is more complex because of the cosine series involved, and the use of the GLD is computationally far more demanding because there are no closed-form expressions for the distribution function and the density function. (The GLD is defined by its quantile function—the inverse of the distribution function.)

## 4. The likelihood function

### a. Joint sample

Let *x _{k}* denote a deterministic forecast of the maximum temperature on day

*k*of the year at a given station issued by a NWS field office with a specified lead time; the lead times considered herein are 24, 96, and 168 h after 0000 UTC. Let

*w*denote the corresponding observation of the maximum temperature. The pair (

_{k}*x*,

_{k}*w*) forms a joint realization of (

_{k}*X*,

_{k}*W*), where

_{k}*X*is the forecast variate (the predictor, from the viewpoint of a rational decision maker) and

_{k}*W*is the predictand. A joint sample {(

_{k}*x*,

_{k}*w*)} can be retrieved from the NDFD; however, its usage poses three challenges.

_{k}First, the available joint sample of (*X _{k}*,

*W*) is typically much shorter than the climatic sample of

_{k}*W*. For example, for the three stations analyzed herein, the joint samples are about 1 yr long (Table 4) whereas the climatic samples are more than 100 yr long (Table 1). The idea of augmenting the joint sample via simulation (Krzysztofowicz and Kelly 2000b) or “reforecasting” (Hamill et al. 2006) is not applicable here because the official NWS forecasts are subjective—they incorporate numerous human judgments made by different forecasters at different aggregation levels, or scales, as in national centers and field offices (Glahn and Ruth 2003)—and it is infeasible to determine, efficiently and reliably, the judgmental modifications of various model outputs that would have been made in years past by the currently employed NWS forecasters.

_{k}Second, the joint sample should be homogeneous: all forecasts included in it should have been produced by one forecast system—the same system for which the BPF is being developed. Inasmuch as the numerical weather prediction models, which provide guidance to forecasters, undergo modifications, and the forecasters in the NWS field offices change (as some relocate or retire and others are hired), the homogeneous joint sample may, in fact, be shorter than the sample stored in the NDFD.

Third, the performance of the forecast system may be nonstationary; for instance, forecasts produced during a cool season may be more informative than forecasts produced during a warm season. This means that the stochastic dependence between *X _{k}* and

*W*, which must be captured by the likelihood function, varies with season; in general, it varies with

_{k}*k*(

*k*= 1, . . . , 365). Therefore, the likelihood function must be allowed to be nonstationary, despite the lack of an adequate sample to develop and estimate a formal nonstationary model.

### b. Adaptive scheme

To cope with the first challenge, it is necessary to formulate a statistical technique that can extract information from a small joint sample and then fuse it with information extracted from a large climatic sample. As this is the unique capability of the Bayesian approach, this approach becomes inevitable.

To cope with the other two challenges, an *adaptive scheme* is formulated for sampling, estimation, and forecasting as follows.

- The joint sample is taken from about 90–120 days preceding a forecast day—the
*sampling window*within which the system homogeneity and performance stationarity can be assumed. (The two time series {*X*} and {_{k}*W*} are still treated as nonstationary.)_{k} - The likelihood function estimated from a given joint sample is used in forecasting on each of the subsequent 5–10 days—the
*forecasting window*to which the sampling assumptions are extended.

### c. Joint standardization

Like the time series of the predictand {*W _{k}*:

*k*= 1, . . . , 365}, the time series of the forecast variate {

*X*:

_{k}*k*= 1, . . . , 365} is nonstationary. But unlike

*W*, which could be characterized statistically using a large climatic sample,

_{k}*X*cannot be characterized in the parallel manner because the joint sample is so small. Therefore, we resort to the following procedure.

_{k}*x*,

_{k}*w*) is standardized using the climatic mean

_{k}*m*and the climatic standard deviation

_{k}*s*of predictand

_{k}*W*for day

_{k}*k*: The plots of the original time series {(

*x*,

_{k}*w*)} and the standardized time series {(

_{k}*x*′

_{k},

*w*′

_{k})} in Fig. 6 for the lead time of 168 h show how this standardization eliminates, or at least reduces, the seasonality in the joint sample. Ditto for other lead times and stations. (Note the degree of similarity between the time series of forecasts and observations.)

*N*days. Realizations are retrieved from

*N*days preceding the current forecast day (the day on which the forecast is to be made) to obtain a joint sample {(

*x*′(

*n*),

*w*′(

*n*)):

*n*= 1, . . . ,

*N*}. This is considered to be a random sample of the pair of variates (

*X*′,

*W*′) such that for day

*k*within the forecasting window for which a forecast is to be made with the specified lead time. While this standardization guarantees

*E*(

*W*′) = 0 and Var(

*W*′) = 1, no presumption is made, or needed, about the moments of

*X*′.

The sampling windows and the sample sizes, *N*, chosen for the examples are listed in Table 4. Note that the joint samples include only past forecasts for which the corresponding observations are available on the forecast day. So, for the 168-h lead time, no forecasts issued within 7 days of the forecast day are included. The sample sizes vary because of missing data.

### d. Marginal distribution function

Let * K*′ denote the marginal distribution function of the standardized forecast variate

*X*′, such that

*′(*K

*x*′) =

*P*(

*X*′ ≤

*x*′) for any

*x*′. This

*′ is to be estimated from the marginal sample {*K

*x*′(

*n*):

*n*= 1, . . . ,

*N*} of the standardized joint sample. [The overbar signifies that

*′ is only an initial estimate of the marginal distribution function of*K

*X*′; this estimate can be revised later as a result of modeling (Krzysztofowicz and Kelly 2000a, b); but because the revised estimate is not needed for the operational BPF, it is not discussed herein.]

Figure 7 shows the empirical distribution functions constructed as explained in section 3c, and the estimated parametric distribution functions * K*′, all Weibull, for two sampling windows (cool and warm) and two lead times (24 and 168 h) at Savannah. The fits are good (Table 5). As the lead time increases, the distribution function becomes steeper and concentrated around the median. The effect of season is slight and is confounded by the difference in sample sizes.

A comparison of Fig. 7 with Fig. 4 reveals that in each of the four cases, * K*′ ≠

*G*′; most significantly, the tails of

*′ are much shorter than the tails of*K

*G*′. Clearly, the forecasts have a distributional bias: it is evident in the shape parameter (

*β*

*β*), but not in the scale parameter (

*α*

*α*), and it increases with lead time. Of course, the distributional bias implies a bias in both the mean and the variance of the forecast.

The general patterns shown here for Savannah hold also for Portland and Kalispell, with the exception of the distribution type, which is log-logistic at Portland.

K

_{k}of the forecast variate

*X*on day

_{k}*k*, with the specified lead time, is obtained from

*′ via (14): giving*K

*P*(

*X*

_{k}≤

*x*) =

K

_{k}(

*x*) at any point

*x*in the original sample space of the maximum temperature. When the adaptive scheme (section 4b) is deployed,

K

_{k}is nonstationary (because the climatic mean

*m*and standard deviation

_{k}*s*vary with day

_{k}*k*of the year) even though the function

*′ remains the same on all forecast days within the specified forecasting window. However,*K

*′ is different for different lead times, as can be seen in Fig. 7.*K

*′ is the Weibull distribution with parameters (*K

*α*

*β*

*η*

K

_{k}is also the Weibull distribution with parameters (

*α*

_{k},

*β*

*η*

_{k}); the relations between the corresponding parameters parallel those in (12). For example, the forecast of the maximum temperature at KSAV on 7 February made with the lead time of 168 h (7 days) is characterized by the variate

*X*

_{38}whose marginal distribution function

K

_{38}is Weibull with parameters where the values of

*α*

*β*

*η*

*m*

_{38},

*s*

_{38}come from Fig. 3. Four examples of

K

_{k}for different days and lead times are shown in Fig. 8.

### e. Meta-Gaussian likelihood model

*X*′ and the standardized predictand

*W*′. Toward this end, we employ the meta-Gaussian likelihood model of Krzysztofowicz and Kelly (2000a, b). At the heart of this model is the

*normal quantile transform*(NQT): where

*Q*is the standard normal distribution function and

*Q*

^{−1}is its inverse. The NQT guarantees that in the sample space of the transformed variates (

*Z*,

*V*), each marginal distribution function is Gaussian (Kelly and Krzysztofowicz 1995, 1997), and several empirical studies using hydrological and meteorological data have demonstrated that the bivariate distribution function is Gaussian as well (e.g., Herr and Krzysztofowicz 2005; Krzysztofowicz and Kelly 2000b). Readers interested in these properties of the meta-Gaussian likelihood model are referred to the works cited above. The presentation below focuses on the practical estimation of the likelihood parameters.

### f. Likelihood parameters

*x*′(

*n*),

*w*′(

*n*)):

*n*= 1, . . . ,

*N*} for the specified lead time, the marginal distribution function

*′ for the specified lead time, and the prior distribution function*K

*G*′, each joint realization is processed through the NQT to obtain Then, the transformed joint sample {(

*z*(

*n*),

*υ*(

*n*)):

*n*= 1, . . . ,

*N*} is used to estimate the parameters

*a*,

*b*,

*σ*of the Gaussian model: The maximum likelihood estimators should be used (DeGroot 1986).

*p*< 1 and

*z*

_{p|υ}is the

*p*-probability quantile of

*Z*, conditional on

*V*=

*υ*; that is,

*P*(

*Z*≤

*z*

_{p}_{|}

*|*

_{υ}*V*=

*υ*) =

*p*. As is well known, the linear regression equals the conditional median:

*E*(

*Z*|

*V*=

*υ*) =

*z*

_{0.5|}

*. The mapping of (19) into the standardized sample space gives where*

_{υ}*x*′

_{p|w′}is the

*p*-probability quantile of

*X*′, conditional on

*W*′ =

*w*′.

### g. Model validation

Validation of the meta-Gaussian likelihood model amounts to checking the three requirements of the Gaussian model (18):

*linearity*—the regression of*Z*on*V*must be linear;*homoscedasticity*—the variance*σ*^{2}of the residual Θ = Z −*aV*−*b*must be independent of*V*; and*normality*—the distribution function of Θ must be normal (Gaussian).

To visualize the dependence structure and to judge the requirements, scatterplots should be produced, as shown in Fig. 9 for two lead times. The scatterplots of the transformed sample points (*z*(*n*), *υ*(*n*)) in Figs. 9c and 9d are linear and homoscedastic, and the Gaussian model (19), depicted by three conditional quantile functions, with *p* = 0.1, 0.5, 0.9, fits the sample points well. The scatterplots of the standardized sample points (*x*′(*n*), *w*′(*n*)) in Figs. 9a and 9b are slightly nonlinear and heteroscedastic, especially for the 168-h lead time. In particular, the regression *E*(*Z*|*V* = *υ*) = *z*_{0.5|υ}, which is linear (Figs. 9c and 9d), is mapped into the conditional median *x*′_{0.5|w′}, which is slightly nonlinear (Figs. 9a and 9b). The 80% central credible interval about the conditional median of *Z*, whose width *z*_{0.9|}* _{υ}* −

*z*

_{0.1|}

*is constant with*

_{υ}*υ*(Figs. 9c and 9d), is mapped into the 80% central credible interval about the conditional median of

*X*′, whose width

*x*′

_{0.9|w′}−

*x*′

_{0.1|w′}decreases with

*w*′ (Figs. 9a and 9b). The homoscedasticity is validated in Figs. 10a and 10b: the scatter of the residuals appears to be independent of the transformed observation. The normality is validated in Figs. 10c and 10d: the quantile–quantile plot of the residuals is nearly linear.

In summary, these analyses demonstrate that applying the Gaussian model directly in the standardized sample space of *X*′ and *W* ′ would be a wrong approach for the data at hand. Further evidence to this effect is given in section 6a.

### h. Forecast informativeness

*Z*and

*V*is fully characterized by the Pearson’s product-moment correlation coefficient

*γ*, which may be expressed in terms of the likelihood parameters (Krzysztofowicz 1992): Under the meta-Gaussian likelihood model,

*γ*remains a fully efficient measure of stochastic dependence between the standardized variates

*X*′ and

*W*′, as well as between the original variates

*X*and

*W*(Kelly and Krzysztofowicz 1997), and it can be transformed into the Spearman’s rank correlation coefficient: The values of both measures,

*γ*and

*ρ*, are reported in Fig. 9.

*X*with respect to predictand

*W*is the

*informativeness score*(Krzysztofowicz 1987, 1992): The score is bounded, 0 ≤ IS ≤ 1, with IS = 0 for an uninformative predictor and IS = 1 for a perfect predictor. [The informativeness score was called the Bayesian correlation score in the original publication by Krzysztofowicz (1992).] The value of IS is determined by the signal-to-noise ratio |

*a*|/

*σ*, estimated by regressing

*Z*on

*V*, as illustrated in Fig. 9: the absolute value of the slope coefficient, |

*a*|, is the measure of signal, and the standard deviation of the residual,

*σ*, is the measure of noise. Figure 9 shows that, as expected, the forecast with lead time of 24 h is significantly more informative than the forecast with lead time of 168 h.

## 5. Probabilistic forecast

### a. Posterior parameters

*k*within the forecasting window, as described in section 4b.

### b. Forecasting equations

To simplify the forecasting equations, the index *k* of the predictand’s day is omitted, which is equivalent to the following substitutions in the forecasting problem. Given a deterministic forecast *x* = *x _{k}*, with a specified lead time, of predictand

*W*=

*W*, and given (i) the prior distribution function

_{k}*G*=

*G*and the corresponding prior density function

_{k}*g*=

*g*for day

_{k}*k*, (ii) the marginal distribution function

*=*K

K

_{k}for day

*k*and for the specified lead time, and (iii) the posterior parameters

*A*,

*B*,

*T*for the specified lead time, the probabilistic forecast is specified by one of the following constructs.

*W*, defined in (3), is specified by the equation where it is to be understood that Φ(

*w*) = Φ(

*w*|

*x*), with the abbreviated form being used when the value

*x*of the deterministic forecast need not be shown. The

*p*-probability posterior quantile of

*W*, defined in (4), is specified by the equation The posterior density function

*ϕ*of

*W*, defined in (2), is specified by the equation where it is understood that

*ϕ*(

*w*) =

*ϕ*(

*w*|

*x*) when

*x*need not be shown.

### c. Posterior functions

Figure 11 shows examples of probabilistic forecasts for Savannah, in the cool season, with the lead times of 24 and 168 h. The posterior distribution functions (Figs. 11a and 11b), and the corresponding posterior density functions (Figs. 11c and 11d), quantify the uncertainty about the maximum temperature, given a deterministic forecast, either *X* = 50°F or *X* = 70°F. They also illustrate how the climatic uncertainty, quantified in terms of the prior distribution function (Figs. 11a and 11b) and the prior density function (Figs. 11c and 11d), is revised based on a deterministic forecast. Two effects of the forecast are apparent. First, the forecast shifts the center of the probability mass (under the density function) toward the forecasted temperature. Second, the forecast usually (but not always) reduces the uncertainty as the posterior density function is sharper than the prior density function; however, this reduction of uncertainty depends on the forecast and the lead time. Forecast *X* = 70°F reduces the uncertainty more than forecast *X* = 50°F, which reflects the heteroscedasticity of the dependence structure between *X*′ and *W* ′ that was captured by the likelihood function. But as the lead time increases, from 24 to 168 h, the degree by which the forecast reduces the climatic uncertainty diminishes: at 168 h, the posterior density functions are only slightly sharper than the prior density function.

### d. Limit of predictability

One of the most important properties of the BPF is its limiting behavior: when the informativeness score IS decreases to zero, the posterior distribution function Φ converges uniformly to the prior distribution function *G*. (Likewise *ϕ* converges to *g*.) Why is this practically important? The example (Fig. 9) illustrates a common property of meteorological forecasts: the longer the lead time, the lower the informativeness of the forecast. (This is further illustrated in section 6a.) At the limit of predictability, the forecast contains no signal (*a* = 0), just noise (*σ* > 0); thus, IS = 0. It follows that as the lead time approaches the limit of predictability, the BPF guarantees that the posterior distribution functions Φ converge to the prior distribution function *G*—the behavior that is evident already in Fig. 11. At the limit of predictability, Φ = *G*, which is the climatic probabilistic forecast for the day because *G* was estimated from the climatic sample for the day. In conclusion, at the limit of predictability and beyond, the BPF automatically provides the decision maker with the correct and complete assessment of the climatic uncertainty.

## 6. Discussion

### a. Forms of heteroscedasticity

The heteroscedasticity of the dependence structure between *X*′ and *W* ′ may take many forms. That is why a flexible model for the likelihood function, such as the meta-Gaussian model, is necessary in order to correctly capture the forecast uncertainty. For instance, at Savannah, the conditional variance of *X*′ decreases with *w*′ (the 80% central credible interval in Fig. 9a narrows). But at Portland, the conditional variance of *X*′ increases with *w*′ (the 80% central credible interval in Fig. 12a widens). Loosely speaking, the 24-h forecasts are “better” for higher temperatures at Savannah, but for lower temperatures at Portland.

For the decision maker, the practical effect of this heteroscedasticity resurfaces in the posterior density function. As the forecasted temperature increases, the posterior density function becomes sharper at Savannah (Fig. 11c)—implying decreasing uncertainty, but grows flatter at Portland (Fig. 13c)—implying increasing uncertainty.

### b. Revision of uncertainty

The probabilistic forecasts for Portland (Fig. 13) illustrate also the interaction between the climatic uncertainty about the predictand and the informativeness of the forecast. The prior density functions (Figs. 13c and 13d) indicate that the climatic uncertainty is larger on the cool day than on the warm day. The likelihood parameters (Figs. 12c and 12d) indicate that the 24-h forecast is more informative on a cool day than on a warm day. The Bayesian revisions of (i) a larger climatic uncertainty based on a more informative forecast (for the cool day) and (ii) a smaller climatic uncertainty based on a less informative forecast (for the warm day) turn out to give nearly identical posterior uncertainty: when the plots in Fig. 13c are shifted 2°F to the right and then superposed on the plots in Fig. 13d, the posterior density functions match (almost). This illustrates the compensatory nature of the interaction between the climatic uncertainty and the forecast informativeness, which only Bayes theorem can capture correctly and fully (in the shape of the revised density function).

### c. Comparison of performance

Figure 14 presents a compilation of the likelihood parameters *a*, *b*, and *σ* for three stations, two seasons, and three lead times. The compilation includes also the informativeness score, IS, for the same stations, seasons, and seven lead times ranging from 24 to 168 h. As the lead time increases, the signal measure *a* decreases, the noise measure *σ* increases (with one exception, explainable by the variability due to small sample size; see Table 4), and, consequently, the informativeness score IS decreases (again, with one exception). The informativeness score is the most stable measure. It declines at a similar rate for each station and season up to the lead time of 96 h. After that, the decline is more rapid for all but one case, and the differences between the stations and the seasons grow larger. Finally, the bias parameter *b* is largely independent of the lead time, but highly variable across stations and seasons.

### d. Generalization of parameters

For implementation of the BPF in the NDFD, where it is desirable to store as few parameters as possible, an attempt should be made to determine which parameters vary least across stations, seasons, and lead times, and thus could be generalized. Figure 14 suggests the possibilities. Since each, IS and *a*, exhibits a relatively small variability across stations and seasons, perhaps a simple model could describe each of them as a function of the lead time. Then the values of IS and *a* could be used to calculate *σ* via (23). Therefore, only the two functions describing IS and *a* and the values of *b* would need to be stored.

## 7. Closure

We have presented a theoretically based, empirically validated Bayesian processor that can be attached to the NDFD in order to process a deterministic forecast of a continuous predictand into a full-fledged probabilistic forecast. This probabilistic forecast is specified by a distribution function, or a density function, or a quantile function. Thus, it provides the complete characterization of uncertainty needed by rational decision makers who use formal decision models and by information providers who want to extract various forecast products for their customers.

The BPF offers the users a quintessential property: the posterior probability of any event can be taken at its face value. This is so because the Bayesian probabilistic forecast is well calibrated against the climatic distribution function of the predictand, which is estimated from the longest available homogeneous sample that can be retrieved from the NCDC or some other archive.

The BPF comes furnished with a statistical procedure that copes with several nonstationarities of the stochastic processes involved and that ensures a parsimonious parameterization. In effect, all archived data pertinent to a given predictand and forecast lead time are processed into two samples for parameter estimation: the climatic sample of the predictand, and the joint sample of the predictor–predictand pair. The two samples may be of different sizes, as is almost always the case in meteorology. The unique attribute of the BPF is that it extracts information from each sample and then fuses it according to Bayes theorem, thereby ensuring that the resultant probabilistic forecast is always well calibrated and most informative, given the data at hand.

## Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant ATM-0641572, “New Statistical Techniques for Probabilistic Weather Forecasting.” The Meteorological Development Laboratory of the National Weather Service provided the data. Matthew R. Peroutka introduced the authors to the NDFD and the uncertainty quantification problem; his cooperation is much appreciated.

## REFERENCES

Alexandridis, M. G., , and Krzysztofowicz R. , 1985: Decision models for categorical and probabilistic weather forecasts.

,*Appl. Math. Comput.***17****,**241–266.DeGroot, M. H., 1986:

*Probability and Statistics*. 2nd ed. Addison-Wesley, 724 pp.Glahn, H. R., , and Ruth D. P. , 2003: The new digital forecast database of the National Weather Service.

,*Bull. Amer. Meteor. Soc.***84****,**195–201.Hamill, T. M., , Whitaker J. S. , , and Mullen S. L. , 2006: Reforecasts: An important dataset for improving weather predictions.

,*Bull. Amer. Meteor. Soc.***87****,**33–46.Herr, H. D., , and Krzysztofowicz R. , 2005: Generic probability distribution of rainfall in space: The bivariate model.

,*J. Hydrol.***306****,**234–263.Johnson, N., , and Kotz S. , 1970a:

*Distributions in Statistics: Continuous Univariate Distributions*. Vol. 1. Wiley, 300 pp.Johnson, N., , and Kotz S. , 1970b:

*Distributions in Statistics: Continuous Univariate Distributions*. Vol. 2. Wiley, 306 pp.Karian, Z. A., , and Dudewicz E. J. , 2000:

*Fitting Statistical Distributions: The Generalized Lambda Distribution and Generalized Bootstrap Methods*. CRC Press, 456 pp.Kelly, K. S., , and Krzysztofowicz R. , 1995: Bayesian revision of an arbitrary prior density.

*Proc. Section on Bayesian Statistical Science,*Alexandria, VA, American Statistical Association, 50–53.Kelly, K. S., , and Krzysztofowicz R. , 1997: A bivariate meta-Gaussian density for use in hydrology.

,*Stochastic Hydrol. Hydraul.***11****,**17–31.Krzysztofowicz, R., 1983: Why should a forecaster and a decision maker use Bayes theorem.

,*Water Resour. Res.***19****,**327–336.Krzysztofowicz, R., 1987: Markovian forecast processes.

,*J. Amer. Stat. Assoc.***82****,**31–37.Krzysztofowicz, R., 1992: Bayesian correlation score: A utilitarian measure of forecast skill.

,*Mon. Wea. Rev.***120****,**208–219.Krzysztofowicz, R., 1999: Bayesian forecasting via deterministic model.

,*Risk Anal.***19****,**739–749.Krzysztofowicz, R., , and Watada L. M. , 1986: Stochastic model of seasonal runoff forecasts.

,*Water Resour. Res.***22****,**296–302.Krzysztofowicz, R., , and Reese S. , 1991: Bayesian analyses of seasonal runoff forecasts.

,*Stochastic Hydrol. Hydraul.***5****,**295–322.Krzysztofowicz, R., , and Kelly K. S. , 2000a: Bayesian improver of a distribution.

,*Stochastic Environ. Res. Risk Assess.***14****,**449–470.Krzysztofowicz, R., , and Kelly K. S. , 2000b: Hydrologic uncertainty processor for probabilistic river stage forecasting.

,*Water Resour. Res.***36****,**3265–3277.Peroutka, M. R., , Zylstra G. J. , , and Wagner J. L. , 2005: Assessing forecast uncertainty in the National Digital Forecast Database. Preprints,

*21st Conf. on Weather Analysis and Forecasting,*Washington, DC, Amer. Meteor. Soc., 2B.3.Ryan, R. T., 2003: Digital forecasts: Communication, public understanding, and decision making.

,*Bull. Amer. Meteor. Soc.***84****,**1001–1005.Schütte, T., , Salka O. , , and Israelsson S. , 1987: The use of the Weibull distribution for thunderstorm parameters.

,*J. Climate Appl. Meteor.***26****,**457–463.Wilks, D. S., 1995:

*Statistical Methods in the Atmospheric Sciences*. Academic Press, 468 pp.

## APPENDIX

### Weibull Distribution

*Y*be a continuous variate having a bounded-below sample space (

*η*, ∞). Let

*η*<

*y*< ∞ and 0 <

*p*< 1. Then, variate

*Y*has a Weibull distribution with the scale parameter

*α*> 0, the shape parameter

*β*> 0, and the shift parameter −∞ <

*η*< ∞ if the distribution function of

*Y*is the density function of

*Y*is and the quantile function of

*Y*is For properties and estimation methods, see Johnson and Kotz (1970a) and Schütte et al. (1987).

Climatic samples for estimation of the prior distribution functions.

Days and climatic sample statistics for the prior distribution functions at KSAV.

MAD between the empirical prior distribution function of *W* ′_{k} for the given day *k* and an estimated parametric prior distribution function at KSAV.

Joint samples for estimation of the likelihood functions.

MAD between the empirical marginal distribution function of *X*′ and the estimated Weibull marginal distribution function of *X*′ at KSAV.