Model-Inspired Predictors for Model Output Statistics (MOS)

Piet Termonia Royal Meteorological Institute, Brussels, Belgium

Search for other papers by Piet Termonia in
Current site
Google Scholar
PubMed
Close
and
Alex Deckmyn Royal Meteorological Institute, Brussels, Belgium

Search for other papers by Alex Deckmyn in
Current site
Google Scholar
PubMed
Close
Full access

We are aware of a technical issue preventing figures and tables from showing in some newly published articles in the full-text HTML view.
While we are resolving the problem, please use the online PDF version of these articles to view figures and tables.

Abstract

This article addresses the problem of the choice of the predictors for the multiple linear regression in model output statistics. Rather than devising a selection procedure directly aimed at the minimization of the final scores, it is examined whether taking the model equations as a guidance may render the process more rational. To this end a notion of constant fractional errors is introduced. Experimental evidence is provided that they are approximately present in the model and that their impact is sufficiently linear to be corrected by a linear regression. Of particular interest are the forcing terms in the coupling of the physics parameterization to the dynamics of the model. Because such parameterizations are estimates of subgrid processes, they are expected to represent degrees of freedom that are independent of the resolved-scale model variables. To illustrate the value of this approach, it is shown that the temporal accumulation of sensible and latent heat fluxes and net solar and thermal radiation utilized as predictors add a statistically significant improvement to the 2-m temperature scores.

Corresponding author address: P. Termonia, Royal Meteorological Institute, Ringlaan 3, B-1180 Brussels, Belgium. Email: piet.termonia@oma.be

Abstract

This article addresses the problem of the choice of the predictors for the multiple linear regression in model output statistics. Rather than devising a selection procedure directly aimed at the minimization of the final scores, it is examined whether taking the model equations as a guidance may render the process more rational. To this end a notion of constant fractional errors is introduced. Experimental evidence is provided that they are approximately present in the model and that their impact is sufficiently linear to be corrected by a linear regression. Of particular interest are the forcing terms in the coupling of the physics parameterization to the dynamics of the model. Because such parameterizations are estimates of subgrid processes, they are expected to represent degrees of freedom that are independent of the resolved-scale model variables. To illustrate the value of this approach, it is shown that the temporal accumulation of sensible and latent heat fluxes and net solar and thermal radiation utilized as predictors add a statistically significant improvement to the 2-m temperature scores.

Corresponding author address: P. Termonia, Royal Meteorological Institute, Ringlaan 3, B-1180 Brussels, Belgium. Email: piet.termonia@oma.be

1. Introduction

Model output statistics (MOS) (Glahn and Lowry 1972) provides a practical tool to improve the skill scores of raw NWP model output. Such scores increasingly play a decisive role in determining the economical value of weather forecasts (Katz and Murphy 1997) in social and commercial applications. MOS provides a simple yet powerful tool to increase their competitiveness.

Since its introduction in operational weather forecasting, the skill of combined systems of models and MOS has steadily improved up to the point where they can compete with human forecasters (Vislocky and Fritsch 1997), especially during longer forecast ranges (Baars and Mass 2005). MOS has also been shown to be beneficial at mesoscales (Hart et al. 2004).

MOS is relatively simple and easy to apply, but it does have some practical inconveniences (see Wilks 1995). The most conspicuous one is that the training of the statistical model has to be performed separately for different models, and hence for subsequent versions of an operational NWP model. This implies that relatively long training sets and sufficient computing power are needed every time a new version of a model becomes available. This can be partially overcome by the introduction of updateable MOS, as described by Wilson and Vallée (2002).

Some progress in MOS systems is still being made. Baars and Mass (2005) considered weighted MOS systems where weights can be adapted to favor the systems that perform best. Efforts are also being made to improve perfect prog systems (Marzban et al. 2006)

Another possibility is to add more predictors to the equation. Sokol (2003) for instance did a very extensive study introducing a basic set of candidate predictors containing 600 model variables and proposing a procedure for selecting a relatively small subset to be used operationally. Such methods are very time consuming and rather tedious to repeat for every new model version.

In fact, one might go one step further and introduce nonlinear relations of the model variables as predictors, for instance, fourth power of relative humidity as in Sokol (2003). This would render the number of possible candidates impractically large, reducing the choice of appropriate predictors to a cumbersome data mining exercise with varying results depending on the situation in which the model is used. Another way of introducing nonlinear relations is to use neural networks (Yuval and Hsieh 2003; Marzban 2003).

In principle, the predictor selection procedure should be part of the training of the MOS with each model upgrade. Ideally it would be convenient to have predictors that, upon being added to the MOS, either give an improvement or a neutral effect (in the sense that the difference before and after adding them is statistically not significant). Once such a predictor is identified it would not be necessary to subject it to the selection procedure.

Most MOS systems use resolved-scale model variables as predictors. However, physics parameterizations give estimates of subgrid processes and can thus be viewed as projections of subgrid physical degrees of freedom onto the resolved scales. They may thus be used to provide some independent subgrid predictors, supposing that physically independent variables also behave highly uncorrelated. As such they are good candidates to attempt to avoid overfitting.

Other than arbitrarily trying new predictors, it is proposed that the choice of predictors might be inspired by the equations of the NWP model. This paper makes a first attempt at doing so and provides some solid evidence that getting inspiration from the NWP model internals leads to a statistically significant improvement of MOS performance.

From a modeling perspective, MOS treats the model output in a rather blind manner; that is, it does not take into account the internals of the NWP model. It is our experience that in contrast to end users, modelers often consider this as a serious drawback of the method, because it does not provide any feedback as to how to improve systematic model errors. By creating a conceptual link with the model equations, the procedure may become more attractive for them.

This article does not report on an operational system but tests the introduced ideas in a context that mimics certain aspects of an operational setup. Model-derived predictors are combined with a limited set of conventional predictors. An exhaustive study of the competitiveness of this approach to eliminative procedures as in Sokol (2003) lies beyond the scope of this paper.

This paper is organized as follows. Section 2 introduces a notion of errors (which will be called constant fractional errors) that are convenient for making the link between the model equations and linear regression models. This leads to the identification of model-inspired predictors. Section 3 introduces the utilized model data of three versions of the Aire Limitée Adaptation Dynamique Développement International (ALADIN) model (ALADIN International Team 1997) and the observational data, discusses the used methodology, and provides some experimental evidence for the existence of these errors in this model. Section 4 discusses the application of model-inspired predictors in the three versions of the model. Section 5 provides the conclusions.

2. Constant fractional errors and the choice of the predictors

Prognostic equations used in the model (either in the dynamical part or in the physics parameterization) have the general form
i1520-0493-135-10-3496-e1
where ψ represents a generic model variable and Fa represents different dynamical and forcing terms depending on model variables Ψi. These forcings have the physical dimensions of a tendency and often represent a coupling between different parts of the model, for example, a coupling between a physics-parameterizaion module and the dynamics or a coupling between the surface scheme and the atmospheric module. As will be shown below, some examples exist where such parameterizations can be externalized from the model while the communication with the model goes through a well-defined interface. As such, systematic errors made in a coupled module may “leak” through the interface into ψ via this forcing term: each time step a small error will get accumulated into ψ.
To make a practical estimate of this error leakage Eq. (1) should be simplified based on some assumptions to retain a form depending on one or more parameters that can be fitted by a linear regression. Making abstraction of the details of the numerical time-stepping scheme, a rough estimate of the contribution of the ath forcing term to the time step integration is given by Fai(t)]Δt, with Δt the time step. The model error εa can then be expressed as a fraction ηa of the total contribution εa(t) = ηa(t)Fait. It is assumed that ηa is small, but that the error accumulates over a time interval T that is to be determined. A last simplifying assumption is that in this time interval ηa stays constant such that we can estimate the error contribution as
i1520-0493-135-10-3496-e2
with
i1520-0493-135-10-3496-e3
the back-in-time accumulation of the variable ϕ over the time period T = NΔt. This type of error where
i1520-0493-135-10-3496-e4
in a sufficiently long interval shall be referred to as constant fractional errors (CFEs). In view of the nonlinearity of the atmospheric equations, this simplification [(4)] may seem at first sight severely restrictive. However, in section 4a it will be shown that it is valid to a sufficiently large extent for the particular fluxes studied in the present paper. Now the total accumulated error becomes the sum of the accumulated errors of the different forcings:
i1520-0493-135-10-3496-e5
Please note that even if the ηa remain constant for a period much longer than the time interval T, does not necessarily have to do so because the accumulated fluxes may change in time; it may even decrease.
MOS corrects the output of an NWP model by means of a statistical model. Usually a linear regression model is taken of the form
i1520-0493-135-10-3496-e6
where ψ is the model variable to be corrected (predictand), and υI, I = 1, . . . , q − 2 are additional predictors. The variables υI represent only a tiny fraction of the total number of degrees of freedom of the model state vector. The parameter q represents the number of degrees of freedom of the regression, that is, the number of parameters of the model. The right-hand side of (2) can now formally be added to the right-hand side of (6):
i1520-0493-135-10-3496-e7
where the accumulated forcings Fa become predictors. Of course it is understood that the added fluxes depend on variables other than υI. Using such predictors that represent the coupling of an external parameterization may give useful feedback to the modeler as to where to look for improving the model.1

Common choices for the other predictors υI are the standard meteorological variables such as temperature, wind, air pressure, and specific humidity. These variables typically represent averages over the area surrounding the grid points of the model. However, the parameterizations of the subgrid processes can be seen as projections of extra physical degrees of freedom in the model and their systematic errors may thus add new possibilities for statistically improving the model output.

The present approach will thus be based on two assumptions: (a) the existence of constant fractional errors in (4), and (b) the assumption that these can be corrected by a linear regression [(7)]. Apart from testing the usefulness of including such accumulated forcing terms as predictors, some tests of these two assumptions will be presented below. Note that because of the nonlinearity of NWP model equations, CFEs will “propagate” in a nonlinear manner through the variables. So it is to be expected that after a sufficiently long time the effect of CFEs can only be approximately described. Second, the nonlinearity implies that linear regression will not be able to correct the full impact of CFEs. But compromises of this type have to be made for any MOS system and have to be accepted.

To test the above-introduced ideas, the prognostic equations of the dynamical part of the model are an obvious choice. However, the fact that their forcing terms involve horizontal gradients that are not directly obtained as model output makes it less easy to use them as a first test. This paper will therefore be restricted to 2-m temperature. This is not a prognostic model variable, but is in general determined by the temperature at the lowest model level TL and the surface temperature Ts.

The prognostic equation for the surface temperature has the general form (Kalnay 2003; Giard and Bazile 2000)
i1520-0493-135-10-3496-e8
where Ct is the thermal inertia coefficient, Rlw is the net longwave radiation, RSW is the net shortwave radiation, H is the sensible heat flux, L · E is the latent heat flux, τ = 1 day, and Tp is the mean soil temperature. The dot represents the sum of different types of evaporation and transpiration; see Giard and Bazile (2000) for details in the ALADIN model, which will be introduced in the next section. The sensible and latent heat fluxes computed with bulk aerodynamic formulas have the form (Kalnay 2003)
i1520-0493-135-10-3496-e9
with the dry static energy given by S = cpT + Φ, ρ is the air density, CH is the turbulent exchange coefficient of heat, V is the wind, and L and s, respectively, indicate the lowest model level and the surface. The coefficients a and b are taken according to the type of evapotranspiration. It can be seen from (9) that both expressions of these fluxes are actually nonlinear expressions as a function of the resolved-scale meteorological variables ρ, V, T, q, and Φ.

Because T2m is given by an interpolation between TL and Ts, it is reasonable to assume that modeling errors coming from the parameterization of the turbulent fluxes will contribute to the errors in T2m at each time step. The accumulated values , , H, and will thus be considered as candidate nonlinear predictors as proposed in (7).

Best et al. (2004) proposed a general way to externalize a surface scheme from the atmospheric module by passing the fluxes of latent heat and sensible heat to the atmosphere through an interface. Any systematic error made in the surface scheme will thus be transferred through these fluxes to the lowest model level.

Of course a practical condition for including such forcing terms as predictors is that they be easily obtainable in an operational context. It will be argued later that the operational model investigated here provides the necessary tool to extract the above-mentioned fluxes.

3. Data and methodology

In this section the validity of the CFE assumption is tested in the context of a limited-area NWP model. First the model and datasets are introduced (section 3a) and the methodology is presented (section 3b).

a. Model and observational datasets

The model used to test the present MOS approach is the ALADIN model. ALADIN is a hydrostatic limited-area model issued from the international ALADIN collaboration (ALADIN International Team 1997). The version of this model is the one used at the Royal Meteorological Institute of Belgium (RMI), referred to as ALADIN-Belgium. It is run operationally at a resolution of 7 km, 4 times a day (at 6-h intervals) based on analyses coming from the Action de Recherche Petite Echelle Grande Echelle (ARPEGE) global model of Météo-France, which also provides the 3-h lateral boundary coupling data.

Models and data were chosen to test two aspects: (a) the progression to a new operational model version, and (b) assess the value of the MOS system with respect to the Belgian synoptic measurement net.

The studied dataset was created by running three versions of the model over a 2-yr period. The choice of these different versions was such as to mimic the problems to be encountered in an operational context:

  • Version A corresponds to the model version that was operational at the RMI at the start of this study. The model data were the forecasts based on the 0000 UTC analyses from 1 October 2003 to 30 September 2005, covering 2 yr.

  • Version B1 is planned as the successor of the current operational model. This model contains a switch to have a better representation of low clouds (Brožková et al. 2006). In version B1 this switch is off. This version of the model was run for all 0000 UTC analyses from 1 January 2004 to 31 December 2005.

  • Version B2 is actually the same as B1 but now run with the low-cloud switch on. This model was run for exactly the same period as model version B1.

These three models make manifest two difficulties when changing an operational model to a new version: (a) a change to an upgraded version (the change from A to B1), and (b) improving a specific part of the parameterization (change from B1 to B2).

The model runs were performed on a smaller domain than the operational domain of ALADIN-Belgium to be able to comfortably rerun several years of forecasts with different model versions. It was checked whether those runs on the smaller domain differ from the runs on the large operational domain, and no significant difference was found. All model runs in this study were performed based on analyses at 0000 UTC.

The commonly used variables 2-m temperature (T; the subscript 2m is dropped hereafter) 10-m wind speed (|V|), mean sea level pressure (Pmsl), specific humidity (q), relative humidity (r), total cloud cover, and the geopotential of the lowest model level were stored with 3-h time intervals. Temperature, relative humidity, and wind speed at 925 and 850 hPa were stored as well. Additionally the 3-hourly accumulated energy fluxes of sensible heat (H), latent heat (L · E), net solar radiation (R), and net thermal radiation (R) at 3-h intervals can be easily obtained as part of the operational output software and were stored as well. So any accumulation over a multiple of 3 h could be obtained.

For the validation of the ideas, a single station approach was followed as in Taylor and Leslie (2005). Termonia (2001) showed that four synoptic stations are sufficient to capture more than 90% of the variance in the 2-m temperatures of the Belgian synoptic measurement net. So a small database of the above-described ALADIN data was created containing the above model variables in the synoptic station of the RMI [Ukkel, World Meteorological Organization (WMO) number 6447], and four other stations:2 the coastal station Koksijde (WMO number 6400) and inland stations Deurne (WMO number 6450), Saint-Hubert (WMO number 6476), and Kleine-Brogel (WMO number 6479). These stations also provide data with 3-h time intervals. For Saint-Hubert (6476) we only have observational data during daytime.

b. Methodology

The parameters α, β, and γI in (6) and (7) are estimated by minimizing the residual error variance (EV):
i1520-0493-135-10-3496-e10
with the errors defined by the difference
i1520-0493-135-10-3496-e11
between observed values oμ and the MOS-corrected forecast Tcorrμ(T, υI). Note the presence of q in the definition of EV to account for the degrees of freedom of the regression model and the definition of the standard root-mean-square error
i1520-0493-135-10-3496-e12
being related to the EV, except that here the sample size N is used instead of Nq. EV is used to measure the quality of the linear fit in the training of our model, whereas the rmse measures the quality of a model, be it the original model or the MOS system.
The significance of the differences between the different MOS systems presented below will be quantified by confidence intervals computed with bootstrap techniques. It has been shown by Zwiers (1987, 1990) that using these methods on autocorrelated data may lead to erroneous conclusions. The errors of the 2-m temperature generally showed sinusoidal yearly cycles with amplitudes of more than 2 K and it was observed that they created yearly cycles in the autocorrelation with amplitudes ranging from 0.2 to about 0.4 at one-day time lag, depending on the chosen model and the forecast range. Therefore, and in order to reduce the impact the annual cycle in the 2-m temperature errors, a mode of the form
i1520-0493-135-10-3496-eq1
was included in all linear regression models where confidence intervals were computed, where d counts the day of the year, τ = 365.25 days, and A and B are the result of the fit.

4. Evaluation

In this section the CFE assumption is tested from a pragmatic point of view by artificially introducing errors in the parameterization of the NWP model in section 4a. The performance of different MOS approaches will be discussed in section 4b.

a. Testing the CFE assumption by artificially induced errors

As can be seen from (9), two of the forcing terms in (8) are proportional to the exchange coefficient CH. So if the assumption of the presence of the CFEs in (4) is correct, an artificial rescaling of CH inside the NWP model should be mirrored in the error of Ts, and thus also in T. This allows for an experimental test of the CFE assumption. Also, if the second assumption holds, then the extra induced error should to a large extent be corrected by removing a linear fit from the model output data.

Version B1 of the ALADIN model was modified to rescale CH immediately after its computation in the specific physics parameterization routine. This model was then run with three different scalings of CH:
i1520-0493-135-10-3496-e13
where the superscript with the brackets refers to the three versions of the model; C(2)H corresponds to the original model B1.
To test for the presence of a constant fractional error and a linear growth of the accumulated error, the three versions of the model were run from 1 January 2005 until 31 December 2005 and the errors [(11)] for the station in Ukkel were fitted with the model
i1520-0493-135-10-3496-e14
where H and are accumulated [as in (3)] over a time period T = 12 h.

Figure 1 shows the test of the CFE assumption: rmse of the T model output, and the residual error variance EV[TH] of the fit in (14), comparing the three versions of the model. Note that taking half of the exchange coefficient when going from version 1 to version 3 produces some very large changes of the rmses, thus allowing for a clear check.

From the figure it can also be seen that diminishing the value of the exchange coefficient increases the rmse, version 1 giving the best score, and degrading subsequently when going to version 2 and to version 3. However, the large differences in the rmses of the original temperatures have almost completely disappeared from the EVs of the fit according to Eq. (14). So, even though the model behaves in a nonlinear way, the artificial increase in the CFEs is almost entirely corrected by a fit that assumes a linear relationship.

This behavior of the scores should not be interpreted as meaning that version 1 is necessarily the best tuned model for all variables and locations. This experiment was only intended to study the behavior of the errors and the MOS corrections. Further study of such results are outside the scope of this paper.

Note that the rmse and the linear regression are most sensitive to CFEs during daytime, in particular at 1200 and 1500 UTC. So using accumulated sensitive and latent heat as predictors in a MOS is expected to contribute primarily during the day, and most specifically to the maximum 2-m temperature.

b. Model-inspired MOS for three versions of the NWP model

The data of the three versions A, B1, and B2 of the operational model introduced in the previous section have been subject to corrections with different MOS models that include the proposed accumulated fluxes. These models have been trained with the first year of data and validated with the second year of data of the three available 2-yr datasets (A, B1, and B2).

To prove that the proposed approach improves the MOS to a statistically significant extent, hypothesis tests are used. Because the difference of the original rmse and the rmse of the MOS correction is not a standard statistic of a distribution, it is necessary to rely on bootstrap methods.

Confidence intervals were calculated by resampling the N-fold samples 1000 times as described in Wilks (1995) and taking the 2.5% and the 97.5% percentiles of rmsecorr − rmse as lower and upper value to get a 95% confidence interval for the difference. For instance, this means a null hypothesis that the difference of two rmses is negative is accepted with a 97.5% confidence level.

In order not to overload the presentation of this article, it will be restricted to illustrating the work with four linear models built with a subset of the variables introduced in section 3a, that is, only the synoptic observables T, |V|, Pmsl, q, r, and the accumulated fluxes H, , , and :
i1520-0493-135-10-3496-e15
i1520-0493-135-10-3496-e16
i1520-0493-135-10-3496-e17
i1520-0493-135-10-3496-e18
So regression model TT has only the 2-m temperature forecast T as a predictor; TV adds to this a set of resolved-scale model output variables, while TF adds only the model inspired fluxes; TVF, finally, combines both sets of predictors. MOS systems that include model-inspired predictors will be denoted MIMOS.

The corresponding MOS systems were trained separately every 3-h forecast range. The accumulation period T was taken to be 12 h. Tests were performed with T of 3, 6, 9, and 12 h. The latter gave the most satisfactory results. The accumulated solar radiance flux () was not taken in account from 0000 to 1200 UTC.

The performance of these different models is illustrated in Figs. 2 –6. Figures 2 –4 show the results for model version A, B1, and B2 at Ukkel (6447). The top graphs show the rmse of the different MOS forecasts. The other graphs show more clearly the improvements obtained from the model-inspired fluxes. The confidence intervals for (rmse[TVF] − rmse[TV]) and (rmse[TF] − rmse[TT]) were calculated with the bootstrap method explained above.

This illustrates the progression from model A to model B. The MOS is introduced in model A. As can be seen from Fig. 2a the biggest gain of a MOS system is already obtained by using only T in model TT. The extra gain is of the order of 0.1 K. This is relatively small but, in applications, can represent a crucial difference in competitiveness and/or economical value with respect to other forecast systems.

It can also be seen from Fig. 2a that by adding the accumulated fluxes to TV to obtain TVF adds an extra improvement to the MOS, except in the latter half of the forecast. As can be seen from Fig. 2c in that case the differences are not significant. The same holds if adding the accumulated fluxes to TT to obtain TF, as shown in Fig. 2b. Based on this it could be decided to use the accumulated fluxes for all forecast ranges. Making the switch to model B1 in Fig. 3, it appears that also in this model adding the fluxes only gives improvements.

Figure 5 compares the 2-m temperature rmse (upper curves) and rmse[TVF] (lower curves) of model B2 to model B1. By switching on the low-cloud correction improvements of about 0.2 K (e.g., the rmse of the B2 runs was 0.19 K lower than the rmse of the B1 runs at the 27- and 39-h forecast range) are found. However, the difference in rmse of MIMOS with TVF between model B1 and model B2 is an order of magnitude smaller, that is, about 0.02 K for all forecast ranges (with a maximum difference of 0.041 K at the 21-h forecast range). This is positive news for the modeler of the low-cloud modification (Brožková et al. 2006): the better modeling of low clouds has a positive impact on the incoming solar radiation, which influences the surface and reduces the error in Fig. 4a with respect to Fig. 3a. This also shows that the MIMOS succeeds in correcting the larger errors of the B1 model with respect to the B2 model. However, ideally progress in modeling should gradually make statistical corrections unnecessary.

Further tests have been done with linear models containing more variables, such as cloud cover, lowest-level geopotential, and temperature, wind speed, and relative humidity at 850 and 925 hPa. In all of these cases the addition of the accumulated forcing terms still improved the rmse.

Figure 6 shows the difference rmse[TVF] − rmse[TV] at different stations for the model B2. Also in the other stations it is observed that adding the accumulated fluxes either give improvements or neutral impact. It may be observed that the effect for coastal regions (station 6400) is minimal while it is more pronounced inland.

This article is restricted to linear models. In fact, careful inspection of scatterplots did not reveal or suggest any nonlinear relationships between predictors and model errors. We have also constructed similar nonlinear systems (neural networks with one hidden layer) using the same predictors, and the results were very similar to those of the linear models.

5. Conclusions

The present approach originated from a twofold objective: select predictors that (a) are inspired by the model equations, and (b) represent the physical degrees of freedom of the model that are present in the subgrid parameterization.

As a first example of such an approach, this paper studied temporally accumulated heat and radiation fluxes, appearing in the prognostic equation for the surface temperature, as predictors for MOS correction of 2-m temperature forecasts.

First, by artificially rescaling the fluxes within the model, it was shown that their influence on the model error is linear in a first approximation. The notion of CFE was introduced for such errors that behave as a constant linear fraction of an accumulated model flux.

It has also been shown that the inclusion of these model-inspired fluxes as predictors gives statistically significant improvements in the performance of a MOS system. Even though they may seem small, they could make the difference in competition with other models, for instance, in commercial applications. This improvement was preserved when switching from one version of the operational model to the subsequent one.

It was also shown how the use of model-inspired predictors may provide feedback to the modeler. Indeed, a reduction of the model errors was shown to have almost completely accounted for a correction by the MOS in the preceding model version. As such, the MIMOS provides an additional a posteriori validation of developments in the model, putting MOS in a broader context than postprocessing of model data.

It has also been shown that it is very safe to include accumulated fluxes being computed in the physical parameterization that represent independent physical degrees of freedom as predictors to the resolved-scale variables: their contribution provides significant improvement for most of the forecast ranges, but did not lead to significant degradations of the rmses for the others.

Acknowledgments

The useful comments of C. Nicolis on this work were much appreciated. The manuscript was considerably improved by comments from H. R. Glahn and two anonymous reviewers. This work was financially supported by the Belgian Science Policy.

REFERENCES

  • ALADIN International Team, 1997: The ALADIN project: Mesoscale modelling seen as a basic tool for weather forecasting and atmospheric research. WMO Bull., 46 , 317324.

    • Search Google Scholar
    • Export Citation
  • Baars, J. A., and C. F. Mass, 2005: Performance of National Weather Service forecasts compared to operational, consensus, and weighted model output statistics. Wea. Forecasting, 20 , 10341047.

    • Search Google Scholar
    • Export Citation
  • Best, M. J., A. Beljaars, J. Polcher, and P. Viterbo, 2004: A proposed structure for coupling tiled surfaces with the planetary boundary layer. J. Hydrometeor., 5 , 12711278.

    • Search Google Scholar
    • Export Citation
  • Brožková, R., M. Derková, M. Bellus, and A. Farda, 2006: Atmospheric forcing by ALADIN/MFSTEP and MFSTEP oriented tunings. Ocean Sci. Discuss., 3 , 124.

    • Search Google Scholar
    • Export Citation
  • Giard, D., and E. Bazile, 2000: Implementation of a new assimilation scheme for soil and surface variables in a global NWP model. Mon. Wea. Rev., 128 , 9971015.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., and D. A. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11 , 12031211.

    • Search Google Scholar
    • Export Citation
  • Hart, K. A., W. J. Steenburgh, D. J. Onton, and A. J. Siffert, 2004: An evaluation of mesoscale-model-based output statistics (MOS) during the 2002 Olympic and Paralympic winter games. Wea. Forecasting, 19 , 200218.

    • Search Google Scholar
    • Export Citation
  • Kalnay, E., 2003: Atmospheric Modelling, Data Assimilation and Predictability. Cambridge University Press, 341 pp.

  • Katz, R. W., and A. H. Murphy, 1997: Economic Value of Weather and Climate Forecasts. Cambridge University Press, 222 pp.

  • Marzban, C., 2003: Neural networks for postprocessing model output: ARPS. Mon. Wea. Rev., 131 , 11031111.

  • Marzban, C., S. Sandgathe, and E. Kalnay, 2006: MOS, perfect prog, and re-analysis. Mon. Wea. Rev., 134 , 657663.

  • Sokol, Z., 2003: MOS-based precipitation forecasts for river basins. Wea. Forecasting, 18 , 769781.

  • Taylor, A. A., and L. M. Leslie, 2005: A single-station approach to model output statistic temperature forecast error assessment. Wea. Forecasting, 20 , 10061020.

    • Search Google Scholar
    • Export Citation
  • Termonia, P., 2001: On the removal of random variables in data sets of meteorological observations. Meteor. Atmos. Phys., 78 , 143156.

    • Search Google Scholar
    • Export Citation
  • Vislocky, R. L., and J. M. Fritsch, 1997: Performance of an advanced MOS system in the 1996–97 national collegiate weather forecasting contest. Bull. Amer. Meteor. Soc., 78 , 28512857.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.

  • Wilson, L. J., and M. Vallée, 2002: The Canadian updateable model output statistics (UMOS) system: Design and development tests. Wea. Forecasting, 17 , 206222.

    • Search Google Scholar
    • Export Citation
  • Yuval, and Hsieh, W. W., 2003: An adaptive nonlinear scheme for precipitation forecasts using neural networks. Wea. Forecasting, 18 , 303310.

    • Search Google Scholar
    • Export Citation
  • Zwiers, F. W., 1987: Statistical considerations for climate experiments. Part II: Multivariate tests. J. Climate Appl. Meteor., 26 , 477487.

    • Search Google Scholar
    • Export Citation
  • Zwiers, F. W., 1990: The effect of serial correlation on statistical inferences made by resampling procedures. J. Climate, 3 , 14521461.

    • Search Google Scholar
    • Export Citation

Fig. 1.
Fig. 1.

Test of the CFE hyptothesis: rmse of the T model output (higher values), and error variance of the linear fit EV[TH] (lower values). Values are given for model runs with C(1)H (solid), C(2)H (long dashed), and C(3)H (dot–dashed).

Citation: Monthly Weather Review 135, 10; 10.1175/MWR3469.1

Fig. 2.
Fig. 2.

Model version A: (a) rmse of T (dashed), TT (dash–dotted), TV (dotted), and TVF (solid); (b) Δrmse = rmse[TVF] − rmse[TV]; (c) Δrmse = rmse[TF] − rmse[TT].

Citation: Monthly Weather Review 135, 10; 10.1175/MWR3469.1

Fig. 3.
Fig. 3.

Same as in Fig. 2 but for model version B1.

Citation: Monthly Weather Review 135, 10; 10.1175/MWR3469.1

Fig. 4.
Fig. 4.

Same as in Fig. 2 but for model version B2.

Citation: Monthly Weather Review 135, 10; 10.1175/MWR3469.1

Fig. 5.
Fig. 5.

The 2-m temperature rmse of the raw model output (upper curves) and rmse[TVF] (lower curves) B1 (dashed) and B2 (solid).

Citation: Monthly Weather Review 135, 10; 10.1175/MWR3469.1

Fig. 6.
Fig. 6.

Model version B2: rmse[TVF] − rmse[TV] for stations (a) 6400, (b) 6450, (c) 6476 and (d) 6479.

Citation: Monthly Weather Review 135, 10; 10.1175/MWR3469.1

* In memoriam Dr. Edward De Dycker.

1

For instance, the modeler’s objective could be to improve the model such that the fractions ηa are minimized.

2

During the present study, data of the stations with WMO indices 6473 and 6480, as proposed in Termonia (2001), were not available so they were replaced by similar stations 6476 and 6479.

Save
  • ALADIN International Team, 1997: The ALADIN project: Mesoscale modelling seen as a basic tool for weather forecasting and atmospheric research. WMO Bull., 46 , 317324.

    • Search Google Scholar
    • Export Citation
  • Baars, J. A., and C. F. Mass, 2005: Performance of National Weather Service forecasts compared to operational, consensus, and weighted model output statistics. Wea. Forecasting, 20 , 10341047.

    • Search Google Scholar
    • Export Citation
  • Best, M. J., A. Beljaars, J. Polcher, and P. Viterbo, 2004: A proposed structure for coupling tiled surfaces with the planetary boundary layer. J. Hydrometeor., 5 , 12711278.

    • Search Google Scholar
    • Export Citation
  • Brožková, R., M. Derková, M. Bellus, and A. Farda, 2006: Atmospheric forcing by ALADIN/MFSTEP and MFSTEP oriented tunings. Ocean Sci. Discuss., 3 , 124.

    • Search Google Scholar
    • Export Citation
  • Giard, D., and E. Bazile, 2000: Implementation of a new assimilation scheme for soil and surface variables in a global NWP model. Mon. Wea. Rev., 128 , 9971015.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., and D. A. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11 , 12031211.

    • Search Google Scholar
    • Export Citation
  • Hart, K. A., W. J. Steenburgh, D. J. Onton, and A. J. Siffert, 2004: An evaluation of mesoscale-model-based output statistics (MOS) during the 2002 Olympic and Paralympic winter games. Wea. Forecasting, 19 , 200218.

    • Search Google Scholar
    • Export Citation
  • Kalnay, E., 2003: Atmospheric Modelling, Data Assimilation and Predictability. Cambridge University Press, 341 pp.

  • Katz, R. W., and A. H. Murphy, 1997: Economic Value of Weather and Climate Forecasts. Cambridge University Press, 222 pp.

  • Marzban, C., 2003: Neural networks for postprocessing model output: ARPS. Mon. Wea. Rev., 131 , 11031111.

  • Marzban, C., S. Sandgathe, and E. Kalnay, 2006: MOS, perfect prog, and re-analysis. Mon. Wea. Rev., 134 , 657663.

  • Sokol, Z., 2003: MOS-based precipitation forecasts for river basins. Wea. Forecasting, 18 , 769781.

  • Taylor, A. A., and L. M. Leslie, 2005: A single-station approach to model output statistic temperature forecast error assessment. Wea. Forecasting, 20 , 10061020.

    • Search Google Scholar
    • Export Citation
  • Termonia, P., 2001: On the removal of random variables in data sets of meteorological observations. Meteor. Atmos. Phys., 78 , 143156.

    • Search Google Scholar
    • Export Citation
  • Vislocky, R. L., and J. M. Fritsch, 1997: Performance of an advanced MOS system in the 1996–97 national collegiate weather forecasting contest. Bull. Amer. Meteor. Soc., 78 , 28512857.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.

  • Wilson, L. J., and M. Vallée, 2002: The Canadian updateable model output statistics (UMOS) system: Design and development tests. Wea. Forecasting, 17 , 206222.

    • Search Google Scholar
    • Export Citation
  • Yuval, and Hsieh, W. W., 2003: An adaptive nonlinear scheme for precipitation forecasts using neural networks. Wea. Forecasting, 18 , 303310.

    • Search Google Scholar
    • Export Citation
  • Zwiers, F. W., 1987: Statistical considerations for climate experiments. Part II: Multivariate tests. J. Climate Appl. Meteor., 26 , 477487.

    • Search Google Scholar
    • Export Citation
  • Zwiers, F. W., 1990: The effect of serial correlation on statistical inferences made by resampling procedures. J. Climate, 3 , 14521461.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Test of the CFE hyptothesis: rmse of the T model output (higher values), and error variance of the linear fit EV[TH] (lower values). Values are given for model runs with C(1)H (solid), C(2)H (long dashed), and C(3)H (dot–dashed).

  • Fig. 2.

    Model version A: (a) rmse of T (dashed), TT (dash–dotted), TV (dotted), and TVF (solid); (b) Δrmse = rmse[TVF] − rmse[TV]; (c) Δrmse = rmse[TF] − rmse[TT].

  • Fig. 3.

    Same as in Fig. 2 but for model version B1.

  • Fig. 4.

    Same as in Fig. 2 but for model version B2.

  • Fig. 5.

    The 2-m temperature rmse of the raw model output (upper curves) and rmse[TVF] (lower curves) B1 (dashed) and B2 (solid).

  • Fig. 6.

    Model version B2: rmse[TVF] − rmse[TV] for stations (a) 6400, (b) 6450, (c) 6476 and (d) 6479.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 588 289 112
PDF Downloads 226 48 2