• Arribas, A., , Robertson K. B. , , and Mylne K. R. , 2005: Test of a poor man’s ensemble prediction system for short-range probability forecasting. Mon. Wea. Rev., 133, 18251839.

    • Search Google Scholar
    • Export Citation
  • Bakhshaii, A., , and Stull R. , 2009: Deterministic ensemble forecasts using gene-expression programming. Wea. Forecasting, 24, 14311451.

    • Search Google Scholar
    • Export Citation
  • Bourke, W., , Hart T. , , Steinle P. , , Seaman R. , , Embery G. , , Naughton M. , , and Rikus L. , 1995: Evolution of the Bureau of Meteorology’s Global Assimilation and Prediction System. Part 2: Resolution enhancements and case studies. Aust. Meteor. Mag., 44, 1940.

    • Search Google Scholar
    • Export Citation
  • Cheng, W. Y. Y., , and Steenburgh J. , 2007: Strengths and weaknesses of MOS, running-mean bias removal, and Kalman filter techniques. Wea. Forecasting, 22, 13041318.

    • Search Google Scholar
    • Export Citation
  • Clemen, R. T., , and Winkler R. L. , 1985: Limits for the precision and value of information from dependent sources. Oper. Res., 33, 427442.

    • Search Google Scholar
    • Export Citation
  • Davies, T., , Cullen M. J. P. , , Malcolm A. J. , , Hawson M. H. , , Stanisforth A. , , White A. A. , , and Wood N. , 2005: A new dynamical core for the Met Office’s global and regional modelling of the atmosphere. Quart. J. Roy. Meteor. Soc., 131, 17591794.

    • Search Google Scholar
    • Export Citation
  • Denis, B., , Côté J. , , and Laprise R. , 2002: Spectral decomposition of two-dimensional atmospheric fields on limited-area domains using the discrete cosine transform (DCT). Mon. Wea. Rev., 130, 18121829.

    • Search Google Scholar
    • Export Citation
  • Durrant, T. H., , Woodcock F. , , and Greenslade D. J. M. , 2009: Consensus forecasts of modeled wave parameters. Wea. Forecasting, 24, 492503.

    • Search Google Scholar
    • Export Citation
  • Ebert, E. E., 2001: Ability of poor man’s ensemble to predict the probability and distribution of precipitation. Mon. Wea. Rev., 129, 24612480.

    • Search Google Scholar
    • Export Citation
  • Engel, C., , and Ebert E. , 2007: Performance of hourly operational consensus forecasts (OCFs) in the Australian region. Wea. Forecasting, 22, 13451359.

    • Search Google Scholar
    • Export Citation
  • Feser, F., , and von Storch H. , 2005: A spatial two-dimensional discrete filter for limited-area-model evaluation purposes. Mon. Wea. Rev., 133, 17741786.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., , and Lowry D. A. , 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11, 12031211.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., , and Ruth D. P. , 2003: The new Digital Forecast Database of the National Weather Service. Bull. Amer. Meteor. Soc., 84, 195201.

    • Search Google Scholar
    • Export Citation
  • Glowacki, T., , Yi X. , , and Steinle P. , 2012: Mesoscale Surface Analysis System for the Australian domain: Design issues, development status, and system validation. Wea. Forecasting, 27, 141157.

    • Search Google Scholar
    • Export Citation
  • Greybush, S. J., , and Haupt S. E. , 2008: The regime dependence of optimally weighted ensemble model consensus forecasts of surface temperature. Wea. Forecasting, 23, 11461161.

    • Search Google Scholar
    • Export Citation
  • Hacker, J. P., , and Rife D. L. , 2007: A practical approach to sequential estimation of systematic error on near-surface mesoscale grids. Wea. Forecasting, 22, 12571273.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., , and Whittaker J. S. , 2006: Probabilistic quantitative precipitation forecasts based on forecast analogs: Theory and application. Mon. Wea. Rev., 134, 32093229.

    • Search Google Scholar
    • Export Citation
  • Johnson, C., , and Swinbank R. , 2009: Medium-range multimodel ensemble combination and calibration. Quart. J. Roy. Meteor. Soc., 135, 777794.

    • Search Google Scholar
    • Export Citation
  • Kanamitsu, M., 1989: Description of the NMC global data assimilation and forecast system. Wea. Forecasting, 4, 335342.

  • Krishnamurti, T. N., , Kishtawal C. M. , , LaRow T. E. , , Bachiochi D. R. , , Zhang Z. , , Williford C. E. , , Gadgil S. , , and Surendran S. , 1999: Improved weather and seasonal climate forecasts from multimodel superensemble. Science, 285, 15481550.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20, 130141.

  • Miller, P. A., , and Benjamin S. G. , 1992: A system for the hourly assimilation of surface observations in mountainous and flat terrain. Mon. Wea. Rev., 120, 23422359.

    • Search Google Scholar
    • Export Citation
  • National Meteorological and Oceanographic Centre, 2010: Operational upgrades to the gridded OCF and PME systems. BOM Analysis and Prediction Operations Bull. 85, 27 pp. [Available online at http://www.bom.gov.au/australia/charts/bulletins/apob85.pdf.]

  • Persson, A., 2005: Early operational numerical weather prediction outside the USA: An historical introduction. Part III: Endurance and mathematics—British NWP, 1948–1965. Meteor. Appl., 12, 381413.

    • Search Google Scholar
    • Export Citation
  • Persson, A., , and Grazzini F. , 2007: User guide to ECMWF forecast products. Meteor. Bull. M3.2, ECMWF, Reading, United Kingdom, 71 pp.

  • Puri, K., , Dietachmayer G. S. , , Mills G. A. , , Davidson N. E. , , Bowen R. , , and Logan L. W. , 1998: The new BMRC Limited Area Prediction System, LAPS. Aust. Meteor. Mag., 47, 203223.

    • Search Google Scholar
    • Export Citation
  • Puri, K., and Coauthors, 2010: Preliminary results from numerical weather prediction implementation of ACCESS. CAWCR Res. Lett., 5, 1522.

    • Search Google Scholar
    • Export Citation
  • Raftery, A., , Gneiting T. , , Balabdaoui F. , , and Polakowski M. , 2005: Using Bayesian model averaging to calibrate forecast ensembles. Mon. Wea. Rev., 133, 11551174.

    • Search Google Scholar
    • Export Citation
  • Ruiz, J., , Saulo C. , , and Kalnay E. , 2009: Comparison of methods used to generate probabilistic quantitative precipitation forecasts over South America. Wea. Forecasting, 24, 319336.

    • Search Google Scholar
    • Export Citation
  • Saito, K., and Coauthors, 2006: The operational JMA nonhydrostatic mesoscale model. Mon. Wea. Rev., 134, 12661298.

  • Seaman, R., , Bourke W. , , Steinle P. , , Hart T. , , Embery G. , , Naughton M. , , and Rickus L. , 1995: Evolution of the Bureau of Meteorology’s Global Assimilation and Prediction System. Part 1: Analysis and initialisation. Aust. Meteor. Mag., 44, 118.

    • Search Google Scholar
    • Export Citation
  • Seed, A., 2003: A dynamic and spatial scaling approach to advection forecasting. J. Appl. Meteor., 42, 381388.

  • Silva Dias, P. L., , Soares Moreira D. , , and Neto G. D. , 2006: The MASTER Model Ensemble System (MSMES). Proc. Eighth Int. Conf. on Southern Hemisphere Meteorology and Oceanography, Foz do Iguaçu, Brazil, INPE, 1751–1757.

  • Sloughter, J. M., , Raftery A. E. , , Gneiting T. , , and Fraley C. , 2007: Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Mon. Wea. Rev., 135, 32093220.

    • Search Google Scholar
    • Export Citation
  • Stamus, P., , Carr F. H. , , and Baumhefner D. P. , 1992: Application of a scale-separation verification technique to regional forecast models. Mon. Wea. Rev., 120, 149163.

    • Search Google Scholar
    • Export Citation
  • Stensrud, D. J., , and Skindlov J. A. , 1996: Gridpoint predictions of high temperature from a mesoscale model. Wea. Forecasting, 11, 103110.

    • Search Google Scholar
    • Export Citation
  • Stensrud, D. J., , and Yussouf N. , 2005: Bias-corrected short-range ensemble forecasts of near surface variables. Meteor. Appl., 12, 217230.

    • Search Google Scholar
    • Export Citation
  • Tapp, R. G., , Woodcock R. F. , , and Mills G. A. , 1986: The application of model output statistics to precipitation prediction in Australia. Mon. Wea. Rev., 114, 5061.

    • Search Google Scholar
    • Export Citation
  • Thompson, P. D., 1977: How to improve accuracy by combining independent forecasts. Mon. Wea. Rev., 105, 228229.

  • Tustison, B., , Harris D. , , and Foufoula-Georgiou E. , 2001: Scale issues in verification of precipitation forecasts. J. Geophys. Res., 106, 11 77511 784.

    • Search Google Scholar
    • Export Citation
  • Wilson, L. J., , and Vallée M. , 2002: The Canadian Updateable Model Output Statistics (UMOS) system: Design and development tests. Wea. Forecasting, 17, 206222.

    • Search Google Scholar
    • Export Citation
  • Wilson, L. J., , Beauregard S. , , Raftery A. E. , , and Verret R. , 2007: Calibrated surface temperature forecasts from the Canadian Ensemble Prediction System using Bayesian model averaging. Mon. Wea. Rev., 135, 13641385.

    • Search Google Scholar
    • Export Citation
  • Wonnacott, T. H., , and Wonnacott R. J. , 1972: Introductory Statistics. J. Wiley and Sons, 510 pp

  • Woodcock, F., 1984: Australian experimental model output statistics forecasts of daily maximum and minimum temperature. Mon. Wea. Rev., 112, 21122121.

    • Search Google Scholar
    • Export Citation
  • Woodcock, F., , and Engel C. , 2005: Operational consensus forecasts. Wea. Forecasting, 20, 101111.

  • Woodcock, F., , and Southern B. , 1983: The use of linear regression to improve official temperature forecasts. Aust. Meteor. Mag., 31, 5762.

    • Search Google Scholar
    • Export Citation
  • View in gallery

    Australian automatic weather station observing network, with an inset of Tasmania to highlight differences in topography over small regions. [Reprinted from Engel and Ebert (2007).]

  • View in gallery

    (a)–(f) The 1.25° mean error (K) with respect to MSAS during the summer period for projection hour 24 for all component models. Negative contours are dashed; zero contour is set in boldface.

  • View in gallery

    As in Fig. 2, but for projection hour 30.

  • View in gallery

    Summer period 1.25° mean error and error standard deviation (K) with respect to MSAS for each component model for each projection hour for (a)–(f) the different models. Statistics are taken over the entire (tropical or midlatitude) sample.

  • View in gallery

    As in Fig. 4, but for the winter period.

  • View in gallery

    Example 1.25° EC FCST-ANAL 15-day bias estimates, with respect to MSAS, for projection hours (a),(c),(e) 30 and (b),(d),(f) 36, for (c),(d) dates 4 and (e),(f) 38 days apart, as compared to (a),(b) zero days. Negative contours are dashed; zero contour is set in boldface.

  • View in gallery

    Summer period 1.25° bias (BES) mean and standard deviation values (K) in the tropics with respect to MSAS for bias-correction window sizes (left to right) of 5, 10, 15, 20, 25, and 30 days: (a) L and (b) UK models. Statistics are taken over the entire sample. Other models show similar sensitivities to bias-correction window sizes (not shown).

  • View in gallery

    As in Fig. 7, but for inverse UMAE.

  • View in gallery

    Box-and-whiskers plots of the 1.25° NWP mean errors (K) after bias correction with respect to MSAS for window sizes of 5, 10, 15, 20, 25, and 30 days over the summer period in the tropics, with each point in the box-and-whiskers plot a single gridbox statistic: (a)–(f) the various models. The boxes show the median and 25th and 75th percentiles, while the whiskers indicate 1.5 × (75th − 25th percentile) below the 25th percentile and above the 75th percentile. Outliers beyond the whisker values are plotted as circles.

  • View in gallery

    As in Fig. 9, but for the increase in error variance relative to the uncorrected error sample (sqrt) (K).

  • View in gallery

    Box-and-whiskers plots of the 1.25° consensus mean errors (K), with respect to MSAS, for bias correction and weighting window sizes of 5, 10, 15, 20, 25, and 30 days, with each point in the box-and-whiskers plot a single gridbox statistic: winter (a) tropics and (b) mid-latitudes. (c),(d) As in (a),(b), but for summer.

  • View in gallery

    As in Fig. 11, but for error standard deviation (K).

  • View in gallery

    The 1.25° RMSE (K), with respect to MSAS for (left to right at each forecast hour) L and LL (unshaded); EC, Japan Meteorological Agency Global Spectra Model (JM), Met Office (UK) (and US in summer) (light gray); and OCF (dark gray) for the (a),(b) winter and (c),(d) summer periods and (a),(c) tropical and (b),(d) midlatitude regions (statistics taken over entire sample) using a 15-day bias correction and weighting.

  • View in gallery

    The 1.25° RMSE (K) of OCF (after 15-day bias correction and weighting) for (a)–(l) projection hours 6, 12, … , 72 over the summer period. The 1.0 K contour is set in boldface.

  • View in gallery

    Mean error (K) from 1.25° OCF (after 15-day bias correction and weighting) interpolated to 5 km (a) without and (b) with 15-day small-scale correction, over Tasmania (see Fig. 1), for projection hour 30 of the summer period. Negative contours are dashed; the zero contour is set in boldface.

  • View in gallery

    Mean error (K) and error standard deviation (K) of 1.25° OCF (after 15-day bias correction and weighting) interpolated to 5 km and corrected with small-scale correction of 0 (uncorrected), 5, 10, 15, 20, 25, and 30 days for the winter and summer periods and tropical and midlatitude regions (statistics taken over entire sample): winter (a) tropics and (b) mid-latitudes. (c),(d) As in (a),(b), but for summer.

  • View in gallery

    As in Fig. 14, but for 1.25° OCF (after 15-day bias correction and weighting) interpolated to 5 km and corrected using 15-day small-scale correction.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 48 48 2
PDF Downloads 38 38 0

Gridded Operational Consensus Forecasts of 2-m Temperature over Australia

View More View Less
  • 1 University of Melbourne, Melbourne, Victoria, Australia
  • | 2 Centre for Australia Weather and Climate Research, Melbourne, Victoria, Australia
© Get Permissions
Full access

Abstract

This paper describes an extension of an operational consensus forecasting (OCF) scheme from site forecasts to gridded forecasts. OCF is a multimodel consensus scheme including bias correction and weighting. Bias correction and weighting are done on a scale common to almost all multimodel inputs (1.25°), which are then downscaled using a statistical approach to an approximately 5-km-resolution grid. Local and international numerical weather prediction model inputs are found to have coarse scale biases that respond to simple bias correction, with the weighted average consensus at 1.25° outperforming all models at that scale. Statistical downscaling is found to remove the systematic representativeness error when downscaling from 1.25° to 5 km, though it cannot resolve scale differences associated with transient small-scale weather.

Corresponding author address: Chermelle Engel, School of Earth Sciences, University of Melbourne, Melbourne VIC 3010, Australia. E-mail: cbengel@unimelb.edu.au

Abstract

This paper describes an extension of an operational consensus forecasting (OCF) scheme from site forecasts to gridded forecasts. OCF is a multimodel consensus scheme including bias correction and weighting. Bias correction and weighting are done on a scale common to almost all multimodel inputs (1.25°), which are then downscaled using a statistical approach to an approximately 5-km-resolution grid. Local and international numerical weather prediction model inputs are found to have coarse scale biases that respond to simple bias correction, with the weighted average consensus at 1.25° outperforming all models at that scale. Statistical downscaling is found to remove the systematic representativeness error when downscaling from 1.25° to 5 km, though it cannot resolve scale differences associated with transient small-scale weather.

Corresponding author address: Chermelle Engel, School of Earth Sciences, University of Melbourne, Melbourne VIC 3010, Australia. E-mail: cbengel@unimelb.edu.au

1. Introduction

Public weather forecast practices change with the quality and amount of information and tools available. In Australia, public weather forecasts have evolved over the decades from short-term forecasts (around 24 h ahead), based on locally observed information (Persson 2005; Persson and Grazzini 2007), to forecasts with lead times out to 8 days, which are strongly guided by numerical weather prediction (NWP) model forecasts. As NWP models improve, public weather forecasting practices are likely to change even further.

Australian weather forecasting offices are in the process of changing from site-based to grid-based forecast preparation with the adoption of the Graphical Forecast Editor (GFE) from the United States (Glahn and Ruth 2003). Instead of forecasting mainly for well-populated, well-observed locations, forecasters will prepare public weather forecasts for domains that will altogether cover Australia, with site-based forecasts derived from the resultant grid. The locality, or level of detail of interest, will use a target domain spatial resolution of around 5 km (2.5 min). This change in forecasting practice is a factor driving the need for more accurate small-scale NWP guidance.

Australia has some unique modeling challenges. It is a large continent encompassing tropical and midlatitude regions, with a population concentrated in a few mainly coastal areas. The sparsely populated regions of Australia are less well observed, leading to a highly inhomogeneous observational network (Fig. 1). These poorly observed regions present a challenge to NWP data assimilation, especially in areas with complex topography (Fig. 1, inset), thus contributing to NWP model error. Even in parts of the domain that are better observed, NWP models contain forecast error due to imperfect modeling of physical and dynamical processes (Stensrud and Yussouf 2005) and imprecise initial conditions (e.g., Lorenz 1963).

Fig. 1.
Fig. 1.

Australian automatic weather station observing network, with an inset of Tasmania to highlight differences in topography over small regions. [Reprinted from Engel and Ebert (2007).]

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

In addition to NWP models run locally at the Bureau of Meteorology, Australian forecasters have access to a variety of model output from international forecasting centers. Current Australian public weather forecasts make heavy use of an automated first-guess forecast that combines local and international forecasts into an operational consensus forecast (OCF; Woodcock and Engel 2005; Engel and Ebert 2007) at specific locations. OCF uses a statistical scheme that bias corrects and weights NWP forecasts from multiple centers and combines them into an objective consensus forecast. The OCF scheme outperforms direct model output (DMO) forecasts derived directly from NWP grids, generally via nearest gridpoint or bilinear interpolation. This is because DMO forecasts are sensitive to both model errors and representation errors (with the model resolution not matching that of the observations), and because representation errors are often systematic and can respond well to bias correction. Simple bias correction and weighting schemes similar to those used in OCF are also used in other schemes described by Stensrud and Skindlov (1996), Stensrud and Yussouf (2005), Silva Dias et al. (2006), Ruiz et al. (2009), and Hacker and Rife (2007). More complicated techniques include regression (Glahn and Lowry 1972; Woodcock and Southern 1983; Woodcock 1984; Tapp et al. 1986; Krishnamurti et al. 1999; Wilson and Vallée 2002), gene-expression programming (Bakhshaii and Stull 2009), ensemble Kalman filter methods (Cheng and Steenburgh 2007), and regime matching (Greybush and Haupt 2008). Regression and gene-expression-programming techniques may remove more model and representativeness error but require years and almost a year, respectively, of stable forecast and observed paired information, and such information is not currently available for most NWP models that undergo a frequent update cycle. Ensemble Kalman filter and regime-matching techniques may be able to work with smaller sets of data but are considered beyond the scope of this study. Therefore, consensus forecasts that reduce forecast error via regression to the mean (Thompson 1977; Ebert 2001; Johnson and Swinbank 2009) are a viable alternative. This paper describes a grid-based OCF (GOCF) process that performs coarse-scale bias correction and weighted averaging, followed by statistical downscaling to finescales.

2. Data

Table 1 shows the NWP model forecasts available at the Bureau of Meteorology in 2007, at the start of the study period in this paper. International models were not always available at their native resolution; some model resolutions were reduced before transmission to Australia. Models had varying arrival times, with some (such as EC) unavailable until after the consensus forecast is run. The goal is to combine available coarse-scale information from multiple models into a finescale deterministic (approximately 5 km) consensus forecast over Australia, with an issue time suitable for operational forecasters.

Table 1.

Input NWP model characteristics for forecasts available at the Bureau of Meteorology in 2007.

Table 1.

To perform bias correction, weighting, and downscaling, we require “truth” on a 1.25° and 5-km grid. We used the Australian Mesoscale Surface Assimilation System (MSAS) (Glowacki et al. 2012) that is available every hour at 2.5′ (approximately 5 km; denoted 5 km hereafter) resolution over Australia. MSAS is a modified Miller and Benjamin (1992) statistical interpolation scheme that analyses automatic weather station surface observations onto a background field at a higher resolution than is currently achievable using current full data assimilation systems available within Australia. MSAS fields at 1.25° are achieved via area averaging. While more sophisticated scale separation techniques exist (Feser and von Storch 2005; Denis et al. 2002), they were considered beyond the scope of this study due to the lack of stable MSAS data.

We demonstrate the gridded OCF technique for 2-m temperature, for forecasts valid every 6 h out to 3 days, using as input the NWP models outlined in Table 1. Two Australian Limited Area Prediction System runs (L and LL) were used to emulate the original OCF studies (Woodcock and Engel 2005; Engel and Ebert 2007). Note the 12-h lag for the European Centre for Medium-Range Weather Forecasts (EC) and Australian Limited Area Prediction System model (LL). The majority of the model forecasts were available with spatial resolutions equal to or coarser than 1.25°. Model output was averaged or interpolated to 1.25° scale if required.

Concurrent upgrades to the operational MSAS analysis scheme meant that a stable dataset of only short duration was available (Glowacki et al. 2012). Evaluation was conducted on independent summer and winter periods to test seasonality, with results segregated into tropical and midlatitude regions (equatorward or poleward of 25°S, respectively). The study uses data from 9 June to 8 August 2007 (winter period) and 23 November 2007 to 7 February 2008 (summer period), with a 30-day training period prior to the beginning of each period. The winter and summer periods have 52 and 77 samples, respectively. Forecasts for the National Centers for Environmental Predication Global Forecast System (US) were unavailable during the winter period. Sea points were not included in the study due to the lack of surface observations as input to MSAS over that area.

3. Grid-based OCF methodology

a. OCF consensus forecast in grid boxes

The site-based OCF methodology of Woodcock and Engel (2005) and Engel and Ebert (2007) is a simple statistical scheme that takes a weighted average of bias-corrected component model forecasts on a day-by-day (or hour by hour) basis, for a directly observed location. The gridded OCF methodology extends this concept to a latitude–longitude grid using the MSAS analysis (Glowacki et al. 2012) as truth, with each grid box assumed to represent an area-mean value centered on the point of interest. In this study, a resolution of 1.25° is chosen as the “common” scale of the forecast information available (Table 1). NWP forecasts unavailable at the common 1.25° scale are upscaled or downscaled to this resolution, using area averaging or bilinear interpolation, respectively. Apart from the transition from specific locations to grid boxes, the process of site-based OCF and grid-based OCF is the same. The process is summarized here for clarity.

For each grid box a consensus forecast based at time (t0) and forecast hour (h = 0, 6, … 72) is calculated at follows. Starting with N + 1 days of historical forecasts from each model (m), valid at times ti (with i = 0, 1, … , N referring to a date i days prior but at the same forecast hour), the historical bias and weight parameters are calculated with reference to the corresponding MSAS analyses.

At each grid box the bias parameter (bm) for model m is provided by the best easy systematic mean statistic (BES; Wonnacott and Wonnacott 1972, section 7.3) using forecasts (fm) and analysis values (o) from the same forecast hour (h) from 1 to N days ago, with
e1
where Q1, Q2, and Q3 are the (fm − o) error sample first, second, and third quartiles, respectively, of the distribution of daily bias errors. Model-dependent weighting parameters (wm) are then calculated using an inverse unbiased mean absolute error (UMAE):
e2
which is then normalized to become
e3
where nm is the number of models in the consensus. From this, the bias-corrected, weighted-average consensus forecast (fc) at that model run time and hour becomes
e4
The bias and weighting parameters are approximated using the historical sample of errors over N days. Deficiency in either bias removal or weighting may result in a suboptimal consensus forecast. While the bias removal may be relatively stable (due to the BES procedure excluding outliers), the weighting parameter is more sensitive to outliers. A single outlier in the historical window may result in a model being severely downweighted from the consensus at that hour while that outlier is in the statistical window. While such outliers may be rare for the 1.25° scale being processed, they may become less rare as the scale of consensus processing becomes smaller. Another practical point of consideration is the normalization of the weighting. With not all models available at all hours, the influence of individual models can vary between forecast hours. Normalized weighting of model forecasts also varies between forecast hours due to the relative skill of the models at each hour through the forecast period.

b. Statistical downscaling of 1.25° consensus 2-m temperature forecasts

The basis for this part of the procedure is described in Engel and Ebert (2007), with the initial ideas of scale separation inspired from Seed (2003), Tustison et al. (2001), and Stamus et al. (1992).

Much of the representativeness error introduced due to the discrepancy in scales between the 1.25° input grid and 5-km output grid is due to persistent local effects related to topography, coastlines, etc., and can be removed by a simple statistical process. The systematic difference in scale (across all models including the consensus) is found for each finescale grid box using MSAS available at 5-km (native) resolution, and upscaled to 1.25° to provide a large-scale truth. The loss of information (scale difference) due to this “upscaling” is calculated by bilinearly interpolating the 1.25° MSAS back to 5-km resolution and subtracting this from the original 5-km MSAS analysis to obtain a field of scale differences. A historical window of scale differences is then collated using the BES [Eq. (1)] to find a small-scale correction. This small-scale correction is then added to the 1.25° consensus forecast interpolated onto the 5-km domain. While the consensus forecast exists for hours 0, 6, … , 72, MSAS small-scale correction exists only for one diurnal cycle (i.e., hours 0, 6, 12, and 18). Therefore, beyond hour 18, the small-scale correction is repeated, matching up the respective time of day.

4. Results

a. Experimental setup

As indicated in section 2, this study tests the gridded consensus methodology for 2-m temperature forecasts using the NWP input given in Table 1. The tests were run in a pseudo-operational mode starting on the first day of the period, using the 30-day training period, with weights and biases updated on a daily basis. Statistics were collated only for dates on which all the model inputs and consensus outputs exist. In this paper we describe 0000 UTC runs of GOCF; 1200 UTC runs were tested outside this study and gave similar results.

To test the optimal temporal window used for bias and weighting (section 3a), historical window sizes were varied between 5 and 30 days in increments of 5 days. For each window size the bias and weighting parameters along with the consensus were calculated. Then, from the 1.25° consensus forecast based on an optimal historical window size of 15 days (section 4b), downscaled consensus forecasts (section 3b) were calculated, using small-scale correction window sizes ranging from 5 to 30 days in increments of 5 days, again to test the optimal window.

b. Large-scale error characteristics

With each NWP model having an independent formulation (except L and LL), the model errors should be somewhat independent, which is beneficial to the consensus forecast as errors may partially cancel out. Systematic NWP model errors are expected to be less pronounced at 1.25° horizontal resolution than at smaller scales since large-scale meteorological processes are typically better represented. Despite this, mean errors from the 1.25° NWP forecasts are nonnegligible and persistent across the summer period, and vary from model to model (Fig. 2) and from hour to hour (cf. Figs. 2 and 3). Errors of similar magnitude but different location occurred in the winter period (not shown). Figures 2 and 3 display regions of high and low mean error at 1.25° with the interior of Australia tending to show greater model bias than the coastal regions. The values of the mean error and error variance change throughout the day with a pronounced diurnal cycle for some models (Figs. 4 and 5).

Fig. 2.
Fig. 2.

(a)–(f) The 1.25° mean error (K) with respect to MSAS during the summer period for projection hour 24 for all component models. Negative contours are dashed; zero contour is set in boldface.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

Fig. 3.
Fig. 3.

As in Fig. 2, but for projection hour 30.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

Fig. 4.
Fig. 4.

Summer period 1.25° mean error and error standard deviation (K) with respect to MSAS for each component model for each projection hour for (a)–(f) the different models. Statistics are taken over the entire (tropical or midlatitude) sample.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

Fig. 5.
Fig. 5.

As in Fig. 4, but for the winter period.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

The success of the bias removal depends on the error stability over the correction window. At a specific projection hour, bias errors can vary significantly from day to day and hour to hour (Fig. 6). Sometimes bias errors can be fairly static over a number of days (say, due to a slow-moving weather system) or they may be very different (due to changing weather conditions). The length of time over which the bias errors are representative of each other will depend on the season and similarity of the weather conditions, so it is important to investigate the optimal length of the time window used for bias correction and weighting parameters.

Fig. 6.
Fig. 6.

Example 1.25° EC FCST-ANAL 15-day bias estimates, with respect to MSAS, for projection hours (a),(c),(e) 30 and (b),(d),(f) 36, for (c),(d) dates 4 and (e),(f) 38 days apart, as compared to (a),(b) zero days. Negative contours are dashed; zero contour is set in boldface.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

The bias corrections bm(*, h) and both vary with window size (shown for tropical summer in Figs. 7 and 8 but with similar behavior for other model–region–season combinations). Here, bm(*, h) and vary most with the smallest window (5 days), with the spread of bm(*, h) and diminishing as the window size increases. Beyond a window size of 5 days there is lower sensitivity of to window size (Fig. 8); the same being true for wm(*, h) (not shown).

Fig. 7.
Fig. 7.

Summer period 1.25° bias (BES) mean and standard deviation values (K) in the tropics with respect to MSAS for bias-correction window sizes (left to right) of 5, 10, 15, 20, 25, and 30 days: (a) L and (b) UK models. Statistics are taken over the entire sample. Other models show similar sensitivities to bias-correction window sizes (not shown).

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

Fig. 8.
Fig. 8.

As in Fig. 7, but for inverse UMAE.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

While the sensitivity of bm(*, h) and to window size may seem relatively small (across the entire summer period and tropical region), the sensitivity to the window size has an impact on the error after bias correction. The mean error after bias correction is sensitive to the window size (Fig. 9). As the window size increases, the median and range of error values also increase. This indicates that the correction sample may be less representative throughout the period. In contrast though, the error variance (after correction) is greater for smaller window sizes, as imperfect bias correction can result in a larger range of errors. Beyond a window size of 15 days, the increase in error variance relative to an uncorrected error sample is similar; with some variation from model to model (Fig. 10). (Note that only a perfect bias correction would result in no increase in error variance.) These results indicate that a window size of 5 or 10 days is too small to generate a stable bm(*, h) or wm(*, h) value as they vary much more than those for the larger statistical windows. Use of a statistical window of 5 or 10 days may result in a smaller mean error after correction, but can result in a larger error variance (after correction). As the statistical window size increases beyond 10 days, the bias parameter and error after bias correction stabilize.

Fig. 9.
Fig. 9.

Box-and-whiskers plots of the 1.25° NWP mean errors (K) after bias correction with respect to MSAS for window sizes of 5, 10, 15, 20, 25, and 30 days over the summer period in the tropics, with each point in the box-and-whiskers plot a single gridbox statistic: (a)–(f) the various models. The boxes show the median and 25th and 75th percentiles, while the whiskers indicate 1.5 × (75th − 25th percentile) below the 25th percentile and above the 75th percentile. Outliers beyond the whisker values are plotted as circles.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

Fig. 10.
Fig. 10.

As in Fig. 9, but for the increase in error variance relative to the uncorrected error sample (sqrt) (K).

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

The consensus forecasts resulting from the various statistical correction window sizes (Figs. 11 and 12) also reflect these findings, and indicate that a window size greater than or equal to 10 d is able to adequately capture the forecast error variability, along with the systematic behavior, with a slight degradation in bias approximations beyond a window size of 15 days.

Fig. 11.
Fig. 11.

Box-and-whiskers plots of the 1.25° consensus mean errors (K), with respect to MSAS, for bias correction and weighting window sizes of 5, 10, 15, 20, 25, and 30 days, with each point in the box-and-whiskers plot a single gridbox statistic: winter (a) tropics and (b) mid-latitudes. (c),(d) As in (a),(b), but for summer.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

Fig. 12.
Fig. 12.

As in Fig. 11, but for error standard deviation (K).

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

The remainder of this paper focuses on consensus forecasts derived from a statistical window size of 15 days. A window size of 15 days was chosen rather than 10 days for the purpose of robustness to missing input, which can occur in practice in an operational setting. Note that Stensrud and Yussouf (2005) use a 12-day window for similar reasons.

Figure 13 shows the RMSE for the NWP forecasts after the 15-day bias correction and the RMSE for the weighted consensus forecast calculated using a 15-day bias correction and weighting. Mapped values of the 15-day consensus RMSE values are shown in Fig. 14.

Fig. 13.
Fig. 13.

The 1.25° RMSE (K), with respect to MSAS for (left to right at each forecast hour) L and LL (unshaded); EC, Japan Meteorological Agency Global Spectra Model (JM), Met Office (UK) (and US in summer) (light gray); and OCF (dark gray) for the (a),(b) winter and (c),(d) summer periods and (a),(c) tropical and (b),(d) midlatitude regions (statistics taken over entire sample) using a 15-day bias correction and weighting.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

Fig. 14.
Fig. 14.

The 1.25° RMSE (K) of OCF (after 15-day bias correction and weighting) for (a)–(l) projection hours 6, 12, … , 72 over the summer period. The 1.0 K contour is set in boldface.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

Model forecast error in the midlatitudes has a relatively low RMSE in the winter period (Fig. 13b) in comparison to the summer period (Fig. 13d). The RMSEs (across all hours) during winter vary from 0.8 to 2.1 K, with a peak during the night (forecast lead time of 12, 18, 36, 42, 60, and 66 h). The reduction in RMSE relative to the individual bias-corrected model forecasts due to the consensus process ranges between 2% and 39% in winter. The RMSEs of the bias-corrected model forecasts in the summer period vary from 1.2 to 2.3 K, slightly higher than in winter, with a peak in the afternoon and evening (forecast lead times of 6, 12, 30, 36, 54, and 60 h) (Fig. 13d). During the summer period the consensus process reduces the RMSE by 12%–40%. The higher RMSE of the bias-corrected model forecasts in summer than in winter leads to a higher consensus RMSE in summer than in winter accordingly.

Note that the bias [bm(*, h)] differs from model to model and hour to hour, with each individual model having a distinct diurnal cycle. After bias removal (using a window size of 15 days), there are still some small nonzero mean error values persisting since bm(*, h) is only an estimate of the true bias (Figs. 9 and 11), but these are small in comparison to the error variance.

The performance of the consensus in the tropics is similar to that in the midlatitudes, apart from slightly lower error variance after bias correction (Fig. 13). During the winter period the bias-corrected model forecast RMSEs in the tropics range from 0.9 to 1.8 K, with the consensus process reducing the RMSEs by 7%–42%. In summer in the tropics, the bias-corrected model forecast RMSEs range from 1.2 to 2.3 K, with the consensus improving the RMSEs by 17%–45%. Consensus forecasts appear to be slightly more accurate in the tropics than in the midlatitudes, though the difference may not be statistically significant (Fig. 13; Fig. 14). Note the spatial variation in consensus RMSE (Fig. 14).

c. Performance of consensus at small scales

Small-scale deviations from the large-scale environment, for example due to terrain effects or differential surface heating, can be difficult to forecast, especially in highly dynamic environments. Since local weather responds to the large-scale environment, an error in the large scale can lead to error in the smaller scale. Therefore, improving forecasts of the large-scale environment can reduce small-scale error.

A simple and practical approach to downscaling coarse-resolution forecasts is a statistical small-scale correction to remove the most stable representation errors from the 1.25° consensus forecast (section 3b). Since for temperature forecasts the greatest influence on the local correction is altitude, this approach may be expected to work well. Nevertheless, the ability of the statistical small-scale correction to remove the representation error also depends on how well the large-scale scenario is represented in the statistical window. Therefore, when the large-scale environment (e.g., a change in prevailing wind direction changing from offshore to onshore flow) drives statistically unrepresented small-scale activity, the method will not work as well. This is a limitation of this method.

Figure 15 shows that the statistical correction reduces the mean error to nearly zero, with the greatest improvements over areas of complex topography (similar results over all of Australia; not shown). Tests confirmed that the statistical correction also reduces the error variance compared to simple bilinear interpolation from 1.25° to 5 km (not shown). The reduction in RMSE is similar for all temporal window sizes tested, with little variation (Fig. 16). This indicates that the systematic small-scale differences are approximately static and can be removed effectively with just a small temporal window. Note that the bias correction and weighting in GOCF forecasts reduce the mean and standard errors relative to the uncorrected consensus forecast.

Fig. 15.
Fig. 15.

Mean error (K) from 1.25° OCF (after 15-day bias correction and weighting) interpolated to 5 km (a) without and (b) with 15-day small-scale correction, over Tasmania (see Fig. 1), for projection hour 30 of the summer period. Negative contours are dashed; the zero contour is set in boldface.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

Fig. 16.
Fig. 16.

Mean error (K) and error standard deviation (K) of 1.25° OCF (after 15-day bias correction and weighting) interpolated to 5 km and corrected with small-scale correction of 0 (uncorrected), 5, 10, 15, 20, 25, and 30 days for the winter and summer periods and tropical and midlatitude regions (statistics taken over entire sample): winter (a) tropics and (b) mid-latitudes. (c),(d) As in (a),(b), but for summer.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

Comparing the RMSE values of the 1.25° and the 5-km consensus forecasts, we see that the RMSE values are larger for the 5-km consensus (Fig. 17) than for the 1.25° consensus (Fig. 14). The 5-km consensus incorporates only systematic small-scale deviations from the large scale, where these deviations are derived from MSAS. Nonsystematic small-scale deviations (e.g., from transient weather systems) are not accounted for by the scheme, and inaccurate predictions of these unresolved features increases the RMSE. Moreover, as MSAS is an assimilation scheme, it cannot be depended upon to accurately represent small-scale features in regions with poor observation coverage. An example of a region with poor observation coverage being the data gap around 25°S, 122.5°E (Fig. 1). Therefore, there are regions where even systematic small-scale deviations cannot be adequately corrected. In summary, while the 5-km statistical correction can remove the systematic small-scale differences, it cannot capture dynamic (nonsystematic) small-scale deviations such as those due to weather variability, and it cannot remove systematic deviations in poorly observed areas accurately. Forecasters, and statistical techniques such as regime matching, can potentially add value in such situations.

Fig. 17.
Fig. 17.

As in Fig. 14, but for 1.25° OCF (after 15-day bias correction and weighting) interpolated to 5 km and corrected using 15-day small-scale correction.

Citation: Weather and Forecasting 27, 2; 10.1175/WAF-D-11-00069.1

5. Discussion

Forecast characteristics frequently change with upgrades to local and international models. In Australia, this has been further affected by the recent acquisition of international model output with grid resolutions finer than are available on the Global Telecommunications System. As resolved scales of the models become increasingly finer, the consensus process may benefit from choosing a smaller “common” scale for the bias correction and consensus, where dynamic small-scale information replaces statistical downscaling at some scales. However, as forecast scales get smaller, the nature of the errors starts to change. For example, if models suffer from large-scale errors in the timing or placement of features, the subsequent small-scale features may also be in the wrong place or over-/underdeveloped. This may be reflected in this study by the poor performance of the L0 and L1 models in the later forecasting period. L0 and L1 are nested from the Australian Global Assimilation and Prognosis model (GASP; Seaman et al. 1995; Bourke et al. 1995), which during the period of this study had comparably lower skill than the other global NWP models. This may have contributed to the lower impact in terms of skill of the L0 and L1 on the consensus forecast accuracy. The Australian Bureau of Meteorology has since upgraded its global NWP model (Puri et al. 2010) and higher-resolution versions of forecasts from other international centers have become available.

The gridded OCF system currently running at the Bureau of Meteorology uses a “common scale” of 0.5° (National Meteorological and Oceanographic Centre 2010), which has resulted in more accurate forecasts in coastal regions. The optimal choice of a common scale given the NWP information available will require further study.

The question may be asked whether it is better to perform the bias correction at point scale, as has been done in other studies (e.g., Woodcock and Engel 2005; Stensrud and Yussouf 2005), or at grid scale, as has been demonstrated here. If forecasts are desired only at locations where observations are available to be used for bias correction, then clearly a point approach is advantageous. When gridded forecasts are required, the point bias correction may be highly localized in some areas and hence might be less suitable than a mesoscale analysis for “filling in the gaps.” In practice, the Bureau of Meteorology uses both approaches: site-based OCF forecasts are made for locations where observations are taken and applied to the relevant grid box, while gridded OCF is used to produce forecasts for the remainder of the grid.

In this study, the bias correction and weighting parameters (on the large scale) and the downscaling corrections (on the small scale) were computed using simple temporal window-based statistics. No attempt is made to decide whether the statistics in the historical sample are from weather scenarios similar to the current one. While there are many advantages to using a simple process, there are also some disadvantages. Simple error characterizations require smaller historical datasets, which in turn allow for rapid adjustment to changes in the underlying models or observations. This rapid adjustment enhances operational reliability since a forecast with reasonable quality can still be produced while the underlying inputs change. Use of a simple method may facilitate the modeling of systematic errors under certain conditions, further reducing the consensus error. Given the lack of control of international inputs, it is possible that more sophisticated model error characterizations (such as regression or artificial neural networks) may lead to a technique that is more operationally unstable.

As models increase in skill, the performance of the consensus forecasts will continue to improve and the associated reduction in error due to consensus may reach a level of diminishing returns. In this scenario, it may be viable to instead focus on more sophisticated error characterization of a single (stable) model forecast, using reanalysis datasets (e.g., Hamill and Whittaker 2006). In Australia, consensus methods are more practical at present.

One way to enhance consensus methods may be to enhance the weighting of consensus models. The current weighting method based on inverse MAE is relatively unresponsive to differences in error variance. Other weightings such as inverse RMSE have been suggested but also only respond mildly to skill differences upon normalization. There may be some merit in weighting directly for error variance so that the normalized weight of the consensus model forecasts reflects directly the difference in error variance. While this may dampen the input of some models and reduce the tendency to regress to the mean (especially if one model dominates), it may increase consensus forecast accuracy. Bayesian model averaging accounts for both the prior performance of component models as well as their interdependence, and has been successfully applied to temperature forecasting from multiple models (Raftery et al. 2005; Wilson et al. 2007).

When one or more model forecasts consistently outperform the other model forecasts, then the reduction in error through consensus may become smaller than if all models performed equally, but the results shown here suggest that the consensus is still the best performer. Note that this situation may not be true if any model forecasts were redundant (Clemen and Winkler 1985). It would be worthwhile to test the accuracy of consensus forecasts using a subset of most skillful models, as other studies have shown that the removal of the consistently poorer models can have a beneficial impact on the final forecast (e.g., Arribas et al. 2005; Durrant et al. 2009).

Simple statistical downscaling of the large-scale consensus forecast using statistical small-scale information further improves the forecast accuracy at small scales. This method, understanding its limitations, provides cheap, relatively accurate temperature forecasts over a 5-km-resolution Australian domain.

In conclusion, the gridded operational consensus forecast methodology described here is a first step in the development of consensus NWP guidance that makes good use of the model information currently available. This relatively inexpensive methodology produces consensus forecasts that are up to 40% more accurate than the individual component models. While more sophisticated methods may be developed, this study demonstrates that a simple method can have a dramatic impact on forecast quality.

The methodology is also being used to produce probabilistic forecasts. The poor man’s ensemble for precipitation described by Ebert (2001) is now run within the gridded OCF system, and Bayesian model averaging (Raftery et al. 2005; Sloughter et al. 2007) is being tested as a calibration method for probabilistic forecasts. Uncertainty products for mean sea level pressure and other variables, based on the variability among input models, are also being developed (T. Hume 2011, personal communication).

Acknowledgments

The authors would like to acknowledge the local and international institutions providing the forecast information used in this study, along with the Australian Bureau of Meteorology for making the forecasts available. John Bally provided helpful advice during the course of this work. We would also like to thank Tomasz Glowacki for early access to Mesoscale Surface Assimilation Scheme data, along with Jim Fraser and Charles Sanders for help getting data.

REFERENCES

  • Arribas, A., , Robertson K. B. , , and Mylne K. R. , 2005: Test of a poor man’s ensemble prediction system for short-range probability forecasting. Mon. Wea. Rev., 133, 18251839.

    • Search Google Scholar
    • Export Citation
  • Bakhshaii, A., , and Stull R. , 2009: Deterministic ensemble forecasts using gene-expression programming. Wea. Forecasting, 24, 14311451.

    • Search Google Scholar
    • Export Citation
  • Bourke, W., , Hart T. , , Steinle P. , , Seaman R. , , Embery G. , , Naughton M. , , and Rikus L. , 1995: Evolution of the Bureau of Meteorology’s Global Assimilation and Prediction System. Part 2: Resolution enhancements and case studies. Aust. Meteor. Mag., 44, 1940.

    • Search Google Scholar
    • Export Citation
  • Cheng, W. Y. Y., , and Steenburgh J. , 2007: Strengths and weaknesses of MOS, running-mean bias removal, and Kalman filter techniques. Wea. Forecasting, 22, 13041318.

    • Search Google Scholar
    • Export Citation
  • Clemen, R. T., , and Winkler R. L. , 1985: Limits for the precision and value of information from dependent sources. Oper. Res., 33, 427442.

    • Search Google Scholar
    • Export Citation
  • Davies, T., , Cullen M. J. P. , , Malcolm A. J. , , Hawson M. H. , , Stanisforth A. , , White A. A. , , and Wood N. , 2005: A new dynamical core for the Met Office’s global and regional modelling of the atmosphere. Quart. J. Roy. Meteor. Soc., 131, 17591794.

    • Search Google Scholar
    • Export Citation
  • Denis, B., , Côté J. , , and Laprise R. , 2002: Spectral decomposition of two-dimensional atmospheric fields on limited-area domains using the discrete cosine transform (DCT). Mon. Wea. Rev., 130, 18121829.

    • Search Google Scholar
    • Export Citation
  • Durrant, T. H., , Woodcock F. , , and Greenslade D. J. M. , 2009: Consensus forecasts of modeled wave parameters. Wea. Forecasting, 24, 492503.

    • Search Google Scholar
    • Export Citation
  • Ebert, E. E., 2001: Ability of poor man’s ensemble to predict the probability and distribution of precipitation. Mon. Wea. Rev., 129, 24612480.

    • Search Google Scholar
    • Export Citation
  • Engel, C., , and Ebert E. , 2007: Performance of hourly operational consensus forecasts (OCFs) in the Australian region. Wea. Forecasting, 22, 13451359.

    • Search Google Scholar
    • Export Citation
  • Feser, F., , and von Storch H. , 2005: A spatial two-dimensional discrete filter for limited-area-model evaluation purposes. Mon. Wea. Rev., 133, 17741786.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., , and Lowry D. A. , 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11, 12031211.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., , and Ruth D. P. , 2003: The new Digital Forecast Database of the National Weather Service. Bull. Amer. Meteor. Soc., 84, 195201.

    • Search Google Scholar
    • Export Citation
  • Glowacki, T., , Yi X. , , and Steinle P. , 2012: Mesoscale Surface Analysis System for the Australian domain: Design issues, development status, and system validation. Wea. Forecasting, 27, 141157.

    • Search Google Scholar
    • Export Citation
  • Greybush, S. J., , and Haupt S. E. , 2008: The regime dependence of optimally weighted ensemble model consensus forecasts of surface temperature. Wea. Forecasting, 23, 11461161.

    • Search Google Scholar
    • Export Citation
  • Hacker, J. P., , and Rife D. L. , 2007: A practical approach to sequential estimation of systematic error on near-surface mesoscale grids. Wea. Forecasting, 22, 12571273.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., , and Whittaker J. S. , 2006: Probabilistic quantitative precipitation forecasts based on forecast analogs: Theory and application. Mon. Wea. Rev., 134, 32093229.

    • Search Google Scholar
    • Export Citation
  • Johnson, C., , and Swinbank R. , 2009: Medium-range multimodel ensemble combination and calibration. Quart. J. Roy. Meteor. Soc., 135, 777794.

    • Search Google Scholar
    • Export Citation
  • Kanamitsu, M., 1989: Description of the NMC global data assimilation and forecast system. Wea. Forecasting, 4, 335342.

  • Krishnamurti, T. N., , Kishtawal C. M. , , LaRow T. E. , , Bachiochi D. R. , , Zhang Z. , , Williford C. E. , , Gadgil S. , , and Surendran S. , 1999: Improved weather and seasonal climate forecasts from multimodel superensemble. Science, 285, 15481550.

    • Search Google Scholar
    • Export Citation
  • Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20, 130141.

  • Miller, P. A., , and Benjamin S. G. , 1992: A system for the hourly assimilation of surface observations in mountainous and flat terrain. Mon. Wea. Rev., 120, 23422359.

    • Search Google Scholar
    • Export Citation
  • National Meteorological and Oceanographic Centre, 2010: Operational upgrades to the gridded OCF and PME systems. BOM Analysis and Prediction Operations Bull. 85, 27 pp. [Available online at http://www.bom.gov.au/australia/charts/bulletins/apob85.pdf.]

  • Persson, A., 2005: Early operational numerical weather prediction outside the USA: An historical introduction. Part III: Endurance and mathematics—British NWP, 1948–1965. Meteor. Appl., 12, 381413.

    • Search Google Scholar
    • Export Citation
  • Persson, A., , and Grazzini F. , 2007: User guide to ECMWF forecast products. Meteor. Bull. M3.2, ECMWF, Reading, United Kingdom, 71 pp.

  • Puri, K., , Dietachmayer G. S. , , Mills G. A. , , Davidson N. E. , , Bowen R. , , and Logan L. W. , 1998: The new BMRC Limited Area Prediction System, LAPS. Aust. Meteor. Mag., 47, 203223.

    • Search Google Scholar
    • Export Citation
  • Puri, K., and Coauthors, 2010: Preliminary results from numerical weather prediction implementation of ACCESS. CAWCR Res. Lett., 5, 1522.

    • Search Google Scholar
    • Export Citation
  • Raftery, A., , Gneiting T. , , Balabdaoui F. , , and Polakowski M. , 2005: Using Bayesian model averaging to calibrate forecast ensembles. Mon. Wea. Rev., 133, 11551174.

    • Search Google Scholar
    • Export Citation
  • Ruiz, J., , Saulo C. , , and Kalnay E. , 2009: Comparison of methods used to generate probabilistic quantitative precipitation forecasts over South America. Wea. Forecasting, 24, 319336.

    • Search Google Scholar
    • Export Citation
  • Saito, K., and Coauthors, 2006: The operational JMA nonhydrostatic mesoscale model. Mon. Wea. Rev., 134, 12661298.

  • Seaman, R., , Bourke W. , , Steinle P. , , Hart T. , , Embery G. , , Naughton M. , , and Rickus L. , 1995: Evolution of the Bureau of Meteorology’s Global Assimilation and Prediction System. Part 1: Analysis and initialisation. Aust. Meteor. Mag., 44, 118.

    • Search Google Scholar
    • Export Citation
  • Seed, A., 2003: A dynamic and spatial scaling approach to advection forecasting. J. Appl. Meteor., 42, 381388.

  • Silva Dias, P. L., , Soares Moreira D. , , and Neto G. D. , 2006: The MASTER Model Ensemble System (MSMES). Proc. Eighth Int. Conf. on Southern Hemisphere Meteorology and Oceanography, Foz do Iguaçu, Brazil, INPE, 1751–1757.

  • Sloughter, J. M., , Raftery A. E. , , Gneiting T. , , and Fraley C. , 2007: Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Mon. Wea. Rev., 135, 32093220.

    • Search Google Scholar
    • Export Citation
  • Stamus, P., , Carr F. H. , , and Baumhefner D. P. , 1992: Application of a scale-separation verification technique to regional forecast models. Mon. Wea. Rev., 120, 149163.

    • Search Google Scholar
    • Export Citation
  • Stensrud, D. J., , and Skindlov J. A. , 1996: Gridpoint predictions of high temperature from a mesoscale model. Wea. Forecasting, 11, 103110.

    • Search Google Scholar
    • Export Citation
  • Stensrud, D. J., , and Yussouf N. , 2005: Bias-corrected short-range ensemble forecasts of near surface variables. Meteor. Appl., 12, 217230.

    • Search Google Scholar
    • Export Citation
  • Tapp, R. G., , Woodcock R. F. , , and Mills G. A. , 1986: The application of model output statistics to precipitation prediction in Australia. Mon. Wea. Rev., 114, 5061.

    • Search Google Scholar
    • Export Citation
  • Thompson, P. D., 1977: How to improve accuracy by combining independent forecasts. Mon. Wea. Rev., 105, 228229.

  • Tustison, B., , Harris D. , , and Foufoula-Georgiou E. , 2001: Scale issues in verification of precipitation forecasts. J. Geophys. Res., 106, 11 77511 784.

    • Search Google Scholar
    • Export Citation
  • Wilson, L. J., , and Vallée M. , 2002: The Canadian Updateable Model Output Statistics (UMOS) system: Design and development tests. Wea. Forecasting, 17, 206222.

    • Search Google Scholar
    • Export Citation
  • Wilson, L. J., , Beauregard S. , , Raftery A. E. , , and Verret R. , 2007: Calibrated surface temperature forecasts from the Canadian Ensemble Prediction System using Bayesian model averaging. Mon. Wea. Rev., 135, 13641385.

    • Search Google Scholar
    • Export Citation
  • Wonnacott, T. H., , and Wonnacott R. J. , 1972: Introductory Statistics. J. Wiley and Sons, 510 pp

  • Woodcock, F., 1984: Australian experimental model output statistics forecasts of daily maximum and minimum temperature. Mon. Wea. Rev., 112, 21122121.

    • Search Google Scholar
    • Export Citation
  • Woodcock, F., , and Engel C. , 2005: Operational consensus forecasts. Wea. Forecasting, 20, 101111.

  • Woodcock, F., , and Southern B. , 1983: The use of linear regression to improve official temperature forecasts. Aust. Meteor. Mag., 31, 5762.

    • Search Google Scholar
    • Export Citation
Save