1. Introduction
The ability to assess forecast model skill is a necessity for improving forecast systems. Forecast error has been monitored regularly at operational centers for decades for a variety of metrics, and long records of these performance metrics document steady improvements in the forecasting systems over the years (e.g., Simmons and Hollingsworth 2002; Magnusson and Källén 2013). Forecast error is due to both errors in the forecast model (often in the physical parameterizations) and errors in the initial state, and attributing forecast errors to initial condition errors or model errors is a difficult but worthy goal as it helps guide forecast system development. Lorenz (1982) noted that the differences between forecast error growth rates and growth rates of differences between consecutive forecasts are related to model deficiencies and that, as forecast models improve, these two growth rates should converge. Attempts to identify causes of the nonsystematic component of forecast error include that of Dalcher and Kalnay (1987), building on the work of Leith (1978), in which it is assumed that the exponential part of error growth is due to the self-growth of initial condition errors, while the linear part is due to model deficiencies. Several follow-on studies have applied similar error-attribution models to different forecasting systems (e.g., Stroe and Royer 1993; Reynolds et al. 1994; Simmons et al. 1995; Savijarvi 1995; Simmons and Hollingsworth 2002; Magnusson and Källén 2013; Peña and Toth 2014). Other methods developed for error attribution include geometric or shadowing techniques (Judd et al. 2008) and a mapping paradigm technique (Toth and Peña 2007).
A component of forecast error usually attributed to model error is the time-mean forecast error, often referred to as bias. Model bias is examined regularly for weather forecasts, as well as seasonal forecasts and climate integrations (e.g., Klocke and Rodwell 2014). As has been shown for other metrics, weather forecast biases have been decreasing as forecast systems improve (e.g., Jung 2005). Forecast models can also be biased in ways that are not necessarily reflected in time-mean errors. Forecast models may exhibit biases in spatial or temporal variability. Skamarock (2004) evaluates aspects of this type of bias by comparing the kinetic energy spectra of forecasts to the k−3 spectra observed at larger scales and k−5/3 spectra observed on the mesoscale by Nastrom and Gage (1985). Many studies (e.g., Blackmon et al. 1977; Trenberth 1981; Lau and Nath 1987; Cai and Van Den Dool 1991) have evaluated the temporal variability of the atmosphere based on long-term sequences of analyses, using filters to examine variability on different time scales. If long model integrations are available (e.g., AMIP- or CMIP-type studies) then it is possible to compare the atmospheric model temporal variability on different time scales with the analysis variability, as in Lau and Nath (1987). Biases in temporal variability have been interpreted in light of biases in the time-mean state and subsequent waveguide characteristics (e.g., Reynolds and Gelaro 1997). Analyses that separate variability into westward and eastward components have been very useful for diagnosing forecast model deficiencies in the simulation of equatorially trapped waves (e.g., Wheeler and Kiladis 1999; Lin et al. 2006). However, these types of temporal variability diagnostics require a long (usually multimonth to multiyear) forecast integration, something that is often not available from a weather forecast model.
Development of new verification methods, especially those designed to deal with high-resolution forecasts for which traditional verification scores may not be well suited, is a very active area of research. The papers of Gilleland et al. (2009) and Gilleland et al. (2010) provide overviews and intercomparisons of spatial forecast verification methods, including neighborhood, scale-separation, feature-based, and field deformation techniques. While these methods are designed to quantify total forecast error, there is potential for these methods to be used for error attribution as well. In particular, scale-separation techniques, in which forecast error is isolated by scale (e.g., Briggs and Levine 1997; Tustison et al. 2001; Harris et al. 2001; Casati et al. 2010), may provide information on the forecast model’s ability to reproduce observed variability-scale structure, especially if initial condition error can be ruled out as a source of these forecast errors.
In this paper we demonstrate simple diagnostics that can be used to determine if forecast temporal variability accurately captures the variability of the filtered versions of reality that the analyses and forecasts are designed to represent. The diagnostics also allow for an examination of how temporal variability on 1-day time scales in the forecast model can change as the forecast integration time increases. We propose that temporal variability diagnostics may serve as a useful complement to traditional model error diagnostics that examine time-mean errors (e.g., Klocke and Rodwell 2014), spatial scale-separation techniques (e.g., Harris et al. 2001), and new techniques to quantify differences between forecast fields and reality as represented on the scales resolved by the data-assimilation and forecast systems (e.g., Peña and Toth 2014). Diagnostics to assess both spatial and temporal variability will become increasingly important as stochastic techniques, which often have tunable spatial and temporal correlation scales, proliferate. Diagnostics measuring temporal variability (the mean square differences between fields 12 or 24 h apart in a forecast integration) have been used at operational centers such as ECMWF and NCEP to monitor model activity and detect consequences of model changes (A. Persson and Z. Toth 2015, personal communication), but we are not aware of published results concerning these diagnostics. In addition, we derive expected bounds on forecast temporal variability as a function of analysis temporal variability and analysis error variance. We demonstrate the utility of the diagnostics in evaluating the relative fidelity of the forecast model temporal variability for different metrics and different models. We apply these diagnostics to both control and perturbed forecasts produced from the National Centers for Environmental Prediction (NCEP) and Canadian Meteorological Centre (CMC) global ensemble forecasting systems. The diagnostics reflect differences in ensemble design and also reflect upgrades to the CMC ensemble system in a manner that is consistent with the expected impacts of the upgrades. The diagnostics and datasets are described in section 2, results are presented in section 3, and a summary and conclusions are given in section 4.
2. Methodology and dataset description
a. Description of diagnostics











Description of forecasts and initial states for which diagnostics are calculated; 13 Feb 2013 is when the CMC system underwent a significant upgrade.


































b. Derivation of bounds on expected temporal variability










































Equations (11) and (12) place bounds on the relationship between forecast variability and analysis variability for a data assimilation and forecasting system that is functioning the way it should. That is, forecast temporal variability should lie between corresponding analysis variability and analysis variability minus 2 times the analysis error variance. Hence, violation of the bounds given in (11) and (12) is indicative of a time variability error in the forecasting and analysis system. Klocke and Rodwell (2014) have discussed error diagnostics based on the mean of short-term forecast change that reveal mean model error associated with fast processes. Equations (11) and (12) complement the Klocke and Rodwell diagnostic by providing bounds on the degree of variability that the model should support.
While these assumptions are not met by real systems, this derivation does point out that
c. Forecast set descriptions
We considered both NCEP and CMC global ensemble forecasts taken from the THORPEX Interactive Grand Global Ensemble (TIGGE) archive (Bougeault et al. 2010), using both the control ensemble member and the first perturbed ensemble member. This provides the opportunity to look at differences in the diagnostics between different forecast centers and between control and perturbed ensemble members. The NCEP Global Ensemble Forecast System produces forecasts from the NCEP Global Forecast System (GFS; Han and Pan 2011). The ensemble transform method with rescaling is used to produce the initial perturbations (Wei et al. 2008), and stochastic total tendency perturbations (STTP; Hou et al. 2006, 2008) are used to account for model uncertainty in the perturbed members. As the NCEP control does not include STTP, the control and perturbed members differ by initial conditions and stochastic forcing.
The Canadian Global Ensemble Prediction System (Charron et al. 2010; Gagnon et al. 2013; Houtekamer et al. 2014) produces forecasts from the Canadian Global Environmental Multiscale model (GEM; Côté et al. 1998a,b; Girard et al. 2014), using an ensemble Kalman filter (EnKF) for initial perturbations. Model uncertainty is incorporated in the perturbed members through both stochastic forcing [physics tendency perturbations (PTP) and stochastic kinetic energy backscatter (SKEB)] as well as parameterization differences. Therefore, the control and perturbed member of the Canadian ensemble differ through PTP and SKEB (included in the perturbed forecast but not the control), as well as differences in parameters in the parameterization schemes relating to gravity wave drag, mixing length, and orographic blocking.
The diagnostics were first calculated for the period 1 January–31 March 2013. However, the Canadian global ensemble prediction system underwent a significant upgrade at 1200 UTC 13 February, as described in Gagnon et al. (2013). This upgrade included a new version of the GEM model (from GEM 4.2.5 to GEM 4.4.1 with improved turbulent mixing and orographic blocking schemes and improved treatment of topography). The EnKF was also upgraded, including increased resolution (from approximately 100 to 66 km), increased volume of assimilated AMSU radiance observations, and improved observation bias corrections. The incorporation of model uncertainty was also upgraded, including changes to PTP and SKEB. Of particular interest here was the change made to the PTP scheme. Gagnon et al. (2013) note that very high quantities of precipitation in a 24-h period were forecast at a few grid points in the tropics in the older implementation—a problem traced to PTP and convection. Modifications to the PTP such that convective tendencies are not perturbed when there is a positive convective available potential energy (CAPE) reduced this problem. To see how this change is reflected in the diagnostics, we calculate the diagnostics for two time periods: before and after the upgrade (1 January–13 February and 14 February–31 March, respectively). This is done for both the NCEP and CMC forecasts to see if changes are apparent in the CMC ensemble that are not apparent in the NCEP ensemble.
Diagnostics were calculated for the initial states (and forecasts started from these initial states) every 0000 and 1200 UTC. The mean square differences were only calculated at 24-h increments so as to exclude diurnal variations. While the initial states and forecasts were produced at different resolutions, all of the diagnostics were performed on 1° × 1° gridded data. Configuration details for the different experiments considered are provided in Table 1. Diagnostics were calculated for temperature, zonal wind, meridional wind, geopotential height, and specific humidity at 850, 500, and 200 hPa. To be concise, we only present results for the 500-hPa level. Results for fields at the 850- and 200-hPa levels were qualitatively similar.
3. Results
Results are first presented for area averages in order to give a concise overview of the diagnostic results. This is followed by plan-view examinations of the diagnostics in order to provide more detailed information on spatial differences.
a. Area-averaged diagnostics
In Fig. 1, we show the four diagnostic quantities defined in (1)–(4):

Diagnostics calculated for 500-hPa geopotential height (m2) averaged for (left) 70°–30°S, (middle) 20°S–20°N, and (right) 30°–70°N for i = 1, 10 days along the x axis, for
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

Diagnostics calculated for 500-hPa geopotential height (m2) averaged for (left) 70°–30°S, (middle) 20°S–20°N, and (right) 30°–70°N for i = 1, 10 days along the x axis, for
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
Diagnostics calculated for 500-hPa geopotential height (m2) averaged for (left) 70°–30°S, (middle) 20°S–20°N, and (right) 30°–70°N for i = 1, 10 days along the x axis, for
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
For the NCEP control forecasts (top row),
Figure 2 is the same as Fig. 1 except the results are shown for 500-hPa specific humidity instead of 500-hPa geopotential height. While the analysis and forecast temporal variability in geopotential height (Fig. 1) were almost identical in the NCEP and CMC systems, larger differences between the two systems are apparent for specific humidity, especially in the tropics. In general, higher spatiotemporal resolution will yield higher temporal variability. Hence, the difference in resolution between NCEP and CMC may account for part of the higher variability of the NCEP system.

As in Fig. 1, but for specific humidity (g2 kg−2).
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

As in Fig. 1, but for specific humidity (g2 kg−2).
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
As in Fig. 1, but for specific humidity (g2 kg−2).
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
For the NCEP system, the analysis variability
The green mean square forecast error curves for both geopotential height and specific humidity show that, as expected, forecast error is substantially smaller than what would be obtained from a persistence forecast (black curves). The forecast errors from the two systems are quite similar, with smaller forecast errors for NCEP in the extratropical height fields and smaller forecast errors for the CMC system in the tropical specific humidity fields.
For both NCEP and CMC systems, both
To quantify the trends in the 1-day variability in the forecasts with increasing integration time, Fig. 3 shows (

The (
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

The (
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
The (
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
Diagnostics based on the control and perturbed ensemble member initial states and forecasts may differ owing to the way in which the ensembles are constructed, where
Diagnostics calculated for the tropics for 500-hPa geopotential height, specific humidity, and temperature are shown in Figs. 4–6, respectively. As in Figs. 1 and 2, the analysis variability

Diagnostics calculated for 500-hPa geopotential height (m2) averaged for 20°S–20°N for (left) period 1 (1 Jan–13 Feb 2013) and (right) period 2 (14 Feb–31 Mar), for i = 1, 5 days along the x axis, for
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

Diagnostics calculated for 500-hPa geopotential height (m2) averaged for 20°S–20°N for (left) period 1 (1 Jan–13 Feb 2013) and (right) period 2 (14 Feb–31 Mar), for i = 1, 5 days along the x axis, for
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
Diagnostics calculated for 500-hPa geopotential height (m2) averaged for 20°S–20°N for (left) period 1 (1 Jan–13 Feb 2013) and (right) period 2 (14 Feb–31 Mar), for i = 1, 5 days along the x axis, for
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

As in Fig. 4, but for 500-hPa specific humidity (g2 kg−2) instead of 500-hPa geopotential height.
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

As in Fig. 4, but for 500-hPa specific humidity (g2 kg−2) instead of 500-hPa geopotential height.
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
As in Fig. 4, but for 500-hPa specific humidity (g2 kg−2) instead of 500-hPa geopotential height.
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

As in Fig. 4, but for 500-hPa temperature (K2) instead of 500-hPa geopotential height.
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

As in Fig. 4, but for 500-hPa temperature (K2) instead of 500-hPa geopotential height.
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
As in Fig. 4, but for 500-hPa temperature (K2) instead of 500-hPa geopotential height.
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
The CMC diagnostics (bottom panels of Figs. 4–6) share some characteristics with NCEP but also exhibit differences. As with the NCEP system,
Figure 7 shows (

The (
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

The (
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
The (
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
The fact that the qualitative characteristics of the diagnostics do not change between period 1 and period 2 for the NCEP members, while they do change for the CMC members, is consistent with the fact that the NCEP system configuration remained constant during the full time period considered, while the CMC system underwent a major upgrade. The fact that these changes in the diagnostics are clear for the perturbed member but not the control member indicate that the system upgrades affecting the perturbed member only, such as the changes to PTP and SKEB, had a larger impact on these particular diagnostics than changes affecting both perturbed and control members, such as the deterministic forecast model upgrades. These changes (the smaller values of
b. Spatial characteristics
While area-averaged diagnostics are useful for gaining a broad understanding of the general behavior of the systems, important spatial variability can be masked in the averaging. The plan-view diagnostics shown in this section will allow for a more detailed understanding of the spatial differences in these diagnostics.
Figure 8 shows

Diagnostics calculated for 500-hPa geopotential height (m): (top left)
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

Diagnostics calculated for 500-hPa geopotential height (m): (top left)
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
Diagnostics calculated for 500-hPa geopotential height (m): (top left)
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
In the Southern Hemisphere, the maximum shifts from the southern Indian Ocean at i = 1 to the southern Pacific Ocean at i = 10. As in the Northern Hemisphere, these patterns are consistent with previous studies of temporally filtered variances, such as Fig. 17 in Trenberth (1981), who likewise find a shift in the maximum from the southern Indian Ocean in 2–8-day bandpass filtered variance, to the southern Pacific Ocean for the 8–64-day bandpass filtered variance. While
The similarities between the spatial patterns produced using the current diagnostics and the spatial patterns available from previous studies using low-pass and bandpass filters is highlighted to stress the potential of this technique for diagnosing the temporal forecast model characteristics on different time scales without the need for multimonth forecast integrations. Blackmon et al. (1977), Lau and Nath (1987), and Cai and Van Den Dool (1991) examined analysis time series of 3–4 months for 9–18 winter seasons [Trenberth (1981) used 8 yr of continuous data]. For their comparison of analysis and forecast temporal variability, Lau and Nath (1987) used 12 winter periods from a 15-yr-long model integration. The 10-day forecasts considered here are not sufficiently long to apply the low-pass (>10 day) or bandpass (7–90, 8–64 day) filters used in the previous studies.
The percent differences of 10-day variability versus 1-day variability (Fig. 8, bottom panels) show considerable spatial inhomogeneity, with large values (over 150%) in some areas of the tropics/subtropics and polar regions and relatively small values in the storm-track regions. In fact, the values in the central North Pacific and southern Indian Ocean are close to zero in the forecasts, indicating that high-frequency (1 day) variability accounts for most of the changes seen over a 10-day time period. The results for the CMC system (not shown) are similar. The large values in the subtropics and polar regions may indicate seasonal trends in temperature and height that will be more prominent in 10-day differences than in 1-day differences. Similar figures for specific humidity (not shown) indicate that throughout much of the midlatitudes, 10-day variability is no larger than 1-day variability. In contrast, there are subtropical–tropical regions where the 10-day variability is over 60% larger than the 1-day variability, perhaps reflecting convectively coupled equatorial waves or seasonal trends.
To highlight differences between analysis and forecast temporal variability, Fig. 9 shows the percent difference between

Diagnostics calculated for 500-hPa geopotential height, percent difference between (top left)
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

Diagnostics calculated for 500-hPa geopotential height, percent difference between (top left)
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
Diagnostics calculated for 500-hPa geopotential height, percent difference between (top left)
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
The same plots shown for 500-hPa geopotential height in Fig. 9 are shown for 500-hPa specific humidity in Fig. 10. These figures indicate that the area-averaged increasing trend for CMC and decreasing trend for NCEP are not spatially uniform. In particular, while the temporal variability decreases over most of the deep tropics in the NCEP forecasts, there are areas in the northern subtropics, notably the western Atlantic, where values increase. The CMC system shows substantial increases in temporal variability over much of the globe, and as noted before, may be related to the smoothing effect of the digital filter on the analysis and subsequent spinup (Buehner et al. 2015). Examination of the moisture biases for these forecasts (not shown) does not reveal a straight-forward correspondence between higher temporal variability and a high specific humidity bias. It is interesting to note that some of the areas of enhanced variability, such as the Caribbean Sea and the western and central North Pacific subtropics, are common to both the NCEP and CMC systems, indicating that both forecast models have a tendency to overestimate temporal variability in these regions when compared to the initial states.

As in Fig. 9, but for 500-hPa specific humidity instead of 500-hPa height.
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

As in Fig. 9, but for 500-hPa specific humidity instead of 500-hPa height.
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
As in Fig. 9, but for 500-hPa specific humidity instead of 500-hPa height.
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
As noted previously, the CMC ensemble system underwent a significant upgrade in February 2013, after which the area-averaged diagnostic characteristics for the perturbed ensemble member changed considerably (Figs. 4–7). Figure 11 shows the percent difference between

Diagnostics calculated for 500-hPa temperature, percent difference between (top left)
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1

Diagnostics calculated for 500-hPa temperature, percent difference between (top left)
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
Diagnostics calculated for 500-hPa temperature, percent difference between (top left)
Citation: Monthly Weather Review 143, 12; 10.1175/MWR-D-15-0083.1
4. Summary and conclusions
We use simple diagnostics to quantify the temporal variability in analyses,
While
The diagnostics also clearly reflect the upgrade in the CMC system on 13 February 2013. Before the upgrade,
An advantage of these diagnostics is the ability to assess forecast temporal variability on different time scales without the need for very long forecast integrations. For example, the locations of the maxima in height field variability (Fig. 8) shift or extend from the North Atlantic and North Pacific jet regions for i = 1 downstream to northern Europe and the eastern North Pacific for i = 10. These shifts are consistent with patterns found in temporal filtering diagnostics of analyses time series that differentiate between regions of synoptic variability and blocking (e.g., Blackmon et al. 1977; Lau and Nath 1987; Cai and Van Den Dool 1991) using low-pass (>10 day) and bandpass (7–90 and 8–64 day) filters that could not be applied to the 10-day forecast integrations considered here.
Diagnostics measuring temporal variability are complementary to other diagnostics, such as those that focus on time-mean quantities or model bias (e.g., Klocke and Rodwell 2014), spatial scale-separation techniques (e.g., Harris et al. 2001), and techniques to quantify differences between forecast fields and reality as represented on the scales resolved by the data-assimilation and forecast systems (e.g., Peña and Toth 2014). Using diagnostics to assess the accuracy of both temporal and spatial variability will become increasingly important as stochastic techniques to account for model uncertainty proliferate in ensemble forecasting systems, as both spatial and temporal correlations are often parameters in these schemes that need to be tuned. Potential future work includes consideration of other forecast systems, as well as an extension to a comparison with observations.
Acknowledgments
This research is supported by the Chief of Naval Research through the NRL Base Program, PE 0601153N. The forecasts were obtained from the THORPEX Interactive Grand Global Ensemble (TIGGE) data portal at ECMWF. We thank three anonymous reviewers for very conscientious critiques that have helped us improve the manuscript.
REFERENCES
Blackmon, M. L., J. M. Wallace, N.-C. Lau, and S. L. Mullen, 1977: An observation study of the Northern Hemisphere wintertime circulation. J. Atmos. Sci., 34, 1040–1053, doi:10.1175/1520-0469(1977)034<1040:AOSOTN>2.0.CO;2.
Bougeault, P., and Coauthors, 2010: The THORPEX Interactive Grand Global Ensemble. Bull. Amer. Meteor. Soc., 91, 1059–1072, doi:10.1175/2010BAMS2853.1.
Briggs, W. M., and R. A. Levine, 1997: Wavelets and field forecast verification. Mon. Wea. Rev., 125, 1329–1341, doi:10.1175/1520-0493(1997)125<1329:WAFFV>2.0.CO;2.
Buehner, M., and Coauthors, 2015: Implementation of deterministic weather forecasting system based on ensemble-variational data assimilation at Environment Canada. Part I: The global system. Mon. Wea. Rev., 143, 2532–2559, doi:10.1175/MWR-D-14-00354.1.
Cai, M., and H. M. Van Den Dool, 1991: Low-frequency waves and traveling storm tracks. Part I: Barotropic component. J. Atmos. Sci., 48, 1420–1436, doi:10.1175/1520-0469(1991)048<1420:LFWATS>2.0.CO;2.
Casati, B., 2010: New developments of the intensity-scale technique within the spatial verification methods intercomparison project. Wea. Forecasting, 25, 113–143, doi:10.1175/2009WAF2222257.1.
Charron, M., G. Pellerin, L. Spacek, P. L. Houtekamer, N. Gagnon, H. L. Mitchell, and L. Michelin, 2010: Toward random sampling of model error in the Canadian Ensemble Prediction System. Mon. Wea. Rev., 138, 1877–1901, doi:10.1175/2009MWR3187.1.
Côté, J., S. Gravel, A. Méthot, A. Patoine, M. Roch, and A. Staniforth, 1998a: The Operational CMC–MRB Global Environmental Multiscale (GEM) model. Part I: Design considerations and formulation. Mon. Wea. Rev., 126, 1373–1395, doi:10.1175/1520-0493(1998)126<1373:TOCMGE>2.0.CO;2.
Côté, J., J.-G. Desmarais, S. Gravel, A. Méthot, A. Patoine, M. Roch, and A. Staniforth, 1998b: The Operational CMC–MRB Global Environmental Multiscale (GEM) model. Part II: Results. Mon. Wea. Rev., 126, 1397–1418, doi:10.1175/1520-0493(1998)126<1397:TOCMGE>2.0.CO;2.
Dalcher, A., and E. Kalnay, 1987: Error growth and predictability in operational ECMWF forecasts. Tellus, 39A, 474–491, doi:10.1111/j.1600-0870.1987.tb00322.x.
Gagnon, N., and Coauthors, 2013: Improvements to the Global Ensemble Prediction System (GEPS) from version 2.0.3 to version 3.0.0. Environment Canada, 49 pp. [Available online at http://collaboration.cmc.ec.gc.ca/cmc/cmoi/product_guide/docs/lib/op_systems/doc_opchanges/technote_geps300_20130213_e.pdf.]
Gilleland, E., D. Ahijevych, B. G. Brown, B. Casati, and E. E. Ebert, 2009: Intercomparison of spatial forecast verification methods. Wea. Forecasting, 24, 1416–1430, doi:10.1175/2009WAF2222269.1.
Gilleland, E., D. Ahijevych, B. G. Brown, and E. E. Ebert, 2010: Verifying forecasts spatially. Bull. Amer. Meteor. Soc., 91, 1365–1373, doi:10.1175/2010BAMS2819.1.
Girard, C., and Coauthors, 2014: Staggered vertical discretization of the Canadian Environmental Multiscale (GEM) model using a coordinate of the log-hydrostatic-pressure type. Mon. Wea. Rev., 142, 1183–1196, doi:10.1175/MWR-D-13-00255.1.
Han, J., and H.-L. Pan, 2011: Revision of convection and vertical diffusion schemes in the NCEP Global Forecast System. Wea. Forecasting, 26, 520–533, doi:10.1175/WAF-D-10-05038.1.
Harris, D., E. Foufoula-Georgiou, K. K. Droegemeier, and J. J. Levit, 2001: Multiscale statistical properties of a high-resolution precipitation forecast. J. Hydrometeor., 2, 406–418, doi:10.1175/1525-7541(2001)002<0406:MSPOAH>2.0.CO;2.
Hou, D., Z. Toth, and Y. Zhu, 2006: A stochastic parameterization scheme within NCEP global ensemble forecast system. 18th Conf. on Probability and Statistics in the Atmospheric Sciences, Atlanta, GA, Amer. Meteor. Soc., 4.5. [Available online at https://ams.confex.com/ams/Annual2006/techprogram/paper_101401.htm.]
Hou, D., Z. Toth, Y. Zhu, and W. Yang, 2008: Impact of a stochastic perturbation scheme on NCEP global ensemble forecast system. 19th Conf. on Probability and Statistics, New Orleans, LA, Amer. Meteor. Soc., 1.1. [Available online https://ams.confex.com/ams/88Annual/techprogram/paper_134165.htm.]
Houtekamer, P. L., X. Deng, H. L. Mitchell, S.-J. Baek, and N. Gagnon, 2014: Higher resolution in an operational ensemble Kalman filter. Mon. Wea. Rev., 142, 1143–1162, doi:10.1175/MWR-D-13-00138.1.
Judd, K., C. A. Reynolds, T. E. Rosmond, and L. A. Smith, 2008: The geometry of model error. J. Atmos. Sci., 65, 1749–1772, doi:10.1175/2007JAS2327.1.
Jung, T., 2005: Systematic errors of the atmospheric circulation in the ECMWF forecasting system. Quart. J. Roy. Meteor. Soc., 131, 1045–1073, doi:10.1256/qj.04.93.
Klocke, D., and M. J. Rodwell, 2014: A comparison of two numerical weather prediction methods for diagnosing fast-physics errors in climate models. Quart. J. Roy. Meteor. Soc., 140, 517–524, doi:10.1002/qj.2172.
Lau, N.-C., and M. J. Nath, 1987: Frequency dependence of the structure and temporal development of wintertime tropospheric fluctuations—Comparison of a GCM simulation with observations. Mon. Wea. Rev., 115, 251–271, doi:10.1175/1520-0493(1987)115<0251:FDOTSA>2.0.CO;2.
Leith, C. E., 1978: Objective methods for weather prediction. Annu. Rev. Fluid Mech., 10, 107–128, doi:10.1146/annurev.fl.10.010178.000543.
Lin, J.-L., and Coauthors, 2006: Tropical intraseasonal variability in 14 IPCC AR4 climate models. Part I: Convective signals. J. Climate, 19, 2665–2690, doi:10.1175/JCLI3735.1.
Lorenz, E. N., 1982: Atmospheric predictability experiments with a large numerical model. Tellus, 34, 505–513, doi:10.1111/j.2153-3490.1982.tb01839.x.
Magnusson, L., and E. Källén, 2013: Factors influencing skill improvements in the ECMWF forecasting system. Mon. Wea. Rev., 141, 3142–3153, doi:10.1175/MWR-D-12-00318.1.
Nastrom, G. D., and K. S. Gage, 1985: A climatology of atmospheric wavenumber spectra of wind and temperature observed by commercial aircraft. J. Atmos. Sci., 42, 950–960, doi:10.1175/1520-0469(1985)042<0950:ACOAWS>2.0.CO;2.
Peña, M., and Z. Toth, 2014: Estimation of analysis and forecast error variances. Tellus, 66A, 21767, doi:10.3402/tellusa.v66.21767.
Reynolds, C., and R. Gelaro, 1997: The effect of model bias on the equatorward propagation of extratropical waves. Mon. Wea. Rev., 125, 3249–3265, doi:10.1175/1520-0493(1997)125<3249:TEOMBO>2.0.CO;2.
Reynolds, C., P. J. Webster, and E. Kalnay, 1994: Random error growth in NMC’s global forecasts. Mon. Wea. Rev., 122, 1281–1305, doi:10.1175/1520-0493(1994)122<1281:REGING>2.0.CO;2.
Savijarvi, H., 1995: Error growth in a large numerical forecast system. Mon. Wea. Rev., 123, 212–221, doi:10.1175/1520-0493(1995)123<0212:EGIALN>2.0.CO;2.
Simmons, A., and A. Hollingsworth, 2002: Some aspects of the improvement in skill of numerical weather prediction. Quart. J. Roy. Meteor. Soc., 128, 647–677, doi:10.1256/003590002321042135.
Simmons, A., R. Mureau, and T. Petroliagis, 1995: Error growth and estimates of predictability from the ECMWF forecasting system. Quart. J. Roy. Meteor. Soc., 121, 1739–1771, doi:10.1002/qj.49712152711.
Skamarock, W. C., 2004: Evaluating mesoscale NWP models using kinetic energy spectra. Mon. Wea. Rev., 132, 3019–3032, doi:10.1175/MWR2830.1.
Stroe, R., and J.-F. Royer, 1993: Comparison of different error growth formulas and predictability estimation in numerical extended-range forecasts. Ann. Geophys., 11, 296–316.
Toth, Z., and M. Peña, 2007: Data assimilation and numerical forecasting with imperfect models: The mapping paradigm. Physica D, 230, 146–158, doi:10.1016/j.physd.2006.08.016.
Trenberth, K. E., 1981: Observed Southern Hemisphere eddy statistics at 500 mb: Frequency and spatial dependence. J. Atmos. Sci., 38, 2585–2605, doi:10.1175/1520-0469(1981)038<2585:OSHESA>2.0.CO;2.
Tustison, B., D. Harris, and E. Foufoula-Georgiou, 2001: Scale issues in verification of precipitation forecast. J. Geophys. Res., 106, 11 775–11 784, doi:10.1029/2001JD900066.
Wei, M., Z. Toth, R. Wobus, and Y. Zhu, 2008: Initial perturbations based on the ensemble transform (ET) technique in the NCEP global operational forecast system. Tellus, 60A, 62–79, doi:10.1111/j.1600-0870.2007.00273.x.
Wheeler, M., and G. N. Kiladis, 1999: Convectively coupled equatorial waves: Analysis of clouds and temperature in the wavenumber–frequency domain. J. Atmos. Sci., 56, 374–399, doi:10.1175/1520-0469(1999)056<0374:CCEWAO>2.0.CO;2.
This decomposition is similar to that employed in Peña and Toth (2014), in which perceived forecast error variances can be partitioned into true analysis errors and true forecast errors plus a term with the correlation of the two errors [their Eqs. (1) and (2)]. Their true analysis errors are analysis minus truth on the model grid and are, thus, analogous to the true filtered states referred to here.