# Search Results

## You are looking at 1 - 10 of 48 items for

- Author or Editor: Huug van den Dool x

- Refine by Access: All Content x

## Abstract

A probability forecast has advantages over a deterministic forecast as the former offers information about the probabilities of various possible future states of the atmosphere. As physics-based numerical models find their success in modern weather forecasting, an important task is to convert a model forecast, usually deterministic, into a probability forecast. This study explores methods to do such a conversion for NCEP’s operational 500-mb-height forecast and the discussion is extended to ensemble forecasting. Compared with traditional model-based statistical forecast methods such as Model Output Statistics, in which a probability forecast is made from statistical relationships derived from single model-predicted fields and observations, probability forecasts discussed in this study are focused on probability information directly provided by multiple runs of a dynamical model—eleven 0000 UTC runs at T62 resolution.

To convert a single model forecast into a strawman probability forecast (single forecast probability or SFP), a contingency table is derived from historical forecast–verification data. Given a forecast for one of three classes (below, normal, and above the climatological mean), the SFP probabilities are simply the conditional (or relative) frequencies at which each of three categories are observed over a period of time. These probabilities have good reliability (perfect for dependent data) as long as the model is not changed and maintains the same performance level as before. SFP, however, does not discriminate individual cases and cannot make use of information particular to individual cases. For ensemble forecasts, ensemble probabilities (EP) are calculated as the percentages of the number of members in each category based on the given ensemble samples. This probability specification method fully uses probability information provided by the ensemble. Because of the limited ensemble size, model deficiencies, and because the samples may be unrepresentative, EP probabilities are not reliable and appear to be too confident, particularly at forecast leads beyond day 6. The authors have attempted to combine EP with SFP to improve the EP probability (referred to as modified forecast probability). Results show that a simple combination (plain average) can considerably improve upon both the EP and SFP.

## Abstract

A probability forecast has advantages over a deterministic forecast as the former offers information about the probabilities of various possible future states of the atmosphere. As physics-based numerical models find their success in modern weather forecasting, an important task is to convert a model forecast, usually deterministic, into a probability forecast. This study explores methods to do such a conversion for NCEP’s operational 500-mb-height forecast and the discussion is extended to ensemble forecasting. Compared with traditional model-based statistical forecast methods such as Model Output Statistics, in which a probability forecast is made from statistical relationships derived from single model-predicted fields and observations, probability forecasts discussed in this study are focused on probability information directly provided by multiple runs of a dynamical model—eleven 0000 UTC runs at T62 resolution.

To convert a single model forecast into a strawman probability forecast (single forecast probability or SFP), a contingency table is derived from historical forecast–verification data. Given a forecast for one of three classes (below, normal, and above the climatological mean), the SFP probabilities are simply the conditional (or relative) frequencies at which each of three categories are observed over a period of time. These probabilities have good reliability (perfect for dependent data) as long as the model is not changed and maintains the same performance level as before. SFP, however, does not discriminate individual cases and cannot make use of information particular to individual cases. For ensemble forecasts, ensemble probabilities (EP) are calculated as the percentages of the number of members in each category based on the given ensemble samples. This probability specification method fully uses probability information provided by the ensemble. Because of the limited ensemble size, model deficiencies, and because the samples may be unrepresentative, EP probabilities are not reliable and appear to be too confident, particularly at forecast leads beyond day 6. The authors have attempted to combine EP with SFP to improve the EP probability (referred to as modified forecast probability). Results show that a simple combination (plain average) can considerably improve upon both the EP and SFP.

## Abstract

Teleconnection patterns have been extensively investigated, mostly with linear analysis tools. The lesser-known asymmetric characteristics between positive and negative phases of prominent teleconnections are explored here. Substantial disparity between opposite phases can be found. The Pacific–North American (PNA) pattern exhibits a large difference in structure and statistical significance in its downstream action center, showing either a large impact over the U.S. southern third region or over the western North Atlantic Ocean. The North Atlantic–based patterns display significant impacts over the North Atlantic for large positive anomalies and even larger impacts over the European sector for large negative anomalies.

The monthly variance is distributed nearly evenly over the entire North Atlantic basin. A teleconnection pattern based on different regions of the basin has been known to assume different structure and time variations. The extent of statistical significance is investigated for three typical North Atlantic–associated patterns based separately on the eastern (EATL), western (WATL), and southern (SATL) regions of the North Atlantic. The EATL teleconnection pattern is similar to the classical North Atlantic Oscillation (NAO). The WATL pattern, however, is more similar to the Arctic (Annular) Oscillation (AO). The sensitivity of the North Atlantic–based teleconnection to a slight shift in base point can be fairly large: the pattern can be an AO or an NAO, with distinctive significance structure between them. Other discernible features are also presented.

## Abstract

Teleconnection patterns have been extensively investigated, mostly with linear analysis tools. The lesser-known asymmetric characteristics between positive and negative phases of prominent teleconnections are explored here. Substantial disparity between opposite phases can be found. The Pacific–North American (PNA) pattern exhibits a large difference in structure and statistical significance in its downstream action center, showing either a large impact over the U.S. southern third region or over the western North Atlantic Ocean. The North Atlantic–based patterns display significant impacts over the North Atlantic for large positive anomalies and even larger impacts over the European sector for large negative anomalies.

The monthly variance is distributed nearly evenly over the entire North Atlantic basin. A teleconnection pattern based on different regions of the basin has been known to assume different structure and time variations. The extent of statistical significance is investigated for three typical North Atlantic–associated patterns based separately on the eastern (EATL), western (WATL), and southern (SATL) regions of the North Atlantic. The EATL teleconnection pattern is similar to the classical North Atlantic Oscillation (NAO). The WATL pattern, however, is more similar to the Arctic (Annular) Oscillation (AO). The sensitivity of the North Atlantic–based teleconnection to a slight shift in base point can be fairly large: the pattern can be an AO or an NAO, with distinctive significance structure between them. Other discernible features are also presented.

## Abstract

The time-mean tropical surface momentum balance is investigated with a simple model that calculates tropical surface winds from time mean sea level pressure fields. The model domain is the global tropical strip centered on the equator with lateral boundaries at ±30° latitude. Steady state surface winds are numerically calculated from the nonlinear horizontal momentum equations, with forcing from observed climatological monthly mean sea level pressures and prescribed lateral boundary winds. Dissipation is parameterized by linear damping and diffusion. Comparisons of model winds with observed climatological monthly mean winds show realistic simulations in most regions and in all months. The poorest simulations occur in the meridional component of the wind in near-equatorial areas of strongly convergent or weak winds. In these areas, and in the near-equatorial region generally, diffusion processes make a significant positive contribution to the realism of the model winds. Horizontal nonlinear advection also improves the simulation near the equator, though to a smaller degree. The generally skillful model winds refute the conventional idea that weak gradients make the tropical pressure field a poor tool for calculating tropical winds. To the contrary, tropical pressure fields contain substantial information about associated winds. Thus, a relatively complete momentum balance can be identified for the major features of the time-mean tropical wind field.

## Abstract

The time-mean tropical surface momentum balance is investigated with a simple model that calculates tropical surface winds from time mean sea level pressure fields. The model domain is the global tropical strip centered on the equator with lateral boundaries at ±30° latitude. Steady state surface winds are numerically calculated from the nonlinear horizontal momentum equations, with forcing from observed climatological monthly mean sea level pressures and prescribed lateral boundary winds. Dissipation is parameterized by linear damping and diffusion. Comparisons of model winds with observed climatological monthly mean winds show realistic simulations in most regions and in all months. The poorest simulations occur in the meridional component of the wind in near-equatorial areas of strongly convergent or weak winds. In these areas, and in the near-equatorial region generally, diffusion processes make a significant positive contribution to the realism of the model winds. Horizontal nonlinear advection also improves the simulation near the equator, though to a smaller degree. The generally skillful model winds refute the conventional idea that weak gradients make the tropical pressure field a poor tool for calculating tropical winds. To the contrary, tropical pressure fields contain substantial information about associated winds. Thus, a relatively complete momentum balance can be identified for the major features of the time-mean tropical wind field.

## Abstract

The performance of ridge regression methods for consolidation of multiple seasonal ensemble prediction systems is analyzed. The methods are applied to predict SST in the tropical Pacific based on ensembles from the Development of a European Multimodel Ensemble System for Seasonal-to-Interannual Prediction (DEMETER) models, plus two of NCEP’s operational models. Strategies to increase the ratio of the effective sample size of the training data to the number of coefficients to be fitted are proposed and tested. These strategies include objective selection of a smaller subset of models, pooling of information from neighboring grid points, and consolidating all ensemble members rather than each model’s ensemble average. In all variations of the ridge regression consolidation methods tested, increased effective sample size produces more stable weights and more skillful predictions on independent data. While the scores may not increase significantly as the effective sampling size is increased, the benefit is seen in terms of consistent improvements over the simple equal weight ensemble average. In the western tropical Pacific, most consolidation methods tested outperform the simple equal weight ensemble average; in other regions they have similar skill as measured by both the anomaly correlation and the relative operating curve. The main obstacles to progress are a short period of data and a lack of independent information among models.

## Abstract

The performance of ridge regression methods for consolidation of multiple seasonal ensemble prediction systems is analyzed. The methods are applied to predict SST in the tropical Pacific based on ensembles from the Development of a European Multimodel Ensemble System for Seasonal-to-Interannual Prediction (DEMETER) models, plus two of NCEP’s operational models. Strategies to increase the ratio of the effective sample size of the training data to the number of coefficients to be fitted are proposed and tested. These strategies include objective selection of a smaller subset of models, pooling of information from neighboring grid points, and consolidating all ensemble members rather than each model’s ensemble average. In all variations of the ridge regression consolidation methods tested, increased effective sample size produces more stable weights and more skillful predictions on independent data. While the scores may not increase significantly as the effective sampling size is increased, the benefit is seen in terms of consistent improvements over the simple equal weight ensemble average. In the western tropical Pacific, most consolidation methods tested outperform the simple equal weight ensemble average; in other regions they have similar skill as measured by both the anomaly correlation and the relative operating curve. The main obstacles to progress are a short period of data and a lack of independent information among models.

## Abstract

It has been observed by many that skill of categorical forecasts, when decomposed into the contributions from each category separately, tends to be low, if not absent or negative, in the “near normal” (N) category. We have witnessed many discussions as to why it is so difficult to forecast near normal weather, without a satisfactory explanation ever having reached the literature. After presenting some fresh examples, we try to explain this remarkable fact from a number of statistical considerations and from the various definitions of skill. This involves definitions of rms error and skill that are specific for a given anomaly amplitude. There is low skill in the N-class of a 3-category forecast system because a) our forecast methods tend to have an rms error that depends little on forecast amplitude, while the width of the categories for predictands with a near Gaussian distribution is very narrow near the center, and b) it is easier, for the verifying observation, to ‘escape’ from the closed N-class (2-sided escape chance) than from the open ended outer classes. At a different level of explanation, there is lack of skill near the mean because in the definition of skill we compare the method in need of verification to random forecasts as the reference. The latter happens to perform, in the rms sense, best near the mean. Lack of skill near the mean is not restricted to categorical forecasts or to any specific lead time.

Rather than recommending a solution, we caution against the over-interpretation of the notion of skill-by-class. It appears that low skill near the mean is largely a matter of definition and may therefore not require a physical-dynamical explanation. We note that the whole problem is gone when one replaces the random reference forecast by persistence.

We finally note that low skill near the mean has had an element of applying the notion forecasting forecast skill in practice long before it was deduced that we were making a forecast of that skill. We show analytically that as long as the forecast anomaly amplitude is small relative to the forecast rms error, one has to expect the anomaly correlation to increase linearly with forecast magnitude. This has been found empirically by Tracton et al. (1989).

## Abstract

It has been observed by many that skill of categorical forecasts, when decomposed into the contributions from each category separately, tends to be low, if not absent or negative, in the “near normal” (N) category. We have witnessed many discussions as to why it is so difficult to forecast near normal weather, without a satisfactory explanation ever having reached the literature. After presenting some fresh examples, we try to explain this remarkable fact from a number of statistical considerations and from the various definitions of skill. This involves definitions of rms error and skill that are specific for a given anomaly amplitude. There is low skill in the N-class of a 3-category forecast system because a) our forecast methods tend to have an rms error that depends little on forecast amplitude, while the width of the categories for predictands with a near Gaussian distribution is very narrow near the center, and b) it is easier, for the verifying observation, to ‘escape’ from the closed N-class (2-sided escape chance) than from the open ended outer classes. At a different level of explanation, there is lack of skill near the mean because in the definition of skill we compare the method in need of verification to random forecasts as the reference. The latter happens to perform, in the rms sense, best near the mean. Lack of skill near the mean is not restricted to categorical forecasts or to any specific lead time.

Rather than recommending a solution, we caution against the over-interpretation of the notion of skill-by-class. It appears that low skill near the mean is largely a matter of definition and may therefore not require a physical-dynamical explanation. We note that the whole problem is gone when one replaces the random reference forecast by persistence.

We finally note that low skill near the mean has had an element of applying the notion forecasting forecast skill in practice long before it was deduced that we were making a forecast of that skill. We show analytically that as long as the forecast anomaly amplitude is small relative to the forecast rms error, one has to expect the anomaly correlation to increase linearly with forecast magnitude. This has been found empirically by Tracton et al. (1989).

## Abstract

A low-resolution version of the National Meteorological Center's global spectral model was used to generate a 10-year set of simulated daily meteorological data. Wintertime low-frequency large-amplitude anomalies were examined and compared with those observed in the real atmosphere. The geographical distributions of the mean and variance of model and real atmosphere show some resemblance. However, careful comparisons reveal distinct regions where short-term climate anomalies prefer to develop. The model's low-frequency anomalies (LFAS) over the North Pacific (North Atlantic) tend to occur about 1500 miles east (southeast) of those observed, locating themselves much closer to the western continents. Because of the Displacement of the model's LFA centers, their associated circulation patterns deviate substantially from those observed.

The frequency distributions of the LFAs for both the model and reality display large skewness. The positive and negative large LFAs were, therefore, examined separately, and four-way intercomparisons were conducted between the model, the observed, the positive, and the negative LFAS. The separate analyses resulted in distinguishable circulation patterns between the positive and negative large LFAS, which cannot possibly be identified if a linear analysis tool, such as an empirical orthogonal function analysis, were used to extract the most dominant mode of the circulations. Despite pronounced misplacement of large LFAs of both polarities and a general underestimation of their magnitudes, the model dm have the capability of persisting its short-term climate anomaly at certain geographical locations. Over the North Pacific, the model's positive LFAs persist as long or longer than those found in reality, while its negative LFAs persist only one-fourth as long (10 versus 40 days).

The principal storm tracks and mean zonal wind at 250 mb (U250) were also examined to supplement the low-frequency anomaly investigation. Contrasting with observations, the model's U250s display considerable eastward extension and its storm tracks near the jet exit show substantial equatorward displacement over both the North Pacific and the North Atlantic oceans. These model characteristics are consistent with the behavior that the model's large LFAs also prefer to develop over the regions far east and southeast of those observed in the real atmosphere.

## Abstract

A low-resolution version of the National Meteorological Center's global spectral model was used to generate a 10-year set of simulated daily meteorological data. Wintertime low-frequency large-amplitude anomalies were examined and compared with those observed in the real atmosphere. The geographical distributions of the mean and variance of model and real atmosphere show some resemblance. However, careful comparisons reveal distinct regions where short-term climate anomalies prefer to develop. The model's low-frequency anomalies (LFAS) over the North Pacific (North Atlantic) tend to occur about 1500 miles east (southeast) of those observed, locating themselves much closer to the western continents. Because of the Displacement of the model's LFA centers, their associated circulation patterns deviate substantially from those observed.

The frequency distributions of the LFAs for both the model and reality display large skewness. The positive and negative large LFAs were, therefore, examined separately, and four-way intercomparisons were conducted between the model, the observed, the positive, and the negative LFAS. The separate analyses resulted in distinguishable circulation patterns between the positive and negative large LFAS, which cannot possibly be identified if a linear analysis tool, such as an empirical orthogonal function analysis, were used to extract the most dominant mode of the circulations. Despite pronounced misplacement of large LFAs of both polarities and a general underestimation of their magnitudes, the model dm have the capability of persisting its short-term climate anomaly at certain geographical locations. Over the North Pacific, the model's positive LFAs persist as long or longer than those found in reality, while its negative LFAs persist only one-fourth as long (10 versus 40 days).

The principal storm tracks and mean zonal wind at 250 mb (U250) were also examined to supplement the low-frequency anomaly investigation. Contrasting with observations, the model's U250s display considerable eastward extension and its storm tracks near the jet exit show substantial equatorward displacement over both the North Pacific and the North Atlantic oceans. These model characteristics are consistent with the behavior that the model's large LFAs also prefer to develop over the regions far east and southeast of those observed in the real atmosphere.

## Abstract

A 10-year run was made with a reduced resolution (T40) version of NMC's medium range forecast model. The 12 monthly mean *surface* pressure fields averaged over 10 years are used to study the climatological seasonal redistribution of mass associated with the annual cycle in heating in the model. The vertically integrated divergent mass flux required to account for the surface pressure changes is presented in 2D vector form. The primary outcome is a picture of mass flowing between land and sea on planetary scales. The divergent mass fluxes are small in the Southern Hemisphere and tropics but larger in the midlatitudes of the Northern Hemisphere, although, when expressed as a velocity, nowhere larger than a few millimeters per second. Although derived from a model, the results are interesting because we have described aspects of the global monsoon system that are very difficult to determine from observations.

Two additional features are discussed, one physical, the other due to postprocessing. First, we show that the local imbalance between the mass of precipitation and evaporation implies a divergent water mass flux that is large in the aforementioned context (i.e., cm s^{−1}). Omission of surface pressure tendencies due to the imbalance of evaporation and precipitation (order 10–30 mb per month) may therefore be a serious obstacle in the correct simulation of the annual cycle. Within the context of the model world it is also shown that the common conversion from *surface* to *sea level* pressure creates very large errors in the mass budget over land. In some areas the annual cycles of surface and sea level pressure are 180° out of phase.

## Abstract

A 10-year run was made with a reduced resolution (T40) version of NMC's medium range forecast model. The 12 monthly mean *surface* pressure fields averaged over 10 years are used to study the climatological seasonal redistribution of mass associated with the annual cycle in heating in the model. The vertically integrated divergent mass flux required to account for the surface pressure changes is presented in 2D vector form. The primary outcome is a picture of mass flowing between land and sea on planetary scales. The divergent mass fluxes are small in the Southern Hemisphere and tropics but larger in the midlatitudes of the Northern Hemisphere, although, when expressed as a velocity, nowhere larger than a few millimeters per second. Although derived from a model, the results are interesting because we have described aspects of the global monsoon system that are very difficult to determine from observations.

Two additional features are discussed, one physical, the other due to postprocessing. First, we show that the local imbalance between the mass of precipitation and evaporation implies a divergent water mass flux that is large in the aforementioned context (i.e., cm s^{−1}). Omission of surface pressure tendencies due to the imbalance of evaporation and precipitation (order 10–30 mb per month) may therefore be a serious obstacle in the correct simulation of the annual cycle. Within the context of the model world it is also shown that the common conversion from *surface* to *sea level* pressure creates very large errors in the mass budget over land. In some areas the annual cycles of surface and sea level pressure are 180° out of phase.

## Abstract

Retrospective forecasts of the new NCEP Climate Forecast System (CFS) have been analyzed out to 45 days from 1999 to 2009 with four members (0000, 0600, 1200, and 1800 UTC) each day. The new version of CFS [CFS, version 2 (CFSv2)] shows significant improvement over the older CFS [CFS, version 1 (CFSv1)] in predicting the Madden–Julian oscillation (MJO), with skill reaching 2–3 weeks in comparison with the CFSv1’s skill of nearly 1 week. Diagnostics of experiments related to the MJO forecast show that the systematic error correction, possible only because of the enormous hindcast dataset and the ensemble aspects of the prediction system (4 times a day), do contribute to improved forecasts. But the main reason is the improvement in the model and initial conditions between 1995 and 2010.

## Abstract

Retrospective forecasts of the new NCEP Climate Forecast System (CFS) have been analyzed out to 45 days from 1999 to 2009 with four members (0000, 0600, 1200, and 1800 UTC) each day. The new version of CFS [CFS, version 2 (CFSv2)] shows significant improvement over the older CFS [CFS, version 1 (CFSv1)] in predicting the Madden–Julian oscillation (MJO), with skill reaching 2–3 weeks in comparison with the CFSv1’s skill of nearly 1 week. Diagnostics of experiments related to the MJO forecast show that the systematic error correction, possible only because of the enormous hindcast dataset and the ensemble aspects of the prediction system (4 times a day), do contribute to improved forecasts. But the main reason is the improvement in the model and initial conditions between 1995 and 2010.

## Abstract

A special composite technique (“phase shifting” method) that records both the low- and high-frequency transient activity throughout the troposphere in a framework moving with an individual low-frequency wave of 500-mb geopotential height at 50°N was used to document the three-dimensional structure of the planetary-scale low-frequency waves as well as the attendant traveling storm tracks from the NMC twice-daily analyses of geopotential height and temperature at pressure levels 850, 700, 500, 300, and 200 mb for the ten winters 1967/68 through 1976/77.

The following are the main characteristics of the Northern Hemisphere midlatitude planetary-scale low-frequency waves (zonal wavenumber *m* = 1, 2, 3, and 4) in winter: (i) The amplitude of the planetary scale low-frequency waves is nearly constant with the zonal wavenumber *m*, and has a maximum at 300 mb for geopotential height and at 850 mb for temperature; (ii) All low-frequency waves have a nearly equivalent barotropic structure (much more so than the stationary waves); (iii) The instantaneous zonal phase speed of an individual low-frequency wave is nearly independent of height and latitude so that we may identify the three-dimensional structure of a low-frequency wave by following that wave at just one pressure level and one latitude in either geopotential height or temperature.

The traveling storm tracks, defined as the local maxima on the rms map of the phase-shifted high-frequency eddies, are identifiable from both geopotential height and temperature data throughout the troposphere. They are located over the trough regions of the low-frequency waves. The barotropic feedback (i.e., the geopotential tendency due to the vorticity flux) of the traveling storm tracks tends to reinforce the low-frequency waves and to retard their propagation throughout the troposphere. The baroclinic feedback (i.e., the temperature tendency due to the heat flux) of the traveling storm tracks appears to have an out-of-phase relation with the low-frequency waves in temperature from 850 mb to 300 mb. At 200 mb, the baroclinic feedback is nearly in phase with the low-frequency waves in the temperature field.

The mutual dependence between the low-frequency flow and their attendant traveling storm tracks dynamically resembles that between the climatological stationary waves and the climatological storm tracks. Therefore, our observational study seems to lend support for the local instability theory that accounts for the existence of the stationary/traveling storm tracks as the consequence of the zonal inhomogeneity of the climatological mean/low-frequency flow.

## Abstract

A special composite technique (“phase shifting” method) that records both the low- and high-frequency transient activity throughout the troposphere in a framework moving with an individual low-frequency wave of 500-mb geopotential height at 50°N was used to document the three-dimensional structure of the planetary-scale low-frequency waves as well as the attendant traveling storm tracks from the NMC twice-daily analyses of geopotential height and temperature at pressure levels 850, 700, 500, 300, and 200 mb for the ten winters 1967/68 through 1976/77.

The following are the main characteristics of the Northern Hemisphere midlatitude planetary-scale low-frequency waves (zonal wavenumber *m* = 1, 2, 3, and 4) in winter: (i) The amplitude of the planetary scale low-frequency waves is nearly constant with the zonal wavenumber *m*, and has a maximum at 300 mb for geopotential height and at 850 mb for temperature; (ii) All low-frequency waves have a nearly equivalent barotropic structure (much more so than the stationary waves); (iii) The instantaneous zonal phase speed of an individual low-frequency wave is nearly independent of height and latitude so that we may identify the three-dimensional structure of a low-frequency wave by following that wave at just one pressure level and one latitude in either geopotential height or temperature.

The traveling storm tracks, defined as the local maxima on the rms map of the phase-shifted high-frequency eddies, are identifiable from both geopotential height and temperature data throughout the troposphere. They are located over the trough regions of the low-frequency waves. The barotropic feedback (i.e., the geopotential tendency due to the vorticity flux) of the traveling storm tracks tends to reinforce the low-frequency waves and to retard their propagation throughout the troposphere. The baroclinic feedback (i.e., the temperature tendency due to the heat flux) of the traveling storm tracks appears to have an out-of-phase relation with the low-frequency waves in temperature from 850 mb to 300 mb. At 200 mb, the baroclinic feedback is nearly in phase with the low-frequency waves in the temperature field.

The mutual dependence between the low-frequency flow and their attendant traveling storm tracks dynamically resembles that between the climatological stationary waves and the climatological storm tracks. Therefore, our observational study seems to lend support for the local instability theory that accounts for the existence of the stationary/traveling storm tracks as the consequence of the zonal inhomogeneity of the climatological mean/low-frequency flow.

## Abstract

The North American Multimodel Ensemble (NMME) forecasting system has been continuously producing seasonal forecasts since August 2011. The NMME, with its suite of diverse models, provides a valuable opportunity for characterizing forecast confidence using probabilistic forecasts. The current experimental probabilistic forecast product (in map format) presents the most likely tercile for the seasonal mean value, chosen out of above normal, near normal, or below normal categories, using a nonparametric counting method to determine the probability of each class. The skill of the 3-month-mean probabilistic forecasts of 2-m surface temperature (T2m), precipitation rate, and sea surface temperature is assessed using forecasts from the 29-yr (1982–2010) NMME hindcast database. Three forecast configurations are considered: a full six-model NMME; a “mini-NMME” with 24 members, four each from six models; and the 24-member CFSv2 alone. Skill is assessed on the cross-validated hindcasts using the Brier skill score (BSS); forecast reliability and resolution are also assessed. This study provides a baseline skill assessment of the current method of creating probabilistic forecasts from the NMME system.

For forecasts in the above- and below-normal terciles for all variables and geographical regions examined in this study, BSS for NMME forecasts is higher than BSS for CFSv2 forecasts. Niño-3.4 forecasts from the full NMME and the mini-NMME receive nearly identical BSS that are higher than BSS for CFSv2 forecasts. Even systems with modest BSS, such as T2m in the Northern Hemisphere, have generally high reliability, as shown in reliability diagrams.

## Abstract

The North American Multimodel Ensemble (NMME) forecasting system has been continuously producing seasonal forecasts since August 2011. The NMME, with its suite of diverse models, provides a valuable opportunity for characterizing forecast confidence using probabilistic forecasts. The current experimental probabilistic forecast product (in map format) presents the most likely tercile for the seasonal mean value, chosen out of above normal, near normal, or below normal categories, using a nonparametric counting method to determine the probability of each class. The skill of the 3-month-mean probabilistic forecasts of 2-m surface temperature (T2m), precipitation rate, and sea surface temperature is assessed using forecasts from the 29-yr (1982–2010) NMME hindcast database. Three forecast configurations are considered: a full six-model NMME; a “mini-NMME” with 24 members, four each from six models; and the 24-member CFSv2 alone. Skill is assessed on the cross-validated hindcasts using the Brier skill score (BSS); forecast reliability and resolution are also assessed. This study provides a baseline skill assessment of the current method of creating probabilistic forecasts from the NMME system.

For forecasts in the above- and below-normal terciles for all variables and geographical regions examined in this study, BSS for NMME forecasts is higher than BSS for CFSv2 forecasts. Niño-3.4 forecasts from the full NMME and the mini-NMME receive nearly identical BSS that are higher than BSS for CFSv2 forecasts. Even systems with modest BSS, such as T2m in the Northern Hemisphere, have generally high reliability, as shown in reliability diagrams.