Search Results
You are looking at 1 - 10 of 48 items for
- Author or Editor: Huug van den Dool x
- Refine by Access: All Content x
Abstract
The time-mean tropical surface momentum balance is investigated with a simple model that calculates tropical surface winds from time mean sea level pressure fields. The model domain is the global tropical strip centered on the equator with lateral boundaries at ±30° latitude. Steady state surface winds are numerically calculated from the nonlinear horizontal momentum equations, with forcing from observed climatological monthly mean sea level pressures and prescribed lateral boundary winds. Dissipation is parameterized by linear damping and diffusion. Comparisons of model winds with observed climatological monthly mean winds show realistic simulations in most regions and in all months. The poorest simulations occur in the meridional component of the wind in near-equatorial areas of strongly convergent or weak winds. In these areas, and in the near-equatorial region generally, diffusion processes make a significant positive contribution to the realism of the model winds. Horizontal nonlinear advection also improves the simulation near the equator, though to a smaller degree. The generally skillful model winds refute the conventional idea that weak gradients make the tropical pressure field a poor tool for calculating tropical winds. To the contrary, tropical pressure fields contain substantial information about associated winds. Thus, a relatively complete momentum balance can be identified for the major features of the time-mean tropical wind field.
Abstract
The time-mean tropical surface momentum balance is investigated with a simple model that calculates tropical surface winds from time mean sea level pressure fields. The model domain is the global tropical strip centered on the equator with lateral boundaries at ±30° latitude. Steady state surface winds are numerically calculated from the nonlinear horizontal momentum equations, with forcing from observed climatological monthly mean sea level pressures and prescribed lateral boundary winds. Dissipation is parameterized by linear damping and diffusion. Comparisons of model winds with observed climatological monthly mean winds show realistic simulations in most regions and in all months. The poorest simulations occur in the meridional component of the wind in near-equatorial areas of strongly convergent or weak winds. In these areas, and in the near-equatorial region generally, diffusion processes make a significant positive contribution to the realism of the model winds. Horizontal nonlinear advection also improves the simulation near the equator, though to a smaller degree. The generally skillful model winds refute the conventional idea that weak gradients make the tropical pressure field a poor tool for calculating tropical winds. To the contrary, tropical pressure fields contain substantial information about associated winds. Thus, a relatively complete momentum balance can be identified for the major features of the time-mean tropical wind field.
Abstract
The North American Multimodel Ensemble (NMME) forecasting system has been continuously producing seasonal forecasts since August 2011. The NMME, with its suite of diverse models, provides a valuable opportunity for characterizing forecast confidence using probabilistic forecasts. The current experimental probabilistic forecast product (in map format) presents the most likely tercile for the seasonal mean value, chosen out of above normal, near normal, or below normal categories, using a nonparametric counting method to determine the probability of each class. The skill of the 3-month-mean probabilistic forecasts of 2-m surface temperature (T2m), precipitation rate, and sea surface temperature is assessed using forecasts from the 29-yr (1982–2010) NMME hindcast database. Three forecast configurations are considered: a full six-model NMME; a “mini-NMME” with 24 members, four each from six models; and the 24-member CFSv2 alone. Skill is assessed on the cross-validated hindcasts using the Brier skill score (BSS); forecast reliability and resolution are also assessed. This study provides a baseline skill assessment of the current method of creating probabilistic forecasts from the NMME system.
For forecasts in the above- and below-normal terciles for all variables and geographical regions examined in this study, BSS for NMME forecasts is higher than BSS for CFSv2 forecasts. Niño-3.4 forecasts from the full NMME and the mini-NMME receive nearly identical BSS that are higher than BSS for CFSv2 forecasts. Even systems with modest BSS, such as T2m in the Northern Hemisphere, have generally high reliability, as shown in reliability diagrams.
Abstract
The North American Multimodel Ensemble (NMME) forecasting system has been continuously producing seasonal forecasts since August 2011. The NMME, with its suite of diverse models, provides a valuable opportunity for characterizing forecast confidence using probabilistic forecasts. The current experimental probabilistic forecast product (in map format) presents the most likely tercile for the seasonal mean value, chosen out of above normal, near normal, or below normal categories, using a nonparametric counting method to determine the probability of each class. The skill of the 3-month-mean probabilistic forecasts of 2-m surface temperature (T2m), precipitation rate, and sea surface temperature is assessed using forecasts from the 29-yr (1982–2010) NMME hindcast database. Three forecast configurations are considered: a full six-model NMME; a “mini-NMME” with 24 members, four each from six models; and the 24-member CFSv2 alone. Skill is assessed on the cross-validated hindcasts using the Brier skill score (BSS); forecast reliability and resolution are also assessed. This study provides a baseline skill assessment of the current method of creating probabilistic forecasts from the NMME system.
For forecasts in the above- and below-normal terciles for all variables and geographical regions examined in this study, BSS for NMME forecasts is higher than BSS for CFSv2 forecasts. Niño-3.4 forecasts from the full NMME and the mini-NMME receive nearly identical BSS that are higher than BSS for CFSv2 forecasts. Even systems with modest BSS, such as T2m in the Northern Hemisphere, have generally high reliability, as shown in reliability diagrams.
Abstract
The performance of ridge regression methods for consolidation of multiple seasonal ensemble prediction systems is analyzed. The methods are applied to predict SST in the tropical Pacific based on ensembles from the Development of a European Multimodel Ensemble System for Seasonal-to-Interannual Prediction (DEMETER) models, plus two of NCEP’s operational models. Strategies to increase the ratio of the effective sample size of the training data to the number of coefficients to be fitted are proposed and tested. These strategies include objective selection of a smaller subset of models, pooling of information from neighboring grid points, and consolidating all ensemble members rather than each model’s ensemble average. In all variations of the ridge regression consolidation methods tested, increased effective sample size produces more stable weights and more skillful predictions on independent data. While the scores may not increase significantly as the effective sampling size is increased, the benefit is seen in terms of consistent improvements over the simple equal weight ensemble average. In the western tropical Pacific, most consolidation methods tested outperform the simple equal weight ensemble average; in other regions they have similar skill as measured by both the anomaly correlation and the relative operating curve. The main obstacles to progress are a short period of data and a lack of independent information among models.
Abstract
The performance of ridge regression methods for consolidation of multiple seasonal ensemble prediction systems is analyzed. The methods are applied to predict SST in the tropical Pacific based on ensembles from the Development of a European Multimodel Ensemble System for Seasonal-to-Interannual Prediction (DEMETER) models, plus two of NCEP’s operational models. Strategies to increase the ratio of the effective sample size of the training data to the number of coefficients to be fitted are proposed and tested. These strategies include objective selection of a smaller subset of models, pooling of information from neighboring grid points, and consolidating all ensemble members rather than each model’s ensemble average. In all variations of the ridge regression consolidation methods tested, increased effective sample size produces more stable weights and more skillful predictions on independent data. While the scores may not increase significantly as the effective sampling size is increased, the benefit is seen in terms of consistent improvements over the simple equal weight ensemble average. In the western tropical Pacific, most consolidation methods tested outperform the simple equal weight ensemble average; in other regions they have similar skill as measured by both the anomaly correlation and the relative operating curve. The main obstacles to progress are a short period of data and a lack of independent information among models.
Abstract
A probability forecast has advantages over a deterministic forecast as the former offers information about the probabilities of various possible future states of the atmosphere. As physics-based numerical models find their success in modern weather forecasting, an important task is to convert a model forecast, usually deterministic, into a probability forecast. This study explores methods to do such a conversion for NCEP’s operational 500-mb-height forecast and the discussion is extended to ensemble forecasting. Compared with traditional model-based statistical forecast methods such as Model Output Statistics, in which a probability forecast is made from statistical relationships derived from single model-predicted fields and observations, probability forecasts discussed in this study are focused on probability information directly provided by multiple runs of a dynamical model—eleven 0000 UTC runs at T62 resolution.
To convert a single model forecast into a strawman probability forecast (single forecast probability or SFP), a contingency table is derived from historical forecast–verification data. Given a forecast for one of three classes (below, normal, and above the climatological mean), the SFP probabilities are simply the conditional (or relative) frequencies at which each of three categories are observed over a period of time. These probabilities have good reliability (perfect for dependent data) as long as the model is not changed and maintains the same performance level as before. SFP, however, does not discriminate individual cases and cannot make use of information particular to individual cases. For ensemble forecasts, ensemble probabilities (EP) are calculated as the percentages of the number of members in each category based on the given ensemble samples. This probability specification method fully uses probability information provided by the ensemble. Because of the limited ensemble size, model deficiencies, and because the samples may be unrepresentative, EP probabilities are not reliable and appear to be too confident, particularly at forecast leads beyond day 6. The authors have attempted to combine EP with SFP to improve the EP probability (referred to as modified forecast probability). Results show that a simple combination (plain average) can considerably improve upon both the EP and SFP.
Abstract
A probability forecast has advantages over a deterministic forecast as the former offers information about the probabilities of various possible future states of the atmosphere. As physics-based numerical models find their success in modern weather forecasting, an important task is to convert a model forecast, usually deterministic, into a probability forecast. This study explores methods to do such a conversion for NCEP’s operational 500-mb-height forecast and the discussion is extended to ensemble forecasting. Compared with traditional model-based statistical forecast methods such as Model Output Statistics, in which a probability forecast is made from statistical relationships derived from single model-predicted fields and observations, probability forecasts discussed in this study are focused on probability information directly provided by multiple runs of a dynamical model—eleven 0000 UTC runs at T62 resolution.
To convert a single model forecast into a strawman probability forecast (single forecast probability or SFP), a contingency table is derived from historical forecast–verification data. Given a forecast for one of three classes (below, normal, and above the climatological mean), the SFP probabilities are simply the conditional (or relative) frequencies at which each of three categories are observed over a period of time. These probabilities have good reliability (perfect for dependent data) as long as the model is not changed and maintains the same performance level as before. SFP, however, does not discriminate individual cases and cannot make use of information particular to individual cases. For ensemble forecasts, ensemble probabilities (EP) are calculated as the percentages of the number of members in each category based on the given ensemble samples. This probability specification method fully uses probability information provided by the ensemble. Because of the limited ensemble size, model deficiencies, and because the samples may be unrepresentative, EP probabilities are not reliable and appear to be too confident, particularly at forecast leads beyond day 6. The authors have attempted to combine EP with SFP to improve the EP probability (referred to as modified forecast probability). Results show that a simple combination (plain average) can considerably improve upon both the EP and SFP.
Abstract
A simple bias correction method was used to correct daily operational ensemble week-1 and week-2 precipitation and 2-m surface air temperature forecasts from the NCEP Global Forecast System (GFS). The study shows some unexpected and striking features of the forecast errors or biases of both precipitation and 2-m surface air temperature from the GFS. They are dominated by relatively large-scale spatial patterns and low-frequency variations that resemble the annual cycle. A large portion of these forecast errors is removable, but the effectiveness is time and space dependent. The bias-corrected week-1 and week-2 ensemble precipitation and 2-m surface air temperature forecasts indicate some improvements over their raw counterparts. However, the overall levels of week-1 and week-2 forecast skill in terms of spatial anomaly correlation and root-mean-square error are still only modest. The dynamical soil moisture forecasts (i.e., land surface hydrological model forced with bias-corrected precipitation and 2-m surface air temperature integrated forward for up to 2 weeks) have very high skill, but hardly beat persistence over the United States. The inability to outperform persistence mainly relates to the skill of the current GFS week-1 and week-2 precipitation forecasts not being above a threshold (i.e., anomaly correlation > 0.5 is required).
Abstract
A simple bias correction method was used to correct daily operational ensemble week-1 and week-2 precipitation and 2-m surface air temperature forecasts from the NCEP Global Forecast System (GFS). The study shows some unexpected and striking features of the forecast errors or biases of both precipitation and 2-m surface air temperature from the GFS. They are dominated by relatively large-scale spatial patterns and low-frequency variations that resemble the annual cycle. A large portion of these forecast errors is removable, but the effectiveness is time and space dependent. The bias-corrected week-1 and week-2 ensemble precipitation and 2-m surface air temperature forecasts indicate some improvements over their raw counterparts. However, the overall levels of week-1 and week-2 forecast skill in terms of spatial anomaly correlation and root-mean-square error are still only modest. The dynamical soil moisture forecasts (i.e., land surface hydrological model forced with bias-corrected precipitation and 2-m surface air temperature integrated forward for up to 2 weeks) have very high skill, but hardly beat persistence over the United States. The inability to outperform persistence mainly relates to the skill of the current GFS week-1 and week-2 precipitation forecasts not being above a threshold (i.e., anomaly correlation > 0.5 is required).
Abstract
Retrospective forecasts of the new NCEP Climate Forecast System (CFS) have been analyzed out to 45 days from 1999 to 2009 with four members (0000, 0600, 1200, and 1800 UTC) each day. The new version of CFS [CFS, version 2 (CFSv2)] shows significant improvement over the older CFS [CFS, version 1 (CFSv1)] in predicting the Madden–Julian oscillation (MJO), with skill reaching 2–3 weeks in comparison with the CFSv1’s skill of nearly 1 week. Diagnostics of experiments related to the MJO forecast show that the systematic error correction, possible only because of the enormous hindcast dataset and the ensemble aspects of the prediction system (4 times a day), do contribute to improved forecasts. But the main reason is the improvement in the model and initial conditions between 1995 and 2010.
Abstract
Retrospective forecasts of the new NCEP Climate Forecast System (CFS) have been analyzed out to 45 days from 1999 to 2009 with four members (0000, 0600, 1200, and 1800 UTC) each day. The new version of CFS [CFS, version 2 (CFSv2)] shows significant improvement over the older CFS [CFS, version 1 (CFSv1)] in predicting the Madden–Julian oscillation (MJO), with skill reaching 2–3 weeks in comparison with the CFSv1’s skill of nearly 1 week. Diagnostics of experiments related to the MJO forecast show that the systematic error correction, possible only because of the enormous hindcast dataset and the ensemble aspects of the prediction system (4 times a day), do contribute to improved forecasts. But the main reason is the improvement in the model and initial conditions between 1995 and 2010.
Abstract
Forecast skill and potential predictability of 2-m temperature, precipitation rate, and sea surface temperature are assessed using 29 yr of hindcast data from models included in phase 1 of the North American Multimodel Ensemble (NMME) project. Forecast skill is examined using the anomaly correlation (AC); skill of the bias-corrected ensemble means (EMs) of the individual models and of the NMME 7-model EM are verified against the observed value. Forecast skill is also assessed using the root-mean-square error. The models’ representation of the size of forecast anomalies is also studied. Predictability was considered from two angles: homogeneous, where one model is verified against a single member from its own ensemble, and heterogeneous, where a model’s EM is compared to a single member from another model. This study provides insight both into the physical predictability of the three fields and into the NMME and its contributing models.
Most of the models in the NMME have fairly realistic spread, as represented by the interannual variability. The NMME 7-model forecast skill, verified against observations, is equal to or higher than the individual models’ forecast ACs. Two-meter temperature (T2m) skill matches the highest single-model skill, while precipitation rate and sea surface temperature NMME EM skill is higher than for any single model. Homogeneous predictability is higher than reported skill in all fields, suggesting there may be room for some improvement in model prediction, although there are many regional and seasonal variations. The estimate of potential predictability is not overly sensitive to the choice of model. In general, models with higher homogeneous predictability show higher forecast skill.
Abstract
Forecast skill and potential predictability of 2-m temperature, precipitation rate, and sea surface temperature are assessed using 29 yr of hindcast data from models included in phase 1 of the North American Multimodel Ensemble (NMME) project. Forecast skill is examined using the anomaly correlation (AC); skill of the bias-corrected ensemble means (EMs) of the individual models and of the NMME 7-model EM are verified against the observed value. Forecast skill is also assessed using the root-mean-square error. The models’ representation of the size of forecast anomalies is also studied. Predictability was considered from two angles: homogeneous, where one model is verified against a single member from its own ensemble, and heterogeneous, where a model’s EM is compared to a single member from another model. This study provides insight both into the physical predictability of the three fields and into the NMME and its contributing models.
Most of the models in the NMME have fairly realistic spread, as represented by the interannual variability. The NMME 7-model forecast skill, verified against observations, is equal to or higher than the individual models’ forecast ACs. Two-meter temperature (T2m) skill matches the highest single-model skill, while precipitation rate and sea surface temperature NMME EM skill is higher than for any single model. Homogeneous predictability is higher than reported skill in all fields, suggesting there may be room for some improvement in model prediction, although there are many regional and seasonal variations. The estimate of potential predictability is not overly sensitive to the choice of model. In general, models with higher homogeneous predictability show higher forecast skill.
Abstract
A series of 90-day integrations by a low-resolution version (T40) of the National Meteorological Center's global spectral model was analyzed for its performance as well as its low-frequency variability behavior. In particular, 5-day mean 500-mb forecasts with leads up to 88 days were examined and compared with the observations. The forecast mean height decreased rapidly as forecast lead increased. A severe negative bias of the mean height in the Tropics was caused by a negative temperature bias and a drop of the surface pressure of about 2 mb. The forecast variance also dropped rapidly to a minimum of 75% of the atmospheric standard deviation before being stabilized at day 18. The model could not maintain large anomalous flows from the atmospheric initial conditions. However, it is quite capable of generating and maintaining large anomalies after drifting to its own climatology and temporal variability.
At extended ranges, the model showed better skill over the North Pacific than North Atlantic when the season advanced to the colder period of the DERF90 (dynamical extended-range forecasts 1990) experiments. The model also displayed dependence on circulation regimes, although the skill fluctuated widely from day to day in general. Blocking flows in the forecast were found to systematically retrogress to the Baffin Island area from the North Atlantic. Therefore, improvements of the model's systematic errors, including its drift, appear to be essential in order to achieve a higher level of forecast performance. However, no generalization can be made due to the usage of a low-resolution model and the experiments being carried out over a rather short time span, from only 3 May to 6 December 1990.
Abstract
A series of 90-day integrations by a low-resolution version (T40) of the National Meteorological Center's global spectral model was analyzed for its performance as well as its low-frequency variability behavior. In particular, 5-day mean 500-mb forecasts with leads up to 88 days were examined and compared with the observations. The forecast mean height decreased rapidly as forecast lead increased. A severe negative bias of the mean height in the Tropics was caused by a negative temperature bias and a drop of the surface pressure of about 2 mb. The forecast variance also dropped rapidly to a minimum of 75% of the atmospheric standard deviation before being stabilized at day 18. The model could not maintain large anomalous flows from the atmospheric initial conditions. However, it is quite capable of generating and maintaining large anomalies after drifting to its own climatology and temporal variability.
At extended ranges, the model showed better skill over the North Pacific than North Atlantic when the season advanced to the colder period of the DERF90 (dynamical extended-range forecasts 1990) experiments. The model also displayed dependence on circulation regimes, although the skill fluctuated widely from day to day in general. Blocking flows in the forecast were found to systematically retrogress to the Baffin Island area from the North Atlantic. Therefore, improvements of the model's systematic errors, including its drift, appear to be essential in order to achieve a higher level of forecast performance. However, no generalization can be made due to the usage of a low-resolution model and the experiments being carried out over a rather short time span, from only 3 May to 6 December 1990.
Abstract
The skill of a set of extended-range dynamical forecasts made with a modern numerical forecast model is examined. A forecast is said to be skillful if it produces a high quality forecast by correctly modeling some aspects of the dynamics of the real atmosphere; high quality forecasts may also occur by chance. The dangers of making a conclusion about model skill by verifying a single long-range forecast are pointed out by examples of apparently high “skill” verifications between extended-range forecasts and observed fields from entirely different years.
To avoid these problems, the entire distribution of forecast quality for a large set of forecasts as a function of lead time is examined. A set of control forecasts that clearly have no skill is presented. The quality distribution for the extended-range forecasts is compared to the distributions of quality for the no-skill control forecast set.
The extended-range forecast quality distributions are found to be essentially indistinguishable from those for the no-skill control at leads somewhat greater than 12 days. A search for individual forecasts with a “return of skill” at extended ranges is also made. Although it is possible to find individual forecasts that have a return of quality, a comparison to the no-skill controls demonstrates that these return of skill forecasts occur only as often as is expected by chance.
Abstract
The skill of a set of extended-range dynamical forecasts made with a modern numerical forecast model is examined. A forecast is said to be skillful if it produces a high quality forecast by correctly modeling some aspects of the dynamics of the real atmosphere; high quality forecasts may also occur by chance. The dangers of making a conclusion about model skill by verifying a single long-range forecast are pointed out by examples of apparently high “skill” verifications between extended-range forecasts and observed fields from entirely different years.
To avoid these problems, the entire distribution of forecast quality for a large set of forecasts as a function of lead time is examined. A set of control forecasts that clearly have no skill is presented. The quality distribution for the extended-range forecasts is compared to the distributions of quality for the no-skill control forecast set.
The extended-range forecast quality distributions are found to be essentially indistinguishable from those for the no-skill control at leads somewhat greater than 12 days. A search for individual forecasts with a “return of skill” at extended ranges is also made. Although it is possible to find individual forecasts that have a return of quality, a comparison to the no-skill controls demonstrates that these return of skill forecasts occur only as often as is expected by chance.
Abstract
Teleconnection patterns have been extensively investigated, mostly with linear analysis tools. The lesser-known asymmetric characteristics between positive and negative phases of prominent teleconnections are explored here. Substantial disparity between opposite phases can be found. The Pacific–North American (PNA) pattern exhibits a large difference in structure and statistical significance in its downstream action center, showing either a large impact over the U.S. southern third region or over the western North Atlantic Ocean. The North Atlantic–based patterns display significant impacts over the North Atlantic for large positive anomalies and even larger impacts over the European sector for large negative anomalies.
The monthly variance is distributed nearly evenly over the entire North Atlantic basin. A teleconnection pattern based on different regions of the basin has been known to assume different structure and time variations. The extent of statistical significance is investigated for three typical North Atlantic–associated patterns based separately on the eastern (EATL), western (WATL), and southern (SATL) regions of the North Atlantic. The EATL teleconnection pattern is similar to the classical North Atlantic Oscillation (NAO). The WATL pattern, however, is more similar to the Arctic (Annular) Oscillation (AO). The sensitivity of the North Atlantic–based teleconnection to a slight shift in base point can be fairly large: the pattern can be an AO or an NAO, with distinctive significance structure between them. Other discernible features are also presented.
Abstract
Teleconnection patterns have been extensively investigated, mostly with linear analysis tools. The lesser-known asymmetric characteristics between positive and negative phases of prominent teleconnections are explored here. Substantial disparity between opposite phases can be found. The Pacific–North American (PNA) pattern exhibits a large difference in structure and statistical significance in its downstream action center, showing either a large impact over the U.S. southern third region or over the western North Atlantic Ocean. The North Atlantic–based patterns display significant impacts over the North Atlantic for large positive anomalies and even larger impacts over the European sector for large negative anomalies.
The monthly variance is distributed nearly evenly over the entire North Atlantic basin. A teleconnection pattern based on different regions of the basin has been known to assume different structure and time variations. The extent of statistical significance is investigated for three typical North Atlantic–associated patterns based separately on the eastern (EATL), western (WATL), and southern (SATL) regions of the North Atlantic. The EATL teleconnection pattern is similar to the classical North Atlantic Oscillation (NAO). The WATL pattern, however, is more similar to the Arctic (Annular) Oscillation (AO). The sensitivity of the North Atlantic–based teleconnection to a slight shift in base point can be fairly large: the pattern can be an AO or an NAO, with distinctive significance structure between them. Other discernible features are also presented.