Search Results
You are looking at 1 - 10 of 13 items for
- Author or Editor: Elizabeth Satterfield x
- Refine by Access: All Content x
Abstract
The ability of an ensemble to capture the magnitude and spectrum of uncertainty in a local linear space spanned by the ensemble perturbations is assessed. Numerical experiments are carried out with a reduced resolution 2004 version of the model component of the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS). The local ensemble transform Kalman filter (LETKF) data assimilation system is used to assimilate observations in three steps, gradually adding more realistic features to the observing network. In the first experiment, randomly placed, noisy, simulated vertical soundings, which provide 10%% coverage of horizontal model grid points, are assimilated. Next, the impact of an inhomogeneous observing system is introduced by assimilating simulated observations in the locations of real observations of the atmosphere. Finally, observations of the real atmosphere are assimilated.
The most important findings of this study are the following: predicting the magnitude of the forecast uncertainty and the relative importance of the different patterns of uncertainty is, in general, a more difficult task than predicting the patterns of uncertainty; the ensemble, which is tuned to provide near-optimal performance at analysis time, underestimates not only the total magnitude of the uncertainty, but also the magnitude of the uncertainty that projects onto the space spanned by the ensemble perturbations; and finally, a strong predictive linear relationship is found between the local ensemble spread and the upper bound of the local forecast uncertainty.
Abstract
The ability of an ensemble to capture the magnitude and spectrum of uncertainty in a local linear space spanned by the ensemble perturbations is assessed. Numerical experiments are carried out with a reduced resolution 2004 version of the model component of the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS). The local ensemble transform Kalman filter (LETKF) data assimilation system is used to assimilate observations in three steps, gradually adding more realistic features to the observing network. In the first experiment, randomly placed, noisy, simulated vertical soundings, which provide 10%% coverage of horizontal model grid points, are assimilated. Next, the impact of an inhomogeneous observing system is introduced by assimilating simulated observations in the locations of real observations of the atmosphere. Finally, observations of the real atmosphere are assimilated.
The most important findings of this study are the following: predicting the magnitude of the forecast uncertainty and the relative importance of the different patterns of uncertainty is, in general, a more difficult task than predicting the patterns of uncertainty; the ensemble, which is tuned to provide near-optimal performance at analysis time, underestimates not only the total magnitude of the uncertainty, but also the magnitude of the uncertainty that projects onto the space spanned by the ensemble perturbations; and finally, a strong predictive linear relationship is found between the local ensemble spread and the upper bound of the local forecast uncertainty.
Abstract
The performance of an ensemble prediction system is inherently flow dependent. This paper investigates the flow dependence of the ensemble performance with the help of linear diagnostics applied to the ensemble perturbations in a small local neighborhood of each model gridpoint location ℓ. A local error covariance matrix 𝗣ℓ is defined for each local region, and the diagnostics are applied to the linear space
Numerical experiments are carried out with an implementation of the local ensemble transform Kalman filter (LETKF) data assimilation system on a reduced-resolution [T62 and 28 vertical levels (T62L28)] version of the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS). Both simulated observations under the perfect model scenario and observations of the real atmosphere in a realistic setting are used in these experiments. It is found that (i) paradoxically, the linear space
Abstract
The performance of an ensemble prediction system is inherently flow dependent. This paper investigates the flow dependence of the ensemble performance with the help of linear diagnostics applied to the ensemble perturbations in a small local neighborhood of each model gridpoint location ℓ. A local error covariance matrix 𝗣ℓ is defined for each local region, and the diagnostics are applied to the linear space
Numerical experiments are carried out with an implementation of the local ensemble transform Kalman filter (LETKF) data assimilation system on a reduced-resolution [T62 and 28 vertical levels (T62L28)] version of the National Centers for Environmental Prediction (NCEP) Global Forecast System (GFS). Both simulated observations under the perfect model scenario and observations of the real atmosphere in a realistic setting are used in these experiments. It is found that (i) paradoxically, the linear space
Abstract
A forecast “bust” or “dropout” can be defined as an intermittent but significant loss of model forecast performance. Deterministic forecast dropouts are typically defined in terms of the 500-hPa geopotential height (Φ500) anomaly correlation coefficient (ACC) in the Northern Hemisphere (NH) dropping below a predefined threshold. This study first presents a multimodel comparison of dropouts in the Navy Global Environmental Model (NAVGEM) deterministic forecast with the ensemble control members from the Environment and Climate Change Canada (ECCC) Global Ensemble Prediction System (GEPS) and the National Centers for Environmental Prediction (NCEP) Global Ensemble Forecast System (GEFS). Then, the relationship between dropouts and large-scale pattern variability is investigated, focusing on the temporal variability and correlation of flow indices surrounding dropout events. Finally, three severe dropout events are examined from an ensemble perspective. The main findings of this work are the following: 1) forecast dropouts exhibit some relation between models; 2) although forecast dropouts do not have a single cause, the most severe dropouts in NAVGEM can be linked to specific behavior of the large-scale flow indices, that is, they tend to follow periods of rapidly escalating volatility of the flow indices, and they tend to occur during intervals where the AO and Pacific North American (PNA) indices are exhibiting unusually strong interdependence; and 3) for the dropout events examined from an ensemble perspective, the NAVGEM ensemble spread does not provide a strong signal of elevated potential for very large forecast errors.
Abstract
A forecast “bust” or “dropout” can be defined as an intermittent but significant loss of model forecast performance. Deterministic forecast dropouts are typically defined in terms of the 500-hPa geopotential height (Φ500) anomaly correlation coefficient (ACC) in the Northern Hemisphere (NH) dropping below a predefined threshold. This study first presents a multimodel comparison of dropouts in the Navy Global Environmental Model (NAVGEM) deterministic forecast with the ensemble control members from the Environment and Climate Change Canada (ECCC) Global Ensemble Prediction System (GEPS) and the National Centers for Environmental Prediction (NCEP) Global Ensemble Forecast System (GEFS). Then, the relationship between dropouts and large-scale pattern variability is investigated, focusing on the temporal variability and correlation of flow indices surrounding dropout events. Finally, three severe dropout events are examined from an ensemble perspective. The main findings of this work are the following: 1) forecast dropouts exhibit some relation between models; 2) although forecast dropouts do not have a single cause, the most severe dropouts in NAVGEM can be linked to specific behavior of the large-scale flow indices, that is, they tend to follow periods of rapidly escalating volatility of the flow indices, and they tend to occur during intervals where the AO and Pacific North American (PNA) indices are exhibiting unusually strong interdependence; and 3) for the dropout events examined from an ensemble perspective, the NAVGEM ensemble spread does not provide a strong signal of elevated potential for very large forecast errors.
Abstract
Ensemble variances provide a prediction of the flow-dependent error variance of the ensemble mean or, possibly, a high-resolution forecast. However, small ensemble size, unaccounted for model error, and imperfections in ensemble generation schemes cause the predictions of error variance to be imperfect. In previous work, the authors developed an analytic approximation to the posterior distribution of true error variances, given an imperfect ensemble prediction, based on parameters recovered from long archives of innovation and ensemble variance pairs. This paper shows how heteroscedastic postprocessing enables climatological information to be blended with ensemble forecast information when information about the distribution of true error variances given an ensemble sample variance is available. A hierarchy of postprocessing methods are described, each graded on the amount of information about the posterior distribution of error variances used in the postprocessing. These homoscedastic methods are used to assess the value of knowledge of the mean and variance of the posterior distribution of error variances to ensemble postprocessing and explore sensitivity to various parameter regimes. Testing was performed using both synthetic data and operational ensemble forecasts of a Gaussian-distributed variable, to provide a proof-of-concept demonstration in a semi-idealized framework. Rank frequency histograms, weather roulette, continuous ranked probability score, and spread-skill diagrams are used to quantify the value of information about the posterior distribution of error variances. It is found that ensemble postprocessing schemes that utilize the full distribution of error variances given the ensemble sample variance outperform those that do not.
Abstract
Ensemble variances provide a prediction of the flow-dependent error variance of the ensemble mean or, possibly, a high-resolution forecast. However, small ensemble size, unaccounted for model error, and imperfections in ensemble generation schemes cause the predictions of error variance to be imperfect. In previous work, the authors developed an analytic approximation to the posterior distribution of true error variances, given an imperfect ensemble prediction, based on parameters recovered from long archives of innovation and ensemble variance pairs. This paper shows how heteroscedastic postprocessing enables climatological information to be blended with ensemble forecast information when information about the distribution of true error variances given an ensemble sample variance is available. A hierarchy of postprocessing methods are described, each graded on the amount of information about the posterior distribution of error variances used in the postprocessing. These homoscedastic methods are used to assess the value of knowledge of the mean and variance of the posterior distribution of error variances to ensemble postprocessing and explore sensitivity to various parameter regimes. Testing was performed using both synthetic data and operational ensemble forecasts of a Gaussian-distributed variable, to provide a proof-of-concept demonstration in a semi-idealized framework. Rank frequency histograms, weather roulette, continuous ranked probability score, and spread-skill diagrams are used to quantify the value of information about the posterior distribution of error variances. It is found that ensemble postprocessing schemes that utilize the full distribution of error variances given the ensemble sample variance outperform those that do not.
Abstract
A conundrum of predictability research is that while the prediction of flow-dependent error distributions is one of its main foci, chaos fundamentally hides flow-dependent forecast error distributions from empirical observation. Empirical estimation of such error distributions requires a large sample of error realizations given the same flow-dependent conditions. However, chaotic elements of the flow and the observing network make it impossible to collect a large enough conditioned error sample to empirically define such distributions and their variance. Such conditional variances are “hidden.” Here, an exposition of the problem is developed from an ensemble Kalman filter data assimilation system applied to a 10-variable nonlinear chaotic model and 25 000 replicate models. The 25 000 replicates reveal the error variances that would otherwise be hidden. It is found that the inverse-gamma distribution accurately approximates the posterior distribution of conditional error variances given an imperfect ensemble variance and provides a reasonable approximation to the prior climatological distribution of conditional error variances. A new analytical model shows how the properties of a likelihood distribution of ensemble variances given a true conditional error variance determine the posterior distribution of error variances given an ensemble variance. The analytically generated distributions are shown to satisfactorily fit empirically determined distributions. The theoretical analysis yields a rigorous interpretation and justification of hybrid error variance models that linearly combine static and flow-dependent estimates of forecast error variance; in doing so, it also helps justify and inform hybrid error covariance models.
Abstract
A conundrum of predictability research is that while the prediction of flow-dependent error distributions is one of its main foci, chaos fundamentally hides flow-dependent forecast error distributions from empirical observation. Empirical estimation of such error distributions requires a large sample of error realizations given the same flow-dependent conditions. However, chaotic elements of the flow and the observing network make it impossible to collect a large enough conditioned error sample to empirically define such distributions and their variance. Such conditional variances are “hidden.” Here, an exposition of the problem is developed from an ensemble Kalman filter data assimilation system applied to a 10-variable nonlinear chaotic model and 25 000 replicate models. The 25 000 replicates reveal the error variances that would otherwise be hidden. It is found that the inverse-gamma distribution accurately approximates the posterior distribution of conditional error variances given an imperfect ensemble variance and provides a reasonable approximation to the prior climatological distribution of conditional error variances. A new analytical model shows how the properties of a likelihood distribution of ensemble variances given a true conditional error variance determine the posterior distribution of error variances given an ensemble variance. The analytically generated distributions are shown to satisfactorily fit empirically determined distributions. The theoretical analysis yields a rigorous interpretation and justification of hybrid error variance models that linearly combine static and flow-dependent estimates of forecast error variance; in doing so, it also helps justify and inform hybrid error covariance models.
Abstract
The statistics of model temporal variability ought to be the same as those of the filtered version of reality that the model is designed to represent. Here, simple diagnostics are introduced to quantify temporal variability on different time scales and are then applied to NCEP and CMC global ensemble forecasting systems. These diagnostics enable comparison of temporal variability in forecasts with temporal variability in the initial states from which the forecasts are produced. They also allow for an examination of how day-to-day variability in the forecast model changes as forecast integration time increases. Because the error in subsequent analyses will differ, it is shown that forecast temporal variability should lie between corresponding analysis variability and analysis variability minus 2 times the analysis error variance. This expectation is not always met and possible causes are discussed. The day-to-day variability in NCEP forecasts steadily decreases at a slow rate as forecast time increases. In contrast, temporal variability increases during the first few days in the CMC control forecasts, and then levels off, consistent with a spinup of the forecasts starting from overly smoothed analyses. The diagnostics successfully reflect a reduction in the temporal variability of the CMC perturbed forecasts after a system upgrade. The diagnostics also illustrate a shift in variability maxima from storm-track regions for 1-day variability to blocking regions for 10-day variability. While these patterns are consistent with previous studies examining temporal variability on different time scales, they have the advantage of being obtainable without the need for extended (e.g., multimonth) forecast integrations.
Abstract
The statistics of model temporal variability ought to be the same as those of the filtered version of reality that the model is designed to represent. Here, simple diagnostics are introduced to quantify temporal variability on different time scales and are then applied to NCEP and CMC global ensemble forecasting systems. These diagnostics enable comparison of temporal variability in forecasts with temporal variability in the initial states from which the forecasts are produced. They also allow for an examination of how day-to-day variability in the forecast model changes as forecast integration time increases. Because the error in subsequent analyses will differ, it is shown that forecast temporal variability should lie between corresponding analysis variability and analysis variability minus 2 times the analysis error variance. This expectation is not always met and possible causes are discussed. The day-to-day variability in NCEP forecasts steadily decreases at a slow rate as forecast time increases. In contrast, temporal variability increases during the first few days in the CMC control forecasts, and then levels off, consistent with a spinup of the forecasts starting from overly smoothed analyses. The diagnostics successfully reflect a reduction in the temporal variability of the CMC perturbed forecasts after a system upgrade. The diagnostics also illustrate a shift in variability maxima from storm-track regions for 1-day variability to blocking regions for 10-day variability. While these patterns are consistent with previous studies examining temporal variability on different time scales, they have the advantage of being obtainable without the need for extended (e.g., multimonth) forecast integrations.
Abstract
In Part I of this study, a model of the distribution of true error variances given an ensemble variance is shown to be defined by six parameters that also determine the optimal weights for the static and flow-dependent parts of hybrid error variance models. Two of the six parameters (the climatological mean of forecast error variance and the climatological minimum of ensemble variance) are straightforward to estimate. The other four parameters are (i) the variance of the climatological distribution of the true conditional error variances, (ii) the climatological minimum of the true conditional error variance, (iii) the relative variance of the distribution of ensemble variances given a true conditional error variance, and (iv) the parameter that defines the mean response of the ensemble variances to changes in the true error variance. These parameters are hidden because they are defined in terms of condition-dependent forecast error variance, which is unobservable if the condition is not sufficiently repeatable. Here, a set of equations that enable these hidden parameters to be accurately estimated from a long time series of (observation minus forecast, ensemble variance) data pairs is presented. The accuracy of the equations is demonstrated in tests using data from long data assimilation cycles with differing model error variance parameters as well as synthetically generated data. This newfound ability to estimate these hidden parameters provides new tools for assessing the quality of ensemble forecasts, tuning hybrid error variance models, and postprocessing ensemble forecasts.
Abstract
In Part I of this study, a model of the distribution of true error variances given an ensemble variance is shown to be defined by six parameters that also determine the optimal weights for the static and flow-dependent parts of hybrid error variance models. Two of the six parameters (the climatological mean of forecast error variance and the climatological minimum of ensemble variance) are straightforward to estimate. The other four parameters are (i) the variance of the climatological distribution of the true conditional error variances, (ii) the climatological minimum of the true conditional error variance, (iii) the relative variance of the distribution of ensemble variances given a true conditional error variance, and (iv) the parameter that defines the mean response of the ensemble variances to changes in the true error variance. These parameters are hidden because they are defined in terms of condition-dependent forecast error variance, which is unobservable if the condition is not sufficiently repeatable. Here, a set of equations that enable these hidden parameters to be accurately estimated from a long time series of (observation minus forecast, ensemble variance) data pairs is presented. The accuracy of the equations is demonstrated in tests using data from long data assimilation cycles with differing model error variance parameters as well as synthetically generated data. This newfound ability to estimate these hidden parameters provides new tools for assessing the quality of ensemble forecasts, tuning hybrid error variance models, and postprocessing ensemble forecasts.
Abstract
Data assimilation schemes combine observational data with a short-term model forecast to produce an analysis. However, many characteristics of the atmospheric states described by the observations and the model differ. Observations often measure a higher-resolution state than coarse-resolution model grids can describe. Hence, the observations may measure aspects of gradients or unresolved eddies that are poorly resolved by the filtered version of reality represented by the model. This inconsistency, known as observation representation error, must be accounted for in data assimilation schemes. In this paper the ability of the ensemble to predict the variance of the observation error of representation is explored, arguing that the portion of representation error being detected by the ensemble variance is that portion correlated to the smoothed features that the coarse-resolution forecast model is able to predict. This predictive relationship is explored using differences between model states and their spectrally truncated form, as well as commonly used statistical methods to estimate observation error variances. It is demonstrated that the ensemble variance is a useful predictor of the observation error variance of representation and that it could be used to account for flow dependence in the observation error covariance matrix.
Abstract
Data assimilation schemes combine observational data with a short-term model forecast to produce an analysis. However, many characteristics of the atmospheric states described by the observations and the model differ. Observations often measure a higher-resolution state than coarse-resolution model grids can describe. Hence, the observations may measure aspects of gradients or unresolved eddies that are poorly resolved by the filtered version of reality represented by the model. This inconsistency, known as observation representation error, must be accounted for in data assimilation schemes. In this paper the ability of the ensemble to predict the variance of the observation error of representation is explored, arguing that the portion of representation error being detected by the ensemble variance is that portion correlated to the smoothed features that the coarse-resolution forecast model is able to predict. This predictive relationship is explored using differences between model states and their spectrally truncated form, as well as commonly used statistical methods to estimate observation error variances. It is demonstrated that the ensemble variance is a useful predictor of the observation error variance of representation and that it could be used to account for flow dependence in the observation error covariance matrix.
Abstract
A new multiscale, ensemble-based data assimilation (DA) method, multiscale local gain form ensemble transform Kalman filter (MLGETKF), is introduced. MLGETKF allows simultaneous update of multiple scales for both the ensemble mean and perturbations through assimilating all observations at once. MLGETKF performs DA in independent local volumes, which lends the algorithm a high degree of computational scalability. The multiscale analysis is enabled through the rapid creation of many pseudoensemble perturbations via a multiscale ensemble modulation procedure. The Kalman gain that is used to update the raw background ensemble mean and perturbations is based on this modulated ensemble, which intrinsically includes multiscale model space localization. Experiments with a noncycled statistical model show that the full background covariance estimated by MLGETKF more accurately resembles the shape of the true covariance than a scale-unaware localization. The mean analysis from the best-performing MLGETKF is statistically significantly more accurate than the best-performing scale-unaware LGETKF. The accuracy of the MLGETKF analysis is more sensitive to small-scale band localization radius than large-scale band. MLGETKF is further examined in a cycling DA context with a surface quasigeostrophic model. The root-mean-square potential temperature analysis error of the best-performing MLGETKF is 17.2% lower than that of the best-performing LGETKF. MLGETKF reduces analysis errors measured in kinetic energy spectra space by 30%–80% relative to LGETKF with the largest improvement at large scales. MLGETKF deterministic and ensemble mean forecasts are more accurate than LGETKF for full and large scales up to 5–6-day lead time and for small scales up to 3–4-day lead time, gaining ~12 h–1 day of predictability.
Abstract
A new multiscale, ensemble-based data assimilation (DA) method, multiscale local gain form ensemble transform Kalman filter (MLGETKF), is introduced. MLGETKF allows simultaneous update of multiple scales for both the ensemble mean and perturbations through assimilating all observations at once. MLGETKF performs DA in independent local volumes, which lends the algorithm a high degree of computational scalability. The multiscale analysis is enabled through the rapid creation of many pseudoensemble perturbations via a multiscale ensemble modulation procedure. The Kalman gain that is used to update the raw background ensemble mean and perturbations is based on this modulated ensemble, which intrinsically includes multiscale model space localization. Experiments with a noncycled statistical model show that the full background covariance estimated by MLGETKF more accurately resembles the shape of the true covariance than a scale-unaware localization. The mean analysis from the best-performing MLGETKF is statistically significantly more accurate than the best-performing scale-unaware LGETKF. The accuracy of the MLGETKF analysis is more sensitive to small-scale band localization radius than large-scale band. MLGETKF is further examined in a cycling DA context with a surface quasigeostrophic model. The root-mean-square potential temperature analysis error of the best-performing MLGETKF is 17.2% lower than that of the best-performing LGETKF. MLGETKF reduces analysis errors measured in kinetic energy spectra space by 30%–80% relative to LGETKF with the largest improvement at large scales. MLGETKF deterministic and ensemble mean forecasts are more accurate than LGETKF for full and large scales up to 5–6-day lead time and for small scales up to 3–4-day lead time, gaining ~12 h–1 day of predictability.
Abstract
Ensemble postprocessing is frequently applied to correct biases and deficiencies in the spread of ensemble forecasts. Methods involving weighted, regression-corrected forecasts address the typical biases and underdispersion of ensembles through a regression correction of ensemble members followed by the generation of a probability density function (PDF) from the weighted sum of kernels fit around each corrected member. The weighting step accounts for the situation where the ensemble is constructed from different model forecasts or generated in some way that creates ensemble members that do not represent equally likely states. In the present work, it is shown that an overweighting of climatology in weighted, regression-corrected forecasts can occur when one first performs a regression-based correction before weighting each member. This overweighting of climatology results in an increase in the mean-squared error of the mean of the predicted PDF. The overweighting of climatology is illustrated in a simulation study and a real-data study, where the reference is generated through a direct application of Bayes’s rule. The real-data example is a comparison of a particular method referred to as Bayesian model averaging (BMA) and a direct application of Bayes’s rule for ocean wave heights using U.S. Navy and National Weather Service global deterministic forecasts. This direct application of Bayes’s rule is shown to not overweight climatology and may be a low-cost replacement for the generally more expensive weighted, regression-correction methods.
Abstract
Ensemble postprocessing is frequently applied to correct biases and deficiencies in the spread of ensemble forecasts. Methods involving weighted, regression-corrected forecasts address the typical biases and underdispersion of ensembles through a regression correction of ensemble members followed by the generation of a probability density function (PDF) from the weighted sum of kernels fit around each corrected member. The weighting step accounts for the situation where the ensemble is constructed from different model forecasts or generated in some way that creates ensemble members that do not represent equally likely states. In the present work, it is shown that an overweighting of climatology in weighted, regression-corrected forecasts can occur when one first performs a regression-based correction before weighting each member. This overweighting of climatology results in an increase in the mean-squared error of the mean of the predicted PDF. The overweighting of climatology is illustrated in a simulation study and a real-data study, where the reference is generated through a direct application of Bayes’s rule. The real-data example is a comparison of a particular method referred to as Bayesian model averaging (BMA) and a direct application of Bayes’s rule for ocean wave heights using U.S. Navy and National Weather Service global deterministic forecasts. This direct application of Bayes’s rule is shown to not overweight climatology and may be a low-cost replacement for the generally more expensive weighted, regression-correction methods.