Search Results
You are looking at 21 - 30 of 37 items for
- Author or Editor: David B. Stephenson x
- Refine by Access: All Content x
Abstract
In the forecasting of binary events, verification measures that are “equitable” were defined by Gandin and Murphy to satisfy two requirements: 1) they award all random forecasting systems, including those that always issue the same forecast, the same expected score (typically zero), and 2) they are expressible as the linear weighted sum of the elements of the contingency table, where the weights are independent of the entries in the table, apart from the base rate. The authors demonstrate that the widely used “equitable threat score” (ETS), as well as numerous others, satisfies neither of these requirements and only satisfies the first requirement in the limit of an infinite sample size. Such measures are referred to as “asymptotically equitable.” In the case of ETS, the expected score of a random forecasting system is always positive and only falls below 0.01 when the number of samples is greater than around 30. Two other asymptotically equitable measures are the odds ratio skill score and the symmetric extreme dependency score, which are more strongly inequitable than ETS, particularly for rare events; for example, when the base rate is 2% and the sample size is 1000, random but unbiased forecasting systems yield an expected score of around −0.5, reducing in magnitude to −0.01 or smaller only for sample sizes exceeding 25 000. This presents a problem since these nonlinear measures have other desirable properties, in particular being reliable indicators of skill for rare events (provided that the sample size is large enough). A potential way to reconcile these properties with equitability is to recognize that Gandin and Murphy’s two requirements are independent, and the second can be safely discarded without losing the key advantages of equitability that are embodied in the first. This enables inequitable and asymptotically equitable measures to be scaled to make them equitable, while retaining their nonlinearity and other properties such as being reliable indicators of skill for rare events. It also opens up the possibility of designing new equitable verification measures.
Abstract
In the forecasting of binary events, verification measures that are “equitable” were defined by Gandin and Murphy to satisfy two requirements: 1) they award all random forecasting systems, including those that always issue the same forecast, the same expected score (typically zero), and 2) they are expressible as the linear weighted sum of the elements of the contingency table, where the weights are independent of the entries in the table, apart from the base rate. The authors demonstrate that the widely used “equitable threat score” (ETS), as well as numerous others, satisfies neither of these requirements and only satisfies the first requirement in the limit of an infinite sample size. Such measures are referred to as “asymptotically equitable.” In the case of ETS, the expected score of a random forecasting system is always positive and only falls below 0.01 when the number of samples is greater than around 30. Two other asymptotically equitable measures are the odds ratio skill score and the symmetric extreme dependency score, which are more strongly inequitable than ETS, particularly for rare events; for example, when the base rate is 2% and the sample size is 1000, random but unbiased forecasting systems yield an expected score of around −0.5, reducing in magnitude to −0.01 or smaller only for sample sizes exceeding 25 000. This presents a problem since these nonlinear measures have other desirable properties, in particular being reliable indicators of skill for rare events (provided that the sample size is large enough). A potential way to reconcile these properties with equitability is to recognize that Gandin and Murphy’s two requirements are independent, and the second can be safely discarded without losing the key advantages of equitability that are embodied in the first. This enables inequitable and asymptotically equitable measures to be scaled to make them equitable, while retaining their nonlinearity and other properties such as being reliable indicators of skill for rare events. It also opens up the possibility of designing new equitable verification measures.
Abstract
A simple and coherent framework for partitioning uncertainty in multimodel climate ensembles is presented. The analysis of variance (ANOVA) is used to decompose a measure of total variation additively into scenario uncertainty, model uncertainty, and internal variability. This approach requires fewer assumptions than existing methods and can be easily used to quantify uncertainty related to model–scenario interaction—the contribution to model uncertainty arising from the variation across scenarios of model deviations from the ensemble mean. Uncertainty in global mean surface air temperature is quantified as a function of lead time for a subset of the Coupled Model Intercomparison Project phase 3 ensemble and results largely agree with those published by other authors: scenario uncertainty dominates beyond 2050 and internal variability remains approximately constant over the twenty-first century. Both elements of model uncertainty, due to scenario-independent and scenario-dependent deviations from the ensemble mean, are found to increase with time. Estimates of model deviations that arise as by-products of the framework reveal significant differences between models that could lead to a deeper understanding of the sources of uncertainty in multimodel ensembles. For example, three models show a diverging pattern over the twenty-first century, while another model exhibits an unusually large variation among its scenario-dependent deviations.
Abstract
A simple and coherent framework for partitioning uncertainty in multimodel climate ensembles is presented. The analysis of variance (ANOVA) is used to decompose a measure of total variation additively into scenario uncertainty, model uncertainty, and internal variability. This approach requires fewer assumptions than existing methods and can be easily used to quantify uncertainty related to model–scenario interaction—the contribution to model uncertainty arising from the variation across scenarios of model deviations from the ensemble mean. Uncertainty in global mean surface air temperature is quantified as a function of lead time for a subset of the Coupled Model Intercomparison Project phase 3 ensemble and results largely agree with those published by other authors: scenario uncertainty dominates beyond 2050 and internal variability remains approximately constant over the twenty-first century. Both elements of model uncertainty, due to scenario-independent and scenario-dependent deviations from the ensemble mean, are found to increase with time. Estimates of model deviations that arise as by-products of the framework reveal significant differences between models that could lead to a deeper understanding of the sources of uncertainty in multimodel ensembles. For example, three models show a diverging pattern over the twenty-first century, while another model exhibits an unusually large variation among its scenario-dependent deviations.
Abstract
Reliable estimates of future changes in extreme weather phenomena, such as tropical cyclone maximum wind speeds, are critical for climate change impact assessments and the development of appropriate adaptation strategies. However, global and regional climate model outputs are often too coarse for direct use in these applications, with variables such as wind speed having truncated probability distributions compared to those of observations. This poses two problems: How can model-simulated variables best be adjusted to make them more realistic? And how can such adjustments be used to make more reliable predictions of future changes in their distribution?
This study investigates North Atlantic tropical cyclone maximum wind speeds from observations (1950–2010) and regional climate model simulations (1995–2005 and 2045–55 at 12- and 36-km spatial resolutions). The wind speed distributions in these datasets are well represented by the Weibull distribution, albeit with different scale and shape parameters.
A power-law transfer function is used to recalibrate the Weibull variables and obtain future projections of wind speeds. Two different strategies, bias correction and change factor, are tested by using 36-km model data to predict future 12-km model data (pseudo-observations). The strategies are also applied to the observations to obtain likely predictions of the future distributions of wind speeds. The strategies yield similar predictions of likely changes in the fraction of events within Saffir–Simpson categories—for example, an increase from 21% (1995–2005) to 27%–37% (2045–55) for category 3 or above events and an increase from 1.6% (1995–2005) to 2.8%–9.8% (2045–55) for category 5 events.
Abstract
Reliable estimates of future changes in extreme weather phenomena, such as tropical cyclone maximum wind speeds, are critical for climate change impact assessments and the development of appropriate adaptation strategies. However, global and regional climate model outputs are often too coarse for direct use in these applications, with variables such as wind speed having truncated probability distributions compared to those of observations. This poses two problems: How can model-simulated variables best be adjusted to make them more realistic? And how can such adjustments be used to make more reliable predictions of future changes in their distribution?
This study investigates North Atlantic tropical cyclone maximum wind speeds from observations (1950–2010) and regional climate model simulations (1995–2005 and 2045–55 at 12- and 36-km spatial resolutions). The wind speed distributions in these datasets are well represented by the Weibull distribution, albeit with different scale and shape parameters.
A power-law transfer function is used to recalibrate the Weibull variables and obtain future projections of wind speeds. Two different strategies, bias correction and change factor, are tested by using 36-km model data to predict future 12-km model data (pseudo-observations). The strategies are also applied to the observations to obtain likely predictions of the future distributions of wind speeds. The strategies yield similar predictions of likely changes in the fraction of events within Saffir–Simpson categories—for example, an increase from 21% (1995–2005) to 27%–37% (2045–55) for category 3 or above events and an increase from 1.6% (1995–2005) to 2.8%–9.8% (2045–55) for category 5 events.
Abstract
Future climate change projections are often derived from ensembles of simulations from multiple global circulation models using heuristic weighting schemes. This study provides a more rigorous justification for this by introducing a nested family of three simple analysis of variance frameworks. Statistical frameworks are essential in order to quantify the uncertainty associated with the estimate of the mean climate change response.
The most general framework yields the “one model, one vote” weighting scheme often used in climate projection. However, a simpler additive framework is found to be preferable when the climate change response is not strongly model dependent. In such situations, the weighted multimodel mean may be interpreted as an estimate of the actual climate response, even in the presence of shared model biases.
Statistical significance tests are derived to choose the most appropriate framework for specific multimodel ensemble data. The framework assumptions are explicit and can be checked using simple tests and graphical techniques. The frameworks can be used to test for evidence of nonzero climate response and to construct confidence intervals for the size of the response.
The methodology is illustrated by application to North Atlantic storm track data from the Coupled Model Intercomparison Project phase 5 (CMIP5) multimodel ensemble. Despite large variations in the historical storm tracks, the cyclone frequency climate change response is not found to be model dependent over most of the region. This gives high confidence in the response estimates. Statistically significant decreases in cyclone frequency are found on the flanks of the North Atlantic storm track and in the Mediterranean basin.
Abstract
Future climate change projections are often derived from ensembles of simulations from multiple global circulation models using heuristic weighting schemes. This study provides a more rigorous justification for this by introducing a nested family of three simple analysis of variance frameworks. Statistical frameworks are essential in order to quantify the uncertainty associated with the estimate of the mean climate change response.
The most general framework yields the “one model, one vote” weighting scheme often used in climate projection. However, a simpler additive framework is found to be preferable when the climate change response is not strongly model dependent. In such situations, the weighted multimodel mean may be interpreted as an estimate of the actual climate response, even in the presence of shared model biases.
Statistical significance tests are derived to choose the most appropriate framework for specific multimodel ensemble data. The framework assumptions are explicit and can be checked using simple tests and graphical techniques. The frameworks can be used to test for evidence of nonzero climate response and to construct confidence intervals for the size of the response.
The methodology is illustrated by application to North Atlantic storm track data from the Coupled Model Intercomparison Project phase 5 (CMIP5) multimodel ensemble. Despite large variations in the historical storm tracks, the cyclone frequency climate change response is not found to be model dependent over most of the region. This gives high confidence in the response estimates. Statistically significant decreases in cyclone frequency are found on the flanks of the North Atlantic storm track and in the Mediterranean basin.
Abstract
The response of North Atlantic and European extratropical cyclones to climate change is investigated in the climate models participating in phase 5 of the Coupled Model Intercomparison Project (CMIP5). In contrast to previous multimodel studies, a feature-tracking algorithm is here applied to separately quantify the responses in the number, the wind intensity, and the precipitation intensity of extratropical cyclones. Moreover, a statistical framework is employed to formally assess the uncertainties in the multimodel projections. Under the midrange representative concentration pathway (RCP4.5) emission scenario, the December–February (DJF) response is characterized by a tripolar pattern over Europe, with an increase in the number of cyclones in central Europe and a decreased number in the Norwegian and Mediterranean Seas. The June–August (JJA) response is characterized by a reduction in the number of North Atlantic cyclones along the southern flank of the storm track. The total number of cyclones decreases in both DJF (−4%) and JJA (−2%). Classifying cyclones according to their intensity indicates a slight basinwide reduction in the number of cyclones associated with strong winds, but an increase in those associated with strong precipitation. However, in DJF, a slight increase in the number and intensity of cyclones associated with strong wind speeds is found over the United Kingdom and central Europe. The results are confirmed under the high-emission RCP8.5 scenario, where the signals tend to be larger. The sources of uncertainty in these projections are discussed.
Abstract
The response of North Atlantic and European extratropical cyclones to climate change is investigated in the climate models participating in phase 5 of the Coupled Model Intercomparison Project (CMIP5). In contrast to previous multimodel studies, a feature-tracking algorithm is here applied to separately quantify the responses in the number, the wind intensity, and the precipitation intensity of extratropical cyclones. Moreover, a statistical framework is employed to formally assess the uncertainties in the multimodel projections. Under the midrange representative concentration pathway (RCP4.5) emission scenario, the December–February (DJF) response is characterized by a tripolar pattern over Europe, with an increase in the number of cyclones in central Europe and a decreased number in the Norwegian and Mediterranean Seas. The June–August (JJA) response is characterized by a reduction in the number of North Atlantic cyclones along the southern flank of the storm track. The total number of cyclones decreases in both DJF (−4%) and JJA (−2%). Classifying cyclones according to their intensity indicates a slight basinwide reduction in the number of cyclones associated with strong winds, but an increase in those associated with strong precipitation. However, in DJF, a slight increase in the number and intensity of cyclones associated with strong wind speeds is found over the United Kingdom and central Europe. The results are confirmed under the high-emission RCP8.5 scenario, where the signals tend to be larger. The sources of uncertainty in these projections are discussed.
Abstract
This study uses a Granger causality time series modeling approach to quantitatively diagnose the feedback of daily sea surface temperatures (SSTs) on daily values of the North Atlantic Oscillation (NAO) as simulated by a realistic coupled general circulation model (GCM). Bivariate vector autoregressive time series models are carefully fitted to daily wintertime SST and NAO time series produced by a 50-yr simulation of the Third Hadley Centre Coupled Ocean–Atmosphere GCM (HadCM3). The approach demonstrates that there is a small yet statistically significant feedback of SSTs on the NAO. The SST tripole index is found to provide additional predictive information for the NAO than that available by using only past values of NAO—the SST tripole is Granger causal for the NAO. Careful examination of local SSTs reveals that much of this effect is due to the effect of SSTs in the region of the Gulf Steam, especially south of Cape Hatteras. The effect of SSTs on NAO is responsible for the slower-than-exponential decay in lag-autocorrelations of NAO notable at lags longer than 10 days. The persistence induced in daily NAO by SSTs causes long-term means of NAO to have more variance than expected from averaging NAO noise if there is no feedback of the ocean on the atmosphere. There are greater long-term trends in NAO than can be expected from aggregating just short-term atmospheric noise, and NAO is potentially predictable provided that future SSTs are known. For example, there is about 10%–30% more variance in seasonal wintertime means of NAO and almost 70% more variance in annual means of NAO due to SST effects than one would expect if NAO were a purely atmospheric process.
Abstract
This study uses a Granger causality time series modeling approach to quantitatively diagnose the feedback of daily sea surface temperatures (SSTs) on daily values of the North Atlantic Oscillation (NAO) as simulated by a realistic coupled general circulation model (GCM). Bivariate vector autoregressive time series models are carefully fitted to daily wintertime SST and NAO time series produced by a 50-yr simulation of the Third Hadley Centre Coupled Ocean–Atmosphere GCM (HadCM3). The approach demonstrates that there is a small yet statistically significant feedback of SSTs on the NAO. The SST tripole index is found to provide additional predictive information for the NAO than that available by using only past values of NAO—the SST tripole is Granger causal for the NAO. Careful examination of local SSTs reveals that much of this effect is due to the effect of SSTs in the region of the Gulf Steam, especially south of Cape Hatteras. The effect of SSTs on NAO is responsible for the slower-than-exponential decay in lag-autocorrelations of NAO notable at lags longer than 10 days. The persistence induced in daily NAO by SSTs causes long-term means of NAO to have more variance than expected from averaging NAO noise if there is no feedback of the ocean on the atmosphere. There are greater long-term trends in NAO than can be expected from aggregating just short-term atmospheric noise, and NAO is potentially predictable provided that future SSTs are known. For example, there is about 10%–30% more variance in seasonal wintertime means of NAO and almost 70% more variance in annual means of NAO due to SST effects than one would expect if NAO were a purely atmospheric process.
Abstract
This study describes a systematic approach to selecting optimal statistical recalibration methods and hindcast designs for producing reliable probability forecasts on seasonal-to-decadal time scales. A new recalibration method is introduced that includes adjustments for both unconditional and conditional biases in the mean and variance of the forecast distribution and linear time-dependent bias in the mean. The complexity of the recalibration can be systematically varied by restricting the parameters. Simple recalibration methods may outperform more complex ones given limited training data. A new cross-validation methodology is proposed that allows the comparison of multiple recalibration methods and varying training periods using limited data.
Part I considers the effect on forecast skill of varying the recalibration complexity and training period length. The interaction between these factors is analyzed for gridbox forecasts of annual mean near-surface temperature from the CanCM4 model. Recalibration methods that include conditional adjustment of the ensemble mean outperform simple bias correction by issuing climatological forecasts where the model has limited skill. Trend-adjusted forecasts outperform forecasts without trend adjustment at almost 75% of grid boxes. The optimal training period is around 30 yr for trend-adjusted forecasts and around 15 yr otherwise. The optimal training period is strongly related to the length of the optimal climatology. Longer training periods may increase overall performance but at the expense of very poor forecasts where skill is limited.
Abstract
This study describes a systematic approach to selecting optimal statistical recalibration methods and hindcast designs for producing reliable probability forecasts on seasonal-to-decadal time scales. A new recalibration method is introduced that includes adjustments for both unconditional and conditional biases in the mean and variance of the forecast distribution and linear time-dependent bias in the mean. The complexity of the recalibration can be systematically varied by restricting the parameters. Simple recalibration methods may outperform more complex ones given limited training data. A new cross-validation methodology is proposed that allows the comparison of multiple recalibration methods and varying training periods using limited data.
Part I considers the effect on forecast skill of varying the recalibration complexity and training period length. The interaction between these factors is analyzed for gridbox forecasts of annual mean near-surface temperature from the CanCM4 model. Recalibration methods that include conditional adjustment of the ensemble mean outperform simple bias correction by issuing climatological forecasts where the model has limited skill. Trend-adjusted forecasts outperform forecasts without trend adjustment at almost 75% of grid boxes. The optimal training period is around 30 yr for trend-adjusted forecasts and around 15 yr otherwise. The optimal training period is strongly related to the length of the optimal climatology. Longer training periods may increase overall performance but at the expense of very poor forecasts where skill is limited.
Abstract
Predictability estimates of ensemble prediction systems are uncertain because of limited numbers of past forecasts and observations. To account for such uncertainty, this paper proposes a Bayesian inferential framework that provides a simple 6-parameter representation of ensemble forecasting systems and the corresponding observations. The framework is probabilistic and thus allows for quantifying uncertainty in predictability measures, such as correlation skill and signal-to-noise ratios. It also provides a natural way to produce recalibrated probabilistic predictions from uncalibrated ensembles forecasts.
The framework is used to address important questions concerning the skill of winter hindcasts of the North Atlantic Oscillation for 1992–2011 issued by the Met Office Global Seasonal Forecast System, version 5 (GloSea5), climate prediction system. Although there is much uncertainty in the correlation between ensemble mean and observations, there is strong evidence of skill: the 95% credible interval of the correlation coefficient of [0.19, 0.68] does not overlap zero. There is also strong evidence that the forecasts are not exchangeable with the observations: with over 99% certainty, the signal-to-noise ratio of the forecasts is smaller than the signal-to-noise ratio of the observations, which suggests that raw forecasts should not be taken as representative scenarios of the observations. Forecast recalibration is thus required, which can be coherently addressed within the proposed framework.
Abstract
Predictability estimates of ensemble prediction systems are uncertain because of limited numbers of past forecasts and observations. To account for such uncertainty, this paper proposes a Bayesian inferential framework that provides a simple 6-parameter representation of ensemble forecasting systems and the corresponding observations. The framework is probabilistic and thus allows for quantifying uncertainty in predictability measures, such as correlation skill and signal-to-noise ratios. It also provides a natural way to produce recalibrated probabilistic predictions from uncalibrated ensembles forecasts.
The framework is used to address important questions concerning the skill of winter hindcasts of the North Atlantic Oscillation for 1992–2011 issued by the Met Office Global Seasonal Forecast System, version 5 (GloSea5), climate prediction system. Although there is much uncertainty in the correlation between ensemble mean and observations, there is strong evidence of skill: the 95% credible interval of the correlation coefficient of [0.19, 0.68] does not overlap zero. There is also strong evidence that the forecasts are not exchangeable with the observations: with over 99% certainty, the signal-to-noise ratio of the forecasts is smaller than the signal-to-noise ratio of the observations, which suggests that raw forecasts should not be taken as representative scenarios of the observations. Forecast recalibration is thus required, which can be coherently addressed within the proposed framework.
No abstract available.
No abstract available.