Browse
Abstract
A 16-member convective-scale ensemble prediction system (CEPS) developed at the Central Weather Bureau (CWB) of Taiwan is evaluated for probability forecasts of convective precipitation. To address the issues of limited predictability of convective systems, the CEPS provides short-range forecasts using initial conditions from a rapid-updated ensemble data assimilation system. This study aims to identify the behavior of the CEPS forecasts, especially the impact of different ensemble configurations and forecast lead times. Warm-season afternoon thunderstorms (ATs) from 30 June to 4 July 2017 are selected. Since ATs usually occur between 1300 and 2000 LST, this study compares deterministic and probabilistic quantitative precipitation forecasts (QPFs) launched at 0500, 0800, and 1100 LST. This study demonstrates that initial and boundary perturbations (IBP) are crucial to ensure good spread–skill consistency over the 18-h forecasts. On top of IBP, additional model perturbations have insignificant impacts on upper-air and precipitation forecasts. The deterministic QPFs launched at 1100 LST outperform those launched at 0500 and 0800 LST, likely because the most-recent data assimilation analyses enhance the practical predictability. However, it cannot improve the probabilistic QPFs launched at 1100 LST due to inadequate ensemble spreads resulting from limited error growth time. This study points out the importance of sufficient initial condition uncertainty on short-range probabilistic forecasts to exploit the benefits of rapid-update data assimilation analyses.
Significance Statement
This study aims to understand the behavior of convective-scale short-range probabilistic forecasts in Taiwan and the surrounding area. Taiwan is influenced by diverse weather systems, including typhoons, mei-yu fronts, and local thunderstorms. During the past decade, there has been promising improvement in predicting mesoscale weather systems (e.g., typhoons and mei-yu fronts). However, it is still challenging to provide timely and accurate forecasts for rapid-evolving high-impact convection. This study provides a reference for the designation of convective-scale ensemble prediction systems; in particular, those with a goal to provide short-range probabilistic forecasts. While the findings cannot be extrapolated to all ensemble prediction systems, this study demonstrates that initial and boundary perturbations are the most important factors, while the model perturbation has an insignificant effect. This study suggests that in-depth studies are required to improve the convective-scale initial condition accuracy and uncertainty to provide reliable probabilistic forecasts within short lead times.
Abstract
A 16-member convective-scale ensemble prediction system (CEPS) developed at the Central Weather Bureau (CWB) of Taiwan is evaluated for probability forecasts of convective precipitation. To address the issues of limited predictability of convective systems, the CEPS provides short-range forecasts using initial conditions from a rapid-updated ensemble data assimilation system. This study aims to identify the behavior of the CEPS forecasts, especially the impact of different ensemble configurations and forecast lead times. Warm-season afternoon thunderstorms (ATs) from 30 June to 4 July 2017 are selected. Since ATs usually occur between 1300 and 2000 LST, this study compares deterministic and probabilistic quantitative precipitation forecasts (QPFs) launched at 0500, 0800, and 1100 LST. This study demonstrates that initial and boundary perturbations (IBP) are crucial to ensure good spread–skill consistency over the 18-h forecasts. On top of IBP, additional model perturbations have insignificant impacts on upper-air and precipitation forecasts. The deterministic QPFs launched at 1100 LST outperform those launched at 0500 and 0800 LST, likely because the most-recent data assimilation analyses enhance the practical predictability. However, it cannot improve the probabilistic QPFs launched at 1100 LST due to inadequate ensemble spreads resulting from limited error growth time. This study points out the importance of sufficient initial condition uncertainty on short-range probabilistic forecasts to exploit the benefits of rapid-update data assimilation analyses.
Significance Statement
This study aims to understand the behavior of convective-scale short-range probabilistic forecasts in Taiwan and the surrounding area. Taiwan is influenced by diverse weather systems, including typhoons, mei-yu fronts, and local thunderstorms. During the past decade, there has been promising improvement in predicting mesoscale weather systems (e.g., typhoons and mei-yu fronts). However, it is still challenging to provide timely and accurate forecasts for rapid-evolving high-impact convection. This study provides a reference for the designation of convective-scale ensemble prediction systems; in particular, those with a goal to provide short-range probabilistic forecasts. While the findings cannot be extrapolated to all ensemble prediction systems, this study demonstrates that initial and boundary perturbations are the most important factors, while the model perturbation has an insignificant effect. This study suggests that in-depth studies are required to improve the convective-scale initial condition accuracy and uncertainty to provide reliable probabilistic forecasts within short lead times.
Abstract
This article introduces an ensemble clustering tool developed at the Weather Prediction Center (WPC) to assist forecasters in the preparation of medium-range (3–7 day) forecasts. Effectively incorporating ensemble data into an operational forecasting process, like that used at WPC, can be challenging given time constraints and data infrastructure limitations. Often forecasters do not have time to view the large number of constituent members of an ensemble forecast, so they settle for viewing the ensemble’s mean and spread. This ignores the useful information about forecast uncertainty and the range of possible forecast outcomes that an ensemble forecast can provide. Ensemble clustering could be a solution to this problem as it can reduce a large ensemble forecast down to the most prevalent forecast scenarios. Forecasters can then quickly view these ensemble clusters to better understand and communicate forecast uncertainty and the range of possible forecast outcomes. The ensemble clustering tool developed at WPC is a variation of fuzzy clustering where operationally available ensemble members with similar 500-hPa geopotential height forecasts are grouped into four clusters. A representative case from 15 February 2021 is presented to demonstrate the clustering methodology and the overall utility of this new ensemble clustering tool. Cumulative verification statistics show that one of the four forecast scenarios identified by this ensemble clustering tool routinely outperforms all the available ensemble mean and deterministic forecasts.
Significance Statement
Ensemble forecasts could be used more effectively in medium-range (3–7 day) forecasting. Currently, the onus is put on forecasters to view and synthesize all of the data contained in an ensemble forecast. This is a task they often do not have time to adequately execute. This work proposes a solution to this problem. An automated tool was developed that would split the available ensemble members into four groups of broadly similar members. These groups were presented to forecasters as four potential forecast outcomes. Forecasters felt this tool helped them to better incorporate ensemble forecasts into their forecast process. Verification shows that presenting ensemble forecasts in this manner is an improvement on currently used ensemble forecast visualization techniques.
Abstract
This article introduces an ensemble clustering tool developed at the Weather Prediction Center (WPC) to assist forecasters in the preparation of medium-range (3–7 day) forecasts. Effectively incorporating ensemble data into an operational forecasting process, like that used at WPC, can be challenging given time constraints and data infrastructure limitations. Often forecasters do not have time to view the large number of constituent members of an ensemble forecast, so they settle for viewing the ensemble’s mean and spread. This ignores the useful information about forecast uncertainty and the range of possible forecast outcomes that an ensemble forecast can provide. Ensemble clustering could be a solution to this problem as it can reduce a large ensemble forecast down to the most prevalent forecast scenarios. Forecasters can then quickly view these ensemble clusters to better understand and communicate forecast uncertainty and the range of possible forecast outcomes. The ensemble clustering tool developed at WPC is a variation of fuzzy clustering where operationally available ensemble members with similar 500-hPa geopotential height forecasts are grouped into four clusters. A representative case from 15 February 2021 is presented to demonstrate the clustering methodology and the overall utility of this new ensemble clustering tool. Cumulative verification statistics show that one of the four forecast scenarios identified by this ensemble clustering tool routinely outperforms all the available ensemble mean and deterministic forecasts.
Significance Statement
Ensemble forecasts could be used more effectively in medium-range (3–7 day) forecasting. Currently, the onus is put on forecasters to view and synthesize all of the data contained in an ensemble forecast. This is a task they often do not have time to adequately execute. This work proposes a solution to this problem. An automated tool was developed that would split the available ensemble members into four groups of broadly similar members. These groups were presented to forecasters as four potential forecast outcomes. Forecasters felt this tool helped them to better incorporate ensemble forecasts into their forecast process. Verification shows that presenting ensemble forecasts in this manner is an improvement on currently used ensemble forecast visualization techniques.
Abstract
To mitigate the impacts associated with adverse weather conditions, meteorological services issue weather warnings to the general public. These warnings rely heavily on forecasts issued by underlying prediction systems. When deciding which prediction system(s) to utilize when constructing warnings, it is important to compare systems in their ability to forecast the occurrence and severity of high-impact weather events. However, evaluating forecasts for particular outcomes is known to be a challenging task. This is exacerbated further by the fact that high-impact weather often manifests as a result of several confounding features, a realization that has led to considerable research on so-called compound weather events. Both univariate and multivariate methods are therefore required to evaluate forecasts for high-impact weather. In this paper, we discuss weighted verification tools, which allow particular outcomes to be emphasized during forecast evaluation. We review and compare different approaches to construct weighted scoring rules, both in a univariate and multivariate setting, and we leverage existing results on weighted scores to introduce conditional probability integral transform (PIT) histograms, allowing forecast calibration to be assessed conditionally on particular outcomes having occurred. To illustrate the practical benefit afforded by these weighted verification tools, they are employed in a case study to evaluate probabilistic forecasts for extreme heat events issued by the Swiss Federal Office of Meteorology and Climatology (MeteoSwiss).
Abstract
To mitigate the impacts associated with adverse weather conditions, meteorological services issue weather warnings to the general public. These warnings rely heavily on forecasts issued by underlying prediction systems. When deciding which prediction system(s) to utilize when constructing warnings, it is important to compare systems in their ability to forecast the occurrence and severity of high-impact weather events. However, evaluating forecasts for particular outcomes is known to be a challenging task. This is exacerbated further by the fact that high-impact weather often manifests as a result of several confounding features, a realization that has led to considerable research on so-called compound weather events. Both univariate and multivariate methods are therefore required to evaluate forecasts for high-impact weather. In this paper, we discuss weighted verification tools, which allow particular outcomes to be emphasized during forecast evaluation. We review and compare different approaches to construct weighted scoring rules, both in a univariate and multivariate setting, and we leverage existing results on weighted scores to introduce conditional probability integral transform (PIT) histograms, allowing forecast calibration to be assessed conditionally on particular outcomes having occurred. To illustrate the practical benefit afforded by these weighted verification tools, they are employed in a case study to evaluate probabilistic forecasts for extreme heat events issued by the Swiss Federal Office of Meteorology and Climatology (MeteoSwiss).
Abstract
This study analyzes the potential of deep learning using probabilistic artificial neural networks (ANNs) for postprocessing ensemble precipitation forecasts at four observation locations. We split the precipitation forecast problem into two tasks: estimating the probability of precipitation and predicting the hourly precipitation. We then compare the performance with classical statistical postprocessing (logistical regression and GLM). ANNs show a higher performance at three of the four stations for estimating the probability of precipitation and at all stations for predicting the hourly precipitation. Further, two more general ANN models are trained using the merged data from all four stations. These general ANNs exhibit an increase in performance compared to the station-specific ANNs at most stations. However, they show a significant decay in performance at one of the stations at estimating the hourly precipitation. The general models seem capable of learning meaningful interactions in the data and generalizing these to improve the performance at other sites, which also causes the loss of local information at one station. Thus, this study indicates the potential of deep learning in weather forecasting workflows.
Abstract
This study analyzes the potential of deep learning using probabilistic artificial neural networks (ANNs) for postprocessing ensemble precipitation forecasts at four observation locations. We split the precipitation forecast problem into two tasks: estimating the probability of precipitation and predicting the hourly precipitation. We then compare the performance with classical statistical postprocessing (logistical regression and GLM). ANNs show a higher performance at three of the four stations for estimating the probability of precipitation and at all stations for predicting the hourly precipitation. Further, two more general ANN models are trained using the merged data from all four stations. These general ANNs exhibit an increase in performance compared to the station-specific ANNs at most stations. However, they show a significant decay in performance at one of the stations at estimating the hourly precipitation. The general models seem capable of learning meaningful interactions in the data and generalizing these to improve the performance at other sites, which also causes the loss of local information at one station. Thus, this study indicates the potential of deep learning in weather forecasting workflows.
Abstract
The extended-range forecast with a lead time of 10–30 days is the gap between weather (<10 days) and climate (>30 days) predictions. Improving the forecast skill of extreme weather events at the extended range is crucial for risk management of disastrous events. In this study, three deep learning (DL) models based on the methods of convolutional neural networks and gate recurrent units are constructed to predict the rainfall anomalies and associated extreme events in East China at lead times of 1–6 pentads. All DL models show skillful prediction of the temporal variation of rainfall anomalies (in terms of temporal correlation coefficient skill) over most regions in East China beyond 4 pentads, outperforming the dynamical models from the China Meteorological Administration (CMA) and the European Centre for Medium-Range Weather Forecasts (ECMWF). The spatial distribution of the rainfall anomalies is also better predicted by the DL models than the dynamical models; and the DL models show higher pattern correlation coefficients than the dynamical models at lead times of 3–6 pentads. The higher skill of DL models in predicting the rainfall anomalies will help to improve the accuracy of extreme-event predictions. The Heidke skill scores of the extreme rainfall event forecast performed by the DL models are also superior to those of the dynamical models at a lead time beyond about 4 pentads. Heat map analysis for the DL models shows that the predictability sources are mainly the large-scale factors modulating the East Asian monsoon rainfall.
Significance Statement
Improving the forecast skill for extreme weather events at the extended range (10–30 days in advance), particularly over populated regions such as East China, is crucial for risk management. This study aims to develop skillful models of the rainfall anomalies and associated extreme heavy rainfall events using deep learning techniques. The models constructed here benefit from the capability of deep learning to identify the predictability sources of rainfall variability, and outperform the current operational models, including the ECMWF and the CMA models, at forecast lead times beyond 3–4 pentads. These results reveal the promising application prospect of deep learning techniques in the extended-range forecast.
Abstract
The extended-range forecast with a lead time of 10–30 days is the gap between weather (<10 days) and climate (>30 days) predictions. Improving the forecast skill of extreme weather events at the extended range is crucial for risk management of disastrous events. In this study, three deep learning (DL) models based on the methods of convolutional neural networks and gate recurrent units are constructed to predict the rainfall anomalies and associated extreme events in East China at lead times of 1–6 pentads. All DL models show skillful prediction of the temporal variation of rainfall anomalies (in terms of temporal correlation coefficient skill) over most regions in East China beyond 4 pentads, outperforming the dynamical models from the China Meteorological Administration (CMA) and the European Centre for Medium-Range Weather Forecasts (ECMWF). The spatial distribution of the rainfall anomalies is also better predicted by the DL models than the dynamical models; and the DL models show higher pattern correlation coefficients than the dynamical models at lead times of 3–6 pentads. The higher skill of DL models in predicting the rainfall anomalies will help to improve the accuracy of extreme-event predictions. The Heidke skill scores of the extreme rainfall event forecast performed by the DL models are also superior to those of the dynamical models at a lead time beyond about 4 pentads. Heat map analysis for the DL models shows that the predictability sources are mainly the large-scale factors modulating the East Asian monsoon rainfall.
Significance Statement
Improving the forecast skill for extreme weather events at the extended range (10–30 days in advance), particularly over populated regions such as East China, is crucial for risk management. This study aims to develop skillful models of the rainfall anomalies and associated extreme heavy rainfall events using deep learning techniques. The models constructed here benefit from the capability of deep learning to identify the predictability sources of rainfall variability, and outperform the current operational models, including the ECMWF and the CMA models, at forecast lead times beyond 3–4 pentads. These results reveal the promising application prospect of deep learning techniques in the extended-range forecast.
Abstract
A new probabilistic tornado detection algorithm was developed to potentially replace the operational tornado detection algorithm (TDA) for the WSR-88D radar network. The tornado probability algorithm (TORP) uses a random forest machine learning technique to estimate a probability of tornado occurrence based on single-radar data, and is trained on 166 145 data points derived from 0.5°-tilt radar data and storm reports from 2011 to 2016, of which 10.4% are tornadic. A variety of performance evaluation metrics show a generally good model performance for discriminating between tornadic and nontornadic points. When using a 50% probability threshold to decide whether the model is predicting a tornado or not, the probability of detection and false alarm ratio are 57% and 50%, respectively, showing high skill by several metrics and vastly outperforming the TDA. The model weaknesses include false alarms associated with poor-quality radial velocity data and greatly reduced performance when used in the western United States. Overall, TORP can provide real-time guidance for tornado warning decisions, which can increase forecaster confidence and encourage swift decision-making. It has the ability to condense a multitude of radar data into a concise object-based information readout that can be displayed in visualization software used by the National Weather Service, core partners, and researchers.
Significance Statement
This study describes the tornado probability algorithm (TORP) and its performance. Operational forecasters can use TORP as real-time guidance when issuing tornado warnings, causing increased confidence in warning decisions, which in turn can extend tornado warning lead times.
Abstract
A new probabilistic tornado detection algorithm was developed to potentially replace the operational tornado detection algorithm (TDA) for the WSR-88D radar network. The tornado probability algorithm (TORP) uses a random forest machine learning technique to estimate a probability of tornado occurrence based on single-radar data, and is trained on 166 145 data points derived from 0.5°-tilt radar data and storm reports from 2011 to 2016, of which 10.4% are tornadic. A variety of performance evaluation metrics show a generally good model performance for discriminating between tornadic and nontornadic points. When using a 50% probability threshold to decide whether the model is predicting a tornado or not, the probability of detection and false alarm ratio are 57% and 50%, respectively, showing high skill by several metrics and vastly outperforming the TDA. The model weaknesses include false alarms associated with poor-quality radial velocity data and greatly reduced performance when used in the western United States. Overall, TORP can provide real-time guidance for tornado warning decisions, which can increase forecaster confidence and encourage swift decision-making. It has the ability to condense a multitude of radar data into a concise object-based information readout that can be displayed in visualization software used by the National Weather Service, core partners, and researchers.
Significance Statement
This study describes the tornado probability algorithm (TORP) and its performance. Operational forecasters can use TORP as real-time guidance when issuing tornado warnings, causing increased confidence in warning decisions, which in turn can extend tornado warning lead times.
Abstract
A hybrid three-dimensional ensemble–variational (En3D-Var) data assimilation system has been developed to explore incorporating information from an 11-member regional ensemble prediction system, which is dynamically downscaled from a global ensemble system, into a 3-hourly cycling convective-scale data assimilation system over the western Maritime Continent. From the ensemble, there exists small-scale ensemble perturbation structures associated with positional differences of tropical convection, but these structures are well represented only after the downscaled ensemble forecast has evolved for at least 6 h due to spinup. There was also a robust moderate negative correlation between total specific humidity and potential temperature background errors, presumably because of incorrect vertical motion in the presence of clouds. Time shifting of the ensemble perturbations, by using those available from adjacent cycles, helped to ameliorate the sampling error prevalent in their raw autocovariances. Monthlong hybrid En3D-Var trials were conducted using different weights assigned to the ensemble-derived and climatological background error covariances. The forecast fits to radiosonde relative humidity and wind observations were generally improved with hybrid En3D-Var, but in all experiments, the fits to surface observations were degraded compared to the baseline 3D-Var configuration. Over the Singapore radar domain, there was a general improvement in the precipitation forecasts, especially when the weighting toward the climatological background error covariance was larger, and with the additional application of time-shifted ensemble perturbations. Future work involves consolidating the ensemble prediction and deterministic system, by centering the ensemble prediction system on the hybrid analysis, to better represent the analysis and forecast uncertainties.
Abstract
A hybrid three-dimensional ensemble–variational (En3D-Var) data assimilation system has been developed to explore incorporating information from an 11-member regional ensemble prediction system, which is dynamically downscaled from a global ensemble system, into a 3-hourly cycling convective-scale data assimilation system over the western Maritime Continent. From the ensemble, there exists small-scale ensemble perturbation structures associated with positional differences of tropical convection, but these structures are well represented only after the downscaled ensemble forecast has evolved for at least 6 h due to spinup. There was also a robust moderate negative correlation between total specific humidity and potential temperature background errors, presumably because of incorrect vertical motion in the presence of clouds. Time shifting of the ensemble perturbations, by using those available from adjacent cycles, helped to ameliorate the sampling error prevalent in their raw autocovariances. Monthlong hybrid En3D-Var trials were conducted using different weights assigned to the ensemble-derived and climatological background error covariances. The forecast fits to radiosonde relative humidity and wind observations were generally improved with hybrid En3D-Var, but in all experiments, the fits to surface observations were degraded compared to the baseline 3D-Var configuration. Over the Singapore radar domain, there was a general improvement in the precipitation forecasts, especially when the weighting toward the climatological background error covariance was larger, and with the additional application of time-shifted ensemble perturbations. Future work involves consolidating the ensemble prediction and deterministic system, by centering the ensemble prediction system on the hybrid analysis, to better represent the analysis and forecast uncertainties.
Abstract
Herein, 14 severe quasi-linear convective systems (QLCS) covering a wide range of geographical locations and environmental conditions are simulated for both 1- and 3-km horizontal grid resolutions, to further clarify their comparative capabilities in representing convective system features associated with severe weather production. Emphasis is placed on validating the simulated reflectivity structures, cold pool strength, mesoscale vortex characteristics, and surface wind strength. As to the overall reflectivity characteristics, the basic leading-line trailing stratiform structure was often better defined at 1 versus 3 km, but both resolutions were capable of producing bow echo and line echo wave pattern type features. Cold pool characteristics for both the 1- and 3-km simulations were also well replicated for the differing environments, with the 1-km cold pools slightly colder and often a bit larger. Both resolutions captured the larger mesoscale vortices, such as line-end or bookend vortices, but smaller, leading-line mesoscale updraft vortices, that often promote QLCS tornadogenesis, were largely absent in the 3-km simulations. Finally, while maximum surface winds were only marginally well predicted for both resolutions, the simulations were able to reasonably differentiate the relative contributions of the cold pool versus mesoscale vortices. The present results suggest that while many QLCS characteristics can be reasonably represented at a grid scale of 3 km, some of the more detailed structures, such as overall reflectivity characteristics and the smaller leading-line mesoscale vortices would likely benefit from the finer 1-km grid spacing.
Significance Statement
High-resolution model forecasts using 3-km grid spacing have proven to offer significant forecast guidance enhancements for severe convective weather. However, it is unclear whether additional enhancements can be obtained by decreasing grid spacings further to 1 km. Herein, we compare forecasts of severe quasi-linear convective systems (QLCS) simulated using 1- versus 3-km grids to document the potential value added of such increases in grid resolutions. It is shown that some significant improvements can be obtained in the representation of many QLCS features, especially as regards reflectivity structure and in the development of small, leading-line mesoscale vortices that can contribute to both severe surface wind and tornado production.
Abstract
Herein, 14 severe quasi-linear convective systems (QLCS) covering a wide range of geographical locations and environmental conditions are simulated for both 1- and 3-km horizontal grid resolutions, to further clarify their comparative capabilities in representing convective system features associated with severe weather production. Emphasis is placed on validating the simulated reflectivity structures, cold pool strength, mesoscale vortex characteristics, and surface wind strength. As to the overall reflectivity characteristics, the basic leading-line trailing stratiform structure was often better defined at 1 versus 3 km, but both resolutions were capable of producing bow echo and line echo wave pattern type features. Cold pool characteristics for both the 1- and 3-km simulations were also well replicated for the differing environments, with the 1-km cold pools slightly colder and often a bit larger. Both resolutions captured the larger mesoscale vortices, such as line-end or bookend vortices, but smaller, leading-line mesoscale updraft vortices, that often promote QLCS tornadogenesis, were largely absent in the 3-km simulations. Finally, while maximum surface winds were only marginally well predicted for both resolutions, the simulations were able to reasonably differentiate the relative contributions of the cold pool versus mesoscale vortices. The present results suggest that while many QLCS characteristics can be reasonably represented at a grid scale of 3 km, some of the more detailed structures, such as overall reflectivity characteristics and the smaller leading-line mesoscale vortices would likely benefit from the finer 1-km grid spacing.
Significance Statement
High-resolution model forecasts using 3-km grid spacing have proven to offer significant forecast guidance enhancements for severe convective weather. However, it is unclear whether additional enhancements can be obtained by decreasing grid spacings further to 1 km. Herein, we compare forecasts of severe quasi-linear convective systems (QLCS) simulated using 1- versus 3-km grids to document the potential value added of such increases in grid resolutions. It is shown that some significant improvements can be obtained in the representation of many QLCS features, especially as regards reflectivity structure and in the development of small, leading-line mesoscale vortices that can contribute to both severe surface wind and tornado production.
Abstract
A time–space shift method is developed for relocating model-predicted tornado vortices to radar-observed locations to improve the model initial conditions and subsequent predictions of tornadoes. The method consists of the following three steps. (i) Use the vortex center location estimated from radar observations to sample the best ensemble member from tornado-resolving ensemble predictions. Here, the best member is defined in terms of the predicted vortex center track that has a closest point, say at the time of t = t *, to the estimated vortex center at the initial time t 0 (when the tornado vortex signature is first detected in radar observations). (ii) Create a time-shifted field from the best ensemble member in which the field within a circular area of about 10-km radius around the vortex center is taken from t = t *, while the field outside this circular area is transformed smoothly via temporal interpolation to the best ensemble member at t 0. (iii) Create a time–space-shifted field in which the above time-shifted circular area is further shifted horizontally to co-center with the estimated vortex center at t 0, while the field outside this circular area is transformed smoothly via spatial interpolation to the non-shifted field at t 0 from the best ensemble member. The method is applied to the 20 May 2013 Oklahoma Newcastle–Moore tornado case, and is shown to be very effective in improving the tornado track and intensity predictions.
Significance Statement
The time–space shift method developed in this paper can smoothly relocate tornado vortices in model-predicted fields to match radar-observed locations. The method is found to be very effective in improving not only model initial condition but also the subsequent tornado track and intensity predictions. The method is also not sensitive to small errors in radar-estimated vortex center location at the initial time. The method should be useful for future real-time or even operational applications although further tests and improvements are needed (and are planned).
Abstract
A time–space shift method is developed for relocating model-predicted tornado vortices to radar-observed locations to improve the model initial conditions and subsequent predictions of tornadoes. The method consists of the following three steps. (i) Use the vortex center location estimated from radar observations to sample the best ensemble member from tornado-resolving ensemble predictions. Here, the best member is defined in terms of the predicted vortex center track that has a closest point, say at the time of t = t *, to the estimated vortex center at the initial time t 0 (when the tornado vortex signature is first detected in radar observations). (ii) Create a time-shifted field from the best ensemble member in which the field within a circular area of about 10-km radius around the vortex center is taken from t = t *, while the field outside this circular area is transformed smoothly via temporal interpolation to the best ensemble member at t 0. (iii) Create a time–space-shifted field in which the above time-shifted circular area is further shifted horizontally to co-center with the estimated vortex center at t 0, while the field outside this circular area is transformed smoothly via spatial interpolation to the non-shifted field at t 0 from the best ensemble member. The method is applied to the 20 May 2013 Oklahoma Newcastle–Moore tornado case, and is shown to be very effective in improving the tornado track and intensity predictions.
Significance Statement
The time–space shift method developed in this paper can smoothly relocate tornado vortices in model-predicted fields to match radar-observed locations. The method is found to be very effective in improving not only model initial condition but also the subsequent tornado track and intensity predictions. The method is also not sensitive to small errors in radar-estimated vortex center location at the initial time. The method should be useful for future real-time or even operational applications although further tests and improvements are needed (and are planned).
Abstract
Hail forecasts produced by the CAM-HAILCAST pseudo-Lagrangian hail size forecasting model were evaluated during the 2019, 2020, and 2021 NOAA Hazardous Weather Testbed (HWT) Spring Forecasting Experiments (SFEs). As part of this evaluation, HWT SFE participants were polled about their definition of a “good” hail forecast. Participants were presented with two different verification methods conducted over three different spatiotemporal scales, and were then asked to subjectively evaluate the hail forecast as well as the different verification methods themselves. Results recommended use of multiple verification methods tailored to the type of forecast expected by the end-user interpreting and applying the forecast. The hail forecasts evaluated during this period included an implementation of CAM-HAILCAST in the Limited Area Model of the Unified Forecast System with the Finite Volume 3 (FV3) dynamical core. Evaluation of FV3-HAILCAST over both 1- and 24-h periods found continued improvement from 2019 to 2021. The improvement was largely a result of wide intervariability among FV3 ensemble members with different microphysics parameterizations in 2019 lessening significantly during 2020 and 2021. Overprediction throughout the diurnal cycle also lessened by 2021. A combination of both upscaling neighborhood verification and an object-based technique that only retained matched convective objects was necessary to understand the improvement, agreeing with the HWT SFE participants’ recommendations for multiple verification methods.
Significance Statement
“Good” forecasts of hail can be determined in multiple ways and must depend on both the performance of the guidance and the perspective of the end-user. This work looks at different verification strategies to capture the performance of the CAM-HAILCAST hail forecasting model across three years of the Spring Forecasting Experiment (SFE) in different parent models. Verification strategies were informed by SFE participant input via a survey. Skill variability among models decreased in SFE 2021 relative to prior SFEs. The FV3 model in 2021, compared to 2019, provided improved forecasts of both convective distribution and 38-mm (1.5 in.) hail size, as well as less overforecasting of convection from 1900 to 2300 UTC.
Abstract
Hail forecasts produced by the CAM-HAILCAST pseudo-Lagrangian hail size forecasting model were evaluated during the 2019, 2020, and 2021 NOAA Hazardous Weather Testbed (HWT) Spring Forecasting Experiments (SFEs). As part of this evaluation, HWT SFE participants were polled about their definition of a “good” hail forecast. Participants were presented with two different verification methods conducted over three different spatiotemporal scales, and were then asked to subjectively evaluate the hail forecast as well as the different verification methods themselves. Results recommended use of multiple verification methods tailored to the type of forecast expected by the end-user interpreting and applying the forecast. The hail forecasts evaluated during this period included an implementation of CAM-HAILCAST in the Limited Area Model of the Unified Forecast System with the Finite Volume 3 (FV3) dynamical core. Evaluation of FV3-HAILCAST over both 1- and 24-h periods found continued improvement from 2019 to 2021. The improvement was largely a result of wide intervariability among FV3 ensemble members with different microphysics parameterizations in 2019 lessening significantly during 2020 and 2021. Overprediction throughout the diurnal cycle also lessened by 2021. A combination of both upscaling neighborhood verification and an object-based technique that only retained matched convective objects was necessary to understand the improvement, agreeing with the HWT SFE participants’ recommendations for multiple verification methods.
Significance Statement
“Good” forecasts of hail can be determined in multiple ways and must depend on both the performance of the guidance and the perspective of the end-user. This work looks at different verification strategies to capture the performance of the CAM-HAILCAST hail forecasting model across three years of the Spring Forecasting Experiment (SFE) in different parent models. Verification strategies were informed by SFE participant input via a survey. Skill variability among models decreased in SFE 2021 relative to prior SFEs. The FV3 model in 2021, compared to 2019, provided improved forecasts of both convective distribution and 38-mm (1.5 in.) hail size, as well as less overforecasting of convection from 1900 to 2300 UTC.