Browse
Abstract
Prediction of severe convective storms at timescales of 2–4 weeks is of interest to forecasters and stakeholders due to their impacts to life and property. Prediction of severe convective storms on this timescale is challenging, since the large-scale weather patterns that drive this activity begin to lose dynamic predictability beyond week 1. Previous work related to severe convective storms on the subseasonal timescale has mostly focused on observed relationships with teleconnections. The skill of numerical weather prediction forecasts of convective-related variables has been comparatively less explored. In this study over the United States, a forecast evaluation of variables relevant in the prediction of severe convective storms is conducted using Global Ensemble Forecast System Version 12 reforecasts at lead times up to four weeks. We find that kinematic and thermodynamic fields are predicted with skill out to week 3 in some cases, while composite parameters struggle to achieve meaningful skill into week 2. Additionally, using a novel method of weekly summations of daily maximum composite parameters, we suggest that aggregation of certain variables may assist in providing additional predictability beyond week 1. These results should serve as a reference for forecast skill for the relevant fields and help inform the development of convective forecasting tools at timescales beyond current operational products.
Abstract
Prediction of severe convective storms at timescales of 2–4 weeks is of interest to forecasters and stakeholders due to their impacts to life and property. Prediction of severe convective storms on this timescale is challenging, since the large-scale weather patterns that drive this activity begin to lose dynamic predictability beyond week 1. Previous work related to severe convective storms on the subseasonal timescale has mostly focused on observed relationships with teleconnections. The skill of numerical weather prediction forecasts of convective-related variables has been comparatively less explored. In this study over the United States, a forecast evaluation of variables relevant in the prediction of severe convective storms is conducted using Global Ensemble Forecast System Version 12 reforecasts at lead times up to four weeks. We find that kinematic and thermodynamic fields are predicted with skill out to week 3 in some cases, while composite parameters struggle to achieve meaningful skill into week 2. Additionally, using a novel method of weekly summations of daily maximum composite parameters, we suggest that aggregation of certain variables may assist in providing additional predictability beyond week 1. These results should serve as a reference for forecast skill for the relevant fields and help inform the development of convective forecasting tools at timescales beyond current operational products.
Abstract
Producing a quantitative snowfall forecast (QSF) typically requires a model quantitative precipitation forecast (QPF) and snow-to-liquid ratio (SLR) estimate. QPF and SLR can vary significantly in space and time over complex terrain, necessitating fine-scale or point-specific forecasts of each component. Little Cottonwood Canyon (LCC) in Utah’s Wasatch Range frequently experiences high-impact winter storms and avalanche closures that result in substantial transportation and economic disruptions, making it an excellent testbed for evaluating snowfall forecasts. In this study, we validate QPFs, SLR forecasts, and QSFs produced by or derived from the Global Forecast System (GFS) and High-Resolution Rapid Refresh (HRRR) using liquid precipitation equivalent (LPE) and snowfall observations collected during the 2019/20–2022/23 cool seasons at the Alta–Collins snow-study site (2945 m MSL) in upper LCC. The 12-h QPFs produced by the GFS and HRRR underpredict the total LPE during the four cool seasons by 33% and 29%, respectively, and underpredict 50th, 75th, and 90th percentile event frequencies. Current operational SLR methods exhibit mean absolute errors of 4.5–7.7. In contrast, a locally trained random forest algorithm reduces SLR mean absolute errors to 3.7. Despite the random forest producing more accurate SLR forecasts, QSFs derived from operational SLR methods produce higher critical success indices since they exhibit positive SLR biases that offset negative QPF biases. These results indicate an overall underprediction of LPE by operational models in upper LCC and illustrate the need to identify sources of QSF bias to enhance QSF performance.
Significance Statement
Winter storms in mountainous terrain can disrupt transportation and threaten life and property due to road snow and avalanche hazards. Snow-to-liquid ratio (SLR) is an important variable for snowfall and avalanche forecasts. Using high-quality historical snowfall observations and atmospheric analyses, we developed a machine learning technique for predicting SLR at a high mountain site in Utah’s Little Cottonwood Canyon that is prone to closure due to winter storms. This technique produces improved SLR forecasts for use by weather forecasters and snow-safety personnel. We also show that current operational models and SLR techniques underforecast liquid precipitation amounts and overforecast SLRs, respectively, which has implications for future model development.
Abstract
Producing a quantitative snowfall forecast (QSF) typically requires a model quantitative precipitation forecast (QPF) and snow-to-liquid ratio (SLR) estimate. QPF and SLR can vary significantly in space and time over complex terrain, necessitating fine-scale or point-specific forecasts of each component. Little Cottonwood Canyon (LCC) in Utah’s Wasatch Range frequently experiences high-impact winter storms and avalanche closures that result in substantial transportation and economic disruptions, making it an excellent testbed for evaluating snowfall forecasts. In this study, we validate QPFs, SLR forecasts, and QSFs produced by or derived from the Global Forecast System (GFS) and High-Resolution Rapid Refresh (HRRR) using liquid precipitation equivalent (LPE) and snowfall observations collected during the 2019/20–2022/23 cool seasons at the Alta–Collins snow-study site (2945 m MSL) in upper LCC. The 12-h QPFs produced by the GFS and HRRR underpredict the total LPE during the four cool seasons by 33% and 29%, respectively, and underpredict 50th, 75th, and 90th percentile event frequencies. Current operational SLR methods exhibit mean absolute errors of 4.5–7.7. In contrast, a locally trained random forest algorithm reduces SLR mean absolute errors to 3.7. Despite the random forest producing more accurate SLR forecasts, QSFs derived from operational SLR methods produce higher critical success indices since they exhibit positive SLR biases that offset negative QPF biases. These results indicate an overall underprediction of LPE by operational models in upper LCC and illustrate the need to identify sources of QSF bias to enhance QSF performance.
Significance Statement
Winter storms in mountainous terrain can disrupt transportation and threaten life and property due to road snow and avalanche hazards. Snow-to-liquid ratio (SLR) is an important variable for snowfall and avalanche forecasts. Using high-quality historical snowfall observations and atmospheric analyses, we developed a machine learning technique for predicting SLR at a high mountain site in Utah’s Little Cottonwood Canyon that is prone to closure due to winter storms. This technique produces improved SLR forecasts for use by weather forecasters and snow-safety personnel. We also show that current operational models and SLR techniques underforecast liquid precipitation amounts and overforecast SLRs, respectively, which has implications for future model development.
Abstract
Accurate tropical cyclogenesis (TCG) prediction is important because it allows national operational forecasting agencies to issue timely warnings and implement effective disaster prevention measures. In 2020, the Korea Meteorological Administration employed a self-developed operational model called the Korean Integrated Model (KIM). In this study, we verified KIM’s TCG forecast skill over the western North Pacific. Based on 9-day forecasts, TCG in the model was objectively detected and classified as well-predicted, early formation, late formation, miss, or false alarm by comparing their formation times and locations with those of 46 tropical cyclones (TCs) from June to November in 2020–21 documented by the Joint Typhoon Warning Center. The prediction of large-scale environmental conditions relevant to TCG was also evaluated. The results showed that the probability of KIM detection was comparable to or better than that of previously reported statistics of other numerical weather prediction models. The intrabasin comparison revealed that the probability of detection in the Philippine Sea was the highest, followed by the South China Sea and central Pacific. The best TCG prediction performance in the Philippine Sea was supported by unbiased forecasts in large-scale environments. The missed and false alarm cases in all three regions had the largest prediction biases in the large-scale lower-tropospheric relative vorticity. Excessive false alarms may be associated with prediction biases in the vertical gradient of equivalent potential temperature within the boundary layer. This study serves as a primary guide for national forecasters and is useful to model developers for further refinement of KIM.
Abstract
Accurate tropical cyclogenesis (TCG) prediction is important because it allows national operational forecasting agencies to issue timely warnings and implement effective disaster prevention measures. In 2020, the Korea Meteorological Administration employed a self-developed operational model called the Korean Integrated Model (KIM). In this study, we verified KIM’s TCG forecast skill over the western North Pacific. Based on 9-day forecasts, TCG in the model was objectively detected and classified as well-predicted, early formation, late formation, miss, or false alarm by comparing their formation times and locations with those of 46 tropical cyclones (TCs) from June to November in 2020–21 documented by the Joint Typhoon Warning Center. The prediction of large-scale environmental conditions relevant to TCG was also evaluated. The results showed that the probability of KIM detection was comparable to or better than that of previously reported statistics of other numerical weather prediction models. The intrabasin comparison revealed that the probability of detection in the Philippine Sea was the highest, followed by the South China Sea and central Pacific. The best TCG prediction performance in the Philippine Sea was supported by unbiased forecasts in large-scale environments. The missed and false alarm cases in all three regions had the largest prediction biases in the large-scale lower-tropospheric relative vorticity. Excessive false alarms may be associated with prediction biases in the vertical gradient of equivalent potential temperature within the boundary layer. This study serves as a primary guide for national forecasters and is useful to model developers for further refinement of KIM.
Abstract
Since the start of the operational use of ensemble prediction systems, ensemble-based probabilistic forecasting has become the most advanced approach in weather prediction. However, despite the persistent development of the last three decades, ensemble forecasts still often suffer from the lack of calibration and might exhibit systematic bias, which calls for some form of statistical post-processing. Nowadays, one can choose from a large variety of post-processing approaches, where parametric methods provide full predictive distributions of the investigated weather quantity. Parameter estimation in these models is based on training data consisting of past forecast-observation pairs, thus post-processed forecasts are usually available only at those locations where training data are accessible. We propose a general clustering-based interpolation technique of extending calibrated predictive distributions from observation stations to any location in the ensemble domain where there are ensemble forecasts at hand. Focusing on the ensemble model output statistics (EMOS) post-processing technique, in a case study based on 10-m wind speed ensemble forecasts of the European Centre for Medium-Range Weather Forecasts, we demonstrate the predictive performance of various versions of the suggested method and show its superiority over the regionally estimated and interpolated EMOS models and the raw ensemble forecasts as well.
Abstract
Since the start of the operational use of ensemble prediction systems, ensemble-based probabilistic forecasting has become the most advanced approach in weather prediction. However, despite the persistent development of the last three decades, ensemble forecasts still often suffer from the lack of calibration and might exhibit systematic bias, which calls for some form of statistical post-processing. Nowadays, one can choose from a large variety of post-processing approaches, where parametric methods provide full predictive distributions of the investigated weather quantity. Parameter estimation in these models is based on training data consisting of past forecast-observation pairs, thus post-processed forecasts are usually available only at those locations where training data are accessible. We propose a general clustering-based interpolation technique of extending calibrated predictive distributions from observation stations to any location in the ensemble domain where there are ensemble forecasts at hand. Focusing on the ensemble model output statistics (EMOS) post-processing technique, in a case study based on 10-m wind speed ensemble forecasts of the European Centre for Medium-Range Weather Forecasts, we demonstrate the predictive performance of various versions of the suggested method and show its superiority over the regionally estimated and interpolated EMOS models and the raw ensemble forecasts as well.
Abstract
Minimum central pressure (P min ) is an integrated measure of the tropical cyclone wind field and is known to be a useful indicator of storm damage potential. A simple model that predicts P min from routinely-estimated quantities, including storm size, would be of great value. Here we present a simple linear empirical model for predicting P min from maximum wind speed, the radius of 34-knot winds (R 34kt ), storm-center latitude, and the environmental pressure. An empirical model for the pressure deficit is first developed that takes as predictors specific combinations of these quantities that are derived directly from theory, based on gradient wind balance and a modified-Rankine-type wind profile known to capture storm structure inside of R 34kt . Model coefficients are estimated using data from the southwestern North Atlantic and eastern North Pacific from 2004–2022 using aircraft-based estimates of P min , Extended Best Track data, and estimates of environmental pressure from Global Forecast System (GFS) analyses. The model has near-zero conditional bias even for low P min , explaining 94.2% of the variance. Performance is superior to a variety of other model formulations, including a standard wind-pressure model that does not account for storm size or latitude (89.2% variance explained). Model performance is also strong when applied to high-latitude data and data near coastlines. Finally, the model is shown to perform comparably well in an operations-like setting based solely on routinely-estimated variables, including the pressure of the outermost closed isobar. Case study applications to five impactful historical storms are discussed. Overall, the model offers a simple, fast, physically-based prediction for P min for practical use in operations and research.
Abstract
Minimum central pressure (P min ) is an integrated measure of the tropical cyclone wind field and is known to be a useful indicator of storm damage potential. A simple model that predicts P min from routinely-estimated quantities, including storm size, would be of great value. Here we present a simple linear empirical model for predicting P min from maximum wind speed, the radius of 34-knot winds (R 34kt ), storm-center latitude, and the environmental pressure. An empirical model for the pressure deficit is first developed that takes as predictors specific combinations of these quantities that are derived directly from theory, based on gradient wind balance and a modified-Rankine-type wind profile known to capture storm structure inside of R 34kt . Model coefficients are estimated using data from the southwestern North Atlantic and eastern North Pacific from 2004–2022 using aircraft-based estimates of P min , Extended Best Track data, and estimates of environmental pressure from Global Forecast System (GFS) analyses. The model has near-zero conditional bias even for low P min , explaining 94.2% of the variance. Performance is superior to a variety of other model formulations, including a standard wind-pressure model that does not account for storm size or latitude (89.2% variance explained). Model performance is also strong when applied to high-latitude data and data near coastlines. Finally, the model is shown to perform comparably well in an operations-like setting based solely on routinely-estimated variables, including the pressure of the outermost closed isobar. Case study applications to five impactful historical storms are discussed. Overall, the model offers a simple, fast, physically-based prediction for P min for practical use in operations and research.
Abstract
A new 20-year wave reforecast was generated based on the NOAA Global Ensemble Forecast System, version 12 (GEFSv12). It was produced using the same wave model setup as the NCEP’s operational GEFSv12 wave component, which employs the numerical wave model WAVEWATCH III and utilizes three grids with spatial resolutions of 0.2° and 0.25°. The reforecast comprises 5 members with one cycle per day and forecast range of 16 days. Once a week, it expands to 35 days and 11 members. This paper describes the development of the wave ensemble reforecast, focusing primarily on validation against buoys and altimeters. The statistical analyses demonstrated very good performance in the short range for significant wave height, with correlation coefficients of 0.95-0.96 on day 1, and between 0.86-0.88 within week 1, along with bias close to zero. After day 10, correlation coefficients fall below 0.70. We found that the degradation of predictability and the increase in scatter errors predominantly occur in the forecast lead time between days 4 and 10, in terms of the ensemble mean and individual members, including the control. For week 2 and beyond, a probabilistic spatio-temporal analysis of the ensemble space provides useful forecast guidance. Our results provide a framework for expanding the usefulness of wave ensemble data in operational forecasting applications.
Abstract
A new 20-year wave reforecast was generated based on the NOAA Global Ensemble Forecast System, version 12 (GEFSv12). It was produced using the same wave model setup as the NCEP’s operational GEFSv12 wave component, which employs the numerical wave model WAVEWATCH III and utilizes three grids with spatial resolutions of 0.2° and 0.25°. The reforecast comprises 5 members with one cycle per day and forecast range of 16 days. Once a week, it expands to 35 days and 11 members. This paper describes the development of the wave ensemble reforecast, focusing primarily on validation against buoys and altimeters. The statistical analyses demonstrated very good performance in the short range for significant wave height, with correlation coefficients of 0.95-0.96 on day 1, and between 0.86-0.88 within week 1, along with bias close to zero. After day 10, correlation coefficients fall below 0.70. We found that the degradation of predictability and the increase in scatter errors predominantly occur in the forecast lead time between days 4 and 10, in terms of the ensemble mean and individual members, including the control. For week 2 and beyond, a probabilistic spatio-temporal analysis of the ensemble space provides useful forecast guidance. Our results provide a framework for expanding the usefulness of wave ensemble data in operational forecasting applications.
Abstract
Ensembles of Convection-Allowing Model (CAM) forecasts are increasingly being used in operational numerical weather forecasting. Several approaches have been devised to find consensus among ensemble forecast fields, including the arithmetic ensemble mean and, more recently, the patchwise Localized Probability-Matched (LPM) mean. However, differences in spatial distribution and intensity of precipitation features among ensemble members make it difficult to construct an ensemble mean product that characterizes the consensus while preserving precipitation structures forecasted by the individual ensemble members.
To overcome this problem, this study aims to develop and test a method for improving ensemble consensus precipitation forecasts by directly considering the spatial offsets among ensemble members. This study uses a multi-scale spatial alignment technique to align the precipitation features of each ensemble member to a common location, and the Spatial Aligned Mean (SAM) is obtained by averaging the re-aligned members.
It is shown that implementing SAM and subsequently applying the LPM technique to the average of all aligned members (SAM-LPM) can significantly improve the warm season precipitation forecast scores using common metrics such as Equitable Threat Score (ETS).Also, improvement in structure of features of heavy rainfall is shown from summer 2023 flash-flooding cases. Thus, SAM and SAM-LPM can be excellent candidate methods for calculating an ensemble consensus and providing ensemble consensus guidance to forecasters.
Abstract
Ensembles of Convection-Allowing Model (CAM) forecasts are increasingly being used in operational numerical weather forecasting. Several approaches have been devised to find consensus among ensemble forecast fields, including the arithmetic ensemble mean and, more recently, the patchwise Localized Probability-Matched (LPM) mean. However, differences in spatial distribution and intensity of precipitation features among ensemble members make it difficult to construct an ensemble mean product that characterizes the consensus while preserving precipitation structures forecasted by the individual ensemble members.
To overcome this problem, this study aims to develop and test a method for improving ensemble consensus precipitation forecasts by directly considering the spatial offsets among ensemble members. This study uses a multi-scale spatial alignment technique to align the precipitation features of each ensemble member to a common location, and the Spatial Aligned Mean (SAM) is obtained by averaging the re-aligned members.
It is shown that implementing SAM and subsequently applying the LPM technique to the average of all aligned members (SAM-LPM) can significantly improve the warm season precipitation forecast scores using common metrics such as Equitable Threat Score (ETS).Also, improvement in structure of features of heavy rainfall is shown from summer 2023 flash-flooding cases. Thus, SAM and SAM-LPM can be excellent candidate methods for calculating an ensemble consensus and providing ensemble consensus guidance to forecasters.
Abstract
This study outlines the updates made to cumulus convection parameterizations between GFSv16 and the forthcoming GFSv17 which will be the first global forecast application to become operational under the Unified Forecast System infrastructure. The updates, addressing known systematic errors and biases, incorporate innovations such as stochasticity, 3-dimensional sub-grid organizational effects, and a prognostic closure evolution. The changes are shown to improve tropical temperature/humidity biases, CAPE forecasts, CONUS precipitation, and tropical variability. By examining individual updates' impact on temperature, humidity profiles, and precipitation power spectra, we find that the new prognostic closure and stricter precipitation evaporation criteria alleviate a dry and cold bias in the tropical boundary layer and enhance precipitation variability in the tropics. Stricter triggering criteria also allow for more CAPE buildup, particularly over the tropics. The cumulus convection updates have a modest impact on precipitation skill scores over CONUS, but overall, there is an improvement when comparing GFSv16 and the latest GFSv17 prototype configurations, in particularly for larger thresholds. The study also highlights the challenges in developing convection parameterizations suitable for both coupled and uncoupled model configurations. Evaluation of the MJO shows varying responses to cumulus convection changes depending on whether the model is coupled with a dynamic ocean model.
Abstract
This study outlines the updates made to cumulus convection parameterizations between GFSv16 and the forthcoming GFSv17 which will be the first global forecast application to become operational under the Unified Forecast System infrastructure. The updates, addressing known systematic errors and biases, incorporate innovations such as stochasticity, 3-dimensional sub-grid organizational effects, and a prognostic closure evolution. The changes are shown to improve tropical temperature/humidity biases, CAPE forecasts, CONUS precipitation, and tropical variability. By examining individual updates' impact on temperature, humidity profiles, and precipitation power spectra, we find that the new prognostic closure and stricter precipitation evaporation criteria alleviate a dry and cold bias in the tropical boundary layer and enhance precipitation variability in the tropics. Stricter triggering criteria also allow for more CAPE buildup, particularly over the tropics. The cumulus convection updates have a modest impact on precipitation skill scores over CONUS, but overall, there is an improvement when comparing GFSv16 and the latest GFSv17 prototype configurations, in particularly for larger thresholds. The study also highlights the challenges in developing convection parameterizations suitable for both coupled and uncoupled model configurations. Evaluation of the MJO shows varying responses to cumulus convection changes depending on whether the model is coupled with a dynamic ocean model.