Browse
Abstract
Heat is the leading cause of weather-related death in the United States. Wet bulb globe temperature (WBGT) is a heat stress index commonly used among active populations for activity modification, such as outdoor workers and athletes. Despite widespread use globally, WBGT forecasts have been uncommon in the United States until recent years. This research assesses the accuracy of WBGT forecasts developed by NOAA’s Southeast Regional Climate Center (SERCC) and the Carolinas Integrated Sciences and Assessments (CISA). It also details efforts to refine the forecast by accounting for the impact of surface roughness on wind using satellite imagery. Comparisons are made between the SERCC/CISA WBGT forecast and a WBGT forecast modeled after NWS methods. Additionally, both of these forecasts are compared with in situ WBGT measurements (during the summers of 2019-2021) and estimates from weather stations to assess forecast accuracy. The SERCC/CISA WBGT forecast was within 0.6°C of observations on average and showed less bias than the forecast based on NWS methods across North Carolina. Importantly, the SERCC/CISA WBGT forecast was more accurate for the most dangerous conditions (WBGT > 31°C), although this resulted in higher false alarms for these extreme conditions compared to the NWS method. In particular, this work improved the forecast for sites more sheltered from wind by better accounting for the influences of land cover on 2-meter wind speed. Accurate forecasts are more challenging for sites with complex microclimates. Thus, appropriate caution is necessary when interpreting forecasts and onsite, real-time WBGT measurements remain critical.
Abstract
Heat is the leading cause of weather-related death in the United States. Wet bulb globe temperature (WBGT) is a heat stress index commonly used among active populations for activity modification, such as outdoor workers and athletes. Despite widespread use globally, WBGT forecasts have been uncommon in the United States until recent years. This research assesses the accuracy of WBGT forecasts developed by NOAA’s Southeast Regional Climate Center (SERCC) and the Carolinas Integrated Sciences and Assessments (CISA). It also details efforts to refine the forecast by accounting for the impact of surface roughness on wind using satellite imagery. Comparisons are made between the SERCC/CISA WBGT forecast and a WBGT forecast modeled after NWS methods. Additionally, both of these forecasts are compared with in situ WBGT measurements (during the summers of 2019-2021) and estimates from weather stations to assess forecast accuracy. The SERCC/CISA WBGT forecast was within 0.6°C of observations on average and showed less bias than the forecast based on NWS methods across North Carolina. Importantly, the SERCC/CISA WBGT forecast was more accurate for the most dangerous conditions (WBGT > 31°C), although this resulted in higher false alarms for these extreme conditions compared to the NWS method. In particular, this work improved the forecast for sites more sheltered from wind by better accounting for the influences of land cover on 2-meter wind speed. Accurate forecasts are more challenging for sites with complex microclimates. Thus, appropriate caution is necessary when interpreting forecasts and onsite, real-time WBGT measurements remain critical.
Abstract
The increased social need for more precise and reliable weather forecasts, especially when focusing on extreme weather events, pushes forward research and development in meteorology towards novel numerical weather prediction (NWP) systems that can provide simulations that resolve atmospheric processes on hectometric scales on demand. Such high-resolution NWP systems require a more detailed representation of the non-resolved processes, i.e. usage of scale-aware schemes for convection and three-dimensional turbulence (and radiation), which would additionally increase the computation needs. Therefore, developing and applying comprehensive, reliable, and computationally acceptable parametrizations in NWP systems is of urgent importance. All operationally used NWP systems are based on averaged Navier-Stokes equations, and thus require an approximation for the small-scale turbulent fluxes of momentum, energy, and matter in the system. The availability of high-fidelity data from turbulence experiments and direct numerical simulations has helped scientists in the past to construct and calibrate a range of turbulence closure approximations (from the relatively simple to more complex), some of which have been adopted and are in use in the current operational NWP systems. The significant development of learned-by-data (LBD) algorithms over the past decade (e.g. artificial intelligence) motivates engineers and researchers in fluid dynamics to explore alternatives for modeling turbulence by directly using turbulence data to quantify and reduce model uncertainties systematically. This review elaborates on the LBD approaches and their use in NWP currently, and also searches for novel data-informed turbulence models that can potentially be used and applied in NWP. Based on this literature analysis, the challenges and perspectives to do so are discussed.
Abstract
The increased social need for more precise and reliable weather forecasts, especially when focusing on extreme weather events, pushes forward research and development in meteorology towards novel numerical weather prediction (NWP) systems that can provide simulations that resolve atmospheric processes on hectometric scales on demand. Such high-resolution NWP systems require a more detailed representation of the non-resolved processes, i.e. usage of scale-aware schemes for convection and three-dimensional turbulence (and radiation), which would additionally increase the computation needs. Therefore, developing and applying comprehensive, reliable, and computationally acceptable parametrizations in NWP systems is of urgent importance. All operationally used NWP systems are based on averaged Navier-Stokes equations, and thus require an approximation for the small-scale turbulent fluxes of momentum, energy, and matter in the system. The availability of high-fidelity data from turbulence experiments and direct numerical simulations has helped scientists in the past to construct and calibrate a range of turbulence closure approximations (from the relatively simple to more complex), some of which have been adopted and are in use in the current operational NWP systems. The significant development of learned-by-data (LBD) algorithms over the past decade (e.g. artificial intelligence) motivates engineers and researchers in fluid dynamics to explore alternatives for modeling turbulence by directly using turbulence data to quantify and reduce model uncertainties systematically. This review elaborates on the LBD approaches and their use in NWP currently, and also searches for novel data-informed turbulence models that can potentially be used and applied in NWP. Based on this literature analysis, the challenges and perspectives to do so are discussed.
Abstract
This study details a two-method, machine-learning approach to predict current and short-term intensity change in global tropical cyclones (TCs), ‘D-MINT’ and ‘D-PRINT’. D-MINT and D-PRINT use infrared imagery and environmental scalar predictors, while D-MINT also employs microwave imagery.
Results show that current TC intensity estimates from D-MINT and D-PRINT are more skillful than three established intensity estimation methods routinely used by operational forecasters for North Atlantic, and eastern and western North Pacific TCs.
Short-term intensity predictions are validated against five operational deterministic guidances at 6-, 12-, 18-, and 24-hour lead times. D-MINT and D-PRINT are less skillful than NHC and consensus TC intensity predictions in North Atlantic and eastern North Pacific TCs, but are more skillful than the other guidances for at least half of the lead times. In western North Pacific, North Indian Ocean, and Southern Hemisphere TCs, D-MINT is more skillful than the JTWC and other individual TC intensity forecasts for over half of the lead times. When probabilistically predicting TC rapid intensification (RI), D-MINT is more skillful in North Atlantic and western North Pacific TCs than three operationally-used RI guidances, but less skillful for yes-no RI forecasts.
In addition, this work demonstrates the importance of microwave imagery, as D-MINT is more skillful than D-PRINT. Since D-MINT and D-PRINT are convolutional neural network models interrogating two-dimensional structures within TC satellite imagery, this study also demonstrates that those features can yield better short-term predictions than existing scalar statistics of satellite imagery in operational models. Finally, a diagnostics tool is revealed to aid the attribution of the D-MINT/D-PRINT intensity predictions.
Abstract
This study details a two-method, machine-learning approach to predict current and short-term intensity change in global tropical cyclones (TCs), ‘D-MINT’ and ‘D-PRINT’. D-MINT and D-PRINT use infrared imagery and environmental scalar predictors, while D-MINT also employs microwave imagery.
Results show that current TC intensity estimates from D-MINT and D-PRINT are more skillful than three established intensity estimation methods routinely used by operational forecasters for North Atlantic, and eastern and western North Pacific TCs.
Short-term intensity predictions are validated against five operational deterministic guidances at 6-, 12-, 18-, and 24-hour lead times. D-MINT and D-PRINT are less skillful than NHC and consensus TC intensity predictions in North Atlantic and eastern North Pacific TCs, but are more skillful than the other guidances for at least half of the lead times. In western North Pacific, North Indian Ocean, and Southern Hemisphere TCs, D-MINT is more skillful than the JTWC and other individual TC intensity forecasts for over half of the lead times. When probabilistically predicting TC rapid intensification (RI), D-MINT is more skillful in North Atlantic and western North Pacific TCs than three operationally-used RI guidances, but less skillful for yes-no RI forecasts.
In addition, this work demonstrates the importance of microwave imagery, as D-MINT is more skillful than D-PRINT. Since D-MINT and D-PRINT are convolutional neural network models interrogating two-dimensional structures within TC satellite imagery, this study also demonstrates that those features can yield better short-term predictions than existing scalar statistics of satellite imagery in operational models. Finally, a diagnostics tool is revealed to aid the attribution of the D-MINT/D-PRINT intensity predictions.
Abstract
Quasi-linear convective systems (QLCSs) can produce multiple hazards (e.g., straight-line winds, flash flooding, and mesovortex tornadoes) that pose a significant threat to life and property, and are often difficult to accurately forecast. The NSSL Warn-on-Forecast System (WoFS) is a convection-allowing ensemble system developed to provide short-term, probabilistic forecasting guidance for severe convective events. Examination of WoFS’s capability to predict QLCSs has yet to be systematically assessed across a large number of cases for 0–6-hr forecast times. In this study, the quality of WoFS QLCS forecasts for 50 QLCS days occurring between 2017–2020 is evaluated using object-based verification techniques. First, a storm mode identification and classification algorithm is tuned to identify high-reflectivity, linear convective structures. The algorithm is used to identify convective line objects in WoFS forecasts and Multi-Radar Multi-Sensor system (MRMS) gridded observations. WoFS QLCS objects are matched with MRMS observed objects to generate bulk verification statistics. Results suggest WoFS’s QLCS forecasts are skillful with the 3- and 6-hr forecasts having similar probability of detection and false alarm ratio values near 0.59 and 0.34, respectively. The WoFS objects are larger, more intense, and less eccentric than those in MRMS. A novel centerline analysis is performed to evaluate orientation, length, and tortuosity (i.e., curvature) differences, and spatial displacements between observed and predicted convective lines. While no systematic propagation biases are found, WoFS typically has centerlines that are more tortuous and displaced to the northwest of MRMS centerlines, suggesting WoFS may be overforecasting the intensity of the QLCS’s rear-inflow jet and northern bookend vortex.
Abstract
Quasi-linear convective systems (QLCSs) can produce multiple hazards (e.g., straight-line winds, flash flooding, and mesovortex tornadoes) that pose a significant threat to life and property, and are often difficult to accurately forecast. The NSSL Warn-on-Forecast System (WoFS) is a convection-allowing ensemble system developed to provide short-term, probabilistic forecasting guidance for severe convective events. Examination of WoFS’s capability to predict QLCSs has yet to be systematically assessed across a large number of cases for 0–6-hr forecast times. In this study, the quality of WoFS QLCS forecasts for 50 QLCS days occurring between 2017–2020 is evaluated using object-based verification techniques. First, a storm mode identification and classification algorithm is tuned to identify high-reflectivity, linear convective structures. The algorithm is used to identify convective line objects in WoFS forecasts and Multi-Radar Multi-Sensor system (MRMS) gridded observations. WoFS QLCS objects are matched with MRMS observed objects to generate bulk verification statistics. Results suggest WoFS’s QLCS forecasts are skillful with the 3- and 6-hr forecasts having similar probability of detection and false alarm ratio values near 0.59 and 0.34, respectively. The WoFS objects are larger, more intense, and less eccentric than those in MRMS. A novel centerline analysis is performed to evaluate orientation, length, and tortuosity (i.e., curvature) differences, and spatial displacements between observed and predicted convective lines. While no systematic propagation biases are found, WoFS typically has centerlines that are more tortuous and displaced to the northwest of MRMS centerlines, suggesting WoFS may be overforecasting the intensity of the QLCS’s rear-inflow jet and northern bookend vortex.
Abstract
Road surface temperatures are a critical factor in determining driving conditions, especially during winter storms. Road temperature observations across the United States are sparse and located mainly along major highways. A machine learning–based system for nowcasting the probability of subfreezing road surface temperatures was developed at NSSL to allow for widespread monitoring of road conditions in real time. In this article, these products were evaluated over two winter seasons. Strengths and weaknesses in the nowcast system were identified by stratifying the evaluation metrics into various subsets. These results show that the current system performed well in general, but significantly underpredicted the probability of subfreezing roads during frozen precipitation events. Machine learning experiments were performed to attempt to address these issues. Evaluations of these experiments indicate reduction in errors when precipitation phase was included as a predictor and precipitating cases were more substantially represented in the training data for the machine learning system.
Significance Statement
The purpose of this study is to better understand the strengths and weaknesses of a system that predicts the probability of subfreezing road surface temperatures. We found that the system performed well in general, but underpredicted the probabilities when frozen precipitation was predicted to reach the surface. These biases were substantially improved by modifying the system to increase its focus on situations with falling precipitation. The updated system should allow for improved monitoring and forecasting of potentially hazardous conditions during winter storms.
Abstract
Road surface temperatures are a critical factor in determining driving conditions, especially during winter storms. Road temperature observations across the United States are sparse and located mainly along major highways. A machine learning–based system for nowcasting the probability of subfreezing road surface temperatures was developed at NSSL to allow for widespread monitoring of road conditions in real time. In this article, these products were evaluated over two winter seasons. Strengths and weaknesses in the nowcast system were identified by stratifying the evaluation metrics into various subsets. These results show that the current system performed well in general, but significantly underpredicted the probability of subfreezing roads during frozen precipitation events. Machine learning experiments were performed to attempt to address these issues. Evaluations of these experiments indicate reduction in errors when precipitation phase was included as a predictor and precipitating cases were more substantially represented in the training data for the machine learning system.
Significance Statement
The purpose of this study is to better understand the strengths and weaknesses of a system that predicts the probability of subfreezing road surface temperatures. We found that the system performed well in general, but underpredicted the probabilities when frozen precipitation was predicted to reach the surface. These biases were substantially improved by modifying the system to increase its focus on situations with falling precipitation. The updated system should allow for improved monitoring and forecasting of potentially hazardous conditions during winter storms.
Abstract
Atmospheric River Reconnaissance has held field campaigns during cool seasons since 2016. These campaigns have provided thousands of dropsonde data profiles, which are assimilated into multiple global operational numerical weather prediction models. Data denial experiments, conducted by running a parallel set of forecasts that exclude the dropsonde information, allow testing of the impact of the dropsonde data on model analyses and the subsequent forecasts. Here, we investigate the differences in skill between the control forecasts (with dropsonde data assimilated) and denial forecasts (without dropsonde data assimilated) in terms of both precipitation and integrated vapor transport (IVT) at multiple thresholds. The differences are considered in the times and locations where there is a reasonable expectation of influence of an intensive observation period (IOP). Results for 2019 and 2020 from both the European Centre for Medium-Range Weather Forecasts (ECMWF) model and the National Centers for Environmental Prediction (NCEP) global model show improvements with the added information from the dropsondes. In particular, significant improvements in the control forecast IVT generally occur in both models, especially at higher values. Significant improvements in the control forecast precipitation also generally occur in both models, but the improvements vary depending on the lead time and metrics used.
Significance Statement
Atmospheric River Reconnaissance is a program that uses targeted aircraft flights over the northeast Pacific to take measurements of meteorological fields. These data are then ingested into global weather models with the intent of improving the initial conditions and resulting forecasts along the U.S. West Coast. The impacts of these observations on two global numerical weather models were investigated to determine their influence on the forecasts. The integrated vapor transport, a measure of both wind and humidity, saw significant improvements in both models with the additional observations. Precipitation forecasts were also improved, but with differing results between the two models.
Abstract
Atmospheric River Reconnaissance has held field campaigns during cool seasons since 2016. These campaigns have provided thousands of dropsonde data profiles, which are assimilated into multiple global operational numerical weather prediction models. Data denial experiments, conducted by running a parallel set of forecasts that exclude the dropsonde information, allow testing of the impact of the dropsonde data on model analyses and the subsequent forecasts. Here, we investigate the differences in skill between the control forecasts (with dropsonde data assimilated) and denial forecasts (without dropsonde data assimilated) in terms of both precipitation and integrated vapor transport (IVT) at multiple thresholds. The differences are considered in the times and locations where there is a reasonable expectation of influence of an intensive observation period (IOP). Results for 2019 and 2020 from both the European Centre for Medium-Range Weather Forecasts (ECMWF) model and the National Centers for Environmental Prediction (NCEP) global model show improvements with the added information from the dropsondes. In particular, significant improvements in the control forecast IVT generally occur in both models, especially at higher values. Significant improvements in the control forecast precipitation also generally occur in both models, but the improvements vary depending on the lead time and metrics used.
Significance Statement
Atmospheric River Reconnaissance is a program that uses targeted aircraft flights over the northeast Pacific to take measurements of meteorological fields. These data are then ingested into global weather models with the intent of improving the initial conditions and resulting forecasts along the U.S. West Coast. The impacts of these observations on two global numerical weather models were investigated to determine their influence on the forecasts. The integrated vapor transport, a measure of both wind and humidity, saw significant improvements in both models with the additional observations. Precipitation forecasts were also improved, but with differing results between the two models.
Abstract
This study investigates regional, seasonal biases in convection-allowing model forecasts of near-surface temperature and dewpoint in areas of particular importance to forecasts of severe local storms. One method compares model forecasts with objective analyses of observed conditions in the inflow sectors of reported tornadoes. A second method captures a broader sample of environments, comparing model forecasts with surface observations under certain warm-sector criteria. Both methods reveal a cold bias across all models tested in Southeast U.S. cool-season warm sectors. This is an operationally important bias given the thermodynamic sensitivity of instability-limited severe weather that is common in the Southeast cool season. There is not a clear bias across models in the Great Plains warm season, but instead more varied behavior with differing model physics.
Significance Statement
The severity of thunderstorms and the types of hazards they produce depend in part on the low-level temperature and moisture in the near-storm environment. It is important for numerical forecast models to accurately represent these fields in forecasts of severe weather events. We show that the most widely used short-term, high-resolution forecast models have a consistent cold bias of about 1 K (up to 2 K in certain cases) in storm environments in the southeastern U.S. cool season. Human forecasters must recognize and adjust for this bias, and future model development should aim to improve it.
Abstract
This study investigates regional, seasonal biases in convection-allowing model forecasts of near-surface temperature and dewpoint in areas of particular importance to forecasts of severe local storms. One method compares model forecasts with objective analyses of observed conditions in the inflow sectors of reported tornadoes. A second method captures a broader sample of environments, comparing model forecasts with surface observations under certain warm-sector criteria. Both methods reveal a cold bias across all models tested in Southeast U.S. cool-season warm sectors. This is an operationally important bias given the thermodynamic sensitivity of instability-limited severe weather that is common in the Southeast cool season. There is not a clear bias across models in the Great Plains warm season, but instead more varied behavior with differing model physics.
Significance Statement
The severity of thunderstorms and the types of hazards they produce depend in part on the low-level temperature and moisture in the near-storm environment. It is important for numerical forecast models to accurately represent these fields in forecasts of severe weather events. We show that the most widely used short-term, high-resolution forecast models have a consistent cold bias of about 1 K (up to 2 K in certain cases) in storm environments in the southeastern U.S. cool season. Human forecasters must recognize and adjust for this bias, and future model development should aim to improve it.
Abstract
This study provides a comparison of the operational HRRR version 4 and its eventual successor, the experimental Rapid Refresh Forecast System (RRFS) model (summer 2022 version), at predicting the evolution of convective storm characteristics during widespread convective events that occurred primarily over the eastern United States during summer 2022. Thirty-two widespread convective events were selected using observations from the MRMS composite reflectivity, which includes an equal number of MCSs, quasi-linear convective systems (QLCSs), clusters, and cellular convection. Each storm system was assessed on four primary characteristics: total storm area, total storm count, storm area ratio (an indicator of mean storm size), and storm size distributions.
It was found that the HRRR predictions of total storm area were comparable to MRMS, while the RRFS overpredicted total storm area by 40-60% depending on forecast lead time. Both models tended to underpredict storm counts particularly during the storm initiation and growth period. This bias in storm counts originates early in the model runs (forecast hour 1) and propagates through the simulation in both models indicating that both miss storm initiation events and/or merge individual storm objects too quickly. Thus, both models end up with mean storm sizes that are much larger than observed (RRFS more so than HRRR). Additional analyses revealed that the storm area and individual storm biases were largest for the clusters and cellular convective modes. These results can serve as a benchmark for assessing future versions of RRFS and will aid model users in interpreting forecast guidance.
Abstract
This study provides a comparison of the operational HRRR version 4 and its eventual successor, the experimental Rapid Refresh Forecast System (RRFS) model (summer 2022 version), at predicting the evolution of convective storm characteristics during widespread convective events that occurred primarily over the eastern United States during summer 2022. Thirty-two widespread convective events were selected using observations from the MRMS composite reflectivity, which includes an equal number of MCSs, quasi-linear convective systems (QLCSs), clusters, and cellular convection. Each storm system was assessed on four primary characteristics: total storm area, total storm count, storm area ratio (an indicator of mean storm size), and storm size distributions.
It was found that the HRRR predictions of total storm area were comparable to MRMS, while the RRFS overpredicted total storm area by 40-60% depending on forecast lead time. Both models tended to underpredict storm counts particularly during the storm initiation and growth period. This bias in storm counts originates early in the model runs (forecast hour 1) and propagates through the simulation in both models indicating that both miss storm initiation events and/or merge individual storm objects too quickly. Thus, both models end up with mean storm sizes that are much larger than observed (RRFS more so than HRRR). Additional analyses revealed that the storm area and individual storm biases were largest for the clusters and cellular convective modes. These results can serve as a benchmark for assessing future versions of RRFS and will aid model users in interpreting forecast guidance.
Abstract
Fog is a phenomenon that exerts significant impacts on transportation, aviation, air quality, agriculture, and even water resources. While data-driven machine learning algorithms have shown promising performance in capturing nonlinear fog events at point locations, their applicability to different areas and time periods is questionable. This study addresses this issue by examining five decision-tree-based classifiers in a South Korean region, where diverse fog formation mechanisms are at play. The five machine learning algorithms were trained at point locations and tested with other point locations for time periods independent of the training processes. Using the ensemble classifiers and high-resolution atmospheric reanalysis data, we also attempted to establish fog occurrence maps in a regional area. Results showed that machine learning models trained on the local datasets exhibited superior performance in mountainous areas, where radiative cooling predominantly contributes to fog formation, compared to inland and coastal regions. As the fog generation mechanisms diversified, the tree-based ensemble models appeared to encounter challenges in delineating their decision boundaries. When they were trained with the reanalysis data, their predictive skills were significantly decreased, resulting in high false alarm rates. This prompted the need for postprocessing techniques to rectify overestimated fog frequency. While postprocessing may ameliorate overestimation, caution is needed to interpret the resultant fog frequency estimates, especially in regions with more diverse fog generation mechanisms. The spatial upscaling of machine learning–based fog prediction models poses challenges owing to the intricate interplay of various fog formation mechanisms, data imbalances, and potential inaccuracies in reanalysis data.
Abstract
Fog is a phenomenon that exerts significant impacts on transportation, aviation, air quality, agriculture, and even water resources. While data-driven machine learning algorithms have shown promising performance in capturing nonlinear fog events at point locations, their applicability to different areas and time periods is questionable. This study addresses this issue by examining five decision-tree-based classifiers in a South Korean region, where diverse fog formation mechanisms are at play. The five machine learning algorithms were trained at point locations and tested with other point locations for time periods independent of the training processes. Using the ensemble classifiers and high-resolution atmospheric reanalysis data, we also attempted to establish fog occurrence maps in a regional area. Results showed that machine learning models trained on the local datasets exhibited superior performance in mountainous areas, where radiative cooling predominantly contributes to fog formation, compared to inland and coastal regions. As the fog generation mechanisms diversified, the tree-based ensemble models appeared to encounter challenges in delineating their decision boundaries. When they were trained with the reanalysis data, their predictive skills were significantly decreased, resulting in high false alarm rates. This prompted the need for postprocessing techniques to rectify overestimated fog frequency. While postprocessing may ameliorate overestimation, caution is needed to interpret the resultant fog frequency estimates, especially in regions with more diverse fog generation mechanisms. The spatial upscaling of machine learning–based fog prediction models poses challenges owing to the intricate interplay of various fog formation mechanisms, data imbalances, and potential inaccuracies in reanalysis data.