Evaluating the Multiscale Implementation of Valid Time Shifting within a Real-Time EnVar Data Assimilation and Forecast System for the 2022 HWT Spring Forecasting Experiment

Nicholas A. Gasperoni aSchool of Meteorology, University of Oklahoma, Norman, Oklahoma

Search for other papers by Nicholas A. Gasperoni in
Current site
Google Scholar
PubMed
Close
,
Xuguang Wang aSchool of Meteorology, University of Oklahoma, Norman, Oklahoma

Search for other papers by Xuguang Wang in
Current site
Google Scholar
PubMed
Close
,
Yongming Wang aSchool of Meteorology, University of Oklahoma, Norman, Oklahoma

Search for other papers by Yongming Wang in
Current site
Google Scholar
PubMed
Close
, and
Tsung-Han Li aSchool of Meteorology, University of Oklahoma, Norman, Oklahoma

Search for other papers by Tsung-Han Li in
Current site
Google Scholar
PubMed
Close
Free access

We are aware of a technical issue preventing figures and tables from showing in some newly published articles in the full-text HTML view.
While we are resolving the problem, please use the online PDF version of these articles to view figures and tables.

Abstract

Multiscale valid time shifting (VTS) was explored for a real-time convection-allowing ensemble (CAE) data assimilation (DA) system featuring hourly assimilation of conventional in situ and radar reflectivity observations, developed by the Multiscale Data Assimilation and Predictability Laboratory. VTS triples the base ensemble size using two subensembles containing member forecast output before and after the analysis time. Three configurations were tested with 108-member VTS-expanded ensembles: VTS for individual mesoscale conventional DA (ConVTS) or storm-scale radar DA (RadVTS) and VTS integrated to both DA components (BothVTS). Systematic verification demonstrated that BothVTS matched the DA spread and accuracy of the best-performing individual component VTS. The 10-member forecasts showed BothVTS performs similarly to ConVTS, with RadVTS having better skill in 1-h precipitation at forecast hours 1–6, while Both/ConVTS had better skill at later hours 7–15. An objective splitting of cases by 2-m temperature cold bias revealed RadVTS was more skillful than Both/ConVTS out to hour 10 for cold-biased cases, while BothVTS performed best at most hours for less-biased cases. A sensitivity experiment demonstrated improved performance of BothVTS when reducing the underlying model cold bias. Diagnostics revealed enhanced spurious convection of BothVTS for cold-biased cases was tied to larger analysis increments in temperature than moisture, resulting in erroneously high convective instability. This study is the first to examine the benefits of a multiscale VTS implementation, showing that BothVTS can be utilized to improve the overall performance of a multiscale CAE system. Further, these results underscore the need to limit biases within a DA and forecast system to best take advantage of VTS analysis benefits.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Nicholas A. Gasperoni, ngaspero@ou.edu

Abstract

Multiscale valid time shifting (VTS) was explored for a real-time convection-allowing ensemble (CAE) data assimilation (DA) system featuring hourly assimilation of conventional in situ and radar reflectivity observations, developed by the Multiscale Data Assimilation and Predictability Laboratory. VTS triples the base ensemble size using two subensembles containing member forecast output before and after the analysis time. Three configurations were tested with 108-member VTS-expanded ensembles: VTS for individual mesoscale conventional DA (ConVTS) or storm-scale radar DA (RadVTS) and VTS integrated to both DA components (BothVTS). Systematic verification demonstrated that BothVTS matched the DA spread and accuracy of the best-performing individual component VTS. The 10-member forecasts showed BothVTS performs similarly to ConVTS, with RadVTS having better skill in 1-h precipitation at forecast hours 1–6, while Both/ConVTS had better skill at later hours 7–15. An objective splitting of cases by 2-m temperature cold bias revealed RadVTS was more skillful than Both/ConVTS out to hour 10 for cold-biased cases, while BothVTS performed best at most hours for less-biased cases. A sensitivity experiment demonstrated improved performance of BothVTS when reducing the underlying model cold bias. Diagnostics revealed enhanced spurious convection of BothVTS for cold-biased cases was tied to larger analysis increments in temperature than moisture, resulting in erroneously high convective instability. This study is the first to examine the benefits of a multiscale VTS implementation, showing that BothVTS can be utilized to improve the overall performance of a multiscale CAE system. Further, these results underscore the need to limit biases within a DA and forecast system to best take advantage of VTS analysis benefits.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Nicholas A. Gasperoni, ngaspero@ou.edu

1. Introduction

Convective system evolution is a complex process involving multiscale interactions between convective scales and the larger mesoscale and synoptic-scale environments in which these systems evolve (e.g., Houze 2004; Zhang et al. 2007; Rotunno and Snyder 2008; Majda and Stechmann 2016). As such, the accurate prediction of these systems relies upon data assimilation (DA) methods that analyze observations at each of these atmospheric scales, in addition to a forecast model having fine enough spatial resolution and a domain sufficiently large to adequately resolve all involved scales (e.g., Johnson et al. 2015; Wang et al. 2021; Fabry and Meunier 2020; Sun et al. 2022). A vital component of attaining this accuracy is the ability for the DA method to adequately represent the flow-dependent background error covariances of the model. Ensemble-based DA approaches are the popular choice given they provide a statistical representation of these flow-dependent error covariances, if the ensemble is sufficiently large. While many synoptic- and global-scale operational systems use ensemble-based DA methods and techniques (e.g., Wang et al. 2013; Buehner et al. 2015; Bonavita et al. 2016; Zhou et al. 2022), demands of computational resources at convective scales are larger given the higher volume of both observations and model grid points, as well as higher DA frequency (Houtekamer and Zhang 2016). Nevertheless, recent advances in computing have made such systems operationally feasible, such as the experimental NOAA High-Resolution Rapid Refresh Ensemble (HRRRE; Kalina et al. 2021), the experimental Warn-on-Forecast System (WoFS; Wheatley et al. 2015), and the next-generation Rapid Refresh Forecast System (RRFS; Carley et al. 2021; Banos et al. 2022) slated for operational implementation within the next couple years.

Although these multiscale convection-allowing ensemble (CAE) forecast and DA systems are now operationally feasible, the constraints of computational resources place stricter limits on the ensemble sizes that can be run. Some operational global DA and forecast systems can run with 80–256 ensemble members (Houtekamer and Zhang 2016); however, CAE systems such as HRRRE and RRFS are limited to 36 members (or a similar value) currently. Directly increasing ensemble size can provide benefits to operational ensemble DA systems (e.g., Houtekamer et al. 2014; Bowler et al. 2017; Lei and Whitaker 2017) and is thus a desirable goal, especially since CAE systems require larger ensemble sizes to properly sample forecast errors. However, there may be trade-offs in utilizing increasing computational resources for other aspects of the ensemble DA system, such as increasing the grid resolution of each ensemble member (Lei and Whitaker 2017), improving the model and physics schemes, or increasing the size of the regional domain for better prediction beyond the 1-day forecast timeframe (Clark et al. 2021).

Another method is to indirectly increase ensemble size by adding ensemble members valid at different forecast lead times to the central ensemble. This method, termed valid time shifting (VTS; Huang and Wang 2018; Gasperoni et al. 2022, 2023; Li et al. 2023, manuscript submitted to Mon. Wea. Rev., hereinafter LGWW) or time-expanded sampling (e.g., Xu et al. 2008; Zhao et al. 2015; Zhang et al. 2023),1 can triple the base ensemble size with only about 50% increase in added computational costs, saving nearly three-quarters of the added costs in directly tripling the ensemble size while attaining similar skill (e.g., Gasperoni et al. 2022). Further, VTS can sample time- and phase-related errors commonly found in numerical weather prediction, such as mistimed convection initiation or phase errors in frontal boundary and mesoscale convective system (MCS) propagation.

VTS has been tested across many scales, from convective to synoptic, and a variety of observation types for ensemble DA systems, including global-scale DA with a focus on tropical cyclone prediction (Huang and Wang 2018), regional 12-hourly mesoscale assimilation of conventional in situ and satellite observations (Zhao et al. 2015), hourly mesoscale assimilation of conventional in situ observations in an RRFS-like system (LGWW), and storm-scale assimilation of simulated and real radar reflectivity observations for the prediction of convective evolution (Gasperoni et al. 2022, 2023; Zhang et al. 2023; Xu et al. 2008). VTS can be designed for use in any ensemble-based DA method, including any flavor of the ensemble Kalman filter (EnKF; e.g., Houtekamer and Zhang 2016) or hybrid ensemble–variational (EnVar; e.g., Bannister 2017) DA system. The results of each of these studies demonstrate the versatility of applications for VTS, both as a cost-saving method and as a method to increase ensemble size and forecast skill of an ensemble prediction system at minimal added costs. The VTS may further be used in a “cost-neutral” approach, which can indirectly increase ensemble size in a way that does not incur added computational costs to the DA system, as suggested by Gasperoni et al. (2022).

Past studies have shown that VTS applied to individual scale components of a DA system can lead to improvements in MCS prediction within the Gridpoint Statistical Interpolation (GSI)-based EnVar system extended for convective scales by Wang and Wang (2017, 2021). Gasperoni et al. (2022) and LGWW demonstrated through case studies that properly configured VTS for individual storm-scale radar assimilation and mesoscale conventional in situ assimilation components, respectively, led to improvements in MCS prediction compared to baseline experiments without VTS. Gasperoni et al. (2023) further showed systematic benefits of radar VTS during the 2021 real-time Hazardous Weather Testbed (HWT) Spring Forecasting Experiment (SFE; Clark et al. 2021), with significant improvements to high-impact cases in terms of small-scale model-derived surrogate severe report prediction and subsequent 0–18-h precipitation forecasts of upscale MCS evolution.

The results of these experimental VTS-enabled RRFS systems are encouraging; however, to this point, VTS has not been simultaneously adapted for all scale components of an ensemble multiscale DA system for CAE forecast purposes. In this study, we will evaluate VTS for the full multiscale experimental RRFS system developed by the University of Oklahoma (OU) Multiscale Data Assimilation and Predictability (MAP) laboratory. That is, VTS will be enabled for both storm-scale radar DA and mesoscale in situ DA for the first time (BothVTS). To develop a complete understanding of the systematic impacts of VTS for all scale components of a multiscale CAE DA system, BothVTS will be compared with individual component VTS applications for storm-scale radar DA (RadVTS) and mesoscale conventional DA (ConVTS). These three configurations were implemented for the real-time 2022 HWT SFE by OU MAP for systematic evaluation and comparisons (Clark et al. 2022).

The rest of the paper is arranged as follows. Section 2 gives an overview of the experimental RRFS system and configurations of RadVTS, ConVTS, and BothVTS. In section 3, the real-time model and DA settings used for the 2022 HWT SFE are explained. Systematic results are given in section 4 of the DA and forecast system, including an additional soil replacement sensitivity experiment to help elucidate some of the systematic results as it relates to near-surface model cold biases found during the real-time experiment. Finally, a discussion and summary are given in section 5.

2. System description

a. Overview

The OU MAP system used in this study employs the GSI-based EnVar for DA that was augmented with direct convective-scale assimilation of radar reflectivity by Wang and Wang (2017, 2021). The reflectivity assimilation is accomplished by incorporating model reflectivity as a control state variable within the EnVar minimization, thereby circumventing the nonlinearity of the reflectivity observation operator (Wang and Wang 2017). The EnVar system was modified to an RRFS-like implementation by Gasperoni et al. (2023), including directly interfacing the GSI-based EnVar with the Finite Volume Cubed Sphere Limited Area Model (FV3-LAM; Black et al. 2021). The dynamical core of FV3-LAM is equivalent to the global FV3 (e.g., Harris and Lin 2013) that is currently operational within the GFS, but modified by Black et al. (2021) to run on a standalone grid for regional applications such as the RRFS. As described in full detail by Gasperoni et al. (2023), the interfacing of FV3-LAM within the convective-scale EnVar system included modifications to the FV3-LAM model interface of GSI for the direct assimilation of radar reflectivity.

This system is considered a sequential multiscale DA system where different scales are updated in separate steps assimilating different scales of observations (e.g., Zhang et al. 2009; Yussouf et al. 2013; Johnson et al. 2015). Conventional in situ observations are assimilated first to analyze larger meso- and synoptic scales, followed by assimilation of radar reflectivity observations to cover small convective scales [see Fig. 1 of Gasperoni et al. (2023), also Fig. 1 here]. The multiscale approach is facilitated in EnVar by adjustments in the covariance localization, needed to limit sampling error of limited ensemble sizes (e.g., Hamill et al. 2001; Houtekamer and Mitchell 2001). That is, smaller localization spatial scales ∼O(10–30) km are used for dense radar observations, while larger scales ∼O(100–1000) km are used for assimilating less dense conventional observations (e.g., Wang and Wang 2021; Yussouf et al. 2013; Johnson et al. 2015). Such a multiscale approach has also been termed “successive covariance localization” by Zhang et al. (2009). Specific details on the configurations are given in section 3.

Fig. 1.
Fig. 1.

Flowchart of sequential multiscale EnVar assimilation strategies with VTS enabled for radar DA (RadVTS), conventional in situ DA (ConVTS), or both radar and conventional DA (BothVTS). Colored components indicate steps that are modified from enabling VTS. Symbols F, A, and rA refer to first guess, analysis, and recentered analysis, respectively, with subscripts indicating ensemble member (1–N) or control (C). BEC refers to background error covariances. For more details on the sequential multiscale EnKF update, please see Gasperoni et al. (2023).

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

b. Valid time shifting configurations

VTS is straightforward to implement into the GSI-based EnVar workflow. After defining a time-shifting interval τ, the VTS expands a base ensemble size K by including from each ensemble member output at times before (tτ) and after (t + τ) the central analysis time t. By considering ensemble member output at different lead times within the base ensemble, the VTS-enabled ensemble covariances expand to a size 3 times (3K) as large as the base, which can be used for the EnVar update of conventional or radar observations (Fig. 1). The main EnKF component updating ensemble perturbations remains unchanged, analyzing only the central K-member ensemble. Expanding to this, the 3K ensemble avoids the incurred computational costs of running the forecast model for all 3K ensemble members, since the same number of ensembles is propagated by the forecast model as the base system without VTS (K members). There are some added costs, although generally we have found this cost increase to be limited to around 50% compared to a near-triple cost of directly increasing ensemble size (Gasperoni et al. 2022, 2023; LGWW). These added costs are related to the higher memory and I/O for the EnVar step for 3K background error covariances (∼27% in this study) and the longer forecast between DA cycles for each member needed to obtain the t + τ subensemble (∼23% for FV3-LAM with τ = 1 h).

The specific scale component implementations of VTS are shown in Fig. 1. Three experiment configurations are tested in this study: VTS enabled for storm-scale radar reflectivity DA only (RadVTS, Fig. 1a), VTS enabled for mesoscale conventional in situ DA only (ConVTS, Fig. 1b), and VTS enabled for both scale components (BothVTS, Fig. 1c). For RadVTS and ConVTS, the base K-member ensemble covariances are used for conventional and radar assimilation steps, respectively. Although VTS components only directly modify the EnVar analysis step, the effects are propagated to the ensemble analysis through the final recentering update around the EnVar control analysis and to the next cycle via the forecast model.

One important aspect of this study is how to properly choose time-shifting interval τ. Past studies have demonstrated that VTS is successful so long as τ is confined to a proper range (e.g., Gasperoni et al. 2022; Zhao et al. 2015; Xu et al. 2008). Further, the choice is dependent on factors including resolved scales of the observations and DA system, cycling frequency, and base ensemble size. Many previous studies have only tested τ less than the cycling frequency for convenience, such as 1–3 h for 6-h cycling of global DA in Huang and Wang (2018) or 2.5–7.5 min for 15-min cycling of radar DA in Zhang et al. (2023). However, there is no strict requirement that τ must be confined to less than the DA frequency, and in fact, Gasperoni et al. (2022) and LGWW found some better results with τ greater than the cycling frequency. The longer τ may be more costly due to longer forecasts required, but it has not been found to be a substantial barrier to success. A unique characteristic of this study in the choice of τ is that VTS is applied to two assimilation scales in a real-time multiscale CAE system, so it may be optimal to choose different τ for mesoscale in situ observation DA versus storm-scale radar DA. Based on case study results for hourly assimilation of radar and conventional observations, it was found that 30–60 min is best for radar DA (Gasperoni et al. 2023), while a longer 60–120-min range is best for conventional observation DA (LGWW). Given some overlap in the ranges, for the real-time implementation here, we have chosen 60-min as τ for both scale components of the DA system. We expect VTS to have benefits for both scale components with this τ, with the added benefit of limiting added computational costs and ensemble background forecast output needed between DA cycles to enable VTS for both components.

3. Real-time experiment design for the 2022 HWT SFE

a. Model and DA cycling configurations

Settings chosen for FV3-LAM model configurations largely follow the configurations used during HWT 2021 (Gasperoni et al. 2023), with only a few modifications. The model domain covers the continental United States (CONUS) with 3-km horizontal grid spacing (Fig. 2), and the vertical grid contained 65 stretched levels with a 2-hPa top. The same “FV3_HRRR” physics suite was used for all ensemble members (Table 1). Updated model and physics versions within the FV3-LAM were used for the 2022 HWT SFE in conjunction with other model contributors in the experiment (Clark et al. 2022).

Fig. 2.
Fig. 2.

(left) Hourly DA cycling configuration and (right) CONUS domains, including the FV3-LAM integration domain (red) and output Lambert conformal domain (gray).

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

Table 1.

Physics parameterizations composing the “FV3_HRRR” physics suite.

Table 1.

The hourly DA cycling strategy is also shown in Fig. 2. A 36-member ensemble was initialized each day from the Global Ensemble Forecast System (GEFS; Zhou et al. 2022) valid at 1800 UTC, with 30 members from the GEFS 1800 UTC analysis and 6 time-lagged members from the 6-h forecast off the GEFS 1200 UTC analysis. The EnVar control was initialized separately from the 1800 UTC GFS. After a 1-h spinup period, the hourly multiscale DA of conventional and radar observations was conducted from 1900 to 0000 UTC.

DA settings are shown in Table 2, including Gaspari and Cohn (1999) covariance localization cutoff radii for different observations, covariance inflation, and ensemble sizes for EnVar and EnKF components. The base localizations for radar and conventional observations follow Wang and Wang (2021), while the doubled localization widths of VTS-enabled components follow sensitivity experiments (Gasperoni et al. 2022; LGWW). The larger localization widths accompany the increased 108-member ensemble size of VTS-enabled covariances, which has reduced sampling error. One change from the 2021 HWT SFE was to incorporate an adaptive inflation technique with relaxation to prior spread (RTPS; Whitaker and Hamill 2012). The base value of 95% RTPS is used as in the past OU MAP real-time runs to inflate analysis ensemble covariances; however, a reduced value of 30% is used for hydrometeor and reflectivity state variables over locations where the observed Multi-Radar Multi-Sensor (MRMS; Smith et al. 2016) composite reflectivity is 0 dBZ (clear air). This “observation-aware” inflation technique prevents RTPS from inflating spurious hydrometeors since they should ideally have no spread in these areas, thus reducing the amount of spurious precipitation in the subsequent ensemble member forecasts.

Table 2.

Data assimilation settings for the three OU MAP real-time system experiment configurations. The terms H (km) and V (scale height in natural log of pressure) refer to horizontal and vertical localization scales, respectively. Bold indicates settings of VTS-enabled components.

Table 2.

b. Free forecast settings and verification metrics

Ten-member free forecasts are initialized from the final 0000 UTC ensemble analysis each day. This includes the control and recentered ensemble members 1–9. GEFS 3-hourly forecasts provide unique boundary conditions for each ensemble member during the free forecast period. The use of 10 members is a practicality which follows previous results that demonstrated limited added value to CAM forecasts with more than 10 members (e.g., Clark et al. 2018; Schwartz et al. 2014). An addition to the 2022 HWT SFE is enabling all stochastic physics recently added to the RRFS workflow (Beck et al. 2022) for the 0000 UTC ensemble member forecasts to increase the spread during the 36-h free forecast period. This includes stochastic parameter perturbations (SPP), and tendency-based stochastic energy backscatter (SKEB), stochastic perturbations of physics tendencies (SPPT), and stochastic perturbed humidity (SHUM) schemes (e.g., Duda et al. 2016; Jankov et al. 2019; Zhu et al. 2019).

The HWT SFE took place from 2 May to 3 June 2022, and OU MAP ran the three real-time VTS configurations from Monday through Friday2 during that period, resulting in 24 cases total for this study. Statistical verifications are done on these cases both for the CONUS domain (120°–70°W longitude, 25°–50°N latitude) and covering small daily “domains of interest” set by HWT where there is the greatest severe weather potential or other interesting convective forecast challenges. DA statistics were calculated against the assimilated observations (MRMS and conventional observations) to gauge the performance of the DA cycling, in terms of both root-mean-square fit (RMSF) and bias. Forecast precipitation areas were verified against observed MRMS reflectivity and gauge-corrected 1-h precipitation (Zhang et al. 2016) products in a neighborhood-based approach, with objective scores calculated for the 10-member ensemble. These scores include fractions skill score (FSS; Roberts and Lean 2008), frequency bias, and reliability diagrams. A 48-km neighborhood radius is used for the CAE free forecasts, with neighborhood probabilities representing an average number of grid points where a defined threshold is exceeded (Schwartz and Sobash 2017). Additional verification of forecast surface thermodynamic fields was conducted against the 2.5-km Real-Time Mesoscale Analysis (RTMA; e.g., De Pondeca et al. 2011), regridded to the output domain (Fig. 2). This study focuses on the 0–18-h forecast period where forecast differences may be mainly attributed to DA differences rather than model errors. Statistical significance of differences in scores is done using permutation resampling (Hamill 1999), with differences considered significant at 90% confidence in a two-tailed test. This level was chosen due to the somewhat limited sample size of 24 cases.

4. Results

a. DA cycling performance

We first compare and contrast the case-average DA statistics in terms of first-guess RMSF, ensemble spread, and analysis RMSF of radar reflectivity in Fig. 3. The analysis fit (Fig. 3b) shows that RadVTS and BothVTS have 1–2 dBZ closer fit to analysis than ConVTS, partially as a result of the increased spread by about the same amount (Fig. 3a). This results in RMSF to first guess forecasts that are 0.5–1 dBZ lower than ConVTS. The results demonstrate the analysis benefits of the radar VTS that is implemented in RadVTS and BothVTS, with minimal differences between the two experiments statistically.

Fig. 3.
Fig. 3.

DA statistics of three VTS configurations relative to MRMS observed reflectivity (dBZ) for (a) first-guess RMSF and ensemble spread (dashed) and (b) analysis fit. Statistics were computed for grid points with MRMS observations ≥ 10 dBZ and averaged over all 24 cases from the SFE, with 95% confidence intervals shown (shading).

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

Next, the case-average DA statistics are compared for conventional assimilated observations in temperature, specific humidity, and wind in Fig. 4 as “sawtooth” figures of first guess and analysis RMSF and bias. In this case, ConVTS and BothVTS perform nearly identically in all variables, with benefits over RadVTS. The largest differences are in temperature, where BothVTS and ConVTS have up to 0.4°C closer analysis fit to observations (i.e., cycle 5) and up to 0.2°C lower error in the first guess compared to RadVTS (Fig. 4a). Differences in moisture are more pronounced in the last three cycles, up to 0.2 g kg−1 closer fit to analysis and 0.1 g kg−1 lower first-guess error (Fig. 4b). The impact of conventional observations VTS is least for wind, although still consistently lower than the experiment without conventional VTS (RadVTS). The effect of VTS increasing ensemble spread is most pronounced for temperature and wind, with values that approximately match the first-guess errors. While VTS does increase the spread of specific humidity, it is not as large and remains lower than the first-guess error for all configurations, indicating lingering underdispersion. In terms of bias (Figs. 4d–f), there is a strong cold bias throughout the DA cycling period up to −1.2°C, a moist bias seen in the last three cycles, and minimal wind bias. While conventional VTS does attempt to correct these thermodynamic biases, the biases are higher in magnitude compared to the correction by VTS and persist through first-guess forecasts between cycles. The strongest cold bias regions are located in the northeast and western CONUS, although some regional variation exists such as a weak warm bias over the central Great Plains (not shown).

Fig. 4.
Fig. 4.

Sawtooth plot of (top) first guess and analysis RMSF (solid lines) and (bottom) bias, computed against surface observations in (a),(d) temperature (°C); (b),(e) specific humidity (g kg−1); and (c),(f) wind (m s−1). Dashed lines in (a)–(c) indicate first-guess ensemble spread for each experiment.

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

Taken together, Figs. 3 and 4 clearly indicate the impacts of conventional VTS on conventional variables and the impacts of radar VTS on reflectivity. BothVTS manages to capture the analysis cycling benefits of both scale components at nearly identical accuracy to their individual implementations (i.e., BothVTS ≈ RadVTS accuracy for reflectivity DA; BothVTS ≈ ConVTS accuracy for in situ DA). Next, we will evaluate how these analysis benefits translate within the 18-h free forecast period.

b. Verification of free forecasts

Ensemble FSS of 1-h precipitation from low to high thresholds (2.54, 6.35, 12.7 mm) is shown in Fig. 5, aggregated over all 24 cases from 2022 HWT SFE. RadVTS tends to have the best scores in the early forecast hours, especially with light and medium thresholds showing significant differences compared to ConVTS and BothVTS out to hour 4. However, in the midforecast hours (7–15), ConVTS and BothVTS have higher scores than RadVTS, with statistically significant differences seen for hours 9–14 at 2.54 mm and 2–4 h at higher thresholds. BothVTS and ConVTS perform very similarly, with minimal significant differences.

Fig. 5.
Fig. 5.

FSS of 1-h precipitation greater than (a) 2.54 mm (0.1 in.), (b) 6.35 mm (0.25 in.), and (c) 12.7 mm (0.5 in.), aggregated over all 24 cases from SFE. Statistically significant differences (90%) between pairs of experiments are shown in markers, colored by the experiment with higher FSS.

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

Since BothVTS had the best overall analysis of both conventional and radar reflectivity (Figs. 3 and 4), it is somewhat surprising that the benefits are not also seen over RadVTS at early forecast hours in FSS (Fig. 5). We found subjectively that BothVTS tended to have enhanced spurious precipitation for some cases compared to RadVTS and that it may be related to the substantial cold bias (i.e., Fig. 4d). To investigate this hypothesis objectively, we split cases into two categories according to the 2-m temperature bias: “cold-biased” (CB) and “less-biased” (LB). This bias was determined by averaging for each case the 1-h control forecast (valid 0100 UTC) temperature error relative to the 2.5-km RTMA analysis over the HWT primary domains (Fig. 6). Using a cutoff value of −1°C, we split the cases into 9 CB (≤−1°C) and 15 LB (>−1°C) cases. Subjectively, it was apparent that most of these CB cases were associated with primary domains in the north and east, and they had substantial broad cold biases throughout (see examples in Fig. 6b).

Fig. 6.
Fig. 6.

(a) Time series of HWT-primary-domain-averaged temperature bias (K), computed relative to RTMA analysis valid at 0000 UTC each day. (b) Examples of forecast-minus-RTMA 2-m temperature differences for three CB cases, taken from the 2022 HWT SFE model comparisons website (https://hwt.nssl.noaa.gov/sfe_viewer/2022/model_comparisons).

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

The grouped FSS is shown in Fig. 7. For the CB group, there is a large effect on skill, with RadVTS having significantly higher FSS than BothVTS and ConVTS from forecast hours 1–9, including differences as high as 0.15 at low and medium precipitation thresholds. BothVTS does match RadVTS in heavy precipitation for hours 1–3 and has some higher skills than ConVTS at these times, but otherwise is similar to ConVTS skill throughout. On the other hand, for LB cases (Figs. 7b,d,f), BothVTS is at least as skillful as RadVTS at early hours for all thresholds, even exceeding RadVTS for heavy precipitation with 3 h having statistically significant differences. The benefits of BothVTS and ConVTS are seen in later hours 6–15 over RadVTS; however, BothVTS has twice as many significantly higher differences with RadVTS (22) than ConVTS compared to RadVTS (11). In fact, some benefits of BothVTS over ConVTS can be seen in higher skill at early hours, especially hours 1–5 at the heavy 12.7-mm threshold (Fig. 7f).

Fig. 7.
Fig. 7.

As in Fig. 5, but grouped by (left) CB cases and (right) LB cases.

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

To expand on this result, we examined frequency bias in Fig. 8 for each group of cases. LB cases tend to be close to unbiased especially for weak precipitation thresholds (0.254, 2.54 mm), although a high-frequency bias around 1.2–1.8 is seen at medium and heavy thresholds (6.35, 12.7 mm). There is a strong dichotomy between CB and LB cases, with CB cases having generally much higher precipitation frequency bias at all thresholds. Comparing VTS configurations, BothVTS and ConVTS have higher frequency biases than RadVTS. Although generally beneficial for 2.54-mm precipitation in LB cases (Fig. 8b), at heavier thresholds, this enhances a positive precipitation bias in BothVTS and ConVTS. Interestingly, the CB cases show a much larger difference between experiments with the conventional VTS component (BothVTS and ConVTS) and those without conventional VTS (RadVTS). In other words, the increase of frequency bias from LB cases to CB cases is much higher for Both/ConVTS than it is for RadVTS, especially at heavy thresholds. This difference indicates that Both/ConVTS have much more spurious and spuriously strong precipitation than RadVTS, leading to lower skill in FSS (Figs. 7a,c,e).

Fig. 8.
Fig. 8.

Frequency bias in 1-h precipitation computed for (a) 0.254-, (b) 2.54-, (c) 6.35-, and (d) 12.7-mm thresholds. Solid lines are for LB cases, and dashed lines are for CB cases.

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

Reliability diagrams of grouped cases are displayed in Fig. 9 for precipitation forecasts, aggregated over forecast hours 1–12. Reliability assesses the calibration of probabilistic forecasts against observed frequencies. At the 2.54-mm threshold, RadVTS is the most reliable for CB cases (closest to diagonal) at forecast probabilities above 40% (Fig. 9a) and BothVTS is generally similar to ConVTS with the exception of a small improvement at highest probabilities ≥ 80%. In the LB group, the differences among experiments are far reduced at 2.54 mm, with ConVTS having slightly better reliability at 40%–80% probabilities (Fig. 9b). For heavy 12.7 mm h−1 precipitation, RadVTS is better than BothVTS and ConVTS for CB cases, although extreme overforecasting limits the skillfulness of even RadVTS (Fig. 9c). Similarly, for LB cases, probabilities below 40% are not skillful, which further indicates spuriously strong precipitation of the system (Fig. 9d). However, BothVTS is skillful and more reliable for 40%–70% probabilities compared to individual scale VTS configuration, due in part to a more limited frequency of these forecasts.

Fig. 9.
Fig. 9.

(top) Reliability diagrams and (bottom) forecast histograms of gridpoint occurrence of probabilities within each bin for (a),(b) 2.54 mm of CB and LB cases, respectively; and (c),(d) 12.7-mm precipitation of CB and LB cases, respectively, aggregated over forecast hours 1–12.

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

Finally, the effect of VTS on thermodynamic forecasts is displayed in Fig. 10, with error computed against RTMA over the HWT primary domains and averaged by CB or LB grouped cases. In terms of both temperature and moisture, LB cases have lower RMSE and reduced bias compared to CB cases, with generally larger differences in terms of temperature (up to 1°C) than dewpoint (up to 0.3°C). There is an apparent diurnal trend in error and bias, with magnitudes decreasing after 0000 UTC in the overnight and early morning hours, followed by a ramp-up of error in the late morning and afternoon hours after 1200 UTC.

Fig. 10.
Fig. 10.

Control forecast RMSE and bias relative to RTMA analyses for (a),(b) 2-m temperature and (c),(d) dewpoint computed over HWT domains of interest for each case. Solid lines are averages over CB cases for each experiment, and dashed lines are averages over LB cases. Markers in each panel indicate statistically significant differences as in Figs. 5 and 7 for experiments among CB cases (top of panels) and LB cases (bottom of panels).

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

In terms of VTS comparisons, the effect of more accurate analysis from conventional component VTS lasts longer for temperature forecasts of LB cases than CB cases. Specifically, Both/ConVTS have lower RMSE by 0.1–0.2°C out to 12 h and reduced bias by about 0.2°C out to 18 h (Figs. 10a,b). In contrast for CB cases, Both/ConVTS are only reduced in RMSE compared to RadVTS for 3 h with limited statistical significance, followed by RadVTS having significantly lower RMSE than Both/ConVTS for hours 6–10 (Fig. 10a). The bias reduction in CB cases is also limited to only 6 h, but differences converge as the initial differences in model state are overpowered by systematic FV3-LAM biases (Fig. 10b). In terms of dewpoint forecasts, Both/ConVTS have lower RMSE than RadVTS for CB and LB cases, with largest differences in hours 3–9 for LB cases. There is very little difference in dewpoint bias for CB cases; however, there are large statistically significant differences for LB cases especially from hours 3 to 15. While there is a moist bias initially (with Both/ConVTS having higher bias than RadVTS), in the later morning through early afternoon hours (9–15), a dry bias is developed in all configurations with RadVTS having larger magnitude bias than Both/ConVTS. Comparing ConVTS to BothVTS, most of the differences are small and insignificant with only a few exceptions: BothVTS has lower RMSE of dewpoint for hours 5–10, and the temperature bias of ConVTS is significantly lower (although only by about 0.05°C).

In summary, the grouped verifications indicate that the multiscale VTS can match or exceed 1-h precipitation skill of individual scale VTS, provided the cases do not have large-scale cold biases present (less than −1°C). However, this finding also suggests that the extreme cold bias is negatively impacting the conventional component of VTS in some way.

c. Representative case examples

In this section, representative cases are shown to help synthesize the benefits of BothVTS seen objectively in the previous sections, particularly for LB cases. Figure 11 demonstrates two cases where BothVTS outperformed RadVTS in the midrange forecast hours (7–15). Between 0300 and 0500 UTC 9 May 2022, a strong spurious storm develops in RadVTS in Nebraska and intensifies over South Dakota, affecting the evolution of the MCS to such an extent that the orientation is nearly perpendicular to observations with the leading edge erroneously propagating into western Minnesota by 1100 UTC (Fig. 11a). BothVTS did not feature that spurious storm in southeast South Dakota, tied to a reduction in a local moist bias over the Nebraska Panhandle that conventional VTS could better correct (not shown). As such, the MCS develops in approximately the correct location and with improved orientation (Fig. 11b).

Fig. 11.
Fig. 11.

Control member and observed MRMS 1-h precipitation (color fill) valid at (a)–(c) 1100 UTC 9 May 2022 and (d)–(f) 0900 UTC 5 May 2022. Thin black, thick black, and magenta contours in (a), (b), (d), and (e) are observed MRMS precipitation at 0.1, 0.3, and 0.75 in. (2.54, 7.62, and 19.05 mm), respectively.

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

Another example is from 5 May 2022 (Figs. 11d–f) from an enhanced severe weather risk case over Oklahoma. BothVTS and RadVTS tended to have too much spurious activity in the early hours of the forecast; however, the upscale growth and evolution were generally captured over the correct locations by 0600–1200 UTC. Differences among experiments can be seen in the structural details of this upscale evolution. At 0900 UTC, RadVTS has a broad area of over 0.5-in. (12.7 mm) precipitation in Arkansas with a forward propagation error relative to observations (Fig. 11d). This broad area is reflective of a linear MCS that extended too far northward into Missouri. On the other hand, BothVTS is able to capture the small intense [≥1.00 in. (25.4 mm)] areas of precipitation in nearly the exact location and orientation as observations (Fig. 11e), due to the development of a smaller bow echo over northwest Arkansas as was observed. Further, although each experiment has overly strong precipitation in Oklahoma, BothVTS is able to capture the location of enhanced heavy precipitation ≥ 0.75 in. (19.05 mm) over southwest Oklahoma, while RadVTS is located too far eastward. Note that ConVTS is not shown for these cases as the results were subjectively very similar to BothVTS.

The other benefit of BothVTS in LB cases was in better prediction of heavy precipitation in early forecast hours (1–4), especially compared to ConVTS. Two case examples of this are shown in Fig. 12. The first example is from 0300 UTC 3 May 2022 (Figs. 12a–c), where a mature linear segmented system was already present at the time of the 0000 UTC initial conditions. By 0300 UTC, the areas of heavy (≥0.5 in.) precipitation are precisely located over the observed areas for BothVTS, while there is a slow propagation error in ConVTS resulting in a northward displacement. Furthermore, BothVTS does better to constrain the widespread spurious activity seen in ConVTS over northwest Arkansas and southwest Missouri, as well as improves the prediction of lighter precipitation in south-central Missouri where a gap is seen in ConVTS. The second example is from 12 May 2022, which also featured an ongoing mature MCS over Minnesota (Figs. 12d–f). At 0300 UTC, BothVTS has heavy precipitation over eastern Minnesota in approximately the correct location and orientation, while ConVTS is lagging behind over central Minnesota. In both of these cases, the VTS-enabled radar assimilation resulted in a better analyzed mature MCS, leading to stronger cold pool maintenance that helped evolve the MCS more accurately compared to observations. RadVTS was not shown for these examples since its forecast was subjectively similar to BothVTS.

Fig. 12.
Fig. 12.

As in Fig. 11, but for cases valid at (a)–(c) 0300 UTC 3 May 2022 and (d)–(f) 0300 UTC 12 May 2022.

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

d. Diagnosing impact of extreme bias on VTS

Representative cases in the previous section focused mainly on LB cases; however, the question remains why does Both/ConVTS have substantially worse performance than RadVTS for extreme CB cases? It was found that the initial extreme cold bias was inherited from the FV3-based GFS external initial conditions, which has been known to exhibit large regional, diurnal cold biases similar to those seen in this study (e.g., Chen et al. 2021; Yang 2020). Similar biases were also reported in the northeast by Hu et al. (2023) for their experimental multiphysics RRFS ensemble during the winter season. To investigate the impact of cold bias on the VTS comparison here, we conducted a sensitivity experiment designed to reduce the model cold bias by utilizing available soil data from the experimental RRFS system, which is updated hourly using the atmospheric analysis of the lowest model level following the strategy employed for the HRRR (Dowell et al. 2022). To conduct this experiment, a replacement of the GFS soil data with the control RRFS hourly soil update was performed prior to the initialization of each cycle forecast. This soil replacement sensitivity experiment was conducted for BothVTS and RadVTS to see how the VTS comparison is affected by biases present in the modeling system. The chosen case was 20 May 2022, with the primary severe threat in the northeast where the surface cold bias was most prominent, and which featured large spurious MCS development within the real-time runs that we hypothesized was impacted by this bias.

Figure 13 shows the sawtooth of temperature and moisture observations before and after the soil replacement, computed over the primary domain for that case. The RMSF is reduced by about 0.5°C for each VTS experiment after soil replacement, in conjunction with a near 50% reduction of the cold bias (Figs. 13a,b). There is an accompanied improvement also for the moisture RMSF (Fig. 13c), with a substantial reduction of the moist bias in the final three cycles (Fig. 13d). Note that BothVTS generally maintains an advantage in closer analysis fit and reduced first-guess error and bias throughout compared to RadVTS. Furthermore, the error growth in the cycle forecasts between analysis and the next hour’s first guess is substantially reduced for BothVTS, especially in the final three cycles. The RMSF and bias of wind were nearly identical to the original HWT experiments (not shown).

Fig. 13.
Fig. 13.

As in Figs. 4a, 4b, 4d, and 4e, but computed for the 20 May 2022 case over the HWT primary domain only. Solid lines are RadVTS and BothVTS real-time experiments ran during HWT, and dashed lines are RRFS soil replacement sensitivity experiments run for RadVTS and BothVTS.

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

The overall improved thermodynamic analysis accuracy led to improvements in the forecasts of composite reflectivity and 1-h precipitation for each experiment (Fig. 14). A reduction of the high-frequency bias is seen in the RRFS soil replacement experiments. In particular, at more extreme thresholds such as 40 dBZ (Fig. 14b), there is a notable decrease in BothVTS, becoming much closer to the bias level of RadVTS after soil replacement. This reduced bias was accompanied by some improved FSS scores, such as 30-dBZ composite reflectivity (Fig. 14c) and heavy 12.7-mm precipitation (Fig. 14d) in forecast hours 2–12. After soil replacement, BothVTS is closer or even better in skill than RadVTS, especially for hours 3–6. Essentially, a less-biased forecast reduced the negative impacts of BothVTS more than in RadVTS, allowing for the skill to be comparable or even better for BothVTS over RadVTS.

Fig. 14.
Fig. 14.

Forecast verification metrics of 20 May 2022 case in (a),(b) frequency bias for 20- and 40-dBZ composite reflectivity thresholds; (c) FSS for 30-dBZ composite reflectivity; and (d) FSS for 12.7-mm precipitation. Solid lines are forecasts ran during HWT real time, and dashed lines are RRFS soil replacement sensitivity experiment forecasts.

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

Composite reflectivity is shown for BothVTS from the real-time run during HWT and the RRFS soil replacement experiment in Fig. 15. At 0300 UTC (Figs. 15a–c), the HWT run shows a large spuriously strong storm in Nebraska, with many additional storms initiating nearby. Further, the east–west line in Wisconsin is too strong and widespread compared to reality. However, the RRFS soil replacement experiment limits the amount of convection over the Wisconsin area and has improved the prediction of the western extent of the line. Further, the strong storm in Nebraska is more limited in size, although still overly strong compared to observations. It also limits the initiation to two distinct areas, where weaker convection is seen in reality. By 0600 UTC (Figs. 15d–f), these strong storms in Nebraska and Iowa grow upscale in each run into a largely spurious MCS. The soil replacement experiment is substantially smaller and weaker in intensity than the HWT run. Further, the reduced bias continues to better restrain convection over the Wisconsin area. Although the soil replacement experiment does not eliminate the spuriously strong convection over Iowa, there is a substantial reduction that is much closer to RadVTS after soil replacement (not shown).

Fig. 15.
Fig. 15.

Composite reflectivity (color fill) of BothVTS from (a),(d) real-time HWT run; (b),(e) RRFS soil replacement sensitivity experiment; and (c),(f) MRMS observations. Thin and thick black contours in (a), (b), (d), and (e) show 20- and 35-dBZ MRMS reflectivity, respectively. Plots are valid for (top) 0300 and (bottom) 0600 UTC 20 May 2022.

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

The soil replacement experiment demonstrates that the extreme cold bias in the background influenced by the soil state has a key effect on the comparison of VTS experiments. That is, BothVTS is closer or even better than RadVTS in the presence of a less-biased model, rather than substantially worse as was seen in the HWT experiment. This was demonstrated with the 20 May 2022 case and further confirmed with a second soil replacement experiment that was conducted (not shown). The next step is to diagnose how conventional VTS leads to a worse forecast in the presence of model bias, despite the improved accuracy of the analysis (e.g., Fig. 13). To examine this problem, we looked at plots of surface-based convective available potential energy (CAPE) differences as well as variance in thermodynamic variables. As seen in Fig. 16, although some small-scale variability exists, BothVTS had generally higher CAPE (by 100–250 J kg−1) than RadVTS over the Iowa region from the HWT real-time experiment, which may have led to the stronger MCS in this region. However, after soil replacement, BothVTS had a widespread reduction in CAPE of 250–1000 J kg−1, with some local areas having even larger reductions. This reduced CAPE had a clear influence in reducing the spuriously strong MCS development after soil replacement.

Fig. 16.
Fig. 16.

(a),(b) CAPE differences (J kg−1) of experiments indicated in panel titles; and (c),(d) percentage variance difference of 108-member ensemble variances from BothVTS (HWT run) relative to 36-member ensemble variances from RadVTS (HWT run) for temperature and dewpoint, respectively.

Citation: Weather and Forecasting 38, 11; 10.1175/WAF-D-23-0096.1

To further investigate how this ties to the conventional VTS differences of the DA, Figs. 16c and 16d show the relative percent variance improvement of 108-member VTS ensembles from BothVTS compared to the 36-member base ensemble variance from RadVTS for temperature and specific humidity. At each grid point, the ratio of the 108-member variance to the 36-member variance is computed. Different physical flow-dependent features can be seen in the variance plots, such as the cold front over Nebraska and a warm front in northern Iowa, and in both variables, the variance is generally higher at nearly all locations than without VTS. However, there is a distinct difference in the amount of this variance increase when comparing variables, specifically temperature having much larger widespread increase than moisture. We found this was due to the influence of radiation, specifically the temperature decrease associated with sunset, and is seen most strongly in the final 2–3 cycles of the DA. This VTS variance increase reflects a physical feature that represents, e.g., errors associated with radiation parameterization schemes. However, when coupled with an extreme cold bias of the first guess and its associated large negative observation innovations, this led to a much stronger correction in temperature analysis (surface warming) than in moisture analysis (surface drying). The combined effect is to increase CAPE due to the larger amount of warming than drying from nearby observations and is a stronger effect of VTS that accounts for the added variance associated with rapid temperature changes near sunset.

5. Summary and discussion

Multiple valid time shifting (VTS) configurations were explored in an hourly RRFS-like system coupled with the FV3-LAM for the real-time 2022 HWT SFE. Here, the multiscale implementation of VTS was studied for the sequential multiscale assimilation of mesoscale conventional observations assimilation and storm-scale radar reflectivity assimilation within a system developed by OU MAP. This system is a GSI-based EnVar system which has been extended for the direct assimilation of radar reflectivity observations by Wang and Wang (2017, 2021). VTS can increase the ensemble size available for flow-dependent covariances of an ensemble-based DA system without adding extra costly members, by incorporating base ensemble member forecast output at additional valid times before and after the central ensemble analysis time. Specifically, the base ensemble size of 36 was increased to 108 when incorporating subensembles before and after analysis time, tripling the size of ensemble background covariances.

In this study, three VTS configurations were examined. The first two are individual scale component implementations of VTS for conventional in situ DA only (ConVTS) and VTS for radar reflectivity DA only (RadVTS). These configurations serve as references for comparison to an implementation which combines VTS for both scale DA components in the multiscale DA system (BothVTS). A 1-h time-shifting interval is defined for all VTS components, which is shown to be beneficial for individual implementations of ConVTS (LGWW) and RadVTS (Gasperoni et al. 2022). The 2022 HWT SFE implementation largely follows the 2021 HWT SFE DA settings of Gasperoni et al. (2023). That is, each day, a 36-member ensemble is initialized at 1800 UTC from GEFS external data, with hourly multiscale DA of in situ and radar observations performed from 1900 to 0000 UTC each day. The final analysis initializes a 10-member ensemble forecast and focuses on VTS comparisons covering the 0–18-h forecast period on the prediction of convective systems over the CONUS.

Results demonstrated that BothVTS captures benefits of DA accuracy in both best-performing individual-scale VTS configurations (i.e., RadVTS accuracy for reflectivity; ConVTS accuracy for conventional variables). The improved accuracy was accompanied by increased spread for VTS-enabled components. Results of the free forecasts showed that BothVTS tended to closely follow ConVTS in terms of FSS, with minimal significant differences. This included lower skill for early hours of the forecasts (1–6) relative to RadVTS, but improved prediction for midlater forecast hours (7–15). Subjective evaluation revealed a clear sensitivity of the resulting forecasts based on an underlying, persistent cold near-surface model bias. Cases were split into two groups by the magnitude of the 2-m temperature bias at 0000 UTC each day over the HWT primary domains, relative to the RTMA: a cold-biased (CB) group with bias ≤ −1°C and a less-biased (LB) group with bias > −1°C. This resulted in 9 CB cases and 15 LB cases. Verification of the two groups had distinct differences. In the CB group, RadVTS had statistically significantly higher skill than ConVTS and BothVTS for hours 1–10. Further, RadVTS had better reliability and less extremely high-frequency bias than Both/ConVTS. On the other hand, LB cases showed that BothVTS had skill at or above the best-performing individual component VTS throughout the forecast hours 1–15, with the biggest benefits in heavy 12.7-mm precipitation in terms of FSS and reliability.

Representative case examples demonstrated that BothVTS has advantages over RadVTS in later forecast hours for cases where new and developing discrete convection at early hours grew upscale into a mature MCS at later hours, especially in the presence of local errors in the environment that BothVTS can better correct. This tended to decrease a fast forward propagation error that was seen for RadVTS in these cases. On the other hand, cases that featured a nearly mature MCS at initialization tended to show the most subjective benefits of BothVTS over ConVTS, owing to the better analyzed mature MCS from radar DA leading to more accurate evolution in the early forecast hours. In these cases, BothVTS tended to correct for a slow propagation error in MCS evolution. In other words, BothVTS is able to subjectively capture forecast benefits of the better performing individual component VTS over different cases and hours. So BothVTS is able to correct deficiencies in individual scale VTS implementations for LB cases.

The deficiency of BothVTS for CB cases was further examined through a soil replacement sensitivity experiment of 20 May 2022. Before each cycle first-guess forecast, the soil state was replaced with the hourly one-way updated soil state from the experimental RRFS system. This soil replacement experiment was conducted for BothVTS and RadVTS configurations. Results demonstrated that at least 50% of the cold bias was reduced through the use of the RRFS soil state, with some improvement to the moist bias in the final three cycles as well. The impact of conventional VTS was still evident in BothVTS compared to RadVTS after soil replacement, and this resulted in forecasts with substantially reduced frequency bias and generally improved skill. Further, the improvements to BothVTS were larger than those seen for RadVTS after soil replacement. The soil replacement substantially reduced a large spurious MCS and constrained precipitation for a line of storms in Wisconsin. This was tied to a reduction of up to 1000 J kg−1 in CAPE in BothVTS relative to the real-time HWT run. Diagnostics revealed that VTS-enabled variances had larger variance increase in temperature relative to moisture. This variance difference, combined with an extreme positive observation innovation in temperature from the extreme cold model bias, resulted in larger analysis corrections to temperature (warming) than moisture (drying), leading to the higher CAPE values and subsequent enhanced spurious precipitation. Thus, although VTS clearly had better accuracy, the influence of underlying model biases may lead to worse results for some cases. This underscores the need to limit biases within the DA and forecast system in order to best take advantage of VTS. Correcting systematic model biases should lead to better utilization of the VTS strategy to increase ensemble size, as was demonstrated in the soil replacement sensitivity experiment.

This study demonstrated a successful sequential multiscale implementation of VTS for assimilating in situ mesoscale and storm-scale observations in the CAE system covering the CONUS. Although 1-h time shifting worked well in this study, more improvements may be seen with different time-shifting intervals for conventional and radar components. Future work will seek to further improve the utilization of BothVTS by examining a technique that blends VTS covariances from multiple time-shifting valid times. Such blending may also work well for the eventual implementation of VTS into a more-advanced simultaneous multiscale method in ensemble-variational or pure ensemble frameworks (e.g., Caron and Buehner 2018; Huang et al. 2021; Wang et al. 2021; Wang and Wang 2023). These methods assimilate all observations and all scales simultaneously. The sensitivity of optimal time-shiftings on the underlying flow from different cases will also be examined in a future study.

1

The rest of this paper employs the VTS terminology.

2

With the exception of Memorial Day, 30 May 2022.

Acknowledgments.

This work was supported by NOAA Grants NA19OAR4590138 and NA19OAR4590231. The authors gratefully acknowledge the Texas Advanced Computing Center (TACC; http://www.tacc.utexas.edu) at The University of Texas at Austin for providing priority queue access and advanced scratch disk space on Frontera during the real-time experiment. Data storage, forecast verification, and plotting were done with resources from the OU Supercomputing Center for Education and Research (OSCER).

Data availability statement.

Model data produced from this study are tape archived at OSCER’s OU and Regional Research Store (OURRstore) archive and can be made available upon request to the corresponding author. RAP prepbufr observations, MRMS observations, and GEFS model data were obtained in real time from http://nomads.ncep.noaa.gov/pub/data/nccf/com/rap/prod, https://mrms.ncep.noaa.gov/data/2D/, and https://nomads.ncep.noaa.gov/pub/data/nccf/com/gens/prod/, respectively, and are also stored locally on OURRstore covering the 2022 HWT SFE period. RTMA analyses were obtained from the National Digital Guidance Database (https://www.ncei.noaa.gov/products/weather-climate-models/national-digital-guidance-database).

REFERENCES

  • Bannister, R. N., 2017: A review of operational methods of variational and ensemble‐variational data assimilation. Quart. J. Roy. Meteor. Soc., 143, 607633, https://doi.org/10.1002/qj.2982.

    • Search Google Scholar
    • Export Citation
  • Banos, I. H., W. D. Mayfield, G. Ge, L. F. Sapucci, J. R. Carley, and L. Nance, 2022: Assessment of the data assimilation framework for the Rapid Refresh Forecast System v0.1 and impacts on forecasts of a convective storm case study. Geosci. Model Dev., 15, 68916917, https://doi.org/10.5194/gmd-15-6891-2022.

    • Search Google Scholar
    • Export Citation
  • Beck, J., and Coauthors, 2022: Implementation and testing of stochastic physics within FV3-LAM and RRFS prototype ensembles using the Common Community Physics Package (CCPP). ECMWF Workshop on Model Uncertainty, Reading, United Kingdom, ECMWF, 20 pp., https://events.ecmwf.int/event/290/contributions/2988/attachments/1818/3287/Model-WS_Beck.pdf.

  • Benjamin, S. G., G. A. Grell, J. M. Brown, T. G. Smirnova, and R. Bleck, 2004: Mesoscale weather prediction with the RUC hybrid isentropic–terrain-following coordinate model. Mon. Wea. Rev., 132, 473494, https://doi.org/10.1175/1520-0493(2004)132<0473:MWPWTR>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Black, T. L., and Coauthors, 2021: A limited area modeling capability for the Finite‐Volume cubed‐sphere (FV3) dynamical core and comparison with a global two‐way nest. J. Adv. Model. Earth Syst., 13, e2021MS002483, https://doi.org/10.1029/2021MS002483.

    • Search Google Scholar
    • Export Citation
  • Bonavita, M., E. Hólm, L. Isaksen, and M. Fisher, 2016: The evolution of the ECMWF hybrid data assimilation system. Quart. J. Roy. Meteor. Soc., 142, 287303, https://doi.org/10.1002/qj.2652.

    • Search Google Scholar
    • Export Citation
  • Bowler, N. E., and Coauthors, 2017: Inflation and localization tests in the development of an ensemble of 4D‐ensemble variational assimilations. Quart. J. Roy. Meteor. Soc., 143, 12801302, https://doi.org/10.1002/qj.3004.

    • Search Google Scholar
    • Export Citation
  • Buehner, M., and Coauthors, 2015: Implementation of deterministic weather forecasting systems based on ensemble–variational data assimilation at Environment Canada. Part I: The global system. Mon. Wea. Rev., 143, 25322559, https://doi.org/10.1175/MWR-D-14-00354.1.

    • Search Google Scholar
    • Export Citation
  • Carley, J. R., and Coauthors, 2021: Status of NOAA’s next generation convection-allowing ensemble: The Rapid Refresh Forecast System. Special Symp. on Global Mesoscale Models, online, Amer. Meteor. Soc., 12.8, https://ams.confex.com/ams/101ANNUAL/meetingapp.cgi/Paper/378383.

  • Caron, J.-F., and M. Buehner, 2018: Scale-dependent background error covariance localization: Evaluation in a global deterministic weather forecasting system. Mon. Wea. Rev., 146, 13671381, https://doi.org/10.1175/MWR-D-17-0369.1.

    • Search Google Scholar
    • Export Citation
  • Chen, X., and Coauthors, 2021: Evaluation of the offline-coupled GFSv15–FV3–CMAQv5.0.2 in support of the next-generation national air quality forecast capability over the contiguous United States. Geosci. Model Dev., 14, 39693993, https://doi.org/10.5194/gmd-14-3969-2021.

    • Search Google Scholar
    • Export Citation
  • Clark, A. J., and Coauthors, 2018: The Community Leveraged Unified Ensemble (CLUE) in the 2016 NOAA/Hazardous Weather Testbed Spring Forecasting Experiment. Bull. Amer. Meteor. Soc., 99, 14331448, https://doi.org/10.1175/BAMS-D-16-0309.1.

    • Search Google Scholar
    • Export Citation
  • Clark, A. J., and Coauthors, 2021: Spring forecasting experiment 2021 conducted by the experimental forecast program of the NOAA Hazardous Weather Testbed: Preliminary findings and results. NOAA Tech. Doc., 86 pp., https://hwt.nssl.noaa.gov/sfe/2021/docs/HWT_SFE_2021_Prelim_Findings_FINAL.pdf.

  • Clark, A. J., and Coauthors, 2022: Spring forecasting experiment 2022 conducted by the experimental forecast program of the NOAA Hazardous Weather Testbed: Program overview and operations plan. NOAA Tech. Doc., 43 pp., https://hwt.nssl.noaa.gov/sfe/2022/docs/HWT_SFE2022_operations_plan.pdf.

  • De Pondeca, M. S. F. V., and Coauthors, 2011: The real-time mesoscale analysis at NOAA’s National Centers for Environmental Prediction: Current status and development. Wea. Forecasting, 26, 593612, https://doi.org/10.1175/WAF-D-10-05037.1.

    • Search Google Scholar
    • Export Citation
  • Dowell, D. C., and Coauthors, 2022: The High-Resolution Rapid Refresh (HRRR): An hourly updating convection-allowing forecast model. Part I: Motivation and system description. Wea. Forecasting, 37, 13711395, https://doi.org/10.1175/WAF-D-21-0151.1.

    • Search Google Scholar
    • Export Citation
  • Duda, J. D., X. Wang, F. Kong, M. Xue, and J. Berner, 2016: Impact of a stochastic kinetic energy backscatter scheme on warm season convection-allowing ensemble forecasts. Mon. Wea. Rev., 144, 18871908, https://doi.org/10.1175/MWR-D-15-0092.1.

    • Search Google Scholar
    • Export Citation
  • Fabry, F., and V. Meunier, 2020: Why are radar data so difficult to assimilate skillfully? Mon. Wea. Rev., 148, 28192836, https://doi.org/10.1175/MWR-D-19-0374.1.

    • Search Google Scholar
    • Export Citation
  • Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723757, https://doi.org/10.1002/qj.49712555417.

    • Search Google Scholar
    • Export Citation
  • Gasperoni, N. A., X. Wang, and Y. Wang, 2022: Using a cost-effective approach to increase background ensemble member size within the GSI-based EnVar system for improved radar analyses and forecasts of convective systems. Mon. Wea. Rev., 150, 667689, https://doi.org/10.1175/MWR-D-21-0148.1.

    • Search Google Scholar
    • Export Citation
  • Gasperoni, N. A., X. Wang, and Y. Wang, 2023: Valid time shifting for an experimental RRFS convection-allowing EnVar data assimilation and forecast system: Description and systematic evaluation in real time. Mon. Wea. Rev., 151, 12291245, https://doi.org/10.1175/MWR-D-22-0089.1.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 1999: Hypothesis tests for evaluating numerical precipitation forecasts. Wea. Forecasting, 14, 155167, https://doi.org/10.1175/1520-0434(1999)014<0155:HTFENP>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 27762790, https://doi.org/10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Harris, L. M., and S.-J. Lin, 2013: A two-way nested global-regional dynamical core on the cubed-sphere grid. Mon. Wea. Rev., 141, 283306, https://doi.org/10.1175/MWR-D-11-00201.1.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129, 123137, https://doi.org/10.1175/1520-0493(2001)129<0123:ASEKFF>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and F. Zhang, 2016: Review of the ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 144, 44894532, https://doi.org/10.1175/MWR-D-15-0440.1.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., X. Deng, H. L. Mitchell, S.-J. Baek, and N. Gagnon, 2014: Higher resolution in an operational ensemble Kalman filter. Mon. Wea. Rev., 142, 11431162, https://doi.org/10.1175/MWR-D-13-00138.1.

    • Search Google Scholar
    • Export Citation
  • Houze, R. A., Jr., 2004: Mesoscale convective systems. Rev. Geophys., 42, RG4003, https://doi.org/10.1029/2004RG000150.

  • Hu, X.-M., J. Park, T. Supinie, N. A. Snook, M. Xue, K. A. Brewster, J. Brotzge, and J. R. Carley, 2023: Diagnosing near-surface model errors with candidate physics parameterization schemes for the multiphysics Rapid Refresh Forecast System (RRFS) ensemble during winter over the northeastern United States and southern Great Plains. Mon. Wea. Rev., 151, 3961, https://doi.org/10.1175/MWR-D-22-0085.1.

    • Search Google Scholar
    • Export Citation
  • Huang, B., and X. Wang, 2018: On the use of cost-effective valid-time-shifting (VTS) method to increase ensemble size in the GFS hybrid 4DEnVar system. Mon. Wea. Rev., 146, 29732998, https://doi.org/10.1175/MWR-D-18-0009.1.

    • Search Google Scholar
    • Export Citation
  • Huang, B., X. Wang, D. T. Kleist, and T. Lei, 2021: A simultaneous multiscale data assimilation using scale-dependent localization in GSI-based hybrid 4DEnVar for NCEP FV3-based GFS. Mon. Wea. Rev., 149, 479501, https://doi.org/10.1175/MWR-D-20-0166.1.

    • Search Google Scholar
    • Export Citation
  • Iacono, M. J., J. S. Delamere, E. J. Mlawer, M. W. Shephard, S. A. Clough, and W. D. Collins, 2008: Radiative forcing by long-lived greenhouse gases: Calculations with the AER radiative transfer models. J. Geophys. Res., 113, D13103, https://doi.org/10.1029/2008JD009944.

    • Search Google Scholar
    • Export Citation
  • Jankov, I., J. Beck, J. Wolff, M. Harrold, J. B. Olson, T. Smirnova, C. Alexander, and J. Berner, 2019: Stochastically perturbed parameterizations in an HRRR-based ensemble. Mon. Wea. Rev., 147, 153173, https://doi.org/10.1175/MWR-D-18-0092.1.

    • Search Google Scholar
    • Export Citation
  • Johnson, A., X. Wang, J. R. Carley, L. J. Wicker, and C. Karstens, 2015: A comparison of multiscale GSI-based EnKF and 3DVar data assimilation using radar and conventional observations for midlatitude convective-scale precipitation forecasts. Mon. Wea. Rev., 143, 30873108, https://doi.org/10.1175/MWR-D-14-00345.1.

    • Search Google Scholar
    • Export Citation
  • Kalina, E. A., I. Jankov, T. Alcott, J. Olson, J. Beck, J. Berner, D. Dowell, and C. Alexander, 2021: A progress report on the development of the High-Resolution Rapid Refresh ensemble. Wea. Forecasting, 36, 791804, https://doi.org/10.1175/WAF-D-20-0098.1.

    • Search Google Scholar
    • Export Citation
  • Lei, L., and J. S. Whitaker, 2017: Evaluating the trade‐offs between ensemble size and ensemble resolution in an ensemble‐variational data assimilation system. J. Adv. Model. Earth Syst., 9, 781789, https://doi.org/10.1002/2016MS000864.

    • Search Google Scholar
    • Export Citation
  • Majda, A. J., and S. N. Stechmann, 2016: Models for multiscale interactions. Part II: Madden–Julian oscillation, moisture, and convective momentum transport. Multiscale Convection-Coupled Systems in the Tropics: A Tribute to Dr. Michio Yanai, Meteor. Monogr., No. 56, Amer. Meteor. Soc., https://doi.org/10.1175/AMSMONOGRAPHS-D-15-0005.1.

  • Mlawer, E. J., S. J. Taubman, P. D. Brown, M. J. Iacono, and S. A. Clough, 1997: Radiative transfer for inhomogeneous atmospheres: RRTM, a validated correlated-k model for the longwave. J. Geophys. Res., 102, 16 66316 682, https://doi.org/10.1029/97JD00237.

    • Search Google Scholar
    • Export Citation
  • Nakanishi, M., and H. Niino, 2009: Development of an improved turbulence closure model for the atmospheric boundary layer. J. Meteor. Soc. Japan, 87, 895912, https://doi.org/10.2151/jmsj.87.895.

    • Search Google Scholar
    • Export Citation
  • Olson, J. B., J. S. Kenyon, W. A. Angevine, J. M. Brown, M. Pagowski, and K. Sušelj, 2019: A description of the MYNN-EDMF scheme and the coupling to other components in WRF–ARW. NOAA Tech. Memo. OAR GSD-61, 42 pp., https://doi.org/10.25923/n9wm-be49.

  • Roberts, N. M., and H. W. Lean, 2008: Scale-selective verification of rainfall accumulations from high-resolution forecasts of convective events. Mon. Wea. Rev., 136, 7897, https://doi.org/10.1175/2007MWR2123.1.

    • Search Google Scholar
    • Export Citation
  • Rotunno, R., and C. Snyder, 2008: A generalization of Lorenz’s model for the predictability of flows with many scales of motion. J. Atmos. Sci., 65, 10631076, https://doi.org/10.1175/2007JAS2449.1.

    • Search Google Scholar
    • Export Citation
  • Schwartz, C. S., and R. A. Sobash, 2017: Generating probabilistic forecasts from convection-allowing ensembles using neighborhood approaches: A review and recommendations. Mon. Wea. Rev., 145, 33973418, https://doi.org/10.1175/MWR-D-16-0400.1.

    • Search Google Scholar
    • Export Citation
  • Schwartz, C. S., G. S. Romine, K. R. Smith, and M. L. Weisman, 2014: Characterizing and optimizing precipitation forecasts from a convection-permitting ensemble initialized by a mesoscale ensemble Kalman filter. Wea. Forecasting, 29, 12951318, https://doi.org/10.1175/WAF-D-13-00145.1.

    • Search Google Scholar
    • Export Citation
  • Smith, T. M., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) severe weather and aviation products: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 16171630, https://doi.org/10.1175/BAMS-D-14-00173.1.

    • Search Google Scholar
    • Export Citation
  • Sun, T., J. Sun, Y. Chen, Y. Zhang, Z. Ying, and H. Chen, 2022: Improving short-term precipitation forecasting with radar data assimilation and a multiscale hybrid ensemble–variational strategy. Mon. Wea. Rev., 150, 23572377, https://doi.org/10.1175/MWR-D-21-0325.1.

    • Search Google Scholar
    • Export Citation
  • Thompson, G., and T. Eidhammer, 2014: A study of aerosol impacts on clouds and precipitation development in a large winter cyclone. J. Atmos. Sci., 71, 36363658, https://doi.org/10.1175/JAS-D-13-0305.1.

    • Search Google Scholar
    • Export Citation
  • Wang, X., D. Parrish, D. Kleist, and J. Whitaker, 2013: GSI 3DVar-based ensemble variational hybrid data assimilation for NCEP global forecast system: Single-resolution experiments. Mon. Wea. Rev., 141, 40984117, https://doi.org/10.1175/MWR-D-12-00141.1.

    • Search Google Scholar
    • Export Citation
  • Wang, X., H. G. Chipilski, C. H. Bishop, E. Satterfield, N. Baker, and J. S. Whitaker, 2021: A Multiscale Local Gain form Ensemble Transform Kalman Filter (MLGETKF). Mon. Wea. Rev., 149, 605622, https://doi.org/10.1175/MWR-D-20-0290.1.

    • Search Google Scholar
    • Export Citation
  • Wang, Y., and X. Wang, 2017: Direct assimilation of radar reflectivity without tangent linear and adjoint of the nonlinear observation operator in the GSI-based EnVar system: Methodology and experiment with the 8 May 2003 Oklahoma City tornadic supercell. Mon. Wea. Rev., 145, 14471471, https://doi.org/10.1175/MWR-D-16-0231.1.

    • Search Google Scholar
    • Export Citation
  • Wang, Y., and X. Wang, 2021: Rapid update with EnVar direct radar reflectivity data assimilation for the NOAA regional convection-allowing NMMB model over the CONUS: System description and initial experiment results. Atmosphere, 12, 1286, https://doi.org/10.3390/atmos12101286.

    • Search Google Scholar
    • Export Citation
  • Wang, Y., and X. Wang, 2023: Simultaneous multiscale data assimilation using scale‐ and variable‐dependent localization in EnVar for convection allowing analyses and forecasts: Methodology and experiments for a tornadic supercell. J. Adv. Model. Earth Syst., 15, e2022MS003430, https://doi.org/10.1029/2022MS003430.

    • Search Google Scholar
    • Export Citation
  • Wheatley, D. M., K. H. Knopfmeier, T. A. Jones, and G. J. Creager, 2015: Storm-scale data assimilation and ensemble forecasting with the NSSL experimental Warn-on-Forecast system. Part I: Radar data experiments. Wea. Forecasting, 30, 17951817, https://doi.org/10.1175/WAF-D-15-0043.1.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., and T. M. Hamill, 2012: Evaluating methods to account for system errors in ensemble data assimilation. Mon. Wea. Rev., 140, 30783089, https://doi.org/10.1175/MWR-D-11-00276.1.

    • Search Google Scholar
    • Export Citation
  • Xu, Q., H. Lu, S. Gao, M. Xue, and M. Tong, 2008: Time-expanded sampling for ensemble Kalman filter: Assimilation experiments with simulated radar observations. Mon. Wea. Rev., 136, 26512667, https://doi.org/10.1175/2007MWR2185.1.

    • Search Google Scholar
    • Export Citation
  • Yang, F., 2020: GFS development and transition to operations. Unified Forecast System (UFS) Medium-Range Weather (MRW) Application Users’ Training, 57 pp., https://dtcenter.org/sites/default/files/events/2020/20201106-0900a-gfsreview-fanglinyang.pdf.

  • Yussouf, N., E. R. Mansell, L. J. Wicker, D. M. Wheatley, and D. J. Stensrud, 2013: The ensemble Kalman filter analyses and forecasts of the 8 May 2003 Oklahoma City tornadic supercell storm using single- and double-moment microphysics schemes. Mon. Wea. Rev., 141, 33883412, https://doi.org/10.1175/MWR-D-12-00237.1.

    • Search Google Scholar
    • Export Citation
  • Zhang, F., N. Bei, R. Rotunno, C. Snyder, and C. C. Epifanio, 2007: Mesoscale predictability of moist baroclinic waves: Convection-permitting experiments and multistage error growth dynamics. J. Atmos. Sci., 64, 35793594, https://doi.org/10.1175/JAS4028.1.

    • Search Google Scholar
    • Export Citation
  • Zhang, F., Y. Weng, J. A. Sippel, Z. Meng, and C. H. Bishop, 2009: Cloud-resolving hurricane initialization and prediction through assimilation of Doppler radar observations with an ensemble Kalman filter. Mon. Wea. Rev., 137, 21052125, https://doi.org/10.1175/2009MWR2645.1.

    • Search Google Scholar
    • Export Citation
  • Zhang, H., J. Gao, Q. Xu, and L. Ran, 2023: Applying time-expended sampling to ensemble assimilation of remote-sensing data for short-term predictions of thunderstorms. Remote Sens., 15, 2358, https://doi.org/10.3390/rs15092358.

    • Search Google Scholar
    • Export Citation
  • Zhang, J., and Coauthors, 2016: Multi-Radar Multi-Sensor (MRMS) quantitative precipitation estimation: Initial operating capabilities. Bull. Amer. Meteor. Soc., 97, 621638, https://doi.org/10.1175/BAMS-D-14-00174.1.

    • Search Google Scholar
    • Export Citation
  • Zhao, Q., Q. Xu, Y. Jin, J. McLay, and C. Reynolds, 2015: Time-expanded sampling for ensemble-based data assimilation applied to conventional and satellite observations. Wea. Forecasting, 30, 855872, https://doi.org/10.1175/WAF-D-14-00108.1.

    • Search Google Scholar
    • Export Citation
  • Zhou, X., and Coauthors, 2022: The development of the NCEP Global Ensemble Forecast System version 12. Wea. Forecasting, 37, 10691084, https://doi.org/10.1175/WAF-D-21-0112.1.

    • Search Google Scholar
    • Export Citation
  • Zhu, Y., W. Li, X. Zhou, and D. Hou, 2019: Stochastic representation of NCEP GEFS to improve sub-seasonal forecast. Current Trends in the Representation of Physical Processes in Weather and Climate Models, D. A. Randall et al., Eds., Springer, 317–328.

Save
  • Bannister, R. N., 2017: A review of operational methods of variational and ensemble‐variational data assimilation. Quart. J. Roy. Meteor. Soc., 143, 607633, https://doi.org/10.1002/qj.2982.

    • Search Google Scholar
    • Export Citation
  • Banos, I. H., W. D. Mayfield, G. Ge, L. F. Sapucci, J. R. Carley, and L. Nance, 2022: Assessment of the data assimilation framework for the Rapid Refresh Forecast System v0.1 and impacts on forecasts of a convective storm case study. Geosci. Model Dev., 15, 68916917, https://doi.org/10.5194/gmd-15-6891-2022.

    • Search Google Scholar
    • Export Citation
  • Beck, J., and Coauthors, 2022: Implementation and testing of stochastic physics within FV3-LAM and RRFS prototype ensembles using the Common Community Physics Package (CCPP). ECMWF Workshop on Model Uncertainty, Reading, United Kingdom, ECMWF, 20 pp., https://events.ecmwf.int/event/290/contributions/2988/attachments/1818/3287/Model-WS_Beck.pdf.

  • Benjamin, S. G., G. A. Grell, J. M. Brown, T. G. Smirnova, and R. Bleck, 2004: Mesoscale weather prediction with the RUC hybrid isentropic–terrain-following coordinate model. Mon. Wea. Rev., 132, 473494, https://doi.org/10.1175/1520-0493(2004)132<0473:MWPWTR>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Black, T. L., and Coauthors, 2021: A limited area modeling capability for the Finite‐Volume cubed‐sphere (FV3) dynamical core and comparison with a global two‐way nest. J. Adv. Model. Earth Syst., 13, e2021MS002483, https://doi.org/10.1029/2021MS002483.

    • Search Google Scholar
    • Export Citation
  • Bonavita, M., E. Hólm, L. Isaksen, and M. Fisher, 2016: The evolution of the ECMWF hybrid data assimilation system. Quart. J. Roy. Meteor. Soc., 142, 287303, https://doi.org/10.1002/qj.2652.

    • Search Google Scholar
    • Export Citation
  • Bowler, N. E., and Coauthors, 2017: Inflation and localization tests in the development of an ensemble of 4D‐ensemble variational assimilations. Quart. J. Roy. Meteor. Soc., 143, 12801302, https://doi.org/10.1002/qj.3004.

    • Search Google Scholar
    • Export Citation
  • Buehner, M., and Coauthors, 2015: Implementation of deterministic weather forecasting systems based on ensemble–variational data assimilation at Environment Canada. Part I: The global system. Mon. Wea. Rev., 143, 25322559, https://doi.org/10.1175/MWR-D-14-00354.1.

    • Search Google Scholar
    • Export Citation
  • Carley, J. R., and Coauthors, 2021: Status of NOAA’s next generation convection-allowing ensemble: The Rapid Refresh Forecast System. Special Symp. on Global Mesoscale Models, online, Amer. Meteor. Soc., 12.8, https://ams.confex.com/ams/101ANNUAL/meetingapp.cgi/Paper/378383.

  • Caron, J.-F., and M. Buehner, 2018: Scale-dependent background error covariance localization: Evaluation in a global deterministic weather forecasting system. Mon. Wea. Rev., 146, 13671381, https://doi.org/10.1175/MWR-D-17-0369.1.

    • Search Google Scholar
    • Export Citation
  • Chen, X., and Coauthors, 2021: Evaluation of the offline-coupled GFSv15–FV3–CMAQv5.0.2 in support of the next-generation national air quality forecast capability over the contiguous United States. Geosci. Model Dev., 14, 39693993, https://doi.org/10.5194/gmd-14-3969-2021.

    • Search Google Scholar
    • Export Citation
  • Clark, A. J., and Coauthors, 2018: The Community Leveraged Unified Ensemble (CLUE) in the 2016 NOAA/Hazardous Weather Testbed Spring Forecasting Experiment. Bull. Amer. Meteor. Soc., 99, 14331448, https://doi.org/10.1175/BAMS-D-16-0309.1.

    • Search Google Scholar
    • Export Citation
  • Clark, A. J., and Coauthors, 2021: Spring forecasting experiment 2021 conducted by the experimental forecast program of the NOAA Hazardous Weather Testbed: Preliminary findings and results. NOAA Tech. Doc., 86 pp., https://hwt.nssl.noaa.gov/sfe/2021/docs/HWT_SFE_2021_Prelim_Findings_FINAL.pdf.

  • Clark, A. J., and Coauthors, 2022: Spring forecasting experiment 2022 conducted by the experimental forecast program of the NOAA Hazardous Weather Testbed: Program overview and operations plan. NOAA Tech. Doc., 43 pp., https://hwt.nssl.noaa.gov/sfe/2022/docs/HWT_SFE2022_operations_plan.pdf.

  • De Pondeca, M. S. F. V., and Coauthors, 2011: The real-time mesoscale analysis at NOAA’s National Centers for Environmental Prediction: Current status and development. Wea. Forecasting, 26, 593612, https://doi.org/10.1175/WAF-D-10-05037.1.

    • Search Google Scholar
    • Export Citation
  • Dowell, D. C., and Coauthors, 2022: The High-Resolution Rapid Refresh (HRRR): An hourly updating convection-allowing forecast model. Part I: Motivation and system description. Wea. Forecasting, 37, 13711395, https://doi.org/10.1175/WAF-D-21-0151.1.

    • Search Google Scholar
    • Export Citation
  • Duda, J. D., X. Wang, F. Kong, M. Xue, and J. Berner, 2016: Impact of a stochastic kinetic energy backscatter scheme on warm season convection-allowing ensemble forecasts. Mon. Wea. Rev., 144, 18871908, https://doi.org/10.1175/MWR-D-15-0092.1.

    • Search Google Scholar
    • Export Citation
  • Fabry, F., and V. Meunier, 2020: Why are radar data so difficult to assimilate skillfully? Mon. Wea. Rev., 148, 28192836, https://doi.org/10.1175/MWR-D-19-0374.1.

    • Search Google Scholar
    • Export Citation
  • Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Quart. J. Roy. Meteor. Soc., 125, 723757, https://doi.org/10.1002/qj.49712555417.

    • Search Google Scholar
    • Export Citation
  • Gasperoni, N. A., X. Wang, and Y. Wang, 2022: Using a cost-effective approach to increase background ensemble member size within the GSI-based EnVar system for improved radar analyses and forecasts of convective systems. Mon. Wea. Rev., 150, 667689, https://doi.org/10.1175/MWR-D-21-0148.1.

    • Search Google Scholar
    • Export Citation
  • Gasperoni, N. A., X. Wang, and Y. Wang, 2023: Valid time shifting for an experimental RRFS convection-allowing EnVar data assimilation and forecast system: Description and systematic evaluation in real time. Mon. Wea. Rev., 151, 12291245, https://doi.org/10.1175/MWR-D-22-0089.1.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., 1999: Hypothesis tests for evaluating numerical precipitation forecasts. Wea. Forecasting, 14, 155167, https://doi.org/10.1175/1520-0434(1999)014<0155:HTFENP>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter. Mon. Wea. Rev., 129, 27762790, https://doi.org/10.1175/1520-0493(2001)129<2776:DDFOBE>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Harris, L. M., and S.-J. Lin, 2013: A two-way nested global-regional dynamical core on the cubed-sphere grid. Mon. Wea. Rev., 141, 283306, https://doi.org/10.1175/MWR-D-11-00201.1.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 129, 123137, https://doi.org/10.1175/1520-0493(2001)129<0123:ASEKFF>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., and F. Zhang, 2016: Review of the ensemble Kalman filter for atmospheric data assimilation. Mon. Wea. Rev., 144, 44894532, https://doi.org/10.1175/MWR-D-15-0440.1.

    • Search Google Scholar
    • Export Citation
  • Houtekamer, P. L., X. Deng, H. L. Mitchell, S.-J. Baek, and N. Gagnon, 2014: Higher resolution in an operational ensemble Kalman filter. Mon. Wea. Rev., 142, 11431162, https://doi.org/10.1175/MWR-D-13-00138.1.

    • Search Google Scholar
    • Export Citation
  • Houze, R. A., Jr., 2004: Mesoscale convective systems. Rev. Geophys., 42, RG4003, https://doi.org/10.1029/2004RG000150.

  • Hu, X.-M., J. Park, T. Supinie, N. A. Snook, M. Xue, K. A. Brewster, J. Brotzge, and J. R. Carley, 2023: Diagnosing near-surface model errors with candidate physics parameterization schemes for the multiphysics Rapid Refresh Forecast System (RRFS) ensemble during winter over the northeastern United States and southern Great Plains. Mon. Wea. Rev., 151, 3961, https://doi.org/10.1175/MWR-D-22-0085.1.

    • Search Google Scholar
    • Export Citation
  • Huang, B., and X. Wang, 2018: On the use of cost-effective valid-time-shifting (VTS) method to increase ensemble size in the GFS hybrid 4DEnVar system. Mon. Wea. Rev., 146, 29732998, https://doi.org/10.1175/MWR-D-18-0009.1.

    • Search Google Scholar
    • Export Citation
  • Huang, B., X. Wang, D. T. Kleist, and T. Lei, 2021: A simultaneous multiscale data assimilation using scale-dependent localization in GSI-based hybrid 4DEnVar for NCEP FV3-based GFS. Mon. Wea. Rev., 149, 479501, https://doi.org/10.1175/MWR-D-20-0166.1.

    • Search Google Scholar
    • Export Citation
  • Iacono, M. J., J. S. Delamere, E. J. Mlawer, M. W. Shephard, S. A. Clough, and W. D. Collins, 2008: Radiative forcing by long-lived greenhouse gases: Calculations with the AER radiative transfer models. J. Geophys. Res., 113, D13103, https://doi.org/10.1029/2008JD009944.

    • Search Google Scholar
    • Export Citation
  • Jankov, I., J. Beck, J. Wolff, M. Harrold, J. B. Olson, T. Smirnova, C. Alexander, and J. Berner, 2019: Stochastically perturbed parameterizations in an HRRR-based ensemble. Mon. Wea. Rev., 147, 153173, https://doi.org/10.1175/MWR-D-18-0092.1.

    • Search Google Scholar
    • Export Citation
  • Johnson, A., X. Wang, J. R. Carley, L. J. Wicker, and C. Karstens, 2015: A comparison of multiscale GSI-based EnKF and 3DVar data assimilation using radar and conventional observations for midlatitude convective-scale precipitation forecasts. Mon. Wea. Rev., 143, 30873108, https://doi.org/10.1175/MWR-D-14-00345.1.

    • Search Google Scholar
    • Export Citation
  • Kalina, E. A., I. Jankov, T. Alcott, J. Olson, J. Beck, J. Berner, D. Dowell, and C. Alexander, 2021: A progress report on the development of the High-Resolution Rapid Refresh ensemble. Wea. Forecasting, 36, 791804, https://doi.org/10.1175/WAF-D-20-0098.1.

    • Search Google Scholar
    • Export Citation
  • Lei, L., and J. S. Whitaker, 2017: Evaluating the trade‐offs between ensemble size and ensemble resolution in an ensemble‐variational data assimilation system. J. Adv. Model. Earth Syst., 9, 781789, https://doi.org/10.1002/2016MS000864.

    • Search Google Scholar
    • Export Citation
  • Majda, A. J., and S. N. Stechmann, 2016: Models for multiscale interactions. Part II: Madden–Julian oscillation, moisture, and convective momentum transport. Multiscale Convection-Coupled Systems in the Tropics: A Tribute to Dr. Michio Yanai, Meteor. Monogr., No. 56, Amer. Meteor. Soc., https://doi.org/10.1175/AMSMONOGRAPHS-D-15-0005.1.

  • Mlawer, E. J., S. J. Taubman, P. D. Brown, M. J. Iacono, and S. A. Clough, 1997: Radiative transfer for inhomogeneous atmospheres: RRTM, a validated correlated-k model for the longwave. J. Geophys. Res., 102, 16 66316 682, https://doi.org/10.1029/97JD00237.

    • Search Google Scholar
    • Export Citation
  • Nakanishi, M., and H. Niino, 2009: Development of an improved turbulence closure model for the atmospheric boundary layer. J. Meteor. Soc. Japan, 87, 895912, https://doi.org/10.2151/jmsj.87.895.

    • Search Google Scholar
    • Export Citation
  • Olson, J. B., J. S. Kenyon, W. A. Angevine, J. M. Brown, M. Pagowski, and K. Sušelj, 2019: A description of the MYNN-EDMF scheme and the coupling to other components in WRF–ARW. NOAA Tech. Memo. OAR GSD-61, 42 pp., https://doi.org/10.25923/n9wm-be49.

  • Roberts, N. M., and H. W. Lean, 2008: Scale-selective verification of rainfall accumulations from high-resolution forecasts of convective events. Mon. Wea. Rev., 136, 7897, https://doi.org/10.1175/2007MWR2123.1.

    • Search Google Scholar
    • Export Citation
  • Rotunno, R., and C. Snyder, 2008: A generalization of Lorenz’s model for the predictability of flows with many scales of motion. J. Atmos. Sci., 65, 10631076, https://doi.org/10.1175/2007JAS2449.1.

    • Search Google Scholar
    • Export Citation
  • Schwartz, C. S., and R. A. Sobash, 2017: Generating probabilistic forecasts from convection-allowing ensembles using neighborhood approaches: A review and recommendations. Mon. Wea. Rev., 145, 33973418, https://doi.org/10.1175/MWR-D-16-0400.1.