1. Introduction
Precipitation over the U.S. West Coast is mainly characterized by atmospheric rivers (ARs), which typically occur during winter and produce 92% of the West Coast’s heaviest 3-day rain events (Ralph and Dettinger 2012). California experiences large variations in annual precipitation and streamflow, with only 5–15 wet days needed to accumulate most of the annual precipitation (Dettinger et al. 2011). With much of the U.S. West Coast susceptible to prolonged drought, and flooding when extreme precipitation occurs (e.g., Ralph et al. 2006), there are challenges for resource management and hazard mitigation. Understanding the physical mechanisms for AR development, including cyclogenesis of the parent circulation, and the manner in which the horizontal vapor flux is translated into orographic precipitation, are foundational to leveraging forecast information into decision making. Numerical weather prediction (NWP) models aid in understanding the physical processes associated with the development and progression of ARs and AR-related impacts.
Retrospective weather forecasts (“reforecasts” or “hindcasts”) are generated from a numerical model with a configuration that remains consistent throughout the time period being simulated, resulting in products that are valuable for model development, process-based studies, and training of postprocessing and machine learning algorithms. A reforecast can provide a long record of forecast output to quantify the performance of the NWP system (Hamill et al. 2006). For example, they are used to examine skill of a wide variety of modeled phenomena of different spatial and temporal scales, including the Madden–Julian oscillation (Jones et al. 2015), tropical cyclones (Li et al. 2016), monsoon precipitation (Zuo et al. 2013), the intertropical convergence zone (Shonk et al. 2018), warm conveyor belts (Wandel et al. 2021), atmospheric rivers (Nardi et al. 2018; DeFlorio et al. 2018), and European winter storms (Pantillon et al. 2017).
Numerous studies have demonstrated the positive impact on precipitation forecasts by calibrating with reforecasts, from which model climatology and statistics are estimated (e.g., Roulin and Vannitsem 2012; Fundel et al. 2010; Yang et al. 2017; Hamill et al. 2004, 2008; Hamill and Whitaker 2006; Vitart 2004). This postprocessing technique has also been used to show improvement in global flood forecasting (Zsoter et al. 2020), fog and visibility for aviation (Herman and Schumacher 2016a), 2-m temperature forecasts (Hagedorn et al. 2008; Guan et al. 2015), and streamflow predictions (Jörg-Hess et al. 2015; Van den Bergh and Roulin 2016). The use of machine learning techniques to statistically postprocess forecasts is rapidly growing (Vannitsem et al. 2021), with reforecasts used as the basis to improve the accuracy of probabilistic forecasts (e.g., Scher and Messori 2018; Chapman et al. 2022).
Several operational weather forecasting centers create reforecast datasets, including the 11-member, 20-yr Global Ensemble Forecasting System (GEFS) Reforecast, version 12, developed by the National Oceanic and Atmospheric Administration (NOAA), European Centre for Medium-Range Weather Forecasts (ECMWF) 11-member, 20-yr ensemble, and Met Office 7-member, 24-yr ensemble. There are also several regional deterministic reforecasts, such as the COSMO-CLM 37-year reforecast over the Russian Arctic (Platonov and Varentsov 2021) and the RAS-NAAD 40-yr North Atlantic atmospheric reforecast (Gavrikov et al. 2020). The global reforecasts generally have the same spatial resolution as their operational counterparts, which are too coarse for cloud-resolving processes and/or local effects from terrain that produce extreme precipitation relevant for ARs (Wehner et al. 2010; Li et al. 2011; Tripathi and Dominguez 2013; Herman and Schumacher 2016b). Regional reforecasts are generated with finer resolution, aiming to improve the simulation of small-scale processes.
To capture a large collection of extreme events, a customized version of WRF (herein West-WRF) was used to generate a reforecast for a period of 34 cool seasons (1 December–31 March), from 1986 to 2019. This high-resolution reforecast is targeted toward AR prediction over the northeast Pacific/U.S. West Coast region using the West-WRF regional atmospheric model (Martin et al. 2018). West-WRF is an ongoing effort at the Center for Western Weather and Water Extremes (CW3E) to develop a regional weather prediction system tailored to western U.S. weather and hydrological extremes, including heavy rainfall and extended dry periods. West-WRF has been established as a skillful tool for reproducing, for example, dynamical relationships between water vapor flux and storm total precipitation, as well as better distribution of extreme precipitation in the Russian River watershed (Martin et al. 2018) and narrow cold front rainbands in Southern California (Cannon et al. 2020).
This reforecast product supports research efforts geared toward improving the understanding of AR forecast skill, particularly at the watershed level, where reservoir management aims to leverage precipitation and hydrologic forecasts in decision making [see Forecast-Informed Reservoir Operations (FIRO); Delaney et al. 2020; Jasperse et al. 2020]. The aim of this paper is to introduce the West-WRF reforecast to the scientific community and present the first results of its evaluation. Section 2 describes the West-WRF Model and verification methods. The results are presented in section 3, followed by applications in section 4, in which current and potential dataset uses are considered. In section 5, the summary and conclusions are presented.
2. Data and methods
a. West-WRF configuration and forcing
The reforecast herein uses the regional Weather Research and Forecasting (WRF) v4.0.1 Model (Skamarock et al. 2019; Powers et al. 2017) as the method for generating high-resolution forecasts for AR prediction along the U.S. West Coast. The configuration of the WRF Model is largely adapted from Martin et al. (2018). West-WRF solves for the nonhydrostatic Euler equations on a spatially discretized grid in the horizontal and a terrain-following hydrostatic pressure vertical coordinate. The West-WRF Model is used as a means of dynamical downscaling of the 0.5° Global Ensemble Forecast System v10 control member into two domains. The outer domain has 9-km grid spacing over the northeast Pacific Ocean and U.S. West Coast, with 618 by 569 mass grid points, and the inner 3-km domain over California has 486 by 399 mass grid points (Fig. 1). Both domains have 60 full model vertical levels and a model top of 10 hPa. Forecasts are run to 168 h (7 days) for the 9-km domain and 120 h (5 days) for the 3-km domain (Fig. 2). The West-WRF reforecast is run for a total of 34 boreal cool seasons (1 December–31 March) extending from 1986 to 2019, and forecasts are initialized at 0000 UTC daily using a cold start and no soil spinup. GEFS v10 is used as forcing for 1986 to 2017, and the reforecast is extended to 2019 using Global Forecast System (GFS) operational 0.5° output as forcing (Fig. 2). An adaptive time step is used for all domains, starting at 5 × dx, and ranging between dx and 8 × dx, targeting a domain-wide vertical Courant–Friedrichs–Lewy (CFL) criterion of 1.28.
The 9-km West-WRF domain (full map) and 3-km nest (red outline) with terrain height (m; shaded). The grids are mapped with a Lambert conformal projection.
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
West-WRF reforecast time period, forcings (GEFS v10 and GFS), and forecast lead times (arrows).
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
In this reforecast, the 9-km domain was expanded to the north and west from the original West-WRF configuration (Martin et al. 2018) to better capture the area of the parent cyclone development for landfalling ARs. Model parameterization schemes include the Thompson microphysics scheme (Thompson et al. 2008), the Yonsei University (YSU) planetary boundary layer scheme (Hong et al. 2006), the RRTMG longwave and shortwave radiation schemes (Iacono et al. 2008), the Noah land surface model (Chen and Dudhia 2001), and the Grell–Freitas convective scheme (9-km domain only; Grell and Freitas 2014). All simulations were performed on the Onyx supercomputer at the U.S. Army Engineer Research and Development Center (ERDC). Total output is generated every 3 h, with select near-surface 2D variables output hourly.
b. Verification
Verification of the West-WRF reforecast is done against both observations and reanalysis model output. To assess the skill of this reforecast, we also verify the native 0.5° GEFS v10 control run (herein GEFS), from which the reforecast has been downscaled, and compare results. Although the reforecast extends to 2019, it is verified up to 2017, during which time the forcing dataset was consistent (Fig. 2). To provide a full evaluation of this reforecast with a focus on U.S. West Coast hydrological extremes, we evaluate 1) surface wind, temperature, and humidity; 2) vertical profiles of geopotential height; 3) atmospheric river intensity and landfall; and 4) precipitation, including extreme events and seasonal totals during the cool season (1 December–31 March).
1) Near-surface and upper-air verification
Observations from the National Centers for Environmental Prediction (NCEP) Meteorological Assimilation Data Ingest System (MADIS; Miller et al. 2005) are used for near-surface and upper-air verification of both the 3- and 9-km reforecasts and GEFS over the period 2002–17 (the time period that MADIS and GEFS v10 input overlap). Observation–model pairs of 2-m temperature, 2-m dewpoint temperature, and 10-m wind speed are constructed at 6-hourly intervals over the entire West-WRF domain. For upper-air verification, instantaneous geopotential height observations on seven pressure levels (925, 850, 700, 500, 400, 300, and 200 hPa) at 12-hourly intervals are compared with the reforecast. The bias, bias-corrected root-mean-square error (BCRMSE, so that effects of the bias are removed), and the Pearson correlation coefficient are calculated as a function of lead time for each forecast, averaged across all model initialization times.
2) Integrated water vapor transport (IVT) evaluation
We use the ERA5 reanalysis (Hersbach et al. 2020) as the reference gridded dataset for IVT, following results from Cobb et al. (2021). This atmospheric reanalysis has 0.25° horizontal resolution and 37 pressure levels. The definition of an AR is based on IVT, which is vertically integrated from 1000 to 300 hPa, and this value is used to verify the forecasts of atmospheric rivers, in terms of intensity, landfall position, and location. For this analysis, AR objects are detected over the 1986–2017 period in ERA5, GEFS, and the West-WRF reforecast 9-km domain (the 3-km domain is too small for entire ARs to be detected) using the Method for Object-Based Diagnostic Evaluation (MODE; Davis et al. 2006; Bullock et al. 2016). AR objects require a minimum IVT threshold of 500 kg m−1 s−1 and a minimum length of 1500 km. This IVT threshold focuses the analysis on more extreme events, with intensities less than this considered weak on the AR scale (Ralph et al. 2019). The intensity of each AR object is defined as the 90th-percentile IVT value within that object. The 90th-percentile value is used, rather than the maximum IVT, to mitigate the effect of model resolution on the intensity (DeHaan et al. 2021). An intensity error is computed for any forecast where ERA5, GEFS, and the West-WRF reforecast all have an AR object detected by MODE. The landfall position is defined as the latitude with maximum IVT at the coastline. ERA5, GEFS, and the reforecast must all have a landfalling AR for the position error to be computed. The error in AR location is evaluated using the “measure of effectiveness” (MoE; Warner et al. 2004; DeHaan et al. 2021). The MoE is a two-dimensional metric that uses the area of intersection between ERA5 and the forecast AR objects. To create a one-dimensional metric, the distance from a perfect MoE is used here, as described in DeHaan et al. (2021). A distance of 0 (a perfect MoE) indicates that the ERA5 and forecast AR objects are exactly the same size and in exactly the same location. As the distance increases, there are either more grid points in the forecast AR object that are not in the ERA5 AR object, or more grid points in the ERA5 AR object that are not in the forecast AR object. The maximum distance (21/2) occurs when the two AR objects do not intersect at any point.
3) Precipitation verification
Precipitation forecasts are verified against the quantitative precipitation estimates (QPE) from the California Nevada River Forecast Center (CNRFC). The gridded data are generated via quality-controlled gauge data mapped to a 4-km Hydrologic Rainfall Analysis Grid (HRAP) using index gauges across the watershed (Fig. 3). The HRAP is derived using the mountain mapper algorithm (Henkel and Peterson 1996), which is based on interpolation of the reference gauge information to nongauge locations adjusting for orography using the Parameter–Elevation Relationships on Independent Slopes Model (PRISM; Daly et al. 1994).
(left) Map of the U.S. West Coast showing the location of the domain; (right) zoom-in view of the domain location. The Russian and Yuba basin watersheds are shaded in green, with the GEFS grid outlined in red. California Nevada River Forecast Center (CNRFC) reference gauges used for computing mean areal precipitation (MAP) are identified (squares).
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
(i) Mean areal precipitation
The mean areal precipitation (MAP) is used to calculate precipitation skill as it more adequately connects to hydrologic implications (e.g., watershed runoff), particularly in mountainous watersheds. The Yuba and Russian River watersheds were chosen as focus areas given their location (Coast Ranges vs Sierra Nevada) and annual precipitation dominated by wintertime extratropical weather systems with associated ARs (Rutz et al. 2014).
The GEFS forecast MAP is derived by taking the 0.5° grid values centered nearest the individual watershed (Fig. 3). The MAPs are in 6-h intervals beginning at 1200 UTC each day. To arrive at 72-h totals, the individual 6-hourly MAP estimates are summed from 0000 UTC day 1 and ending at 0000 UTC day 4, that is, twelve 6-h periods. Each of the grids were weighted based on the percentage of subbasins contained in the grid. For the Russian watershed, the upper-left grid weight was 30%, the lower-left grid 10%, and the lower-right grid 60%. For the Yuba watershed the upper-right grid was weighted 20%, the upper-left grid 70%, the lower-right and lower-left grids both 5% (Fig. 3). This follows the methodology described by Brown et al. (2014).
The West-WRF 3- and 9-km reforecast output were processed using HUC8 shapefiles (green shaded areas in Fig. 3) that define the watershed. All model grid points falling within the shapefiles are spatially averaged for each 6-h accumulation period. The individual 6-h forecasts available for each model (20 for the 3-km reforecast, 28 for the 9-km reforecast, and 32 for GEFS) were aggregated into 72-h MAP forecasts, i.e., days 1–3, 2–4, and 3–5 for all, and additional days 5–7 for the 9-km reforecast and GEFS. Days 6–8 are not included for GEFS as there were no matching reforecasts. The total verification dates are staggered to validate forecasts with all available lead times for each forecast (e.g., 8 December for the 9-km West-WRF). Given the large sample size (3878 GEFS, 3559 West-WRF 9-km, and 3656 West-WRF 3-km), it is assumed that the relatively small differences in sample size for the three models would have little impact on the statistical significance of the final results.
The MAP metrics evaluated include bias, BCRMSE, coefficient of determination (hereinafter R2), symmetric extremal dependence index (SEDI) (Ferro and Stephenson 2011), and Heidke skill score (HSS) (Jolliffe and Stephenson 2011). The HSS measures the fractional improvement of the forecast over the standard forecast (this reference measure is the proportion correct that would be expected by random forecasts that are statistically independent of the observations). Like most skill scores, it is normalized by the total range of possible improvements over the standard, which means it is appropriate to compare Heidke skill scores on different datasets. The range of the HSS is from −∞ to 1. Negative values indicate that the chance forecast is better, zero means no skill, and a perfect forecast obtains a HSS of 1. The SEDI is a skill score used for extreme events and provides results where the hit rate and false alarm rate decrease toward zero. A value of 1 would indicate a hit rate of 1 and a false alarm rate of zero. A value less than zero would indicate a forecast worse than chance. The 90th-percentile 72-h observed December–March MAP for the period of record (1986–2017) was used as a threshold for the SEDI and HSS metrics.
(ii) Seasonal precipitation
For seasonal total observations, the 6-h CNRFC QPE is aggregated into 24-h totals and summed across valid dates. The seasonal totals of West-WRF 3-km reforecast precipitation are calculated first by aggregating the 3-hourly precipitation into 24-h forecast totals. The 24-h totals are then summed across all valid times according to their lead time (e.g., all 48-h forecasts are summed together). Once aggregated, the seasonal totals from the forecast are bilinearly interpolated to the CNRFC grid to compare the forecast products. The seasonal totals are calculated for each individual cool season using valid dates between 6 December and 31 March. The data start on the 6th to have a complete 5-day set of forecasts given the model was initialized on 1 December of each year. Although the 9-km output was also assessed, the overall spatial patterns are the same as the 3-km output.
3. Results
We present results for the verification and evaluation of the West-WRF reforecast against surface stations, vertical profile measurements, the ERA5 reanalysis, and CNRFC QPE, and compare these results with the verification of GEFS, from which the reforecast was downscaled. Reforecast output is available through the boreal cool season (1 December–31 March) extending from 1986 to 2019, although analysis goes up to 2017 when the forcing data remains consistent (see section 2a).
a. Surface station
Verification results for the West-WRF reforecast and GEFS against MADIS near-surface observations (both land and ocean) for 2-m temperature (T2), 2-m dewpoint temperature (TD2), and 10-m wind speed (WS10) across the 9- and 3-km domains are shown in Figs. 4 and 5, respectively. Additionally, verification metrics averaged across all lead times are presented in Table 1. For the 9-km reforecast, the T2 and WS10 bias are smaller than GEFS, but the TD2 bias is larger (Figs. 4a–c), with statistically significant bias reduction in T2 by the reforecast during nighttime over days 1 and 2 (∼12 and ∼36 h). The reforecast improves upon GEFS for T2 BCRMSE (Fig. 4d) but results are mixed for TD2 and WS10 (Figs. 4e,f). However, the GEFS correlation is larger than the reforecast for all three variables (Figs. 4g–i). For both GEFS and the 9-km reforecast, for all metrics apart from WS10 bias, there is a degradation in skill (increased bias and BCRMSE, decreased correlation) with increased lead time.
(a)–(c) West-WRF reforecast 9-km domain (red) and GEFS (blue) bias, (d)–(f) bias-corrected root-mean-square error (BCRMSE), and (g)–(i) correlation as a function of forecast hour (0000 UTC initializations) for (left) 2-m temperature (°C), (center) 2-m dewpoint temperature (°C), and (right) 10-m wind speed (m s−1), using all Meteorological Assimilation Data Ingest System (MADIS) METAR and maritime observation (land and water) and model pairs over the West-WRF 9-km domain and averaged for all model initializations over the 2002–17 cool seasons (1 Dec–31 Mar). Light red and blue shadings indicate the 95% confidence interval using bootstrapping for the reforecast and GEFS, respectively.
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
As in Fig. 4, for the West-WRF reforecast 3-km domain (red) and GEFS (blue).
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
Summary table of reforecast (“Ref”) verification against surface observations in comparison with GEFS against surface observations, as the average values for each simulation, variable, and metric across all forecast lead times. Boldface values indicate the better-performing model (reforecast or GEFS) for a particular metric. Asterisks indicate statistically significant improvement by either model (95% using bootstrapping) for at least one lead time.
The 3-km reforecast data have several measures that are equal to or better than that of GEFS (Fig. 5, Table 1). Reforecast T2 bias is similar in magnitude but of different sign (positive) when compared with GEFS (negative), with a systematic and statistically significant difference of about 1°C (Fig. 5a), while the reforecast TD2 bias is similar (Fig. 5b) and the 3-km reforecast has a smaller WS10 bias that is a statistically significant improvement at several lead times, particularly at night (local time is UTC − 8 h; Fig. 5c). The 3-km reforecast improves upon GEFS for BCRMSE (Figs. 5d–f) outside of nighttime TD2. The correlation of all three surface variables is higher in the reforecast than GEFS for almost all lead times (Figs. 5g–i). The lead time evolution of the errors shows a distinct diurnal cycle in both the reforecast and GEFS across all variables. The 2-m temperature bias is close to zero during the afternoon in the reforecast, and close to zero during the night in GEFS. The biases and BCRMSE in 2-m dewpoint temperature and biases in 10-m wind speed are larger in both the reforecast and GEFS during the nighttime hours.
b. Vertical profiles
Vertical profiles of geopotential height bias, BCRMSE, and correlation for the 9-km West-WRF reforecast and GEFS against MADIS upper-air radiosonde observations are shown in Fig. 6. Results for the 3-km domain are similar (not shown), even with a different (smaller) set of model–observation pairs, indicating the predominance of the large-scale circulation toward upper-air verification. The West-WRF reforecast features a smaller negative bias than GEFS throughout the lower and middle troposphere (below 500 hPa) and during the first 24 h of the forecast (Figs. 6a–c). Otherwise, both models have negative biases in the mid- and upper troposphere. BCRMSE (Figs. 6d–f) and correlation (Figs. 6g–i) are similar for both models, with the former increasing with forecast time especially in the upper troposphere, and correlation falling off with increasing forecast time. Most of the differences, except for bias in several mid- and upper-level early lead times (Fig. 6c), are not statistically significant.
(left) West-WRF reforecast 9-km domain, (center) GEFS, and (right) their difference for (a)–(c) bias, (d)–(f) bias-corrected root-mean-square error, and (g)–(i) correlation as a function of forecast hour (UTC) for geopotential height, using all Meteorological Assimilation Data Ingest System (MADIS) upper-air observation and model pairs over the West-WRF 9-km domain and averaged for all model initializations over the 2002–17 cool seasons (1 Dec–31 Mar).
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
c. AR intensity and landfall
The intensity error, landfall position error, and distance from perfect MoE averaged across all categorized ARs for both the West-WRF 9-km reforecast and GEFS, as well as the differences between the two, are shown in Fig. 7. Comparison of both the reforecast and GEFS with ERA5 allows for determination of the “value added” by the dynamical downscaling of GEFS. For all three metrics, the reforecast has smaller errors across all forecast lead times than GEFS. As seen by the cases when the error bars for the difference are below 0, the improvements are statistically significant at the 95% confidence level for three lead times for landfall position error, four lead times for AR intensity error, and all seven lead times for MoE. The largest improvement in intensity error occurs at the 48-h forecast, improving the intensity forecast by almost 4 kg m−1 s−1, which is an average error reduction of over 5%, while the largest improvement in landfall position error occurs at the 144-h forecast, with an improvement of over 30 km. The reduction in MoE error with the reforecast corresponds to an average improvement in location of the whole AR of 1% of the area of the AR. This 1% can range from an area of 20 000 km2 for a small AR to over 500 000 km2 for a large AR that is correctly forecast as an AR. While improvements in AR skill with the West-WRF reforecast are modest on average, these results demonstrate a consistent, clear improvement in AR intensity, landfall, and location characteristics using the West-WRF reforecast relative to GEFS.
(a) AR intensity error using a threshold of 500 kg m−1 s−1, where intensity is defined as the 90th-percentile value within any AR object. (b) AR landfall position error using a threshold of 500 kg m−1 s−1, where landfall position is defined as the latitude of maximum IVT at the coastline. (c) Distance from perfect MoE for a threshold of 500 kg m−1 s−1. In all cases, the error bars are the 95% confidence interval computed with bootstrapping. For each of (a)–(c), the top plot shows full values and the bottom plot shows the difference (reforecast minus GEFS).
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
d. Mean areal precipitation
Figure 8 shows the forecast verification results of 72-h MAP from the 3- and 9-km West-WRF reforecasts and GEFS for the Russian and Yuba watersheds for cool seasons 1986–2017. For both the Russian and Yuba watersheds the 3- and 9-km reforecast have statistically significantly higher R2 values relative to GEFS for all forecast lead times (Figs. 8a,b). There are no statistically significant differences between the 9- and 3-km reforecast R2 values for either watershed, for any forecast lead times. The Yuba watershed R2 values are consistently higher than for the Russian watershed at all forecast lead times for all models.
(a), (b) Coefficient of determination (R2); (c),(d) root-mean-square error (RMSE); and (e),(f) bias for 72-h MAP for West-WRF 3- and 9-km reforecasts and GEFS for cool seasons (1 Dec–31 Mar) 1986–2017 for the (left) Russian and (right) Yuba watersheds. Error bars denote 95% confidence intervals computed with bootstrapping.
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
Both the 3- and 9-km reforecasts for the Russian watershed have lower BCRMSE than GEFS for days 1–3, 2–4, and 3–5, significant at 95% (Fig. 8c). For days 4–6 and 5–7, the 9-km reforecast shows improvement over GEFS, but this is not statistically significant. In the Yuba watershed (Fig. 8d) there are no statistically significant differences between the reforecasts and GEFS, although both reforecasts have lower BCRMSE in days 1–3 and the 9-km reforecast has higher BCRMSE for days 4–6 and days 5–7.
The bias results for the Russian watershed indicate both the 3- and 9-km reforecasts have a statistically significant reduction in bias relative to GEFS at all lead times (Fig. 8e). The GEFS biases range from around −5 to −7 mm, increasing in magnitude with lead time, whereas the reforecast biases range from −4 to −5 mm. A low bias in the modeled precipitation in the Bay Area in Northern California, in which the Russian watershed resides, has been found in previous studies (e.g., English et al. 2021). For the Yuba watershed (Fig. 8f), both the 3- and 9-km reforecasts have a consistent high bias, whereas GEFS has a negative bias. These biases are of generally comparable magnitudes (∼4–5 mm). The GEFS bias increases slightly with lead time, whereas biases in the reforecasts decrease slightly with lead time. The Yuba is one of the wettest watersheds in the Sierra Nevada and West-WRF reforecast results are consistent with other studies showing high-resolution numerical guidance overestimating precipitation in the Sierra Nevada (Caldwell et al. 2009; Hughes et al. 2020). The similar biases for the 9- and 3-km reforecast domains suggest that biases in large-scale forcing are passed from the parent to the nested domain.
A threshold value of extreme rainfall was required to calculate SEDI and HSS. This was defined using the 90th-percentile observed 3-day MAPs for cool seasons 1986–2017 and was found to be 63.5 mm for both watersheds, giving 325 events for the Russian watershed and 365 for the Yuba watershed. For both the Russian and Yuba watersheds, the West-WRF 3- and 9-km reforecasts tend to outperform GEFS based on the SEDI, but with no significant statistical difference (Figs. 9a,b).
Symmetric extremal dependency index (SEDI) and Heidke skill score (HSS) for the 90th-percentile 72-h MAP for West-WRF 3- and 9-km reforecasts and GEFS for cool seasons (1 Dec–31 Mar) 1986–2017 for the Russian and Yuba watersheds. Error bars denote 95% confidence intervals computed with bootstrapping.
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
Both the 3- and 9-km reforecasts show statistically significant improvements in HSS over GEFS for days 1–3 in the Russian watershed (Fig. 9c), and for days 2–4 in the Yuba watershed (Fig. 9d). There is also a statistically significant improvement in HSS in the Yuba watershed for the 9-km reforecast for days 3–5 over GEFS. The consistent low bias observed in the Russian watershed and the high bias observed in the Yuba watershed for West-WRF are contributing to the lower scores in the Russian watershed and the higher scores in the Yuba watershed for these more extreme events. For GEFS, the magnitude of the bias in the Yuba watershed is less than in the Russian watershed, which may contribute to the higher scores in the Yuba watershed than in the Russian watershed for these more extreme events.
To further evaluate the West-WRF reforecast, we analyzed the MAP of all landfalling ARs in the Russian watershed above an intensity scale of 2 (the maximum IVT is greater than 500 kg m−1 s−1 and has a duration of over 48 h, an IVT of 750 kg m−1 s−1 and a duration over 24 h, or an IVT greater than 1000 kg m−1 s−1, Ralph et al. 2019) using the SIO_R1-Catalog (Gershunov et al. 2017). This catalog identifies all landfalling ARs, their location, intensity, and duration, along the U.S. West Coast based on the NCEP reanalysis over seven decades. These results (not shown) indicate a consistent low bias in both the 3- and 9-km reforecasts for 6-h MAPs through day 5 and day 7. In total, this analysis shows that at basin-scale, the West-WRF reforecast shows improvement in MAP relative to GEFS. Also, the consistent low bias in the reforecast for the Russian watershed and the high bias observed in the Yuba watershed provides strong motivation to apply bias correction to the reforecasts to improve skill.
e. Seasonal precipitation
Seasonal precipitation patterns are critical for water resource management and reliability throughout California. As such, it is important to understand how the high-resolution model is able to replicate seasonal precipitation patterns and where biases may arise as a function of lead time. The seasonal precipitation errors from the 3-km West-WRF reforecast are shown in Fig. 10 (the 9-km spatial patterns are similar; not shown) and the differences from GEFS are negligible when aggregated across this time period. The cool seasons analyzed (2012–15) span both above and below normal annual precipitation as a demonstration of seasonal precipitation skill. In cool season 2012, rainfall totals in areas encompassed by California Department of Water Resources (DWR)’s northern sierra eight-station index were 85% of normal, and 90%, 58%, and 71% of normal for cools seasons 2013, 2014, and 2015, respectively (California Data Exchange Center; https://cdec.water.ca.gov/precipapp/get8SIPrecipIndex.action). Wet biases of ∼400 mm of precipitation are noted in the northern Sierra Nevada along the California–Nevada border while dry biases of ∼200 mm are located in the northern Central Valley and Coast Ranges. The magnitude of errors fluctuates from year to year, because there is a high amount of interannual variability in the number and location of ARs impacting the U.S. West Coast (e.g., Kim et al. 2013). However, the spatial pattern of the biases is relatively consistent between cool seasons, with large parts of California having a small precipitation error of less than 50 mm. The wet bias in the northern Sierra Nevada decreases with forecast lead time, but the dry bias is relatively consistent through all lead times. Although of varying magnitude, similar spatial bias patterns have also been noted in other operational high-resolution regional models, such as the North American Model (NAM) and High-Resolution Rapid Refresh (HRRR) model (e.g., Dougherty et al. 2021). The persistence of systematic biases in these northern California regions across models suggest that 1) West-WRF’s level of seasonal precipitation predictability is on par with other models, and 2) the reforecast can be leveraged for in-depth and/or localized physical process investigation and further postprocessing techniques. Regardless of the years chosen for this analysis, there are consistent errors across space.
Seasonal bias in West-WRF 3-km rainfall over California and Nevada during cool seasons (1 Dec–31 Mar) 2012, 2013, 2014, and 2015 as a function of lead time (days). Brown colors indicate dry bias, and green colors indicate wet bias.
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
4. Applications
The applications of the reforecast dataset are wide-ranging, including 1) meteorological studies (especially of ARs), 2) postprocessing, and 3) model performance analysis and model improvement. This dataset can be used to examine the atmospheric dynamics in the northeast Pacific and U.S. West Coast at a high resolution and spanning across decades. For example, Fig. 11 shows the distribution of 9-km IVT along an arced band along the U.S. and Canadian West Coast. The distribution of IVT in Fig. 11b during the December–March period of cool season 2017 indicates that the maxima of AR-like landfalling conditions occurred at both 43° and 34°N. This bimodal maximum spans the entire California coast and coincides with the wettest year on record in the northern sierra eight-station index (https://cdec.water.ca.gov/cgi-progs/products/PLOT_ESI.pdf). The frequency distribution of the forecasts also indicates these same areas of analyzed (observed) high-frequency IVT activity are underestimated by the forecasts made at the longest lead times (e.g., warmer colors representing 4–6-day lead time). The bimodal maximum is unique to cool season 2017 as the mean time-integrated IVT of the reforecast dataset is maximized at ∼43°N and decays southward until the cross section reaches the moisture-laden equatorial tropical bands. The spread of IVT at individual latitudes (e.g., σ = 30% of mean value at 40°N), shown in Fig. 11c, indicates that seasonal patterns of IVT can differ significantly from one year to the next. This analysis can help future studies by identifying high-frequency areas of AR activity during different cool seasons and can be used as a target for relating the underestimation of forecast AR intensity to precipitation forecast error.
(a) Latitude–longitude band used for identification of IVT frequency on or near the U.S. and Canadian West Coast, (b) distribution of cool season (1 Dec–31 Mar) 2017 total IVT summed from daily snapshots at 0000 UTC from the reforecast analysis (black), and 1–7-day lead time forecasts (from blue to red), and (c) mean time total IVT (thick black line) and individual cool-season time total IVT (gray lines) of the reforecast (1986–2016). The IVT is summed between 6 Dec and 31 Mar of each cool season.
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
This dataset can also be used to better understand the precipitation distributions during key historical extreme events within the domain. For example, in February 1986, extreme rainfall occurred in the Russian watershed and produced the largest flood on record for the Russian River near Guerneville (Blodgett and Lucas 1988). The duration of IVT above 250 kg m−1 s−1 (AR threshold) was 186 h and the maximum IVT observed was 854 kg m s−1. Based on the AR scale (Ralph et al. 2019), this event was categorized as an AR 4 (“mostly hazardous”). The observed 6-h MAP values over a 5-day period for the Russian watershed are shown in Fig. 12 along with the difference relative to the 3-km West-WRF reforecast. For the reforecast, we analyze individual 5-day forecasts issued at 0000 UTC valid at the times shown. There is no clear trend with respect to forecast error as a function of forecast lead time. There is a positive correlation between observed 6-h MAP and magnitude of negative error between the observations and 3-km reforecast, with the strongest correlation at day 2 with 60% variance explained, and an average across all forecast lead times of 42%. However, there is a stronger relationship with the absolute error, with 70% variance explained on day 2 and an average of 51% for all lead times. Therefore, the relationship is not simply that, as the MAP increases, the reforecast dry bias increases, but the bias can also become positive and increase with increasing MAP. Conversely, the relationship is also not simply that, as the MAP decreases, the reforecast dry bias decreases, but the bias can also become positive and decrease with decreasing MAP.
West-WRF 3-km reforecast 6-h MAP error (bars) for the February 1986 landfalling AR as a function of forecast lead time, and the observed 6-h MAP (red line).
Citation: Journal of Hydrometeorology 24, 11; 10.1175/JHM-D-22-0235.1
The relatively long duration (34 years) and high resolution of the West-WRF reforecast allows for its use as a training dataset in machine learning applications. The West-WRF reforecast has already been used as a training dataset for machine learning studies to improve upon the existing model representations of IVT (Chapman et al. 2022) and precipitation (Badrinath et al. 2023). The consistent low bias in precipitation demonstrated by the West-WRF reforecasts for the Russian watershed and for landfalling strong ARs and the high bias observed in the Yuba watershed provides strong motivation to apply machine learning and other bias correction methods to West-WRF to improve model performance. Further applications of machine learning techniques, utilizing the West-WRF reforecast, to improve the representation of precipitation in the near-real-time West-WRF simulations that CW3E runs over the U.S. West Coast every wet season are ongoing, and preliminary results are encouraging.
Another use of the West-WRF reforecast is to quantify the performance of the NWP system and identify avenues for model improvement. While the configuration of near-real-time West-WRF simulations produced by CW3E no longer match the reforecast, the similarities are sufficient to still gain insight into model deficiencies. The availability of a long-period reforecast makes it possible to stratify model performance as a function of the synoptic-scale weather pattern, which can be objectively defined using self-organizing maps (SOMs), for example (Easthom et al. 2022). The homogenous model configuration used across numerous extreme precipitation events over California and the U.S. West Coast throughout the reforecast allows for process-based studies aiming to improve model parameterization schemes (in particular, microphysics and planetary boundary layer).
5. Summary and conclusions
This paper introduces the West–Weather Research and Forecasting (West-WRF) reforecast and presents verification of the West-WRF reforecast against a variety of datasets on several temporal and spatial scales, highlighting the improved skill relative to GEFS, from which it was downscaled. First, domain-wide verification is presented, using temperature, humidity, and wind speed observations near the surface and vertical profiles of geopotential height. Each metric considers a different aspect of the skill, including systematic errors (bias), random errors, and conditional biases (bias-corrected root-mean-square error), and the ability to capture changes in the quantity considered (correlation). For the 3-km West-WRF reforecast near-surface fields, the bias, BCRMSE, and correlation are all better than the GEFS verification, with three exceptions where the differences are negligible. Whereas all correlations are better in the 3-km reforecast than in GEFS, all correlations in the 9-km reforecast are worse. The consistent reforecast improvements in both 3 and 9 km are the temperature BCRMSE and wind speed bias. For geopotential height, there are negligible differences in the performance of the reforecast when compared with GEFS for BCRMSE and correlation, but there are improvements in the bias in the reforecast, especially in the lower and middle troposphere and in the first 24 h.
The ERA5 reanalysis is used to examine AR intensity and landfall using an IVT threshold of 500 kg m−1 s−1 and a minimum length of 1500 km. Results show that the 9-km reforecast AR intensity and landfall position have smaller errors across all forecast lead times relative to GEFS, with statistically significant differences for several lead times. The error in AR location is evaluated using the Measure of Effectiveness (MoE; DeHaan et al. 2021) and results show that the reforecast also has smaller MoE errors across all forecast lead times relative to GEFS, equivalent to an average improvement in location of the whole AR of 1% of the area of the AR. The verification of ARs demonstrates a consistent improvement when using the West-WRF reforecast in comparison with GEFS.
Focusing on finer spatial scale reforecast performance, we analyzed mean areal precipitation (MAP) within two California watersheds, the Russian and the Yuba. For both watersheds, the 3- and 9-km West-WRF reforecast 72-h MAP has statistically significant higher coefficient of determination values relative to the GEFS for all lead times, with negligible differences between the 9- and 3-km reforecast. The 3- and 9-km reforecasts for the Russian watershed have significantly lower BCRMSE than GEFS for days 1–3, 2–4, and 3–5. In the Yuba watershed there are no statistically significant differences between the reforecasts and GEFS, although both reforecasts have lower BCRMSE in days 1–3 and the 9-km reforecast has higher BCRMSE for days 4–6 and days 5–7. The 3- and 9-km reforecasts have a statistically significant reduction in negative bias relative to GEFS at all lead times for the Russian watershed. For the Yuba watershed, both the 3 and 9-km reforecasts show a consistent high bias, while GEFS shows a negative bias. These West-WRF reforecast results are consistent with other studies showing high-resolution numerical guidance overestimating precipitation in the Sierra Nevada and underestimating in the Bay Area (Caldwell et al. 2009; Hughes et al. 2020; English et al. 2021).
In terms of extreme precipitation (90th percentile), in both the Russian and Yuba watersheds, the West-WRF reforecast (3 and 9 km) tends to outperform GEFS based on SEDI, but differences are not statistically significant. Both the 3- and 9-km reforecasts show statistically significant improvement in Heidke skill score over GEFS for days 1–3 in the Russian watershed, and for days 2–4 in the Yuba watershed, and additionally in the 9-km reforecast for days 3–5. The consistent low bias observed in the Russian watershed and the high bias observed in the Yuba watershed for the reforecasts are contributing to the lower scores in the Russian watershed and the higher scores in the Yuba watershed for these more extreme events. For GEFS, the magnitude of the bias in the Yuba watershed is less than in the Russian watershed, which may contribute to the higher scores in the Yuba watershed than in the Russian watershed. This consistent low bias was also found when analyzing landfalling ARs in the Russian watershed above an intensity scale of 2 in both the 3- and 9-km reforecasts for 6-h MAPs. This analysis shows that at the basin scale, the reforecast shows improvement in MAP relative to GEFS.
The day-1–5-lead-time seasonal precipitation in California from the 3-km West-WRF reforecast is compared with the CNRFC QPE gridded dataset for cool seasons 2012–15. Wet biases of ∼400 mm of precipitation are noted in the northern Sierra Nevada along the California–Nevada border while dry biases of ∼200 mm are located in the northern Central Valley and Coast Ranges. The seasonal precipitation spatial bias in the reforecast is consistent with other operational high-resolution regional models and can be leveraged for in-depth and/or localized physical process investigation and further postprocessing techniques.
There are challenges associated with modeling and verifying precipitation, especially extreme events and on the basin scale. The representation of terrain in numerical weather prediction models, and associated modulation of the atmospheric state resulting in precipitation, are often a key source of model error. Further complicating matters, the provision of QPE in complex terrain is difficult. Gridded precipitation datasets, using remotely sensed observations and spatial interpolation methods (such as Mountain Mapper) are developed to “fill in the gaps” between relatively sparse gauge observations, and these datasets also have significant uncertainty in areas of complex terrain. Further efforts to improve forecasts of precipitation in complex terrain using the West-WRF reforecast will need to be accompanied by multiple verification datasets and metrics, including those used in this study. The quantification of ARs and their characteristics also have uncertainties, and several methods and datasets exist for this purpose. Further analysis of the representation of ARs in the West-WRF reforecast beyond that presented here could be done with these alternative datasets and methods.
The extreme year-to-year variability of West Coast precipitation is a challenge for water resource managers conflicted between flood protection and maintaining sufficient water supplies. There is a high demand from different communities for long-term high-resolution atmospheric forecasts performed with high-resolution model configurations for the northeast Pacific and U.S. West Coast, where mesoscale and submesoscale processes are of high relevance. Capturing the impacts of landfalling ARs at watershed scales requires high-resolution spatial domains given the complex topography of California. This dataset allows users to visualize many of the AR and precipitation patterns over a longer period. The principal motivation for creating the West-WRF high-resolution multiyear reforecast was to examine U.S. West Coast hydrological extremes, and the positive verification results of the reforecast show that it can be leveraged for scientific studies of ARs and extreme precipitation, machine learning, and model evaluation.
Acknowledgments.
Thanks are given to Dr. Kristen Guirguis of Scripps Institution of Oceanography, who provided the SIO_R1-Catalog for categorizing the scale of landfalling ARs, and to Shawn Roj for reproducing two figures for this study. This work was supported by the California Department of Water Resources Ph3 AR research program (Award 4600014294) and the Forecast Informed Reservoir Operations Award (USACE W912HZ1920023). Computing resources and support were provided by the DoD High Performance Computing Modernization Program.
Data availability statement.
The West-WRF reforecast output is too large to be publicly archived with available resources. Access to model output, along with documentation and methods used to support this study, are available from author Daniel Steinhoff at CW3E/UCSD (dsteinhoff@ucsd.edu). Near-surface and upper-air meteorological station observations are obtained from the Meteorological Assimilation Data Ingest System (MADIS; https://madis.ncep.noaa.gov/index.shtml). Global Ensemble Forecast System (GEFS), version 10, reforecast output is obtained from the National Energy Research Scientific Computing (NERSC) Center (https://portal.nersc.gov/project/refcst/v2/). The California Nevada River Forecast Center (CNRFC) Quantitative Precipitation Estimate (QPE) are obtained from the CNRFC webportal archive (https://www.cnrfc.noaa.gov/arc_search.php). ERA5 is from the Copernicus Climate Change Service (https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=overview).
REFERENCES
Badrinath, A., L. Delle Monache, N. Hayatbini, W. Chapman, F. Cannon, and F. M. Ralph, 2023: Improving precipitation forecasts with convolutional neural networks. Wea. Forecasting, 38, 291–306, https://doi.org/10.1175/WAF-D-22-0002.1.
Blodgett, J. C., and J. B. Lucas, 1988: Profile of Sacramento River, Freeport to Verona, California, flood of February 1986. USGS Open-File Rep. 88-82, 20 pp., https://pubs.usgs.gov/of/1988/0082/report.pdf.
Brown, J. D., L. Wu, M. He, S. Regonda, H. Lee, and D. Seo, 2014: Verification of temperature, precipitation, and streamflow forecasts from the NOAA/NWS Hydrologic Ensemble Forecast Service (HEFS): 1. Experimental design and forcing verification. J. Hydrol., 519, 2869–2889, https://doi.org/10.1016/j.jhydrol.2014.05.028.
Bullock, R. G., B. G. Brown, and T. L. Fowler, 2016: Method for object-based diagnostic evaluation. NCAR Tech. Note NCAR/TN-532+STR, 84 pp., https://opensky.ucar.edu/islandora/object/technotes%3A546.
Caldwell, P., H.-N. S. Chin, D. C. Bader, and G. Bala, 2009: Evaluation of a WRF dynamical downscaling simulation over California. Climatic Change, 95, 499–521, https://doi.org/10.1007/s10584-009-9583-5.
Cannon, F., and Coauthors, 2020: Observations and predictability of a high-impact narrow cold-frontal rainband over Southern California on 2 February 2019. Wea. Forecasting, 35, 2083–2097, https://doi.org/10.1175/WAF-D-20-0012.1.
Chapman, W. E., L. Delle Monache, S. Alessandrini, A. C. Subramanian, F. M. Ralph, S.-P. Xie, S. Lerch, and N. Hayatbini, 2022: Probabilistic predictions from deterministic atmospheric river forecasts with deep learning. Mon. Wea. Rev., 150, 215–234, https://doi.org/10.1175/MWR-D-21-0106.1.
Chen, F., and J. Dudhia, 2001: Coupling an advanced land surface–hydrology model with the Penn State–NCAR MM5 modeling system. Part I: Model implementation and sensitivity. Mon. Wea. Rev., 129, 569–585, https://doi.org/10.1175/1520-0493(2001)129<0569:CAALSH>2.0.CO;2.
Cobb, A., L. Delle Monache, F. Cannon, and F. M. Ralph, 2021: Representation of dropsonde‐observed atmospheric river conditions in reanalyses. Geophys. Res. Lett., 48, e2021GL093357, https://doi.org/10.1029/2021GL093357.
Daly, C., R. P. Neilson, and D. L. Phillips, 1994: A statistical-topographic model for mapping climatological precipitation over mountainous terrain. J. Appl. Meteor., 33, 140–158, https://doi.org/10.1175/1520-0450(1994)033<0140:ASTMFM>2.0.CO;2.
Davis, C., A. Brown, and R. Bullock, 2006: Object-based verification of precipitation forecasts. Part I: Methodology and application to mesoscale rain areas. Mon. Wea. Rev., 134, 1772–1784, https://doi.org/10.1175/MWR3145.1.
DeFlorio, M. J., D. E. Waliser, B. Guan, D. A. Lavers, F. M. Ralph, and F. Vitart, 2018: Global assessment of atmospheric river prediction skill. J. Hydrometeor., 19, 409–426, https://doi.org/10.1175/JHM-D-17-0135.1.
DeHaan, L. L., A. C. Martin, R. R. Weihs, L. Delle Monache, and F. M. Ralph, 2021: Object-based verification of atmospheric river predictions in the northeast Pacific. Wea. Forecasting, 36, 1575–1587, https://doi.org/10.1175/WAF-D-20-0236.1.
Delaney, C. J., and Coauthors, 2020: Forecast informed reservoir operations using ensemble streamflow predictions for a multipurpose reservoir in Northern California. Water Resour. Res., 56, e2019WR026604, https://doi.org/10.1029/2019WR026604.
Dettinger, M. D., F. M. Ralph, T. Das, P. J. Neiman, and D. R. Cayan, 2011: Atmospheric rivers, floods and the water resources of California. Water, 3, 445–478, https://doi.org/10.3390/w3020445.
Dougherty, K. J., J. D. Horel, and J. E. Nachamkin, 2021: Forecast skill for California heavy precipitation periods from the high-resolution rapid refresh model and the coupled ocean–atmosphere mesoscale prediction system. Wea. Forecasting, 36, 2275–2288, https://doi.org/10.1175/WAF-D-20-0182.1.
Easthom, G. K., G. M. Lackmann, M. J. Molina, and L. DeHaan, 2022: Evaluating variations in atmospheric river predictability as a function of pattern. 31st Conf. on Weather Analysis and Forecasting (WAF)/27th Conf. on Numerical Weather Prediction (NWP), Houston, TX, Amer. Meteor. Soc., 10.5, https://ams.confex.com/ams/102ANNUAL/meetingapp.cgi/Paper/399158.
English, J. M., D. D. Turner, T. I. Alcott, W. R. Moninger, J. L. Bytheway, R. Cifelli, and M. Marquis, 2021: Evaluating operational and experimental HRRR model forecasts of atmospheric river events in California. Wea. Forecasting, 36, 1925–1944, https://doi.org/10.1175/WAF-D-21-0081.1.
Ferro, C. A., and D. B. Stephenson, 2011: Extremal dependence indices: Improved verification measures for deterministic forecasts of rare binary events. Wea. Forecasting, 26, 699–713, https://doi.org/10.1175/WAF-D-10-05030.1.
Fundel, F., A. Walser, M. A. Liniger, C. Frei, and C. Appenzeller, 2010: Calibrated precipitation forecasts for a limited-area ensemble forecast system using reforecasts. Mon. Wea. Rev., 138, 176–189, https://doi.org/10.1175/2009MWR2977.1.
Gavrikov, A., and Coauthors, 2020: RAS-NAAD: 40-yr high-resolution North Atlantic atmospheric hindcast for multipurpose applications (new dataset for the regional mesoscale studies in the atmosphere and the ocean). J. Appl. Meteor. Climatol., 59, 793–817, https://doi.org/10.1175/JAMC-D-19-0190.1.
Gershunov, A., T. Shulgina, F. M. Ralph, D. A. Lavers, and J. J. Rutz, 2017: Assessing the climate-scale variability of atmospheric rivers affecting western North America. Geophys. Res. Lett., 44, 7900–7908, https://doi.org/10.1002/2017GL074175.
Grell, G. A., and S. R. Freitas, 2014: A scale and aerosol aware stochastic convective parameterization for weather and air quality modeling. Atmos. Chem. Phys., 14, 5233–5250, https://doi.org/10.5194/acp-14-5233-2014.
Guan, H., B. Cui, and Y. Zhu, 2015: Improvement of statistical postprocessing using GEFS reforecast information. Wea. Forecasting, 30, 841–854, https://doi.org/10.1175/WAF-D-14-00126.1.
Hagedorn, R., T. M. Hamill, and J. S. Whitaker, 2008: Probabilistic forecast calibration using ECMWF and GFS ensemble reforecasts. Part I: Two-meter temperatures. Mon. Wea. Rev., 136, 2608–2619, https://doi.org/10.1175/2007MWR2410.1.
Hamill, T. M., and J. S. Whitaker, 2006: Probabilistic quantitative precipitation forecasts based on reforecast analogs: Theory and application. Mon. Wea. Rev., 134, 3209–3229, https://doi.org/10.1175/MWR3237.1.
Hamill, T. M., J. S. Whitaker, and X. Wei, 2004: Ensemble reforecasting: Improving medium-range forecast skill using retrospective forecasts. Mon. Wea. Rev., 132, 1434–1447, https://doi.org/10.1175/1520-0493(2004)132<1434:ERIMFS>2.0.CO;2.
Hamill, T. M., J. S. Whitaker, and S. L. Mullen, 2006: Reforecasts: An important dataset for improving weather predictions. Bull. Amer. Meteor. Soc., 87, 33–46, https://doi.org/10.1175/BAMS-87-1-33.
Hamill, T. M., R. Hagedorn, and J. S. Whitaker, 2008: Probabilistic forecast calibration using ECMWF and GFS ensemble reforecasts. Part II: Precipitation. Mon. Wea. Rev., 136, 2620–2632, https://doi.org/10.1175/2007MWR2411.1.
Henkel, A., and C. Peterson, 1996: Can deterministic quantitative precipitation forecasts in mountainous regions be specified in a rapid, climatologically-consistent manner with Mountain Mapper functioning as the tool for mechanical specification, quality control, and verification. Extended Abstracts, Fifth National Heavy Precipitation Workshop, State College, PA, NWS/NOAA, 31 pp.
Herman, G. R., and R. S. Schumacher, 2016a: Using reforecasts to improve forecasting of fog and visibility for aviation. Wea. Forecasting, 31, 467–482, https://doi.org/10.1175/WAF-D-15-0108.1.
Herman, G. R., and R. S. Schumacher, 2016b: Extreme precipitation in models: An evaluation. Wea. Forecasting, 31, 1853–1879, https://doi.org/10.1175/WAF-D-16-0093.1.
Hersbach, H., and Coauthors, 2020: The ERA5 global reanalysis. Quart. J. Roy. Meteor. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803.
Hong, S.-Y., Y. Noh, and J. Dudhia, 2006: A new vertical diffusion package with an explicit treatment of entrainment processes. Mon. Wea. Rev., 134, 2318–2341, https://doi.org/10.1175/MWR3199.1.
Hughes, M., J. D. Lundquist, and B. Henn, 2020: Dynamical downscaling improves upon gridded precipitation products in the Sierra Nevada, California. Climate Dyn., 55, 111–129, https://doi.org/10.1007/s00382-017-3631-z.
Iacono, M. J., J. S. Delamere, E. J. Mlawer, M. W. Shephard, S. A. Clough, and W. D. Collins, 2008: Radiative forcing by long-lived greenhouse gases: Calculations with the AER radiative transfer models. J. Geophys. Res., 113, D13103, https://doi.org/10.1029/2008JD009944.
Jasperse, J., and Coauthors, 2020: Lake Mendocino forecast informed reservoir operations final viability assessment. Tech. Rep., University of California, San Diego, 141 pp., https://cw3e.ucsd.edu/FIRO_docs/LakeMendocino_FIRO_FVA.pdf.
Jolliffe, I. T., and D. B. Stephenson, 2011: Forecast Verification: A Practitioner’s Guide in Atmospheric Science. 2nd ed. Wiley, 296 pp.
Jones, C., A. Hazra, and L. M. Carvalho, 2015: The Madden–Julian oscillation and boreal winter forecast skill: An analysis of NCEP CFSv2 reforecasts. J. Climate, 28, 6297–6307, https://doi.org/10.1175/JCLI-D-15-0149.1.
Jörg-Hess, S., S. B. Kempf, F. Fundel, and M. Zappa, 2015: The benefit of climatological and calibrated reforecast data for simulating hydrological droughts in Switzerland. Meteor. Appl., 22, 444–458, https://doi.org/10.1002/met.1474.
Kim, J., D. E. Waliser, P. J. Neiman, B. Guan, J.-M. Ryoo, and G. A. Wick, 2013: Effects of atmospheric river landfalls on the cold season precipitation in California. Climate Dyn., 40, 465–474, https://doi.org/10.1007/s00382-012-1322-3.
Li, F., W. D. Collins, M. F. Wehner, D. L. Williamson, J. G. Olson, and C. Algieri, 2011: Impact of horizontal resolution on simulation of precipitation extremes in an aqua-planet version of Community Atmospheric Model (CAM3). Tellus, 63A, 884–892, https://doi.org/10.1111/j.1600-0870.2011.00544.x.
Li, W., Z. Wang, and M. S. Peng, 2016: Evaluating tropical cyclone forecasts from the NCEP Global Ensemble Forecasting System (GEFS) reforecast version 2. Wea. Forecasting, 31, 895–916, https://doi.org/10.1175/WAF-D-15-0176.1.
Martin, A., F. M. Ralph, R. Demirdjian, L. DeHaan, R. Weihs, J. Helly, D. Reynolds, and S. Iacobellis, 2018: Evaluation of atmospheric river predictions by the WRF Model using aircraft and regional mesonet observations of orographic precipitation and its forcing. J. Hydrometeor., 19, 1097–1113, https://doi.org/10.1175/JHM-D-17-0098.1.
Miller, P. A., M. F. Barth, and L. A. Benjamin, 2005: An update on MADIS observation ingest, integration, quality control and distribution capabilities. 21st Int. in Conf. on Interactive Information and Processing Systems for Meteorology, Oceanography, and Hydrology/14th Symp. on Education, San Diego, CA, Amer. Meteor. Soc., J7.12, https://ams.confex.com/ams/pdfpapers/86703.pdf.
Nardi, K. M., E. A. Barnes, and F. M. Ralph, 2018: Assessment of numerical weather prediction model reforecasts of the occurrence, intensity, and location of atmospheric rivers along the West Coast of North America. Mon. Wea. Rev., 146, 3343–3362, https://doi.org/10.1175/MWR-D-18-0060.1.
Pantillon, F., P. Knippertz, and U. Corsmeier, 2017: Revisiting the synoptic-scale predictability of severe European winter storms using ECMWF ensemble reforecasts. Nat. Hazards Earth Syst. Sci., 17, 1795–1810, https://doi.org/10.5194/nhess-17-1795-2017.
Platonov, V., and M. Varentsov, 2021: Introducing a new detailed long-term COSMO-CLM hindcast for the Russian Arctic and the first results of its evaluation. Atmosphere, 12, 350, https://doi.org/10.3390/atmos12030350.
Powers, J. G., and Coauthors, 2017: The weather research and forecasting model: Overview, system efforts, and future directions. Bull. Amer. Meteor. Soc., 98, 1717–1737, https://doi.org/10.1175/BAMS-D-15-00308.1.
Ralph, F. M., and M. D. Dettinger, 2012: Historical and national perspectives on extreme West Coast precipitation associated with atmospheric rivers during December 2010. Bull. Amer. Meteor. Soc., 93, 783–790, https://doi.org/10.1175/BAMS-D-11-00188.1.
Ralph, F. M., P. J. Neiman, G. A. Wick, S. I. Gutman, M. D. Dettinger, D. R. Cayan, and A. B. White, 2006: Flooding on California’s Russian River: Role of atmospheric rivers. Geophys. Res. Lett., 33, L13801, https://doi.org/10.1029/2006GL026689.
Ralph, F. M., J. J. Rutz, J. M. Cordeira, M. Dettinger, M. Anderson, D. Reynolds, L. J. Schick, and C. Smallcomb, 2019: A scale to characterize the strength and impacts of atmospheric rivers. Bull. Amer. Meteor. Soc., 100, 269–289, https://doi.org/10.1175/BAMS-D-18-0023.1.
Roulin, E., and S. Vannitsem, 2012: Postprocessing of ensemble precipitation predictions with extended logistic regression based on hindcasts. Mon. Wea. Rev., 140, 874–888, https://doi.org/10.1175/MWR-D-11-00062.1.
Rutz, J. J., W. J. Steenburgh, and F. M. Ralph, 2014: Climatological characteristics of atmospheric rivers and their inland penetration over the western United States. Mon. Wea. Rev., 142, 905–921, https://doi.org/10.1175/MWR-D-13-00168.1.
Scher, S., and G. Messori, 2018: Predicting weather forecast uncertainty with machine learning. Quart. J. Roy. Meteor. Soc., 144, 2830–2841, https://doi.org/10.1002/qj.3410.
Shonk, J. K., E. Guilyardi, T. Toniazzo, S. J. Woolnough, and T. Stockdale, 2018: Identifying causes of western Pacific ITCZ drift in ECMWF System 4 hindcasts. Climate Dyn., 50, 939–954, https://doi.org/10.1007/s00382-017-3650-9.
Skamarock, W. C., and Coauthors, 2019: A description of the Advanced Research WRF Model version 4.3. NCAR Tech. Note NCAR/TN-556+STR, 148 pp., https://doi.org/10.5065/1dfh-6p97.
Thompson, G., P. R. Field, R. M. Rasmussen, and W. D. Hall, 2008: Explicit forecasts of winter precipitation using an improved bulk microphysics scheme. Part II: Implementation of a new snow parameterization. Mon. Wea. Rev., 136, 5095–5115, https://doi.org/10.1175/2008MWR2387.1.
Tripathi, O. P., and F. Dominguez, 2013: Effects of spatial resolution in the simulation of daily and subdaily precipitation in the southwestern US. J. Geophys. Res. Atmos., 118, 7591–7605, https://doi.org/10.1002/jgrd.50590.
Van den Bergh, J., and E. Roulin, 2016: Postprocessing of medium range hydrological ensemble forecasts making use of reforecasts. Hydrology, 3, 21, https://doi.org/10.3390/hydrology3020021.
Vannitsem, S., and Coauthors, 2021: Statistical postprocessing for weather forecasts: Review, challenges, and avenues in a big data world. Bull. Amer. Meteor. Soc., 102, E681–E699, https://doi.org/10.1175/BAMS-D-19-0308.1.
Vitart, F., 2004: Monthly forecasting at ECMWF. Mon. Wea. Rev., 132, 2761–2779, https://doi.org/10.1175/MWR2826.1.
Wandel, J., J. F. Quinting, and C. M. Grams, 2021: Toward a systematic evaluation of warm conveyor belts in numerical weather prediction and climate models. Part II: Verification of operational reforecasts. J. Atmos. Sci., 78, 3965–3982, https://doi.org/10.1175/JAS-D-20-0385.1.
Warner, S., N. Platt, and J. F. Heagy, 2004: User-oriented two-dimensional measure of effectiveness for the evaluation of transport and dispersion models. J. Appl. Meteor., 43, 58–73, https://doi.org/10.1175/1520-0450(2004)043<0058:UTMOEF>2.0.CO;2.
Wehner, M. F., R. L. Smith, G. Bala, and P. Duffy, 2010: The effect of horizontal resolution on simulation of very extreme US precipitation events in a global atmosphere model. Climate Dyn., 34, 241–247, https://doi.org/10.1007/s00382-009-0656-y.
Yang, X., S. Sharma, R. Siddique, S. J. Greybush, and A. Mejia, 2017: Postprocessing of GEFS precipitation ensemble reforecasts over the U.S. mid-Atlantic region. Mon. Wea. Rev., 145, 1641–1658, https://doi.org/10.1175/MWR-D-16-0251.1.
Zsoter, E., C. Prudhomme, E. Stephens, F. Pappenberger, and H. Cloke, 2020: Using ensemble reforecasts to generate flood thresholds for improved global flood forecasting. J. Flood Risk Manage., 13, e12658, https://doi.org/10.1111/jfr3.12658.
Zuo, Z., S. Yang, Z.-Z. Hu, R. Zhang, W. Wang, B. Huang, and F. Wang, 2013: Predictable patterns and predictive skills of monsoon precipitation in Northern Hemisphere summer in NCEP CFSv2 reforecasts. Climate Dyn., 40, 3071–3088, https://doi.org/10.1007/s00382-013-1772-2.