A key aim of observational campaigns is to sample atmosphere–ocean phenomena to improve understanding of these phenomena, and in turn, numerical weather prediction. In early 2018 and 2019, the Atmospheric River Reconnaissance (AR Recon) campaign released dropsondes and radiosondes into atmospheric rivers (ARs) over the northeast Pacific Ocean to collect unique observations of temperature, winds, and moisture in ARs. These narrow regions of water vapor transport in the atmosphere—like rivers in the sky—can be associated with extreme precipitation and flooding events in the midlatitudes. This study uses the dropsonde observations collected during the AR Recon campaign and the European Centre for Medium-Range Weather Forecasts (ECMWF) Integrated Forecasting System (IFS) to evaluate forecasts of ARs. Results show that ECMWF IFS forecasts 1) were colder than observations by up to 0.6 K throughout the troposphere; 2) have a dry bias in the lower troposphere, which along with weaker winds below 950 hPa, resulted in weaker horizontal water vapor fluxes in the 950–1000-hPa layer; and 3) exhibit an underdispersiveness in the water vapor flux that largely arises from model representativeness errors associated with dropsondes. Four U.S. West Coast radiosonde sites confirm the IFS cold bias throughout winter. These issues are likely to affect the model’s hydrological cycle and hence precipitation forecasts.
Observational campaigns use a range of airborne and surface-based instruments to probe atmosphere–ocean phenomena to improve understanding of these phenomena, and in turn, numerical weather prediction (NWP) models. A campaign, for example, can have a research aircraft to deploy dropsondes to measure atmospheric properties (Ralph et al. 2017), extra radiosondes to better sample weather systems or climate zones (Schäfler et al. 2018; or a research vessel for ocean measurements (Ralph et al. 2016). The observations taken may then be assimilated into NWP models to first provide a more accurate estimate of the initial state, and second, to compare with the NWP short-range forecasts prior to their assimilation to identify model errors and behavior. In recent years, there has been a range of missions, for example, covering tropical regions (e.g., Doyle et al. 2017), extratropical regions (Geerts et al. 2017; Ralph et al. 2016; Schäfler et al. 2018), polar regions (e.g., Uttal et al. 2002), and cloud processes (Flamant et al. 2018).
In January and February 2018, there was an observational campaign called Atmospheric River Reconnaissance (AR Recon) in which research aircraft released dropsondes into atmospheric rivers (ARs; Ralph et al. 2018) and other dynamically active regions across the eastern North Pacific Ocean, along with radiosondes from sites in California. ARs are important because they are responsible for much of the water vapor flux across the midlatitudes (Ralph et al. 2005, 2017) and as they can be associated with extreme precipitation, flooding, and adverse socioeconomic effects especially in coastal mountainous regions (Ralph et al. 2006; Lavers et al. 2011; Neiman et al. 2011; Ramos et al. 2015). It was the aim of AR Recon in 2018 to provide research measurements and added information to better inform decision-makers and forecasters on AR impacts. In particular, the campaign afforded the opportunity to do diagnostic studies on the capability of NWP systems to model ARs (e.g., Lavers et al. 2018; Stone et al. 2020). Lavers et al. (2018) found that the largest uncertainties in the magnitude of the AR water vapor flux in the ensemble forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF) Integrated Forecasting System (IFS) originated from the 850-hPa winds, a standard pressure level typically above the planetary boundary layer (PBL). The specific humidity was also erroneous in the ECMWF IFS forecasts examined, but was subject to less uncertainty. These uncertainties in the water vapor flux can affect the forecasts for high-impact extreme precipitation events driven by ARs and also the location and magnitude of the latent heat release, which has implications for the atmospheric dynamics and predictability (e.g., Berman and Torn 2019). Strong diabatic forcing over the central United States has also been identified as a common precursor six days prior to large forecast busts over Europe (Rodwell et al. 2013).
The AR Recon campaign conducted six intensive observation periods (IOPs) in February and March 2019 in which dropsondes and radiosondes were launched across the northeast Pacific Ocean and California, respectively. These unique dropsonde profiles increase the available data sample on ARs and together with those obtained in 2018 provide an opportunity to further the findings of Lavers et al. (2018). Specifically, one limitation of the previous research was the predominant use of three standard pressure levels (925, 850, and 700 hPa) to calculate the water vapor flux; this was because these are the only levels archived in the ECMWF ensemble system and a consistent assessment was required between the observations and medium-range forecasts. This study builds upon the previous results by considering all available dropsonde pressure levels that were assimilated into the ECMWF IFS, thus allowing for a more complete analysis that includes data at a much higher vertical resolution than considered by Lavers et al. (2018). In so doing, the following questions are addressed. First, what model errors exist in the specific humidity, temperature, and winds during these AR events? Second, in which layers do the largest water vapor flux and its errors occur? And third, using the 925-, 850-, and 700-hPa surfaces, what is the spread–error relationship of lower-tropospheric water vapor flux in ECMWF IFS forecasts?
2. Data and methods
a. IOPs and U.S. West Coast land-based radiosonde sites
There were six IOPs in AR Recon 2019: 2 February, 11 February, 13 February, 24 February, 26 February, and 1 March 2019 (all centered at 0000 UTC). Figure 1 shows the magnitude of the vertically integrated horizontal water vapor flux (integrated vapor transport, IVT), mean sea level pressure, and dropsonde locations during the IOPs. The dropsondes were deployed by research aircraft (average flight time of 8 h) mostly in regions of intense water vapor flux. These observations were transferred to the World Meteorological Organization Global Telecommunications System (GTS) and ingested into operational NWP systems including the ECMWF IFS. During the six IOPs, 259 dropsondes were released, and together with the 326 dropsondes released in the five IOPs in AR Recon 2018 [Fig. 1 in Lavers et al. (2018)], there were 585 dropsondes available, with 571 that include the 925-, 850-, and 700-hPa surfaces. In the IOPs in 2018, Vaisala RD94 dropsondes were used and their accuracy for pressure, temperature, and relative humidity is 0.4 hPa, 0.2 K, and 2%, respectively (Vaisala 2010). In the 2019 IOPs, the Vaisala RD41 dropsondes were used, with higher accuracy for temperature of 0.1 K (Vaisala 2018). Jensen et al. (2016) indicate the high performance of the humidity sensors, and the wind observations were reported to the nearest 1 m s−1 (due to the alphanumeric data reports).
To assess the representativeness of the dropsondes, the 0000 UTC ascents at four U.S. West Coast land-based radiosonde sites were assessed during winter [December–January–February (DJF)] 2017/18 and 2018/19 and summer [June–July–August (JJA)] 2018 and 2019. The sites used were at Salem, Oregon (ID: 72694; 44.9°N, 123.0°W); Medford, Oregon (ID: 72597; 42.4°N, 122.9°W); Oakland, California (ID: 72493; 37.7°N, 122.2°W); and San Diego, California (ID: 72293; 32.8°N, 117.1°W). These four radiosonde sites are shown as magenta dots in Fig. 1.
b. ECMWF ensemble of data assimilations and forecasts
To characterize the initial atmospheric flow uncertainty in the ECMWF IFS, an ensemble of data assimilations (EDA; Isaksen et al. 2010) is employed. During this period, the EDA consisted of one control member and 25 perturbed members in which the first-guess (background; 3–15-h lead time) forecasts and observations (including the dropsonde data) were combined (using four-dimensional variational data assimilation) to produce 26 new analyses. In this framework, the variance of the first-guess forecasts indicates the initial flow uncertainty and the observation errors are derived from the different observation perturbations estimated and applied in each of the 25 perturbed EDA members. The analyses attempt to describe the remaining atmospheric uncertainty after assimilation. Following the data assimilation procedure, the 50 ensemble forecast members (ENS) that run out to 15 days are produced by a symmetric combination of 6-h forecasts from the EDA analyses, the latest single high-resolution forecast, and singular vectors (Lang et al. 2015). Stochastic perturbations (Leutbecher et al. 2017) are applied within the background forecasts and ENS. The EDA in the 0000 UTC window and ENS data from 0000 UTC were interpolated to the release point of the dropsonde observations.
c. Forecast evaluation and water vapor flux calculation
In data assimilation, it is common practice to assess the model fit to observations using observation-minus-background differences (O − B; Desroziers et al. 2005), referred to in this study as departures. Herein, using all dropsonde locations, we calculate O − B in the EDA control member for specific humidity, temperature, and wind speed on all assimilated pressure levels from 1000 to 200 hPa and average these departures in 50-hPa layers. This approach allows for the identification of potential model biases in different layers. Note that each dropsonde report received over the GTS for use in this study typically contained 15–40 levels and thus they have fewer pressure levels than those measured by the dropsondes on their descent to the ocean surface. These pressure levels consist of both standard and significant levels (most of them being significant) and the average O − B departures were evaluated from these levels.
The horizontal pressure-level water vapor flux magnitudes were determined as the product of specific humidity and the horizontal wind speeds and then evaluated in 50-hPa layers. These fluxes were assessed by linearly interpolating specific humidity onto wind levels; similar fluxes were found when interpolating the winds on to specific humidity levels. This approach, without considering the vertical integral of the water vapor flux, guards against higher weighting being given to any particular pressure level.
A reliable ensemble forecasting system should have the property where over many forecasts the mean ensemble variance and error of the ensemble mean are the same (Leutbecher and Palmer 2008). For example, the ensemble variance (EnsVar) can be compared with the variance of the error of the ensemble mean (Error2); the standard deviation of these quantities can also be compared. Rather than discussing errors, however, we discuss the departures of forecasts from observations because each has associated errors and biases. Following Rodwell et al. (2016), we estimate observation uncertainty (ObsUnc2) consistent with the EDA’s observation perturbations and calculate the squared bias between the forecast and observation (Bias2). Writing Depar2 for the mean squared departure and DepVar for the variance of the departures (once the bias has been removed), we obtain the following:
This modified spread–error relationship is investigated in water vapor fluxes on the 925-, 850-, and 700-hPa pressure surfaces at 24-h intervals out to 120 h (day 5).
3. Results and discussion
a. Forecast evaluation with dropsonde observations
To illustrate the AR Recon dropsonde data used to calculate the O − B departures, Fig. 2 shows scatterplots of observations versus background (EDA control) forecasts for specific humidity, temperature, wind speed, and water vapor fluxes in the 950–1000-hPa layer. First, as expected given the short forecast range, all variables have a strong linear correlation ranging from 0.86 for water vapor flux (Fig. 2d) to 0.97 for temperature (Fig. 2b). Second, for this particular layer, the mean O − B shows that the observed temperature is 0.23 K warmer (Fig. 2b) and the specific humidity is 0.15 g kg−1 moister (Fig. 2a) than the model, as highlighted by the location of many of the points below the 1:1 lines in Figs. 2a and 2b. Third, the cloud of points for wind speed (Fig. 2c) and water vapor flux (Fig. 2d) have a similar shape implying that the model has a tendency to underestimate low-level wind speeds (>20 m s−1) and subsequently underestimate low-level water vapor fluxes. This likely causes the positive O − B values of 0.41 m s−1 and 6.06 g kg−1 m s−1 for the wind speed and vapor fluxes, respectively (Figs. 2c,d).
The average O − B departures (in the EDA control member) in 50-hPa layers for the specific humidity, temperature, wind speed, and water vapor fluxes are shown in Fig. 3. Figure 3a reveals that the model is drier than observations by ~0.1–0.2 g kg −1 below 850 hPa and moister than observations by ~0.1 g kg −1 between 600 and 850 hPa. The dry departure below 850 hPa would reduce the lower-tropospheric water vapor flux in the model and potentially affect IFS precipitation. The model moist bias at midlevels has less impact on the water vapor fluxes as it is above the low-level jet core; this is confirmed by assessment of the observed and background fluxes in Fig. 3d. For the temperature in Fig. 3b, the model mostly has a cold bias throughout the troposphere when compared with the dropsonde profiles, with the largest departure of about 0.6 K found from 900 to 950 hPa. Over the ocean this layer includes, on average, the PBL top, which is calculated here using the bulk Richardson number (e.g., Lavers et al. 2019). Figure 4 shows scatterplots of the observations versus background for the height and temperature at the PBL top. There is generally good agreement found in the scatterplots and the linear correlation has values of 0.79 and 0.96 for the height and temperature of the PBL top, respectively; the average observed PBL height is 734.4 m (928.0 hPa) and the background PBL height is 743.6 m (926.4 hPa). These findings for the PBL height are in line with those for the midlatitudes in Lavers et al. (2019) and the mean O − B for temperature at the PBL top of 0.44 K corroborates the O − B for temperature in Fig. 3b. Furthermore, the temperature departures of all layers in Fig. 3b have mean values that are significantly different from zero at the 90% level, as shown by the black error bars that do not cross zero, suggesting that the signal of a cold temperature bias in the model is robust. These model cold biases are similar to previous findings of an approximate 0.5 K bias in the tropics (20°S–20°N) and somewhat smaller biases in the northern midlatitudes at low levels (Ingleby 2017).
For wind speed, there is evidence that the observations are stronger than the background in the 950–1000-hPa layer and in the upper troposphere (Fig. 3c), with the latter perhaps suggesting a model underestimation of the jet stream. In most layers, however, there is little sign of a clear bias. The pressure-level background and observed water vapor fluxes averaged in 50-hPa layers are plotted in Fig. 3d. In the model and observations, the 900–950- and 950–1000-hPa layers have the peak water vapor fluxes, where observed values of 109 and 118 g kg−1 m s−1 are found, respectively. The low-altitude nature of this flux in the PBL may not be adequately captured by the commonly archived 925- and 1000-hPa levels suggesting that forecast studies of ARs using a limited number of pressure levels may not provide accurate flux estimates. With the moisture content of the atmosphere decreasing with height, water vapor fluxes also typically decrease with increasing height, with a notable drop off in water vapor fluxes above 850 hPa in the present study (Fig. 3d). A comparison of the observed (gray bars) and model (red bars) fluxes highlights that the model underestimates water vapor fluxes below 850 hPa, with Fig. 3e showing that the O − B in the 950–1000-hPa layer is 6.06 g kg−1 m s−1 meaning that the forecast flux is 5.2% below the observed flux. These results indicate that the modeled transport of water vapor through the atmospheric branch of the water cycle within ARs (over the ocean) is insufficient in the IFS forecasts, which has important ramifications for precipitation forecasting and for the correct positioning of moisture for latent heat release (e.g., Reynolds et al. 2019).
To ascertain whether there was dependence between the low-altitude forecast departures (of specific humidity, temperature, winds, and the water vapor flux) and their location relative to the AR, we also assessed the relationship between the root-mean-square departures (RMSD) in the EDA control member and the IVT at the 0000 UTC analysis at the closest model grid point. We use the IVT because it acts as a proxy for AR location, with larger values being mostly situated near the core of the AR. While there is some positive association between RMSD and IVT, there is not a particularly strong correlation evident (not shown). This aspect of the forecast evaluation will be tested further with more dropsonde observations that will become available with future campaigns.
b. Forecast evaluation with U.S. West Coast radiosonde observations
We now consider the O − B departures (in the EDA control member) at the four U.S. West Coast land-based radiosonde sites in Figs. 5 and 6 . For the specific humidity during winters (DJF) 2017/18 and 2018/19 (Figs. 5a,d,g,j), there are primarily negative O − B values suggesting the model has an overall wet bias. This model wet bias is somewhat different to the dropsonde results where a model dry bias was found at low altitudes. For specific humidity during summers (JJA) 2018 and 2019 (Figs. 6a,d,g,j), the model bias in the lower troposphere changes sharply with height, with dry biases found below 950 hPa and wet biases above 950 hPa at Salem and San Diego, and a dry bias found between 950 and 800 hPa in Oakland and Medford (with a wet bias in Oakland below 950 hPa). These results illustrate how, over land, biases can vary significantly based on the model’s ability to represent the local climate, effects of topography, and land–sea differences at the subgrid scale. For example, the model grid cells over land do not accurately represent subgrid-scale urban areas, which would neglect the urban heat island effect thus potentially leading to temperature biases and moisture content issues. Note, also that the O − B departures are mostly larger in JJA owing to the higher moisture content present in the summer (cf. Figs. 5a,d,g,j and 6a,d,g,j).
In terms of temperature, positive O − B departures or model cold biases are generally seen in the radiosonde profiles and these departures are broadly similar between DJF (Figs. 5b,e,h,k) and JJA (Figs. 6b,e,h,k) particularly above the PBL. This corroborates the dropsonde results discussed in section 3a and the cold biases presented in Ingleby (2017). In particular, the largest temperature biases throughout the troposphere are found at the southernmost site assessed herein at San Diego, which agrees with the results for the tropics in Ingleby (2017). It is hypothesized that these biases are linked to low-level humidity and cloud problems from several interrelated issues, such as, challenges posed by low vertical resolution satellite data to the data assimilation system and forecast model errors. For the wind speed (Figs. 5c,f,i,l and 6c,f,i,l), the high-altitude winds are generally underestimated by the model, a similar finding to the dropsonde results (Fig. 3c). In the lower troposphere (except Medford; Figs. 5f and 6f) the O − B wind speed departures are negative, and hence different from the dropsondes, meaning that the model winds are mostly too large, which potentially indicates an issue with model roughness and drag at these locations. The different average wind profile at Medford relates to its higher altitude (~400 m) and surrounding complex terrain. Comparing DJF (Figs. 5c,f,i,l) with the results of JJA (Figs. 6c,f,i,l) shows that the sign of the model wind speed bias at different heights is broadly consistent in both seasons, although the magnitude of the bias may change between summer and winter.
c. Spread–error relationship of water vapor flux at the dropsonde locations
A forecast evaluation of the square root of the modified spread–error relationship in (2) at the dropsonde profiles is now undertaken for water vapor fluxes on the 925-, 850-, and 700-hPa levels and the results are presented in Fig. 7. In general, the standard deviation of the departures (solid lines) increases with lead time with the 925-hPa surface having the largest departures (Fig. 7a). Note that an evaluation of the modified spread–error relationship with a smaller sample size showed that the shape of the error lines depended somewhat on the sample used. Thus, the smallest departures found at T + 48 in Fig. 7 are hypothesized to result from the relatively small sample considered and do not indicate that T + 48 is less subject to errors. The ensemble standard deviation (dashed lines) also grows with lead time, but it is not enough to explain the standard deviation of the departures . The observation uncertainty is also an important component, since by considering it, the total draws closer to the departures (cf. dotted and solid lines in Fig. 7). In this case, the observation uncertainty is thought to largely account for representativeness errors whereby the model resolving processes on a grid does not adequately represent the point observation from the dropsonde. However, this is still not enough to match the standard deviation of the departures, which implies that the forecasts for this quantity are underdispersive, and this may reflect the need for improved stochastic perturbations to better represent model uncertainty (e.g., of large horizontal gradients in ARs) and ultimately increase the dispersiveness of the ensemble. An alternative approach to address the representativeness issue is through a downscaling of the model fields to the point observation (which would further increase EnsVar). However, this is outside the scope of this study.
This investigation has used unique dropsonde observations collected during AR Recon 2018 and 2019 to evaluate the forecasts of ARs in the ECMWF EDA and ENS. First, the O − B departures calculated in the AR Recon IOPs indicate that the EDA control member had a cold bias throughout the troposphere, reaching a maximum value of 0.6 K in the 900–950-hPa layer. This cold bias was also found at four U.S. West Coast radiosonde sites across both winter and summer seasons indicating that the bias is independent of AR occurrence. Furthermore the most noticeable cold bias was found at the southernmost radiosonde site (San Diego), consistent with the cold bias identified in the tropics in previous studies. Second, the EDA control member at the dropsonde profiles is 0.1–0.2 g kg−1 drier below 850 hPa and has weaker winds below 950 hPa, which results in an O − B of 6.06 g kg−1 m s−1 for water vapor flux in the 950–1000-hPa layer, or 5.2% of the average observed flux of 118 g kg−1 m s−1. This implies that the forecasted moisture flux within ARs is insufficient, thus potentially affecting the downstream precipitation forecasts and the location of moisture within the model. Note that the largest errors herein are found at a lower altitude than in Lavers et al. (2018). This results from the use of pressure-level water vapor fluxes compared to the vertical integral of the water vapor transport from three levels in Lavers et al. (2018) which resulted in a higher weight being given to 850 hPa. Third, the evaluation of the modified spread–error relationship revealed the IFS to be underdispersive. This is partly due to representativeness error, whereby the model resolves processes on the model grid scale of 18 km, which is a larger scale than the point observation given by the dropsonde. Even when an attempt is made to account for the representativeness issue through an estimate of the observation uncertainty, we found that the IFS still had too little spread. There is also a possibility that not enough model uncertainty is accounted for in the forecasts.
Further research on this topic could follow multiple directions and here we highlight three possibilities. First, as model improvements are generally implemented annually at NWP centers the future deployment of dropsondes in observational campaigns could be used to assess whether model upgrades are improving the forecasts of ARs and precipitation. Second, data denial experiments (i.e., IFS model forecasts run without these dropsonde observations) could be performed and evaluated to determine the impact of better representing the water vapor flux on precipitation forecasts. Third, a potential limitation of dropsondes is that they are usually available for specific storms during IOPs, which results in a dropsonde sample that is biased toward storms. Herein, we addressed this by using U.S. West Coast radiosonde sites. This shows that the cold bias is more widespread than just within storms and ARs. However, the dry bias at low levels was not present at these land stations which suggests that moisture biases are more dependent on the storms or the surface characteristics. This could be further investigated by using the synoptic radiosonde network in other midlatitude regions that are affected by ARs to ascertain if the forecast departures and uncertainties uncovered herein are found in other locations. This type of diagnostic study would then lead to improved understanding of model behavior and errors.
The authors acknowledge financial support from the European Union Horizon 2020 IMPREX project (Grant 641811). We are deeply thankful to the NOAA and U.S. Air Force flight crews for undertaking the missions to provide these dropsonde observations. JDD and CAR acknowledge the support of the Chief of Naval Research through the NRL Base Program, PE 0601153N. ACS acknowledges the support of U.S. Army Corps of Engineers (USACE)-Cooperative Ecosystem Studies Unit (CESU) as part of Forecast Informed Reservoir Operations (FIRO), Grant W912HZ-15-2-0019 and the California Department of Water Resources Atmospheric River Program, grant 4600010378 TO#15 Am 22. The authors are grateful to the three anonymous reviewers whose comments helped to clarify and improve the paper.
Data availability statement: The data used are available through the ECMWF archive (https://www.ecmwf.int/en/forecasts/datasets/archive-datasets).
Denotes content that is immediately available upon publication as open access.