Observations across the North Atlantic jet stream with high vertical resolution are used to explore the structure of the jet stream, including the sharpness of vertical wind shear changes across the tropopause and the wind speed. Data were obtained during the North Atlantic Waveguide and Downstream Impact Experiment (NAWDEX) by an airborne Doppler wind lidar, dropsondes, and a ground-based stratosphere–troposphere radar. During the campaign, small wind speed biases throughout the troposphere and lower stratosphere of only −0.41 and −0.15 m s−1 are found, respectively, in the ECMWF and Met Office analyses and short-term forecasts. However, this study finds large and spatially coherent wind errors up to ±10 m s−1 for individual cases, with the strongest errors occurring above the tropopause in upper-level ridges. ECMWF and Met Office analyses indicate similar spatial structures in wind errors, even though their forecast models and data assimilation schemes differ greatly. The assimilation of operational observational data brings the analyses closer to the independent verifying observations, but it cannot fully compensate for the forecast error. Models tend to underestimate the peak jet stream wind, the vertical wind shear (by a factor of 2–5), and the abruptness of the change in wind shear across the tropopause, which is a major contribution to the meridional potential vorticity gradient. The differences are large enough to influence forecasts of Rossby wave disturbances to the jet stream with an anticipated effect on weather forecast skill even on large scales.
The existence and behavior of the North Atlantic jet stream is central to the weather experienced across Europe in all seasons. Weather systems having major impacts on surface conditions, such as midlatitude cyclones, the fronts embedded within them, and mesoscale convective systems, are all influenced strongly by interaction with the jet stream. Their structure and evolution is affected by the location of strong vertical wind shear, as well as wave and vortex disturbances at tropopause level that develop as the jet stream meanders and contorts. Meandering jet streams coincide with strong gradients of potential vorticity (PV) along the isentropic surfaces intersecting the tropopause. These gradients serve as a waveguide for propagating Rossby waves (Hoskins and Ambrizzi 1993; Schwierz et al. 2004; Martius et al. 2010). Disturbances to the waveguide at the entrance (western) end of the storm track can have a major effect on surface weather thousands of kilometers downstream through the propagation of disturbance energy in the form of Rossby wave packets [see recent review by Wirth et al. (2018)]. Therefore, a detailed representation of the jet stream structure is important not only locally in forecasting upper-tropospheric winds, but it also has far-reaching consequences for predicting surface weather system development.
Accurate prediction of Rossby waves is sensitive to the representation of the jet stream structure and associated PV gradient, even though their wavelength exceeds the width of the strongest PV gradient regions by several orders of magnitude. This introduces a resolution dependence to jet stream prediction. It has been demonstrated that global numerical weather prediction (NWP) models fail to maintain sufficiently sharp PV gradients at the tropopause, and Rossby wave amplitude decreases with lead time (Gray et al. 2014; Saffin et al. 2017). If the PV gradient is too smooth in a model, then advection of disturbances by the jet stream and counterpropagation of Rossby waves against the zonal flow are both expected to be too weak. Harvey et al. (2016) showed analytically that although these effects on Rossby wave phase speed cancel to first order, in more accurate estimates phase speed must always decrease (slower eastward). Harvey et al. (2018) used wave activity theory to show that when the PV gradient is too smooth in a model, then Rossby wave amplitude is also predicted to decay. The lead-time dependence of the PV gradient forecast error, both in horizontal gradient along an isentropic surface (Gray et al. 2014) and in vertical gradient (Saffin et al. 2017), indicates that the NWP models struggle to represent the tropopause, an issue that is expected to be even more prominent in climate prediction models due to their lower spatial resolution. Davies and Didone (2013) showed how forecast errors of PV propagate and amplify along the jet stream waveguide, and Baumgart et al. (2018) have quantified the extent to which different dynamical mechanisms contribute to the growth of PV forecast error from uncertainty in the initial conditions.
In this study, we examine high-resolution observations of the jet stream (detailed in section 2) and compare them with the representation of jet stream winds in meteorological analyses and short-term forecasts. It is an open question to what extent they are able to represent the observed wind speed distribution, especially the strength of the vertical wind shear on either side of the tropopause, which is of crucial importance for an accurate representation of the meridional PV gradient and Rossby wave evolution.
In the 1990s and early 2000s, several studies that used in situ observed winds on board commercial airliners to validate NWP winds reported on significant wind speed biases in meteorological analyses (Tenenbaum 1991, 1996; Rickard et al. 2001; Cardinali et al. 2004). Multicase averaging revealed wind speed biases increasing with observed wind speeds and reaching values of up to 5%–10% (Rickard et al. 2001). Cardinali et al. (2004) found that jet streak winds are too weak by 2%–5% in data-dense regions over the United States and by 5%–9% in data-sparse regions over Canada. The continuous increase of vertical and horizontal resolution in NWP models, the continuous increase in quality, amount and resolution of aircraft and satellite observations and their improved application has led to a substantially improved representation of winds in NWP analyses. As depicted by Petersen (2016), Northern Hemispheric wind errors decreased by about 40% for 24-h forecasts between 1984 and 2004. Houchi et al. (2010) compared winds in different climate regions using high-vertical-resolution radiosondes from 85 stations and ECMWF short-term forecasts in the year 2006. They found qualitative agreement of observed and modeled wind distributions at all levels. However, they note a substantial underestimation of vertical wind shear and its variability associated with small-scale vertical wind gradients that are not well represented by ECMWF short-term forecasts, particularly due to the limited vertical resolution of the model. Based on multimonth analysis differences between ECMWF and the National Centers for Environmental Prediction (NCEP), Baker et al. (2014) estimate an uncertainty of winds at 300 hPa on the order of 2–3 m s−1 over the northern North Atlantic. More recently, Belmonte Rivas and Stoffelen (2019) compared surface winds represented by ERA5 with Advanced Scatterometer (ASCAT) observations and found systematic circulation errors in the sense that surface winds are too cyclonic across ocean basins in the reanalysis and meridional winds are too weak in midlatitudes. These surface wind errors were attributed to underestimation in directional wind turning (the Ekman spiral) across the boundary layer of the ECMWF model. Therefore, it can be anticipated that errors at tropopause level will not have the same characteristics as surface wind errors.
In this study we compare operational meteorological analyses and short-term forecasts of two global NWP centers, the ECMWF and the Met Office, with a unique set of wind profile observations across the tropopause that was obtained during the North Atlantic Waveguide and Downstream Impact Experiment (NAWDEX). NAWDEX was conducted in autumn 2016 with the aim to examine the structure of the jet stream, the impact of diabatic processes on the jet stream disturbances, and their influence on high-impact weather downstream (Schäfler et al. 2018). For the first time, an established Doppler wind lidar payload on board the research aircraft DLR Falcon performed dedicated observations of the jet stream winds providing both high vertical and horizontal resolution, which is not available from other observational sources. Additionally, the wind lidar dataset is supplemented by dropsonde and ground-based wind profiler observations to provide a wider coverage and to investigate the observational reliability of the wind lidar.
In section 2, we provide an overview of the observation and model data and the methods applied to validate analyses and short-term forecasts of ECMWF and Met Office. In section 3, a case study is presented with coordinated wind lidar and dropsonde observations of a jet stream near Iceland on 23 September 2016. Section 4 contains a statistical evaluation of the horizontal wind and vertical wind shear representation during the NAWDEX field phase based on the wind lidar dataset and wind profiler observations. Discussion of the results and conclusions are given in section 5. The implications of the findings are presented in section 6.
2. Data and methods
a. Airborne observations: Doppler wind lidar and dropsondes
During NAWDEX, wind observations on board the DLR Falcon were obtained by two Doppler wind lidar systems: the ALADIN Airborne Demonstrator (A2D; Reitebuch et al. 2009; Lux et al. 2018; Marksteiner et al. 2018) and the 2-μm Doppler wind lidar system (Weissmann et al. 2005; Witschas et al. 2017). In this study we rely on observations of the horizontal wind vector measured by the 2-μm Doppler wind lidar (DWL). Additionally, we use wind observations measured by in situ sensors in the nose-boom of the aircraft and by dropsondes that were released during coordinated flights with the High Altitude and Long Range Research Aircraft (HALO; Schäfler et al. 2018).
The coherent and heterodyne detection DWL measures range resolved profiles of the horizontal wind vector beneath the aircraft through detection of frequency shifts between emitted and retrieved laser signals. The DWL uses a wavelength of 2022.54 nm in an atmospheric window with low absorption of water vapor enabling wind measurements up to the maximum flight altitude of ~12 km, depending on aerosol column beneath. The DWL transmits short laser pulses with a length of 400–500 ns, a repetition rate of 500 Hz, and an energy of 1–2 mJ to the atmosphere beneath the aircraft. The signal is partly scattered back to the aircraft by aerosols and cloud particles where it is received by a telescope and analyzed for frequency shift Δf, which is proportional to the wind speed υLOS in the line of sight (LOS) according to Δf = (2f0 × υLOS)/c, where f0 is the laser frequency, c is the speed of light, and λ0 = c/f0 = 2022.54 nm is the laser wavelength. To be able to derive a horizontal wind vector from LOS measurements, the DWL uses a double-wedge scanner to measure LOS winds at different pointing directions. A conical step-and-stare scan pattern [velocity–azimuth display (VAD) technique] around the vertical axes with an off-nadir angle of 20° provides 21 LOS observations per one scanner revolution. A mean wind vector in the measurement volume can be derived by combining these 21 LOS velocities at different viewing direction. A wind profile is derived every 42 s, that is, the time that is required for one complete scanner revolution with 21 LOS observations including an averaging of 1 s per LOS position and the scanner movement. Wind vectors are derived at a vertical resolution of 100 m. A more detailed instrument description of the DWL and the algorithms for the wind retrieval can be found in Witschas et al. (2017).
During NAWDEX, the DLR Falcon successfully observed approaching cyclones and evolving jet streams surrounding Iceland. Eight flights were performed with the DWL between 17 September and 9 October 2016 [see Fig. 1a and overview in Schäfler et al. (2018)] corresponding to a total measurement time of 22 h 55 min and a total distance of ~17 000 km. In a total of 1922 measurement profiles between 0 and 12 km altitude, 77 541 horizontal wind measurements were obtained, which corresponds to a total data availability of about 33.8% resulting from low concentration of the required aerosol or cloud scatterers in the frequently sampled clean and dry tropospheric and lower-stratospheric air at high latitudes. However, the NAWDEX dataset provides a maximum in data availability where the average wind shows a maximum, between 8 and 10 km altitude (Fig. 1b). The maximum data availability of 80% at 9.4 km altitude corresponds to ~18 h 20 min of observations and a flight distance of 13 500 km. The mean profile separation, that is, the horizontal resolution, which depends on the speed of the aircraft and the time for one scanner revolution (~42 s), is approximately 8.6 km. The distribution of all observations shows that winds up to 91 m s−1 were sampled, which represents the highest wind speeds that have been observed by the DWL since its first airborne deployment in 2001.
To assess the accuracy (systematic error) and precision (random error) of the DWL during the campaign, typically comparisons with independent observation types are conducted. During three DLR Falcon research flights (RF02, RF03, and RF04) on 17, 21, and 23 September, coordinated flights with HALO provide 15 dropsondes that are used for a comparison with DWL winds. Dropsondes are small instrument carriers consisting of temperature, pressure and humidity sensors as well as a GPS receiver that transmit their data to the Airborne Vertical Atmospheric Profiling Systems (AVAPS; UCAR/NCAR–Earth Observing Laboratory 1993; Hock and Franklin 1999) on board the aircraft that consists of a data acquisition and processing unit. AVAPS is a well-established dropsonde system to provide high-quality and high-resolution profile data from the flight altitude down to the ground (e.g., Wang et al. 2015). During NAWDEX the Vaisala dropsonde, version RD94, was used (Vaisala 2017) and the data were quality controlled using the automatic postprocessing Earth Observing Laboratory (EOL) Atmospheric Sounding Processing Environment (ASPEN; https://www.eol.ucar.edu/software/aspen) software. Wind speed accuracy is on the order of 0.2–0.3 m s−1 (H. Vömel 2019, personal communication).
The dropsonde wind observations were vertically interpolated to the DWL vertical resolution of 100 m and after accounting for the drift of the dropsonde, the spatially closest DWL observation was used for comparison. Figure 1c shows a scatterplot for 529 pairs of wind observations from the DWL and dropsondes ranging between 4 and 55 m s−1. Although the mean horizontal distance between sets of the compared observations is 10.8 km and maximum distances up to 29 km are reached, no dependence on the distance difference between both observations is discernible. The good agreement is reflected by a high correlation coefficient of 0.99. A linear fit reveals a slope value of 0.99 and an intercept of −0.004 m s−1. The mean bias is 0.05 m s−1 and the standard deviation is 1.87 m s−1. A more restrictive selection of data points, with a maximum horizontal distance between dropsonde and DWL of 10 km, leads to a reduced number of 245 observations for the comparison and a reduced standard deviation of 1.50 m s−1. These results are in agreement with earlier findings that are summarized in Table 1 following Witschas et al. (2020). Slight differences between the different campaigns may arise from different weather situations and related wind variability and aerosol loads resulting in different signal-to-noise ratios, differences in the retrieval algorithms and quality-control thresholds, or differences in the spatial–temporal collocation. Nevertheless, these results demonstrate the high accuracy and precision of the DWL.
b. Wind profiler data at South Uist
In addition to the airborne observations described above, the stratospheric–tropospheric wind profiler (STP) located on the island of South Uist in the Outer Hebrides, Scotland (Winston 2004; location indicated in Fig. 1a), provides an overview of the wind conditions during the extended NAWDEX campaign period (10 September–20 October 2016). The ATRAD STP installed at the site has an operating frequency centered at 64 MHz and is able to provide wind measurements up to an altitude of 20 km with a vertical resolution of 500 m. It runs continuously, providing data to European meteorological services through the EUMETNET E-PROFILE Program (http://eumetnet.eu/activities/observations-programme/current-activities/e-profile/). Very high frequency (VHF) radio waves are generated by a 12 × 12 antenna array. The directional beams are partially scattered off irregularities in the atmospheric refractive index, and the LOS winds are derived from the Doppler-shifted return frequency. Horizontal wind components are constructed from a cyclic sequence of 5 vertical and near-vertical beam pointing directions known as Doppler beam swinging. The dwell time for each direction is 1 min, giving a maximum temporal frequency of 5 min; however, to reduce measurement errors, the data transmitted on the Global Telecommunication System (GTS) via the E-PROFILE network are averaged over 30 min periods, and these data are utilized here (data are available for download from the Met Office 2008). Typical measurement areas at ~10 km altitude are 5 km × 5 km. The STP data were assimilated at ECMWF and Met Office.
The accuracy of the current configuration of the South Uist wind profiler has not been assessed systematically against independent high-resolution observations; however, a number of similar STP systems from the same manufacturer located in Australia have recently been evaluated against collocated radiosonde observations by Dolman et al. (2018). They find the line of best fit between the individual wind components measured by the two techniques to be in the range 0.93–0.97. Earlier STP systems have been systematically evaluated by Dibbern et al. (2001) who found typical mean wind speed biases relative to radiosonde measurements of order 0.09 m s−1 with a standard deviation of 1.5 m s−1.
c. Modeled winds
For the comparison, we use ECMWF operational analysis and short-term forecast fields from the atmospheric high-resolution model (HRES; IFS cycle 41r2) with spectral truncation TCo1280 (Malardel et al. 2016). The data were retrieved from ECWMF’s Meteorological Archival and Retrieval System (MARS) and interpolated to a 0.125° × 0.125° longitude–latitude grid (~14 km). The IFS is a hydrostatic atmospheric model that uses a hybrid-pressure vertical coordinate with 137 levels that transition from terrain-following surfaces into pressure surfaces with increasing altitude (Simmons and Burridge 1981). To compare with wind observations, first the pressure at each level is calculated by using the surface pressure before the geopotential height can be derived from integrating the hydrostatic equation using pressure and temperature profiles. Details on the vertical discretization and altitude calculation can be found in the IFS documentation in Part III: Dynamics and Numerical procedures (available at https://www.ecmwf.int/en/forecasts/documentation-and-support). We use 6-h analysis fields (0000, 0600, 1200, and 1800 UTC) in combination with hourly forecasts initialized from 0000 to 1200 UTC for the intermediate time steps (e.g., Schäfler et al. 2010) as higher temporal frequency reduces the error in interpolating model data to observation points. For example, this strategy is used by many authors for airmass trajectory calculations, despite the differences between analyses and short-range forecasts, because the reduced interpolation error has been shown to reduce net trajectory error (e.g., Stohl et al. 2001).
The NAWDEX wind observations are also compared with operational analyses and forecasts from the Met Office using the Met Office Unified Model (MetUM). The MetUM is a nonhydrostatic fully compressible model with deep atmosphere dynamics. The model version in use in 2016 was the GA6.1/GL6.1 science configuration (Walters et al. 2017) operating with a horizontal N768 grid (~17 km grid-spacing in midlatitudes), with 70 vertical levels on a terrain-following hybrid-height Charney–Phillips grid. Since this model is formulated in hybrid-height coordinates, no vertical integration is required to derive altitude values. To compare with the observations, the wind components are output on model levels and simply interpolated in the horizontal and vertical to the coordinates of the observations using linear interpolation in space and time. Forecasts are initialized from analyses at 6-h intervals (0000, 0600, 1200, and 1800 UTC) with data output at 1-h intervals.
Please note that the DWL profile data are an independent dataset, meaning that they were not assimilated by the IFS or MetUM data assimilation systems. In contrast, all dropsondes released during NAWDEX (Schäfler et al. 2018) and the STP data were distributed on the GTS and assimilated in the ECMWF (Schindler et al. 2020) and the Met Office prediction systems.
Figure 2 shows the distribution of IFS and MetUM model levels between ground and 15 km altitude in comparison with the vertically constant resolution of 100 m for the DWL and 500 m for the STP at South Uist. In the region 8–14 km where the jet stream is typically observed, the IFS provides 19 vertical levels with a mean vertical distance of ~300 m ranging from 290 to 310 m. The MetUM provides 11 levels at a mean vertical separation of ~550 m ranging from 460 to 630 m in this region. As we are interested in the model capability to capture the observed sharp gradients at the tropopause, we perform the comparisons at the vertical resolution of the DWL and by linearly interpolating the model data in the vertical to the observation location. Likewise, the 1-hourly model data are bilinearly interpolated in the horizontal to the profile location and linearly in time to the observation time (Schäfler et al. 2010). Please note that for the dropsondes, the model data were interpolated to the location along the fall trajectory of each dropsonde (tracked by GPS). In case of the wind profiler, we used data at a 6-hourly time resolution and only compare profiles at the time of the analysis to avoid an influence of short-term forecast error.
3. Case study
a. Synoptic overview
First, a case study on NAWDEX intensive observation period (IOP) 3 on 23 September 2016 is presented that comprises HALO (RF 03), DLR Falcon (RF 04), and the Facility for Airborne Atmospheric Measurements (FAAM) Bae 146 (RF 01) flights that observed ascending air masses within Cyclone Vladiana (Schäfler et al. 2018). In this paper the focus is on the flight of the DLR Falcon southeast of Iceland between 0710 and 1017 UTC (Fig. 3) that was coordinated with HALO between 0800 and 0900 UTC. After the joint leg, the DLR Falcon returned to Keflavik, Iceland, and HALO turned southwestward to observe a strong warm conveyor belt (WCB) related to Cyclone Vladiana (Oertel et al. 2019). At 0900 UTC, the center of Cyclone Vladiana (V) was located south of Iceland, and a second low was located to the west (Fig. 3a). The occluded frontal system related to Vladiana is visible in the increased relative humidity at 700 hPa north and west of the cyclone center and in the clouds along the cold and warm fronts in the eastern and southeastern sector of the cyclone. In the upper-level outflow of the WCB, which can be seen from the approaching high-level clouds (Fig. 4), a weak ridge has formed with its axis from northwestern Scotland toward Iceland (Fig. 3b). On their coordinated leg, the DLR Falcon and HALO entered a region of increased jet stream winds along the northeast flank of the ridge (Fig. 3b). Increased jet stream winds follow the 2 PVU contour on the 320 K isentropic surface (cf. Figs. 3a,b) and a second wind speed maximum occurred along the western flank of the ridge. On the coordinated leg dropsonde observations were made by the HALO aircraft (see colored dots in Fig. 3b). The aircraft were separated by only 50-km horizontal distance along the coordinated flight leg. Additionally, the flight was located relatively close to the wind profiler in South Uist (Fig. 3b) that was observing the jet stream while it moved over the station.
b. Observations and model evaluation
Figure 5a shows DWL wind speed observations along the entire 2340-km-long flight between 0710 and 1017 UTC (see track in Fig. 3a). After take-off at Keflavik, the Falcon initially loitered near Iceland between 0710 and 0800 UTC to wait for the HALO aircraft to join the coordinated flight leg between 0800 and 0900 UTC toward the southeast and after that returned along the same track to Iceland. In the first part of the flight leg, the data coverage in clean and dry air is low and restricted to a band extending from 1000 m to about 1500 m beneath the aircraft and to the lowest ~2 km above the ocean. In the upper band, the signal intensity is high near the aircraft, whereas an increased load of sea salt aerosol and low-level clouds increases the atmospheric return near the surface (cf. low-level clouds northeast of the WCB-induced cirrus in Fig. 4). The data coverage improves and the observed wind speeds increase up to a maximum of 58 m s−1 when both aircraft approached the upper-level cirrus clouds at about 0825 UTC and entered the region of the jet stream. The return along the same flight track causes the symmetry in the wind field in Fig. 5a. The following discussion concentrates on the coordinated part and the return flight with increased upper-level winds between 5 and 12 km altitude (gray box in Fig. 5a). The DWL observations in this subset and the complementary in situ and dropsonde observations (Fig. 5b) depict the jet stream. Dropsonde winds above and below the DWL observations confirm that, despite the limited data coverage, the DWL captured the entire vertical extent of the jet stream. Maximum wind speeds follow the dynamical tropopause with increased static stability above, as visible from the large vertical gradient of potential temperature. In the following we use the term tropopause as a synonym for the dynamical tropopause, where PV equals 2 PVU. North of Cyclone Vladiana, a colder Arctic air mass was advected beneath the ascending warm air and formed a tropopause fold structure along the transect that was also intersected on the return flight. The ascending warm air mass with elevated tropopause altitude can be characterized by two separate regions. The first part with tropopause altitudes of about 9 km (~0812–0826 and 0948–1000 UTC) features low data coverage in the tropospheric air mass indicating a lack of cirrus clouds, while the second region with the tropopause located at about 10 km altitude (~0826–0948 UTC) is characterized by increased returns from the DWL due to the cirrus clouds.
Figures 5c and 5d show differences of horizontal wind speed between ECMWF IFS and Met Office MetUM forecasts (using +8, +9, and +10 h forecasts for the IFS and +2, +3, and +4 h for the MetUM) and DWL observations, respectively. The IFS shows coherent areas of increased negative wind speed differences above and below the tropopause corresponding to underestimated winds with peak values of up to −17 m s−1. The MetUM wind speed differences are slightly weaker and feature positive and negative regions that range between −10.5 and 9.5 m s−1. Please note that the depicted error structures are mirrored on the return flight toward Iceland. The consistency of the wind speed differences derived from the three measurement types—DWL, in situ, and dropsondes—highlights the reproducibility and representativeness of the measurements. The dropsonde profiles suggest that largest differences occurred near the tropopause. The IFS and MetUM wind speed differences differ substantially, although it can be noted that the most negative differences in the MetUM tend to occur at approximately the same location as in the IFS. Interestingly, the IFS and MetUM tropopause altitude is different as can be seen from the PV distribution in Fig. 6. The tropopause fold and leading edge of the tropospheric air mass appear earlier along the section in the MetUM that corresponds to a northwestward shift. Similarly, the second increase in tropopause altitude, that is, the region of low PV values that was approached at about 0820 UTC in the MetUM (Fig. 6a) and is located farther northwest along the flight track than in the IFS (Fig. 6b). Toward the southeast of the flight section, MetUM overestimates the jet stream wind (Fig. 5d); this is most likely caused by a different representation between the models of the dynamics associated with the WCB outflow of Vladiana, which is suggested by the higher diagnosed tropopause in the MetUM compared to the IFS in this region. Although this indicates the importance of a correct representation of the tropopause altitude, a vertical shift would be expected to show up as a vertical dipolelike structure in the wind speed differences, while this is not the structure found.
To investigate the representation of winds near the tropopause in more detail, observed and modeled wind profiles at the location of the six dropsondes are examined (Fig. 7). The close correspondence of DWL measurements (dots) and dropsonde winds (color lines) for these six profiles is consistent with the general statistical comparison shown in Fig. 1c. The maximum wind speed was observed by the DWL at the location of the easternmost dropsonde with 57.5 m s−1 at 10.1 km altitude. Unfortunately, the associated dropsonde was launched at a lower altitude of 8.6 km (after HALO descended to a lower flight level) and therefore did not capture this wind maximum (Fig. 5b). A qualitative comparison of the observations (Fig. 7a) and the IFS profiles interpolated to the observation points (Fig. 7b) shows that the altitude of the wind maxima coincides well, while both the strength of the wind maximum and the vertical gradients are underestimated resulting in increased negative wind speed differences in the jet stream above 9 km (Fig. 7c). The observations exhibit a step-like change in vertical wind shear at ~10 km altitude, which is not represented in the IFS. The MetUM forecasts (Fig. 7e) show a more realistic representation of the peak wind speeds. However, the strong vertical gradients are underestimated especially above the wind maximum where the observed step-like change in wind speed with height is not represented correctly, which results in increased wind speed differences (Fig. 7f).
To account for the variability in tropopause altitude along the flight and the height of the wind maximum that differs between the dropsonde locations, wind speeds are displayed with respect to their vertical distance to the tropopause identified by 2 PVU (1 PVU = 10−6 K kg−1 m2 s−1) (Figs. 7g–l). Using the tropopause as a reference is an established approach to investigate tropopause sharpness and related chemical gradients (e.g., Birner 2006; Pan et al. 2004). In tropopause-relative coordinates, the observed wind profiles transecting the jet stream (sondes 2–6) collapse on each other showing that the observed peak wind speed and abrupt change in vertical wind shear is approximately collocated with the dynamic tropopause defined in terms of simulated PV. However, there are differences using the tropopause of the IFS (Fig. 7g) and the MetUM (Fig. 7j). For example, the maximum wind in DWL observations at the easternmost dropsonde profile (dots in Fig. 7g) is situated less than 300 m above the IFS tropopause, while the MetUM tropopause is only 100 m above this DWL wind maximum (Fig. 7j). These displacements are less than the model level spacing in the IFS and MetUM and therefore better correspondence cannot be expected. Although the tropopause location has some inherent uncertainty, difference features from multiple profiles are more coherent in the tropopause-relative framework. The distributions of modeled wind speeds (Figs. 7h,k) and respective differences (Figs. 7i,l) emphasize the finding that the IFS underestimates the wind maxima and tropopause sharpness and that the MetUM performs better in terms of wind speeds and gradients in this particular case. Note also that the observations are compared with longer lead time forecasts for the IFS than for the MetUM (due to the operational forecast frequency). Nevertheless, this analysis shows that the wind speed differences are influenced by diverse uncertainties related to the representation of the peak winds, the strength of vertical wind shear on the stratospheric and tropospheric sides of the tropopause and uncertainty in tropopause altitude.
Figure 7 shows that the vertical gradient of wind speed is underrepresented on both sides of the tropopause over a considerable distance (more than 1 km), which spans several model levels in both the IFS and MetUM. To further investigate the structure of vertical wind shear, Fig. 8a shows the magnitude of the vertical shear in the vector wind, calculated at points along the cross section, as derived from the DWL and dropsonde observations. Thin, but horizontally extended, layers of high vertical wind shear are observed along the tropopause and also ~1 km above it. Although each layer is too thin to be resolved in the NWP data (Figs. 8b,c), both models indicate increased vertical shear above the tropopause. The important question for Rossby wave propagation is whether the vertical wind shear above and below the tropopause is too weak in the models on average, since this would imply a weaker PV gradient.
For a quantitative comparison, Fig. 9 shows horizontal averages of wind speeds and vertical shear in a tropopause-relative framework for this flight. Figures 9a and 9b reiterate the finding of increased wind errors above the tropopause in the IFS compared to MetUM (see also from Figs. 5c and 5d). Vertical wind shear is higher on the stratospheric side of the tropopause in both models (Figs. 9c,d); however, it is clearly underestimated compared to the observations. The higher spread in the observed vertical shear is dominated by the small-scale layers (Fig. 8a) that cannot be represented at the current model resolution. The maximum observed vertical shear by the DWL with a 100-m vertical resolution is 0.23 s−1, which certainly is a local extreme. For this case study, the median observed vertical shear is 0.031 s−1 above and 0.013 s−1 below the tropopause. Corresponding median values are 0.018/0.010 s−1 for the IFS and 0.021/0.013 s−1 for the MetUM, which indicates a significant underestimation of shear, especially above the tropopause, in this case.
4. Statistical assessment of wind speed differences
Section 3 focused on the structure of the observed wind speeds and vertical shear for one case study and gave an indication of significant uncertainties in the representation of jet stream winds in global NWP models, especially at the level of the midlatitude tropopause. To investigate whether these uncertainties were systematically occurring features during NAWDEX, the following section addresses campaign statistics based on the entire DWL dataset and the wind profiler data at South Uist (location in Fig. 1).
a. Wind lidar dataset
Frequency distributions for all DWL wind speed observations from NAWDEX in tropopause-relative coordinates make use of the IFS definition of the tropopause in Fig. 10a and the MetUM tropopause in Fig. 10b. Both wind distribution and mean and median wind curves look similar. Small differences between both can be explained by slightly variable tropopause altitudes as discussed in section 3b. The highest average winds peak around the tropopause with a maximum median (mean) wind speed of ~41 m s−1 (~38 m s−1), which is found in the 500 m below the tropopause. Above and below the tropopause, winds quickly decline. The altitude range from 1 km above to 2 km below the tropopause provides slightly weaker maxima in the frequency distributions indicating broader distributions and thus more variability in the winds. The highest data coverage from the DWL is found around the tropopause, which is a result from the chosen flight altitude. Some increased frequencies above the tropopause appear at high wind speeds and are related to situations where the tropopause altitude rapidly decreases in the stratospheric air, that is, on the cyclonic shear side of the jet stream, for example at ~0810 UTC in Fig. 5b. In such situations high wind speeds are attributed to low tropopause altitudes.
The median (mean) wind speed difference of −0.41 m s−1 (−0.68 m s−1) for the IFS and −0.15 m s−1 (−0.28 m s−1) for the MetUM derived from the 77 541 modeled and observed wind speeds is small. Frequency distributions of the differences for 1 km altitude bins relative to the tropopause provide information on the vertical distribution of biases in the IFS (Fig. 10c) and MetUM (Fig. 10d). Generally, the median (mean) differences are small at all altitudes ranging between −1.54 m s−1 (−1.72 m s−1) and 0.38 m s−1 (0.30 m s−1) in the IFS and −0.9 m s−1 (−1.0 m s−1) and 0.36 m s−1 (0.22 m s−1) in the MetUM. Please note that most of the wind speed differences are found to be statistically significant based on the 95% confidence interval that was calculated from 1000 bootstrap samples. Interestingly, the highest variability in the differences is visible in the altitude bin directly above the tropopause in both models indicating increased uncertainty in the representation of the winds at this location. This is particularly striking when viewing individual frequency curves for each range bin (Fig. 11). The differences in the first kilometer above the tropopause provide a significantly broader distribution (standard deviation of 3.98 m s−1 for the IFS and 3.82 m s−1 for the MetUM) compared to the mean curve (standard deviation of 3.23 m s−1 for the IFS and 3.17 m s−1 for the MetUM).
Figures 10e and 10f show the magnitude of vertical shear for the DWL dataset. The vertical distribution of median and mean vertical shear using IFS and MetUM is remarkably similar around the tropopause. Observed median (mean) values in the troposphere range from 0.01 s−1 (0.013 s−1) to 0.016 s−1 (0.02 s−1) with values decreasing with height toward the tropopause. Above the tropopause vertical shear values jump up to values of 0.021 s−1 (0.023 s−1) before they again decrease to ~0.014 s−1 (0.017 s−1). The increased difference between mean and median levels relates to the skewed distributions at all altitudes. The vertical shear difference to the DWL observations of the IFS (Fig. 10g) and the MetUM (Fig. 10h) show an underestimation at all levels with the smallest errors in the 2 km below the tropopause. This is in agreement with the case study presented in Fig. 9. Expressed as a ratio of observed and modeled vertical shear, the factor of underestimation ranges between 1.3 and 5 for the median in both models. The underestimation is lower (factor of 1.5 to 2) in the upper troposphere where observed vertical shear is small and directly above the tropopause where the simulated vertical shear shows a maximum (cf. Figs. 10e,f).
One could ask to what extent this result is reproducible in a different year or season. Therefore, we repeated the statistical comparison for the WindVAL-I campaign that was conducted from Iceland in the period 11 to 29 May 2015 and that used the same DWL instrument to measure horizontal wind speed (Reitebuch et al. 2017; Marksteiner et al. 2018). The appendix (Fig. A1a) shows again increased data coverage around the tropopause. Although the mean winds are smaller than during NAWDEX and almost constant with altitude for this campaign (Fig. A1a), again the largest variability in the wind speed differences occurs in the altitude bin directly above the tropopause (Fig. A1b). Vertical wind shear (Fig. A1c) also shows a comparable distribution with weakest differences in the upper troposphere. As during NAWDEX, the vertical shear in the IFS (Fig. A1d) is too weak at all altitudes with underestimation ratios ranging between 2 and 3.5 being higher in the lower troposphere.
b. Ground-based wind profiler dataset
To investigate the representativeness of the DWL comparison with NWP data, the ECMWF and Met Office analysis data are additionally compared with STP wind profiles at South Uist providing a continuous time series in the NAWDEX observation area. During the NAWDEX period the wind situation above South Uist is characterized by large variability (Fig. 12a). Especially in the first half of the period, repeated passages of strong wind events accompanied by increased tropopause variability are noticeable. The tropopause location in MetUM and IFS are located at similar altitudes with a mean difference of approximately 100 m. Jet stream observations are related to IOP 1 (Tropical Cyclone Ian) on 17 September, IOP 2 (Cyclone Ursula) on 22 September, IOP 3 (Vladiana) from 23 to 25 September, and IOP 4 (Tropical Storm Karl) from 27 to 29 September. Increased winds on 3 and 7 October can be related to IOP 6 (the Stalactite Cyclone) and IOP 8, respectively. In the second half of the time series, upper-level wind speeds, as well as the variability of the tropopause, become lower as a block established over Europe (Schäfler et al. 2018).
Figure 12b shows 6-h forecasts from the Met Office that correspond to the background forecasts in the data assimilation process. In the one-month period, two obvious situations appear that feature increased wind speed differences. First, frontal passages, which can be identified from tilted isentropes, most often feature overestimated wind speeds in the lower troposphere. Second, situations with strong upper-level winds, elevated tropopause altitudes, and sharp vertical gradients in winds and static stability predominantly feature underestimated wind speeds in the first 2 km above the tropopause. Figure 12c shows the Met Office analysis profiles compared with the STP observations. Obviously, the data assimilation of the STP observations reduces the errors in the background field. However, negative analysis differences remain in situations of increased errors in the 6 h forecast, for example, on 12, 17, and 24–25 September. The comparison of ECMWF analysis profiles with the STP observations (Fig. 12c) reveals very similar errors, even in situations of large tropopause variability, which is remarkable as both forecasting systems use different data assimilation schemes and models. Consistent with the DWL observations, the diagnosed wind speed errors show increased uncertainty of the winds above the tropopause with a tendency of an underestimation, especially above tropopause ridges.
A unique set of comprehensive airborne and ground-based wind profile observations was used to characterize the structure of the jet stream and to evaluate the representation of winds across the tropopause in the two state-of-the-art global operational NWP forecasting systems of the ECMWF and the Met Office. The study covers the high-latitude North Atlantic Ocean where the availability of conventional data sources for winds are sparse. The NAWDEX period was characterized by high wave activity and variable predictability (Schäfler et al. 2018).
The independent (not assimilated) DWL dataset features 1922 wind profiles at high horizontal (8.6 km profile spacing) and vertical resolution (100 m) during eight flights. Comparison of DWL wind profiles with dropsondes demonstrates the low measurement error, which is needed to quantify meteorological analysis errors. Although NWP models are characterized by lower horizontal and vertical resolution, compared to the DWL data, the average representation of the winds is remarkably good. Statistical assessment using the DWL dataset provided median (mean) biases of −0.41 m s−1 (−0.68 m s−1) for the IFS and −0.15 m s−1 (−0.28 m s−1) for the MetUM. The comparison with temporally continuous lidar profiles requires a temporal interpolation from NWP analysis and forecast data, so it is likely that forecast errors may have affected the differences with NWP data. The longer forecast intervals that were used for the ECMWF data (forecasts initialized at 0000 and 1200 UTC) compared to the MetUM (initialized at 0000, 0600, 1200, and 1800 UTC) may have caused slightly higher average negative wind speed differences in the IFS. NWP profiles were found to be smoother and less detailed for the IFS compared to the MetUM. Diagnosed average biases are smaller at all altitudes relative to the early 2000s that were characterized by biases on the order of 5%–10% (Tenenbaum 1991, 1996; Rickard et al. 2001; Cardinali et al. 2004). This study corroborates that recent advances in NWP connected to improved data assimilation methods, improved data quality and availability, and increased model resolution and better formulation have led to a significant improvement of the wind analysis quality in the midlatitudes. However, Horányi et al. (2015) have shown that already small-scale systematic observational wind errors on the order of 1 m s−1 are able to significantly deteriorate forecast quality after 24 h.
This study also shows that wind errors still reach values exceeding ±10 m s−1 (i.e., about 3σ of the difference distributions) for individual cases and that error structures are of large extent and spatially correlated (up to ~500 km in the horizontal and 1–2 km in the vertical) in the analyses and short-range forecasts of ECMWF and Met Office. DWL measurement errors are found to be smaller than the errors in NWP data and typically uncorrelated. Forecast and analysis error structures are most prominent immediately above the tropopause on the flanks of upper-level ridges where strongest vertical wind shear occurs (e.g., Fig. 5). The same wind error structures are found in the comparison of modeled profiles with the STP radar profiler data over a 6-week period (Fig. 12). The spatial structure of near-tropopause errors is similar in ECMWF and Met Office short-range forecasts and analyses, even though the forecast models and data assimilation schemes differ greatly. Moreover, increased wind uncertainty directly above the tropopause could be confirmed for the WindVAL-I campaign in 2015.
The different observation types, used in this study, have very different sampling characteristics. The DWL observations represent samples from 8.6 km line segments, the STP profiler measurements represent a volume of size 5 km × 5 km × 500 m (at 10 km) averaged over 30 min, while the dropsondes are effectively point measurements along the sonde trajectory. These are compared with winds from NWP models represented on a grid with an approximate horizontal spacing of 15 km and vertical level spacing of 300 m in the IFS and 17 km and 550 m in the MetUM (see Fig. 2). Therefore, such a validation of NWP data will inevitably be affected by a representation (sampling) error (e.g., Janjić et al. 2017). For this reason, data assimilation uses an assigned observation error that is a combination of instrument and representation error. Weissmann et al. (2005) estimate the representation error to range between 1.5 m s−1 for a point measurement in a 40 km grid box and 0.15 m s−1 for a line measurement through that box. They argue that typical assigned observation errors of 2–3 m s−1 may be too high. To account for the difference in the representation of the data, the observations could be averaged before comparing. However, this study aimed at investigating how far the models deviate from “nature” as observed by the DWL and STP. The large horizontal and vertical scales of the correlated wind error structures (several hundred kilometers horizontally and 1–2 km vertically) can be represented on the grids used by the NWP models. Furthermore, error features persisted for extended periods of time (hours to several days) in the time series of the STP (Fig. 12). The magnitude of the errors (up to 10 m s−1) and the systematic occurrence at the flank of and above ridges indicates that these structures cannot be explained by representation and measurement error alone.
The analysis of vertical wind shear revealed that observed values rapidly increase above the tropopause and that median vertical shear is underestimated in both models at all altitudes by a factor of 1.5 to 5. This is line with Houchi et al. (2010) who found an underestimation by a factor of 2.5 to 3 for vertical shear of the zonal and meridional wind and illustrate that most of the missing vertical shear can be explained by the lower vertical resolution of the model profiles. By vertically averaging winds they estimate an effective vertical resolution for wind shear of 1.7 km for the IFS version in 2006 with 91 model levels. Furthermore, the missing small-scale variability of vertical wind shear that was demonstrated along the DWL cross section (Fig. 8) is in line with their findings.
6. Implications of the findings
Underestimation of vertical shear by models has implications locally for the nature and intensity of turbulence and the parameterization of subgrid-scale processes (Houchi et al. 2010). For example, by changing the bulk Richardson number used in parameterization. In addition, the underestimation of the change in vertical shear across the tropopause that has been discovered here has a nonlocal, large-scale consequence: the dynamics of Rossby wave propagation depend on the meridional gradient in the PV distribution, which is dominated by the change in vertical shear. Direct calculation of Ertel PV and its gradient across the jet stream from observations requires measurements of horizontal wind and temperature with high resolution in both the vertical and horizontal. This is very difficult to achieve, although Harvey et al. (2020) present an example from a high density dropsonde section crossing the jet stream in NAWDEX IOP4. However, the meridional gradient in quasigeostrophic PV q across a zonal flow u (see Hoskins and James 2014) can be estimated using the DWL wind data (without coincident high-resolution temperature profile data):
where ρR(z) is a reference density profile (assumed to vary less quickly with z than u(z) to derive the right side approximation), f is Coriolis parameter, β is its meridional gradient, Nt and Ns are the Brunt–Väisälä frequencies for troposphere and stratosphere, and Λt and Λs are the respective vertical wind shears separated by a specified distance Δz across the tropopause zone. The horizontal curvature term is estimated by centered difference over cross-jet scale L, where uJ represents the jet core speed and ue is the environmental wind speed at distance L from the core. At 62°N, f = 1.3 × 10−4 s−1 and β = 1.1 × 10−11 m−1 s−1. Using numbers from the observed cross-section Fig. 5b, it is estimated that the meridional wind curvature term is approximately 8–12β (using L = 600 km, uJ = 50 m s−1 and ue = 30 m s−1) and the vertical wind curvature term is as much as 2000–2500β (using Δz of 100 m, Ns = 2 × 10−2 s−1, Nt = 10−2 s−1, Λt = −3 × 10−2 s−1, Λt = 10−2 s−1) illustrating how dominant the change in vertical wind shear is in the estimate of meridional PV gradient in the regions where errors are observed. If the same change in vertical shear in the model is spread over 1 km (cf. profiles in observations and analyses in Fig. 7), this term would be 10 times smaller in the model (although still dominant).
Background forecasts (+6 h) for the atmospheric column above the STP profiler at South Uist showed similar wind error structures above the tropopause with higher amplitude than seen in the analyses. This indicates that data assimilation reduces the background forecast model error but cannot eliminate it. Future work is needed to evaluate whether assimilated wind profiles tend to improve near-tropopause wind fields through sharpening the gradients. Pilch Kedzierski et al. (2016) found that static stability increments tend to strengthen the tropopause gradients. Schindler et al. (2020) demonstrate an overall positive impact of additional wind information from NAWDEX radiosonde and dropsonde observations on the midtropospheric flow.
Additional research is needed to quantify errors of other quantities across the tropopause and how these uncertainties relate to our findings. Pilch Kedzierski et al. (2016) indicate an excessively diffuse tropopause in terms of temperature gradients as verified by radio-occultation observations. Another important quantity is water vapor providing a tropopause-based step change in concentration. The resulting sharp peak in longwave radiative cooling at the tropopause is able to strengthen the positive Ertel PV anomaly above, and negative PV anomaly below, the tropopause (Chagnon et al. 2013; Spreitzer et al. 2019), thus increasing tropopause sharpness (Ferreira et al. 2015). Saffin et al. (2017) used the MetUM with PV tracers to show that diabatic processes, including longwave cooling, microphysics, and the turbulent mixing parameterization all act to increase the tropopause PV contrast, while the nonconservative numerical effects associated with the dynamical core of the model compete, acting to reduce the PV contrast. In forecasts, the PV anomalies associated with these tendencies saturate in about 24 h, indicating that the model has found its own climatological balance of processes at the tropopause. However, the true balance affecting tropopause structure in the atmosphere, where numerical effects are absent and the tropopause is typically much sharper, is not known. Furthermore, the NAWDEX observations show that a major increase in model vertical resolution near the tropopause (by at least a factor of 3) would be required to resolve the abrupt change in both vertical wind shear and static stability there, indicating scope to increase forecast skill through better representation of the tropopause and its influence on the propagation of Rossby waves.
In August 2018, the European Space Agency (ESA) Aeolus satellite mission was launched, carrying the first wind lidar in space. It is expected to contribute significantly to improved representation of the winds in global analyses and forecasts (e.g., Stoffelen et al. 2005; ESA 2008; Reitebuch 2012). It will be interesting to evaluate to what extent a large number of observations from Aeolus in oceanic regions with hitherto sparse wind data coverage will impact winds in the midlatitudes and, more specifically, at the tropopause.
The DLR Falcon contribution to NAWDEX received funding from DLR, the Naval Research Laboratory (NRL), the European Space Agency (ESA) within the WindVal-II project (Contract 4000114053/15/NL/FF/gp), and the European Facility for Airborne Research (EUFAR; project NAWDEX-Influence). The authors thank the German Science Foundation (DFG) for supporting the HALO contribution to the NAWDEX campaign within the priority program SPP1294 HALO. The authors are grateful for the HALO and Falcon pilots who did a fantastic job to coordinate both aircraft for several coordinated flight legs, which allowed us to compare the different datasets. Additionally, we thank the Met Office and the University of Manchester, especially Professor Geraint Vaughan, for operating the ST Wind Profilers during NAWDEX, including the one on South Uist. We thank Dr. Florian Ewald for providing the SEVIRI satellite image for this publication. In addition, we thank ECMWF for providing data access in the framework of the Support Tool for HALO Missions (SPDEHALO) project. B. Harvey is funded through the National Centre for Atmospheric Science National Capability Programme. J. Doyle acknowledges support from the Chief of Naval Research through the NRL Base Program, PE 61153N. The authors thank Dr. Sonja Gisinger for her valuable comments on the manuscript. We thank two anonymous reviewers for their helpful suggestions.
In 2015, the WindVAL-I campaign was conducted from Iceland using the same set of instruments on board the Falcon. Unlike NAWDEX, this campaign focused rather on the preparation of the Aeolus calibration and validation in various wind and cloud scenes than on specifically observing the jet stream situation (Reitebuch et al. 2017). Figure A1 shows all 141 906 DWL wind observations in tropopause-relative coordinates that were measured from 14 research flights in the surrounding of Iceland.
This article is included in the Waves to Weather (W2W) Special Collection.