A global observing system simulation experiment (OSSE) was used to assess the potential impact of a proposed Global Navigation Satellite System (GNSS) radio occultation (RO) constellation on tropical cyclone (TC) track, maximum 10-m wind speed (Vmax), and integrated kinetic energy (IKE) forecasts. The OSSE system was based on the 7-km NASA nature run and simulated RO refractivity determined by the spatial distribution of observations from the original planned (i.e., including both equatorial and polar orbits) Constellation Observing System for Meteorology, Ionosphere, and Climate-2 (COSMIC-2). Data were assimilated using the NOAA operational weather analysis and forecasting system. Three experiments generated global TC track, Vmax, and IKE forecasts over 6 weeks of the North Atlantic hurricane season in the North Atlantic, east Pacific, and west Pacific basins. Confidence in our results was bolstered because track forecast errors were similar to those of official National Hurricane Center forecasts, and Vmax errors and IKE errors showed similar results. GNSS-RO assimilation did not significantly impact global track forecasts, but did slightly degrade Vmax and IKE forecasts in the first 30–60 h of lead time. Global forecast error statistics show adding or excluding explicit random errors to RO profiles made little difference to forecasts. There was large forecast-to-forecast variability in RO impact. For two cases studied in depth, track and Vmax improvements and degradations were traced backward through the previous 24 h of assimilation cycles. The largest Vmax degradation was traced to particularly good control analyses rather than poor analyses caused by GNSS-RO.
As new observing systems have become available, their data have been incorporated into global weather models. In many cases, new global observing systems have improved model forecasts over the past several decades (Kalnay 2002). Despite these advances, further forecast improvements using new observing systems are not guaranteed. The decision to invest the resources required to build new observing systems should be well informed. For example, stakeholders should weigh the system’s expected forecast impact against its expected economic cost in comparison to similar cost benefit ratios of other proposed systems.
A good method to determine the forecast impact of proposed observations is through observing system simulation experiments (OSSEs). An OSSE is designed to mimic real-data denial studies [i.e., observing system experiments (OSEs)] with data from observing systems that do not yet exist (e.g., Atlas 1997; Hoffman and Atlas 2016). Data denial studies compare two experiments: A control generating forecasts from analyses made by assimilating all observations and a replica experiment excluding one observation type from assimilation. This makes OSSEs particularly suited to help stakeholders and forecasters anticipate forecast impacts of proposed observing systems.
In 2017, the U.S. Congress mandated NOAA to use the OSSE framework to assess the impact of Global Navigation Satellite System radio occultation (GNSS-RO; e.g., Businger et al. 1996; Kursinski et al. 1997; Hajj et al. 2002) data assimilation on global weather forecasts (Weather Research and Forecasting Innovation Act 2017, H.R. 353). Improvements in global forecasting were found during the experiments (Cucurull et al. 2018), supporting previous evidence of the positive impact of GNSS-RO observations on operational weather forecast skill (e.g., Cucurull et al. 2007; Anthes et al. 2008; Buontempo et al. 2008; Healy 2008; Cucurull 2010; Anlauf et al. 2011; Cucurull and Anthes 2014; Cucurull et al. 2017).
Since tropical cyclones (TCs) are a leading cause of death and property damage, improving TC track and intensity forecasts is an area of active research. Previous studies exploring the impact of GNSS-RO data on TC analyses and forecasts have used the OSE framework with regional models (e.g., Huang et al. 2005; Kuo et al. 2009; Huang et al. 2010; Liu et al. 2012; Anisetty et al. 2014; Chen et al. 2015; Phunthirawurthi et al. 2016). All but two of these papers looked at case studies of 1–2 TCs and their conclusions are therefore limited. The exceptions, Huang et al. (2010) and Chen et al. (2015), tested the impact of GNSS-RO data on track forecasts of several TCs.
Huang et al. (2010) used the Weather Research and Forecasting (WRF) limited-area model with nested 45–15–5–km grids to examine the impact of GNSS-RO observations from the Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC) satellites on TC track forecasts. They used different model domains for different TCs in the western Pacific and Indian Oceans, so the analysis and forecasting system was not standardized across cases. The study yielded mixed results with some cases showing large impacts with only a few GNSS-RO profiles and some cases with statistically insignificant impacts assimilating more profiles. The number of GNSS-RO profiles assimilated during the study was 2 to 56 per cycle over cases spanning 3 years and 13 forecasts. Their study of six typhoons in 2008 showed neutral impacts on track forecasts on the 45 km grid through 72-h lead time. Only 12 to 43 GNSS-RO profiles per cycle were assimilated during these six storms. Chen et al. (2015) used a limited area Typhoon WRF (TWRF) system at 45-km grid spacing to study GNSS-RO impact for 11 typhoons over the western Pacific Ocean from 2008 to 2010. Each assimilation cycle ingested about 100 GNSS-RO profiles. Results indicated that the GNSS-RO sounding assimilation reduced 72-h track errors by around 12 km, a modest but statistically significant improvement. This improvement was credited to an improvement in the environmental steering current around the western Pacific subtropical high. Neither study evaluated any TC intensity metric (e.g., maximum wind speed; Vmax). A comparison of these findings to the findings of our study can be found in section 5.
There are three key reasons why GNSS-RO observations may be particularly beneficial for TC forecasts (e.g., Bauer et al. 2014; Ho et al. 2014; Vergados et al. 2014; Biondi et al. 2015; Li et al. 2015; Murphy et al. 2015). First, since GNSS-RO observations are satellite based, profiles exist over open ocean where other vertical profiles are rare. Second, the GNSS-RO radio signals are minimally attenuated due to clouds and precipitation; thus, GNSS-RO observations can potentially be acquired very near to the surface in and below dense cloud cover. TCs often develop over oceanic data voids and are characterized by dense cloud cover, so GNSS-RO observations may provide unparalleled sampling of the TC environment. Even the TC itself, including its dynamically significant inner core could be observed with a sufficiently large constellation of RO receivers (e.g., Leidner et al. 2017). Third, since GNSS-RO observations are dependent on atmospheric density, they provide important information for understanding atmospheric thermodynamics. In practice, GNSS-RO retrievals or data assimilation infer high vertical (~100 m) resolution profiles of temperature, water vapor, and pressure. Nevertheless, the impact of GNSS-RO observations on TC forecasts may be limited by the coarse horizontal (~300 km) resolution of the observations relative to the finescale processes important during the life cycle of a TC. In addition, relatively coarse model resolution may limit the impact of GNSS-RO observations on analyses and forecasts of important TC structures if the model is incapable of resolving those structures.
To the authors’ knowledge, this is the only study to evaluate the impact of GNSS-RO data on TC forecasts within a global OSSE. Therefore, the purpose of this study is to 1) evaluate the ability of such a global OSSE to reliably assess the impact of GNSS-RO observations on TC forecasts, 2) assess the impact of GNSS-RO assimilation on global model TC forecasts to the extent that the OSSE allows, and 3) provide suggestions for potential improvements in future similar studies.
This study is an extension of Cucurull et al. (2018), which is an examination of the impact of the Constellation Observing System for Meteorology, Ionosphere, and Climate-2 (COSMIC-2; Cucurull et al. 2017) observations on global and hemispheric weather pattern forecasts. The present study is unique in that it evaluates RO impacts on TC forecasts from a global forecast model system cycling over a 6-week period.
In section 2, the OSSE configuration and experimental design are described in detail. Section 3 presents global track and Vmax error statistics along with an assessment of the reliability of the results in this OSSE. The Vmax statistics are compared to integrated kinetic energy statistics to provide an assessment of intensity and physical size and to evaluate the reliability of using Vmax as a key metric in this OSSE with relatively coarse resolution. Section 4 provides an investigation of two analyses, representing an improvement and degradation due to RO assimilation. Finally, section 5 provides a conclusion that evaluates our results in the context of previous studies and offers recommendations for future studies regarding implementation and interpretation of TC impacts in global model OSSEs.
a. OSSE background
There are several key components of OSSEs: 1) A nature run (NR), a free-running dynamical simulation of the atmosphere that stands in as the real atmosphere, 2) Simulated observations obtained from the NR using observation operators, and 3) an experimental analysis and forecasting system.
A reliable OSSE system should be validated in multiple ways. First, the NR climatology should be similar to that of the real atmosphere. Second, the differences between the modeling system that produces the NR and the analysis and forecasting system should approximate the differences between the real atmosphere and operational modeling systems. This requires that the NR be from a state-of-the-art numerical model and be of higher resolution than the numerical model used in the data denial experiments. Thus, while the experimental modeling system should be similar to an operational model, it should not be identical (e.g., the forecast model typically has a lower resolution than a true state-of-the-art operational model since the NR is “lower resolution” than the real atmosphere). Third, the system must be calibrated properly for the simulated observations. This means that observation errors, counts, and geographic and vertical distribution should be similar to operations and that their impact on analyses and forecasts be similar to their real-world counterparts. Further information on OSSEs can be found in Hoffman and Atlas (2016) and its supplemental material.
This study uses the Community Global OSSE Package (CGOP; Boukabara et al. 2016) in concert with NCEP’s Global Data Assimilation System/Global Forecast System (GDAS/GFS) analysis and forecasting system. Precise details of our OSSE system follow.
b. Nature run
The global OSSE experiments in this study use the National Aeronautics and Space Administration (NASA) Global Modeling and Assimilation Office (GMAO) Goddard Earth Observing System Model, version 5 (GEOS-5) global mesoscale nature run (G5NR; Putman et al. 2015). The G5NR is a ~2-yr dataset on a 1/16° (~7-km) horizontal grid and a 72-layer hybrid sigma-pressure vertical grid. The free-running nonhydrostatic GEOS-5 atmospheric general circulation model starts from real-world initial conditions in May 2005. Unconstrained to reality except through sea surface temperatures and sea ice from the daily 0.05° Operational Sea Surface Temperature and Sea Ice Analysis product (Donlon et al. 2012), the G5NR quickly develops a highly realistic state that is distinct from the real atmosphere. This study uses data 15 months into the free run, so there is no one-to-one relationship between individual real-world storms and those simulated in the G5NR.
The G5NR has been extensively documented and verified against the known climatology of the real atmosphere (Gelaro et al. 2014). While G5NR has been noted for realistic TC frequencies and tracks (e.g., Gelaro et al. 2014; Reale et al. 2017), the algorithm used by Reale et al. (2017) to generate the G5NR TC climatology used strict warm-core and vertical-structure thresholds for determining the existence of a TC. Therefore, some weaker storms (e.g., tropical depressions) were missed, which skews the distribution of TC strength toward stronger storms. To somewhat alleviate this issue, we applied a TC genesis and lysis extension to each track from the original climatology; however, this did not apply to storms that were missed entirely by the Reale et al. (2017) climatology.
c. Simulated observations
Simulated control observations for this study (surface pressure, temperature, zonal and meridional wind, specific humidity, and satellite radiances) were generated using appropriate forward operators following the procedures of Boukabara et al. (2016). Spatial and temporal coverage for conventional and radiance observations was based on real-world coverage from August to September 2014 (Table 1). Simulated conventional observations were generated by interpolating the nature run to the geographical locations of the corresponding real-world observations. Simulated clear sky satellite radiances were generated using the Community Radiative Transfer Model (CRTM), version 2.1.3. Geographic and temporal distribution of clear sky radiances were derived from real-world observations and those locations that coincided with the G5NR clouds were rejected by the assimilation system’s quality control checks. This impacted infrared radiances more than microwave radiances.
Simulated COSMIC-2 GNSS-RO refractivity profiles were produced using the forward operator described by Cucurull (2010). GNSS-RO observation coverage was calculated based on realistic orbits of the 12 originally planned COSMIC-2 satellites, with six receiving satellites (COSMIC-2B) in polar orbits and six (COSMIC-2A) in equatorial orbits. COSMIC-2A and COSMIC-2B were each expected to provide ~6000 refractivity profiles per day. Figure 1 shows the assimilated RO coverage from COSMIC-2 over two 6-h time windows, with higher data density between 45°N and 45°S due to COSMIC-2A. This represents a large increase in profiles to ~3000 profiles per 6-h assimilation cycle compared to the most comparable studies discussed in section 1 (Huang et al. 2010: 2 to 56 profiles; Chen et al. 2015: ~100 profiles).
Since real-world observations have errors (e.g., instrumental and representativeness), a credible OSSE must ensure the error statistics associated with the simulated observations closely match those of a corresponding real-world OSE. When simulating observations from a nature run, there naturally will not be any instrument error since there are no real instruments. Thus, these are known as “perfect observations.”
To make observations more realistic, random Gaussian errors were added to all observations based on an iterative process described by Errico et al. (2013). Satellite radiances also have appropriate biases added as described by Boukabara et al. (2016). The process of adding random errors involves performing a “perfect observation” OSSE (i.e., one that assimilates observations with no errors added) and comparing observation increment (observation minus background and observation minus analysis; O − B and O − A) statistics for each observation type to those of a real-world OSE. Variances from those increments are used to update an estimate of added observation errors. Note that by comparing the perfect-observation OSSE to the OSE, we account for all error sources (e.g., instrument and representativeness) as one all-inclusive error. All observation errors in this study were simulated using this method. Note that the simulated observation error statistics produced by this process are unique to the OSSE system being used. More detailed information on this method can be found in Errico et al. (2013, their appendix B).
This process was conducted for the 2014 operational configuration of the first COSMIC constellation. Increment variance statistics for COSMIC refractivity observations in the preliminary initial perfect observation OSSE and corresponding OSE are presented in Cucurull et al. (2018, Fig. 2a). The Gaussian random errors calculated for COSMIC are here added to COSMIC-2. This is justified because the instrument observation error differences between COSMIC and COSMIC-2 are expected to have very little if any impact. Also, measurement error is dwarfed by representativeness error in an OSSE configuration. The added errors are shown to be random and unbiased at all levels through an elevation of 30 km at four different latitudes in Cucurull et al. (2018, Fig. 3).
d. Analysis and forecast system
The global experiments used the Q1FY15 version of the National Centers for Environmental Prediction’s Global Data Assimilation System (GDAS)/Global Forecast System (GFS) (NWS 2014). The analysis was created by a hybrid Gridpoint Statistical Interpolation (GSI)/ensemble Kalman filter (EnKF) configuration described by Wang et al. (2013) and Kleist and Ide (2015) (hybrid 3DEnVar). The GFS forecasts were run at reduced T670L64 resolution (~27 km at the equator). Eighty reduced-resolution T254L64 (~77 km at the equator) ensemble members were run to help estimate background error covariance via the EnKF (Hamill et al. 2011; Wang 2010; Wang et al. 2013). Final background error covariances used by GSI were generated by linearly combining the EnKF estimate and the static GSI estimate (Boukabara et al. 2016). Horizontal localization scales vary from 700 km in the troposphere to 1300 km toward the model top.
We conducted an 8-week (1 August–30 September of year 2 of the nature run simulation) cycling period with GDAS analyses produced four times per day at 0000, 0600, 1200, and 1800 UTC. The first two weeks of cycling (1–14 August) were used for model spinup and the remaining 6 weeks for verification purposes. GFS produced once-daily 168-h global forecasts initialized at 0000 UTC, yielding a total of 47 global forecasts. This once-daily (every 4 GDAS cycles) spacing helps mitigate forecast autocorrelation that arises due to the large influence of the previous analysis on the current one (Aberson and DeMaria 1994).
e. Experimental design
To examine the impact of GNSS-RO observations on TCs, we use the three global experiments conducted by Cucurull et al. (2018): 1) A control that assimilated all error-added conventional and satellite radiance observations (i.e., surface pressure, temperature, specific humidity, u- and υ-wind, and satellite radiance) (CTL), 2) An experiment that assimilated all the observations used in CTL and perfect (i.e., error-free) GNSS-RO refractivity profiles (RO_perf), and 3) An experiment that assimilated all the observations used in CTL and error-added GNSS-RO refractivity profiles (RO_err). These global experiments tested the accumulated impact of 6-hourly GNSS-RO assimilation on TC track and Vmax forecasts. Supplemental “single-analysis” forecasts were run for select forecasts chosen for in-depth study (see Table 2 for details on all tests).
f. Analysis and forecast verification
We assessed the average TC track and Vmax errors of all forecasts globally. Storm track and Vmax errors were computed using “best tracks” derived from 7-km G5NR data (Figs. 2a–c). (Errors are defined as the difference between the model state and the G5NR state.) The best tracks were based on the those published by Reale et al. (2017), extended to better capture genesis and lysis. Any G5NR TCs that began prior to or ended after the experimental period (15 August–30 September) were discarded as truncated events, yielding a total of 17 TCs for investigation. We used the Geophysical Fluid Dynamics Laboratory (GFDL) vortex tracker on 0.25° × 0.25° GFS forecast data to provide track and Vmax forecasts for each forecast TC. Both G5NR and GFS tracks and minimum sea level pressure (MSLP) were cross checked against MSLP plots and both proved to be accurate. Both trackers have increments of 0.1° latitude/longitude for position and 1 kt (1 kt ≈ 0.51 m s−1) for Vmax.
Since Vmax occurs in a small part of a TC and since the experimental analysis and forecasting system is coarse compared to the nature run, we also used a metric that assesses both maximum intensity and physical size. Integrated kinetic energy (IKE; Powell and Reinhold 2007) is defined as
where ρ is air density of 1.15 kg m−3, U is wind magnitude in excess of 34 kt, dV is the volume of the air under consideration (1 m3), and IKE is expressed in terms of terajoules. IKE is the sum of kinetic energy of all unit volumes associated with the TC in which wind exceeds 34 kt. IKE is used to evaluate the impact of COSMIC-2 radio occultation and to provide a baseline of comparison for the Vmax metric. To provide a direct comparison both G5NR and experiment IKE values were computed on the GFS native 0.25° grid.
To quantify the error differences of storm track and intensity, a paired Student’s t test is performed, which seeks to determine whether the observed differences in the experiments, on average, are significantly different (Hamill 1999; Gilleland et al. 2018). (Sample sizes for these t tests are given in the middle of Figs. 4, 5, and 6.) From this, the statistical significance of the mean differences between the experiments is determined using a 95% confidence interval, as in Kren et al. (2018).
3. Impact of COSMIC-2 on global track and intensity forecasts
a. Global impact on TC analyses
Globally, there were 17 TCs identified in the Reale et al. (2017) climatology dataset during the experimental period investigated in this study. There were 132 individual storm initializations within the 47 168-h GFS forecasts made during the study period. (This is possible because many of the global forecasts contained multiple TCs). Figure 3 shows histograms of the difference between CTL and RO_err track and Vmax error magnitudes at the initial time of these storm forecasts. The histograms show that RO_err track analyses skew slightly toward degradations, with a mean 4 km degradation (Fig. 3a). This falls well within the increment of the TC tracker (0.1° or 9–12 km).
RO_err Vmax analyses skew more convincingly toward degradations, with a mean degradation of 2 kt (Fig. 3b). The TC tracker has Vmax resolution of 1 kt. For context, official TC Vitals information such as that from the HURDAT2 Atlantic hurricane database (Landsea and Franklin 2013) is published with an increment of 5 kt. Out of 132 analyses, only 25 (~19%) had impacts of 5 kt or larger, but those that did were overwhelmingly degraded (24 analyses) rather than improved (1 analysis).
b. Global impact on TC forecasts
Global analyses yield CTL track forecast errors that are very similar to those of the National Hurricane Center (NHC) for storms in the Atlantic basin during the 2014 hurricane season (Figs. 4a,c; NHC statistics can be found at NOAA 2019a,b). This enhances our confidence that the OSSE system used here adequately represents the impact of the operationally assimilated observations on TC track forecasts. Both CTL and RO_err track errors grow over time at a realistic rate through 120 h (the end of the NHC statistics).
The impact of COSMIC-2 RO refractivity profiles on track forecasts is less than 15 km at most lead times. Although the impacts are degradations at most lead times, only lead times 0, 6, and 24 h are statistically significant (4–10 km degradations). The magnitude of these degradations is smaller than the smallest possible difference between two individual TC positions due to tracker position increments (0.1° or 9–12 km). Furthermore, the tracker is applied on a GFS forecast grid of 0.25° (~27 km). Thus, RO assimilation yields neutral track forecast impacts in this global model OSSE.
Global analyses yield CTL Vmax forecast errors that are 3–5× larger those of the NHC in the Atlantic basin during the 2014 hurricane season (Figs. 4b,d). There are a few likely reasons for this: 1) The G5NR TC tracker was strict and therefore stronger TCs are overrepresented in the sample; 2) the Vmax in our experiments is precisely known from the nature run, while it can only be approximated in real storms; 3) the experimental analysis and forecast modeling system has a lower-than-operational resolution (~77 and ~27 km at the equator, respectively); and 4) NHC forecasts are not model products and are based in part on observations that sample scales of motion not represented in the GDAS, GFS, or G5NR. Ultimately, this reduces confidence that the Vmax metric presented here can be used as a direct comparison to future real-world data impact experiments; however, there is value in documenting these impacts in the global OSSE system. Specifically, it is valuable to provide future studies with a point of comparison for their own results, and to act as an indicator of structural changes in the storm. To further enhance our analysis, we also calculated integrated kinetic energy (IKE) as an intensity and physical size metric and compared IKE and Vmax results.
Average Vmax errors of 35–45 kt characterized both CTL and RO_err. The degradations seen in RO_err analyses extended through 60 h forecast lead time, characterized by 1–2 kt statistically significant degradations. Note in Fig. 4d that the degradation at 0 h does not grow; instead it remains ~2 kt until 48 h, at which point it begins to decrease. This strongly suggests the source of the errors in the first 60 h is not in the broader environment. Rather, the source of the Vmax degradation seems to be in the analysis of the storm itself. Section 4 explores this in more detail.
Global IKE errors are similar to those of Vmax (Figs. 5a,b). Statistically significant degradations of around 1 TJ occur during the first 30 h of lead time. Like Vmax, the degradation does not grow and impact becomes neutral after 30 h. Compared to Vmax, significant degradations last only half the time (30 versus 60 h). Unlike Vmax, the IKE error for both tests grow throughout the 168 h forecasts. Since this metric that accounts for physical size of the TC supports the results found with Vmax, RO assimilation yields small degradations in the first 30–60 h of forecasts in this global model OSSE. Furthermore, confidence increased that the conclusions drawn from Vmax results are appropriate for assessing impact in this global model OSSE.
Adding simulated observation errors does not alter RO impacts in a statistically significant way, though RO_perf tends to produce smaller errors than RO_err (Fig. 6). It was expected and encouraging that RO_perf produces smaller forecast errors than RO_err. The lack of statistically significant differences suggests adding random errors is a second-order impact and is not vital to producing realistic TC forecast errors in a global model OSSE. The obvious exception to this is significant degradation due to adding observational errors in Vmax forecasts at 78 and 114 h, which slightly exceed the 95% confidence interval.
As in the Huang et al. (2010) experiments discussed in the Introduction, there was strong forecast-dependency (i.e., some of the 132 forecasts were improved and others degraded). An attempt was made to identify storm attributes that distinguish between forecasts that benefitted from RO assimilation from those that did not. We binned TCs at initialization by basin, latitude band, Vmax, minimum sea level pressure, and RO observation coverage in the vicinity of the TC core (not shown). This binned approach of course yielded smaller samples, and therefore often yielded few statistically significant differences between CTL and RO_err. Where significant differences were found, small forecast sample size hindered the ability to determine conclusively whether those differences were caused by the attribute under study or whether it was the result of a deficiency of the sample. No overall distinguishing attribute was found through binning the results of this study.
4. GNSS-RO impact on two forecasts
In light of these global results and the degradation of the Vmax and IKE forecasts, in-depth study of two individual forecasts was conducted. The forecasts were chosen based on the accumulated impact of RO assimilation on analyses. Accumulated impact is due to the combined effect of 1) the current-cycle impacts and 2) the impact of RO on the background field acquired over all previous cycles. This is a confounding factor for any realistic cycling data impact study, but especially one that investigates impact on individual weather systems (such as TCs) which can be more sensitive to initial conditions than time-averaged global metrics.
To partially address this complication and determine which impacts are due to current-cycle observations and which are due to the accumulation of impacts in the background field, we ran single-cycle tests assimilating control observations and an RO_err background (ROerr_1cyc_ctl). This forecast is compared to CTL and RO_err to determine the contribution of 1) accumulated impact in the background field and 2) single cycle impacts due to RO profiles withheld in the current 6-h window. Note that ROerr_1cyc_ctl provides an imperfect picture of RO impact in a given assimilation window because not assimilating RO changes how the other observations are assimilated. Therefore, the impact seen in these comparisons may not be the same as the impact seen through a more direct comparison of RO_err background and analyses. This second approach was also employed in this study and is presented later.
The assimilation itself is evaluated using statistics generated by single-cycle tests assimilating control and COSMIC-2 observations and a CTL background (CTL_1cyc_ro). Since cycling RO_err is more representative of how RO would be used in operations, CTL_1cyc_ro is not meant to show realistic RO impact.
a. Forecast and assimilation overview
To produce a clear contrast between forecasts, we chose the most extreme Vmax analysis impacts (see Fig. 3): 6 September forecast of EP11 (improvement) and 22 September forecast of AL04 (degradation). These cases also had track analysis impacts of the same sign. IKE calculations for these two analyses and forecasts were similar to Vmax results described in this section (not shown).
The track forecasts for EP11 show CTL, RO_err, and ROerr_1cyc_ctl move to the right of the G5NR track through 96 h, resulting in landfall along the southwestern coast of Mexico (Fig. 7a). In RO_err, there was modest improvement in TC center initialization, followed by steady improvement of 30–50 km after 12 h (Fig. 7c). ROerr_1cyc_ctl was similar to RO_err as expected, showing that accumulation of impacts in the background state was considerably larger than the current cycle. Due to model resolution limitations, Vmax forecasts for each test fell 20–25 kt short of G5NR (Fig. 7b). RO_err and ROerr_1cyc_ctl performed best at Vmax initialization and for much of the first 72 h. Analysis improvements did not increase during the forecast.
The CTL_1cyc_ro analysis was very similar to CTL (expected because CTL_1cyc_ro used CTL background), but consistently produced much better track forecasts than the other three tests through 84 h. CTL_1cyc_ro Vmax impact through 48 h was small, but it produced the best (strongest Vmax) forecast thereafter likely due to less land interaction allowed by its superior track forecast. Overall, the assimilation of RO into the CTL background did not appear to degrade the forecast in a way that would suggest poor assimilation.
The track forecasts for AL04 show CTL, RO_err, and ROerr_1cyc_ctl move to the right of G5NR through 72 h (Fig. 8a). RO_err and ROerr_1cyc_ctl initialize the TC position poorly (see Fig. 12) and that degradation grows steadily over the first 48 h. The degradation in RO_err and ROerr_1cyc_ctl was 100–250 km from 12 to 96 h. After 96 h, the G5NR storm curves sharply to the northeast. ROerr_1cyc_ctl did not meet TC criteria at 120 h and was therefore terminated in the tracker. RO_err track error grows rapidly to 1700 km by 132 h, a result of failure to adequately curve. CTL forecasted the curve better, producing 1400 km improvements at 132 h. Again, Vmax initialized poorly in the coarse global model (50–70 kt underestimate) in all three tests (Fig. 8b). RO_err and ROerr_1cyc_ctl each degraded Vmax by 18 kt, suggesting the accumulated impact of RO assimilation in the analysis and modeling system was much larger than the immediate impact of the current cycle of RO assimilation. Forecast error declined for all three tests as the G5NR TC weakened.
CTL_1cyc_ro analysis and forecast errors again were quite similar to CTL, as expected. The main difference was that the CTL_1cyc_ro track curved more sharply, leading to 100–200 km improvements over CTL after 114 h. As in the 6 September EP11 analysis and forecast, there was no indication assimilating RO into the CTL background caused degradations indicative of poor assimilation.
Before evaluating how RO assimilation caused these impacts, it was vital to further ensure GSI assimilated RO well at these cycles. Assimilated RO observations that are plotted in blue represent O − A bias that was smaller than O − B bias (Figs. 7a and 8a) in CTL_1cyc_ro. Observations plotted in red represent O − A bias that was larger than O − B bias in CTL_1cyc_ro. Over 70% of RO observations in CTL_1cyc_ro had a better fit to the model after assimilation both globally and in the regions defined in Figs. 7a and 8a. Assimilated RO counts, bias, and standard deviation for both CTL_1cyc_ro and RO_err are shown in Table 3. This shows reductions in bias and standard deviation during assimilation, a sign that GSI was using RO observations to successfully minimize O − A during these cycles (statistics for all cycles in RO_err were similar after the 1–14 August spinup period). As expected, RO_err O-B bias and standard deviation were smaller than CTL_1cyc_ro because RO_err background included RO impact from previous cycles.
Further analysis shows background errors for individual profiles (denoted in Figs. 7a and 8a) in the vicinity of the TCs under study (Figs. 9 and 10 ). Profile A for each case is the profile that samples closest to the TC core. All profiles correct misfits between observation and background especially in the mid- to upper troposphere (~2–10 km) and show no indication of abnormal assimilation near the TCs.
b. Impact on TC track and Vmax analyses
A closer view of sea level pressure and 10-m wind speed analyses shows the impact of accumulated and single-cycle RO assimilation (Figs. 11 and 12 ). GFS TCs are weaker than G5NR storms and are generally not as symmetrical. While GFS TC cores are much weaker, the GFS storms are just as broad. For instance, the AL04 tropical storm force wind field is comparable in size for G5NR and CTL, but the core of the storm is much less compact in the GFS tests. Thus, these GFS storms do not feature hurricane force winds. RO_err AL04 is displaced northward from G5NR and produces a much less symmetrical storm than CTL (degradation), while RO_err EP11 is displaced eastward toward G5NR and has a stronger wind field than CTL (improvement). ROerr_1cyc_ctl is slightly weaker than RO_err for AL04 (i.e., the TC was slightly improved due to RO assimilation at that cycle) and slightly stronger for EP11 (i.e., the TC was slightly degraded due to RO assimilation at that cycle). Note that the single cycle impacts seen by comparing RO_err and ROerr_1cyc_ctl are not the same as comparing RO_err background and RO_err analysis because the difference between assimilating control observations plus RO (as in RO_err) and assimilating only control observations (as in ROerr_1cyc_ctl) is not only in the RO observations. The “control observations” in RO_err and ROerr_1cyc_ctl are slightly different because adding RO impacts how the control observations are assimilated. The more direct analysis of comparing RO_err background to RO_err analysis is presented later.
The origins of the accumulated impacts can be traced back through previous cycles of analyses and short-term forecasts. Note again that this is different from the RO_err/ROerr_1cyc_ctl comparison because it is comparing the impact of RO observations on the RO_err background to the impact of control observations on the CTL background. Hence, this allows us to follow the evolution of the differences between CTL and RO_err analyses leading up to the 0000 UTC forecasts of AL04 and EP11. Time series of analysis impacts for both storms reveal little correlation of impact sign or magnitude (Fig. 13). For example, Figs. 13c and 13d show abrupt changes in sign and/or magnitude in the cycles immediately preceding the outlier Vmax analyses.
Assimilation cycles occurred every 6 h, presenting an opportunity to show how the RO_err analyses and short forecasts evolved over the previous day in comparison to that of CTL. Track forecast improvement for EP11 can be attributed to the background state at 0000 UTC 6 September since track error was not reduced in RO_err through data assimilation at that cycle (Fig. 14a). This improvement can be traced back to good analyses at 1200 and 1800 UTC 5 September. While CTL analyses were also improved through assimilation, the 6-h GDAS forecasts yielded very poor background states. Thus, the 27 km track improvement for RO_err was largely caused by a series of comparatively good 6-h forecasts. The track forecast degradation for AL04 can be attributed to a series of very poor 6-h forecasts at 0600, 1200, and 1800 UTC 21 September (Fig. 14b). Although the assimilation at 1200 and 1800 UTC 21 September and 0000 UTC 22 September greatly reduced the error, it was not enough to overcome the aforementioned poor forecasts. Meanwhile, the 6-h forecast initialized at 1200 UTC in CTL reduced the error from 46 to 25 km. Thus, the 25 km degradation was caused by poor 6-h forecasts in RO_err and one particularly good 6-h forecast in CTL. Note that while the improved EP11 analysis did not feature an improvement in the 0000 UTC 6 September assimilation cycle, the degraded AL04 analysis did feature an improvement in the 0000 UTC 22 September assimilation cycle. It is important to remember in a cycling experiment that a degradation at a particular cycle does not imply that cycle’s assimilation caused the degradation.
For the two forecasts investigated here, G5NR Vmax was always higher than the GFS experiments, so any increase in GDAS winds constituted an improvement. The 8 kt improvement of EP11 Vmax analysis at 0000 UTC 6 September developed during the 6-h forecast from 1800 UTC 5 September to 0000 UTC 6 September (Fig. 15a). The assimilation at the 1800 UTC cycle actually degraded Vmax by 3 kt. During the subsequent 6-h GDAS forecasts, RO_err Vmax increased 7 kt and CTL Vmax decreased 3 kt. Together with a 3 kt weakening in the CTL assimilation and 1 kt strengthening in the RO_err assimilation, this led to the observed 8 kt improvement. The 18 kt degradation in the AL04 analysis originated at the 1200 UTC cycle on 21 September (Fig. 15b). Two lackluster RO_err analysis cycles followed from that time with Vmax unable to increase above 42 kt. Meanwhile, CTL analyses and 6-h GDAS forecasts generated progressively increasing Vmax to 60 kt, accounting for the 18 kt degradation in RO_err.
While several studies have explored the impact of real RO profiles on TC forecasts in limited area models (e.g., Huang et al. 2010; Chen et al. 2015), to the authors’ knowledge a similar investigation of simulated refractivity profiles in a global modeling system has not been published. Hence, this study was designed to 1) evaluate the ability of a global model OSSE to reliably assess the impact of GNSS-RO refractivity assimilation on TC forecasts, 2) assess the impact of GNSS-RO assimilation on global model TC forecasts to the extent that the OSSE allows, and 3) provide suggestions to improve future similar studies where necessary.
The OSSE system included a high-resolution, nonhydrostatic nature run (G5NR), an operational analysis and forecast model system (GDAS/GFS), a hybrid ensemble data assimilation system (3DEnVar GDAS), RO observations based on the original planned COSMIC-2 constellation, and simulated observation errors. The study was based on two 6-week experiments: One without RO assimilation (CTL) and one with RO assimilation with realistic random errors added to the RO profiles (RO_err). A third experiment (RO_perf) assimilated RO profiles without added errors. Supplemental one-cycle forecasts were conducted to determine 1) the impact of a specific set of RO profiles on assimilation minimization statistics and analyses (CTL_1cyc_ro), and 2) the extent of impact originating from the background field and from the RO profiles in the current assimilation window (ROerr_1cyc_ctl).
In general, analysis and forecast impacts are neutral to slightly negative. The major findings of the study are:
The RO refractivity assimilation produced consistent reductions in the refractivity bias and spread globally, regionally around TCs, and for the individual profiles closest to TCs. Assimilation statistics near TCs appeared to be very similar to those away from TCs and globally. Thus, there is high confidence RO is assimilated well (though not necessarily optimally) in the vicinity of TCs.
Adding simulated refractivity errors in the global OSSE reduced the accuracy of both TC track and Vmax forecasts (as expected and desired), but impacts generally did not exceed the 95% confidence interval. Thus, in this study, added errors were not critical for interpretation of TC impacts.
Overall, TC track forecasts are not significantly impacted over the 168 h forecast; however, there is large variability on a forecast-by-forecast basis (i.e., some track forecasts are improved, while others are degraded). There is high confidence that this OSSE is capable of adequately assessing track impact because GFS track errors closely matched those documented at the National Hurricane Center.
Overall, TC Vmax forecasts are degraded by ~1–2 kt over the first 60 h of the forecast. Degradations do not increase over time, suggesting RO is not degrading the near-TC environment in which the storms develop. Rather, the degradations are introduced in the analysis of the storm itself and begin to fade after 48 h. Confidence in the metric itself is low due to large CTL and RO_err errors compared to official NHC errors (i.e., Vmax errors are too large in part due to coarse model resolution). The value of this result is that it suggests small systematic degradation in the structure of the storms. Further analysis confirmed corresponding impact in TC structure.
Integrated kinetic energy (IKE) forecasts were slightly degraded in the first 30 h of lead time and impacts were neutral thereafter. The initial degradation did not grow with time, similar to Vmax. The similarity of IKE and Vmax results increases our confidence in concluding early small degradations for storm intensity due to refractivity assimilation.
Insensitivity of track and Vmax forecasts to TC geographic location, TC intensity at initialization, or the number of assimilated RO observations in the near-storm environment.
For the two largest Vmax analysis impacts, most of the impact originates in the background field, not the observations in the current assimilation cycle. While this is not surprising, it complicates the search for a definitive cause of RO impacts. In such cases, it is valuable to evaluate the series of preceding cycles to identify how the impact originated. Closer evaluation of analysis cycles immediately preceding two cases reveals the importance of the propagation of assimilation impact in the 6 h GDAS forecast.
It is important to consider the real-world context in which this study was conducted: Official forecasts and TC best track databases report track in 0.1° increments and Vmax in 5 kt increments. The average statistically significant impacts of RO found in this study are smaller than these increments. Thus, RO from COSMIC-2 (used this way in a global model) is not likely to have a consistent impact on forecasts issued by the NHC and consumed by the public.
As discussed in section 1, Huang et al. (2010) and Chen et al. (2015) discovered largely neutral impacts on track and did not evaluate any TC intensity metric. Chen et al. (2015) found a statistically significant improvement of 12 km at 72 h. This study generally agrees with these neutral results. The lack of improvement in track forecasts is not unique to RO assimilation studies—Landsea and Cangialosi (2018) documented stagnation in track forecast improvements during the 2010s. As they mention, if a deterministic predictability limit has been reached, future improvements will likely come from probabilistic forecasts.
Forecast improvements for Vmax have been even more difficult to achieve. Previous studies that have documented RO impact on TC intensity metrics have used case studies in regional limited area models and are difficult to compare to this study. Vmax is also a difficult metric to interpret in a global OSSE because of the coarse model involved. In a real TC, Vmax occurs in a very small part of the storm (i.e., the eyewall for more developed storms), driven by processes not resolved in the model used here. Future data impact studies using global models could evaluate intensity through 1) using the operational resolution of the global modeling system, 2) spatially integrated metrics like IKE, or 3) by using the global model output as boundary conditions for higher-resolution regional models designed for convection and TCs. The first option may not be feasible depending on computing resources and OSSE studies should use caution with the third option since the experimental forecast model resolution should never exceed that of the nature run. For that reason, special OSSE configurations have been developed specifically for TCs, based on nature runs with much higher resolution (e.g., Nolan et al. 2013; Atlas et al. 2015).
It is important to note that the way observations impact the analyses and forecasts is dependent upon the data assimilation method used. In this study, we used the 3DEnVar GDAS. Further study is needed to determine how different methods may handle GNSS-RO data differently. Further OSSEs adding an economic component to TC impact are also recommended because TCs often cause extensive economic damage.
Furthermore, in this study the data assimilation system closely parallels the system that was used operationally a few years ago. That does not guarantee that the data assimilation system was correctly tuned to maximize the benefit of RO observations for TC forecasting. While the analysis of the O − B and O − A statistics (Figs. 7 and 8) shows the data assimilation system is better fitting the RO observations, it does not prove the data assimilation is optimal for these observations. For example, the localization scales were tuned by NOAA’s Environmental Modeling Center, not specially designed for our configuration. This may have degraded the analysis of other fields. These issues should be the subject of future studies.
To best confirm this study’s conclusions, a cycled global model OSE with real-world COSMIC-2 data will need to be conducted. This will soon be possible because the equatorial component of the constellation (COSMIC-2A) launched in June 2019. It is also necessary to evolve OSSE systems with the most recent modeling systems and applications. To that end, a new FV3-GFS-based OSSE system is being developed.
The authors thank Ruifang Li for code to plot RO profile locations, Jack Woollen for providing templates to simulate COSMIC-2 observations, Lisa Bucci for access to her integrated kinetic energy code, and to the rest of the NOAA OSSE team for their help simulating observations from the nature run. We obtained the real-world observation system templates for August–September 2014 from the NOAA/Environmental Modeling Center. The authors also thank Altug Aksoy and four anonymous reviewers for providing helpful comments to greatly improve the manuscript. This study was conducted by NOAA in response to the H.R. 353, the “Weather Research and Forecasting Innovation Act of 2017” (Public Law 115-25, https://www.congress.gov/bill/115th- congress/house-bill/353/text). According to section 107(d)(1) of the Act: “Not later than 30 days after the date of the enactment of this Act, the Assistant Administrator for Oceanic and Atmospheric Research shall complete an Observing System Simulation Experiment [OSSE] to assess the value of data from Global Navigation Satellite System Radio Occultation.” Michael J. Mueller was supported by funding from NOAA Award Number NA17OAR4320101.