1. Introduction
For over 30 years, reconnaissance aircraft have been used to obtain measurements within the tropical cyclone (TC) inner core (e.g., Zawislak et al. 2022; Rogers et al. 2013, 2006). Though these platforms have proven extremely valuable for both research and forecast improvement (Zawislak et al. 2022 and references therein), they are limited by safety and operational constraints. As a result, observations near the ocean surface are mainly constrained to sparse dropwindsonde (Hock and Franklin 1999) profiles and Stepped Frequency Microwave Radiometer (SFMR) surface wind speed retrievals (Uhlhorn et al. 2007). Small uncrewed aircraft systems (sUASs) capable of operating in this hazardous region provide a variety of atmospheric measurements with greater flexibility at a lower cost than conventional aircraft. Unlike crewed platforms, they can be remotely guided to observe small-scale features in highly turbulent regimes. A comprehensive review of sUAS platforms is given in Elston et al. (2015).
One such sUAS, which has been deployed and controlled from the NOAA WP-3D (P-3) aircraft, is the Coyote (Cione et al. 2016). The Coyote is capable of remaining airborne for up to one hour while providing high-resolution pressure, temperature, humidity, and wind observations. It was first deployed during two flights into Hurricane Edouard (2014), first in the eye and eyewall region and then within the boundary layer inflow. The quality of the Coyote observations was confirmed in Cione et al. (2016), which compared the Coyote data with nearby dropsondes and the P-3 tail Doppler radar (TDR).
Following the abovementioned first two flights, a total of seven successful Coyote missions were flown in 2017 and 2018, including six over a series of three NOAA P-3 flights into Hurricane Maria (2017; Pasch et al. 2023) and a seventh into Hurricane Michael (2018; Beven et al. 2019). The flight patterns were designed to sample the region of maximum near-surface inflow to quantify vertical fluxes of mass and momentum (“inflow” module) and the magnitude and extent of the strongest winds at multiple levels around the storm center (“eyewall” module). An overview of these missions and the data collected was presented in Cione et al. (2020). Preliminary results of assimilating these observations in an experimental ensemble data assimilation (DA) framework were discussed in Aksoy et al. (2022).
Aksoy et al. (2022), hereafter A22, demonstrated the potential benefit of assimilating Coyote data using the Hurricane Ensemble Data Assimilation System (HEDAS; Aksoy et al. 2012; Aksoy 2013) for the flights centered on 1800 UTC 23 September 2017 in Hurricane Maria. One advantage of HEDAS is the flexibility to test new DA configurations to maximize data impact, a difficult task in an operational system. A22 found that the impact of assimilating inner-core TC observations could be incrementally improved by applying a novel online quality control (QC) technique in addition to cycling more frequently. Assimilating full-resolution Coyote data in this optimal configuration resulted in a better representation of the location and extent of the strongest wind and the structure of the secondary circulation when compared with the National Hurricane Center (NHC) best track (best track hereafter) and nearby observations.
Building upon the promising results of A22, the current study expands the assimilation of Coyote observations with a first-time demonstration in the operational Hurricane Weather Research and Forecast (HWRF) system with the goal of guiding future work on the subject. This study also expands on A22 by assimilating Coyote observations for multiple Maria (2017) flights, considering the impact on both the analyses and subsequent forecasts. This approach provides insight into the cumulative effect of multiple cycles with Coyote data as well as the indirect impact on intermediate cycles with no Coyote observations. Moreover, this study explores additional strategies for enhancing the impact of these unique observations within current operational constraints. The ultimate goal of this study is to inform the operational use of this type of data from Coyote and other sUAS platforms that are expected to be available soon (Cione et al. 2020; Pinto et al. 2021).
The paper is structured as follows: section 2 gives an account of the evolving structure and intensity of hurricane Maria at the observing times and provides details about the quantities measured by the Coyote as well as the timing and location of these and other reconnaissance observations. Section 3 describes the numerical model, data assimilation and verification systems. The experimental design is explained in section 4, and section 5 describes the outcome of the experiments. A brief summary of key findings and their implications is given in the final section.
2. Background
a. Evolution of Hurricane Maria over the Coyote sUAS missions
At the time of the first Coyote flight (0000 UTC 23 September 2017), Maria had been slowly reintensifying after making landfall in Puerto Rico as a major hurricane. Prior to landfall, Maria completed an eyewall replacement cycle, which nearly tripled the eye diameter (Pasch et al. 2023). Though Maria weakened over Puerto Rico, it strengthened again after emerging over water, obtaining a maximum 10-m wind speed of 57 m s−1 (intensity; hereafter referred to as VMAX) with a 953-hPa minimum sea level pressure (PMIN) by 1800 UTC 22 September. Maria began slowly weakening around the time of the first Coyote flight, and VMAX estimates decreased from 0000 UTC 23 September onward. During the P-3 flights centered at 0000 UTC and 1800 UTC 23 September, the TDR measured winds of 50 m s−1 or greater at 500-m altitude, extending outwards more than 50 km from Maria’s center. The horizontal wind field analyzed from the TDR data suggested a secondary wind maximum in the east-southeast and east-northeast quadrants (Fig. 1). The Coyote observations also indicated wind speeds of at least 50 m s−1 in the inner core. In the time period between the 1800 UTC 23 September and 1800 UTC 24 September flights, Maria slowly weakened, and its wind field expanded as the storm moved northward. During this time, PMIN continued to drop even as the strongest winds decreased.
b. Coyote observations
The Coyote was deployed from the P-3 aircraft during three flights into Hurricane Maria between 22 and 24 September 2017. To more clearly show the Coyote data in context, Fig. 2 overlays GOES-13 visible satellite images with Coyote wind measurements. Both Figs. 1 and 2 show that the flights concentrated on Maria’s inner core region. As shown in Fig. 3, the Coyote data generally focused on the lowest 1 km, which is much lower than available high-density flight-level observations (HDOBS hereafter). All Coyotes measured pressure, temperature, relative humidity, wind speed and direction, and GPS-based position and ground speed (Cione et al. 2016, 2020). The data were transmitted in real time to the P-3 aircraft and postprocessed by the manufacturer. The final quality-controlled data were provided at 10-Hz temporal resolution, which corresponds to a horizontal resolution of about 3 m assuming an average flight speed of 30 m s−1.
The first flight (hereafter denoted C1), which sampled the southern half of the eyewall, was conducted at 2200 UTC 22 September (Fig. 2a). The Coyote was released in the eye then circled cyclonically outward into the eyewall, where it flew at constant altitude levels in a stepwise pattern from the P-3 to the surface. This flight collected data for 40.9 min and had the longest flight duration.
Three more Coyotes were deployed subsequently, within a 6-h window centered at 1800 UTC 23 September (C2, C3, and C4, respectively). In C2, the pressure, temperature, and humidity sensor failed to acquire data. Since pressure data are needed for the height assignment in HWRF DA, these data were not used in this study. The subsequent Coyotes (C3 and C4 in Fig. 2b) carried out 31.6- and 32.4-min eyewall-sampling modules, jointly obtaining nearly symmetric coverage at the radial location of the strongest winds. C3 was released in the eye region and directed out into the eyewall in the northwest quadrant, and C4 initiated from inside the southwest eyewall as C3 was terminating. As in the earlier flights, these Coyotes sampled at multiple constant-altitude intervals.
Two more Coyotes (C5 and C6, respectively, in Fig. 2c) flew 24 h later at 1800 UTC 24 September. Both were released in the northwest sector of the storm, first in the center and then radially outward toward the eyewall. These two Coyotes glided slowly to the surface after the engines failed to start, remaining in flight for 6.1 and 7.0 min, respectively.
c. Conventional reconnaissance observations
Conventional P-3 reconnaissance data were collected simultaneously to the Coyote observations and throughout each mission. At the flight altitude, the aircraft reported wind, temperature and humidity observations every 30 s which were transmitted at 10-min intervals. Surface wind measurements from the SFMR were included with the flight-level data. Note that the P-3 flights concurrent with Coyote missions were not operationally tasked, and the crew was sufficiently occupied monitoring the Coyote that TDR and much of the dropsonde data were not transmitted for assimilation into HWRF in real time during the Coyote missions. To maintain consistency with operations, only the observations which were available in real time were assimilated in this study. Following each of the Coyote flights, the TDR data were used to construct horizontal wind analyses at 500-m height intervals, as in Gamache (1997). Coyote wind data are overlaid upon these TDR analyses in Fig. 1 to give clearer context of where the Coyote was with respect to various wind structures within Maria.
3. Methods
a. HWRF Model
The experiments were conducted with the 2021 version of operational HWRF (H221; Biswas et al. 2018), which is comprised of the WRF atmospheric model with the Nonhydrostatic Mesoscale Model (NMM) dynamic core coupled to the Princeton Ocean Model (POM; Blumberg and Mellor 1987) and HWRF Data Assimilation System (HDAS; (Biswas et al. 2018). Three nested model domains have sizes of 77.2°, 17.8°, and 5.9°, respectively, with the two innermost domains configured to follow the storm. The horizontal grid spacing is 13.5, 4.5, and 1.5 km, respectively, and there are 75 vertical pressure levels.
b. HDAS and GSI
HDAS was used here consistently with the real-time operational configuration. In particular, the system uses a hybrid ensemble three-dimensional, variational (3D-EnVAR) DA system based on Gridpoint Statistical Interpolation (GSI; Shao et al. 2016). Observations were only assimilated over the two moveable nests. No DA was performed in the outer domain (D01), which instead uses the global analysis for initial and boundary conditions. For the outer moveable nest (D02), flow-dependent covariances were obtained from the previous cycle’s 80-member NCEP GDAS ensemble 6-h forecasts as described in Tong et al. (2018). The ensemble covariances for the innermost nest (D03) were derived from a 40-member ensemble, initialized from perturbations to D02 and updated every 6 h using an ensemble Kalman filter (EnKF) similar to Lu et al. (2017). This configuration mitigates the tendency in stronger storms for the short-term intensity forecast to be too weak when using GDAS covariance during DA (Tong et al. 2018; Lu et al. 2017).
The present study takes advantage of several user-configurable capabilities in GSI. The spatial resolution of specified observation types can be modified by applying vertical and horizontal thinning to map the data to a specified grid spacing. When the thinning option is used, the minimum horizontal grid spacing is constrained by that of the innermost model domain. Static observation errors are assigned in GSI according to platform and pressure levels. For our experiments, the Coyote data were given the same observation error values as flight-level observations, though the actual error characteristics warrant further investigation in future studies. GSI applies a gross error check to remove observations based on differences between the observations and the model first guess [i.e., the innovations or observation minus background (OMB)]. As part of the QC, data are rejected if the innovations exceed a user-defined multiple of the observation error. The maximum allowable ratio of innovation to observation error (hereafter the gross error ratio) can be adjusted to be more or less strict (Hu et al. 2016, 2018). This approach differs from A22 in that they evaluated the Coyote observations with respect to the other available datasets. In that framework, Coyote observations that are consistent with other data sources can be retained despite potentially large innovations.
c. Verification
Verification against the best track was performed according to standard National Hurricane Center (NHC) procedures following Rappaport et al. (2009). Cases were verified only if they were designated as tropical or subtropical and had an intensity of tropical depression strength or greater. Verification was not performed if the GFDL TC tracker (Marchok 2002) was unable to find a vortex associated with a given system or, likewise, if there was no best track entry associated with the model vortex. The latter may occur when the model produces false alarms or continues to forecast a TC beyond its dissipation in the best track. The statistics shown below are homogeneous, meaning that all experiments compared had a verifiable forecast for a given cycle in order for the cycle to be included in the sample.
Verification metrics shown here include the mean absolute error (MAE; the standard NHC verification metric), mean error, and the percent improvement (i.e., skill) of the MAE. The mean error (hereafter, intensity bias) is positive if the forecast is too strong on average and negative if the forecast is too weak on average. Among best track metrics, this paper focuses on verification of track as well as the radius of maximum wind (RMW) and two measures of TC strength (VMAX and PMIN). Though the forecasts of significant wind radii [i.e., the extent of 34-, 50-, and 64-kt winds (1 kt ≈ 0.51 m s−1)] did vary among the experiments, the differences were typically less than 5 km with mixed results. As such, the differences were not sufficiently consistent to discuss with clarity. As in Rappaport et al. (2009), track error is defined as the great-circle distance between the forecast and best track positions, and intensity error is defined as the difference in the observed and tracker-calculated VMAX.
In addition to the MAE and mean skill, we used a “consistency metric” (Ditchek et al. 2023) to identify lead times with consistent forecast improvement between the mean skill, median skill, and frequency of superior performance (FSP; Velden and Goldenberg 1987). FSP is determined by the percentage of times one set of forecasts is superior to another in terms of absolute error. The consistency metric provides a means for evaluating the robustness of results when dealing with smaller sample sizes as is the case in the current study. Possible outcomes of consistent improvement (dark green), marginally consistent improvement (light green), no consistency (gray), marginally consistent degradation (light red), or consistent degradation (dark red) are indicated in the relevant figures for each 6-h forecast lead time. A consistent designation was given if all three metrics exceeded their given thresholds for improvement. Marginal consistency was assigned if any two metrics pass the threshold but the third is neutral. The improvement thresholds were the same as in Ditchek et al. (2023): 1% for skill, and FSP was adjusted based on sample size, exceeding 80% for the number of cases evaluated here. For more details, see Ditchek et al. (2023).
Verification with respect to observations was also performed by comparing the short-term (6-h) forecasts with flight-level, SFMR and TDR data. The root-mean-square (RMS) difference between the model state prior to data assimilation (background) and reconnaissance observations (OMB) was computed for each 6-h cycle beginning with the first Coyote flight and ending 24 h after the final Coyote flight for experiments which either assimilated or excluded Coyote data. Verifying against these in situ observations provides additional, valuable information that cannot be captured by verifying against the best track alone.
4. Experiment design
a. Control experiment
The first cycle for CONTROL began at 1800 UTC 15 September, and cycling continued at 6-h intervals through 0000 UTC 29 September. CONTROL assimilated all available conventional and reconnaissance observations as well as clear-sky satellite radiance data, similar to that outlined in Tong et al. (2018). Reconnaissance observations include flight-level, SFMR, TDR Doppler velocity and dropsonde observations with the dropsonde location estimated by the algorithm described in Aberson et al. (2017). As explained before, TDR and dropsonde data that were not transmitted in real time were not assimilated for the three cycles with Coyote data in any of the experiments here, though sensitivity tests (not shown) revealed that the Coyote data had a similar relative impact when the additional data were assimilated. Finally, a 126-h forecast was initialized with each analysis, and the 6-h forecast was taken as the first guess for the subsequent DA cycle.
b. Single-cycle experiments
The sensitivity experiments first focus on single-cycle tests with the addition of Coyote data to examine the impact of the Coyote with different GSI configurations. The single-cycle experiments were conducted for the 1800 UTC 23 September case, which provided the most complete coverage of Coyote observations. The initial conditions for the single-cycle experiments were provided by the 6-h forecast from the 1200 UTC 23 September cycle of CONTROL. All single-cycle experiments assimilated the same data as CONTROL in addition to Coyote data in four different configurations described below and summarized in Table 1.
GSI configuration for the various experiments.
Configuration choices for the single-cycle experiments were motivated by both similarities to and differences between the Coyote and other flight-level data. Since Coyote observations consist of the same measured quantities as HDOBs, the first experiment (SINGLE-1) was configured to assimilate Coyote data in the same manner.as HDOBs. SINGLE-1 used the same gross error ratio for Coyote data as HDOBs, and it applied horizontal thinning of 3 km to yield a similar horizontal resolution as HDOBs. Subsequent experiments SINGLE-2 through SINGLE-4 explored how differences between Coyote and flight-level observations might translate to differences in assimilation approach. In particular, Coyote observations are reported at higher temporal frequency and are concentrated near the strongest winds and across large horizontal gradients, which might suggest that the Coyote data can be assimilated at greater horizontal resolution with relaxed QC to account for the greater number and variability of these data. Experiment SINGLE-2 was identical to SINGLE-1 except that horizontal thinning was changed to 1.5 km, allowing more observations to be assimilated. SINGLE-3 retained the 3-km thinning from SINGLE-1 but doubled the gross error ratio, allowing observations with up to twice the OMB values to be assimilated. The fourth experiment, SINGLE-4, applied both the 1.5-km thinning and double the gross error ratio.
c. Fully cycled experiments
Following the single-cycle test case, two fully cycled experiments that assimilated Coyote observations were conducted. These experiments were initialized at 0000 UTC 23 September, the first cycle with Coyote data. The initial conditions for the first cycle of the fully cycled experiments were taken from the 1800 UTC 22 September forecast of CONTROL. These experiments continued cycling until 1800 UTC 26 September, 48 h after the final Coyote mission, yielding a total of 16 cycles. The two experiments, CYCLED-1 and CYCLED-4, respectively, followed the SINGLE-1 and SINGLE-4 configurations. In particular, CYCLED-1 assimilated the Coyote data at 3-km horizontal grid spacing using the standard gross error ratio for flight level data, and CYCLED-4 assimilated the data at 1.5-km grid spacing with double the gross error ratio.
5. Results
This section first presents results from single-cycle tests that were used to inform the modified configurations for the cycled experiments. We examine how changes to user-defined parameters in GSI impact the number of Coyote observations assimilated, the fit to observations, and the impact on the analysis and subsequent forecast. We then expand this work with cycling experiments as described above. From this set of experiments, we examine bulk error statistics as well as impacts of the Coyote data on the forecasts.
a. Single-cycle experiments
One of the most noticeable results in the SINGLE-1 experiment is that a considerable number of the horizontal wind observations was rejected by the DA system (Figs. 4b and 5a). The rejections from the analysis, which occurred when the analysis innovation exceeded the maximum allowed, occurred mainly to the northeast of the center when the Coyote was in the vicinity of the maximum winds (Fig. 4b). Observations near the strongest winds west of the center were also initially rejected but were eventually assimilated in subsequent iterations of the GSI algorithm. Meanwhile, all temperature and humidity observations passed this QC check and were retained. In general, it is the experience of the authors that HDAS tends to reject more wind than thermodynamic observations in the TC inner core, so this result is consistent with that experience. One possible explanation for this is that model biases result in more extreme errors in the inner-core winds than in thermodynamic variables. Also, small errors in size or position may also contribute to large wind innovations in high gradient regions.
The single-cycle sensitivity experiments here focused on modifying the QC parameters in GSI so that more wind data could be assimilated. As described above, the two parameters adjusted were the thinning and gross error ratio. About the same percentage was rejected in SINGLE-2, though about 75% more wind data were assimilated due to the reduced thinning (Fig. 5b). Alternatively, doubling the gross error ratio (SINGLE-3) decreased the rejection rate to about 3% so that about 12% more wind data were assimilated (Fig. 5c). SINGLE-4 had the lowest rejection rate (about 2%) and likewise the highest number of assimilated wind observations (double the number assimilated in SINGLE-1, see Fig. 5d). Note that SINGLE-4 rejected none of the observations originally rejected near the RMW northeast of the TC center in SINGLE-1 (Figs. 4b,c). In fact, the only wind observation rejected in SINGLE-4 was an apparent outlier at the very beginning of the C4 flight (Figs. 4a,c). From the observations rejected in SINGLE-1 but accepted in SINGLE-4, nearly all innovations were positive, suggesting that including more data allowed for better representation of the observed stronger mean vortex structure.
Figure 5 also explores how well the single-cycle analyses fit the Coyote data in each of the experiments. First, changing the thinning (e.g., SINGLE-1 versus SINGLE-2 and SINGLE-3 versus SINGLE-4) did not meaningfully alter the root-mean-square error (hereafter RMSE) or the mean error (hereafter bias) of the OMB. This suggests that thinning did not substantially impact the characteristics of the data passed to GSI. Meanwhile, increasing the gross error ratio, as in SINGLE-2 and SINGLE-4, increased the OMB RMSE on average due to including observations with larger innovations in the calculation. Despite the larger OMB RMSE, the RMSE of the analysis innovations (observation minus analysis, hereafter OMA) was reduced to a greater degree when more wind observations were assimilated, and the smallest RMSE was realized for the SINGLE-4 experiment. Also, the OMA distribution obtained a more Gaussian structure when the QC was relaxed. Finally, the innovation bias reduced to near zero in all the single-cycle experiments.
Assimilating Coyote made the analysis vortex stronger when the highest number of Coyote data were assimilated. To demonstrate this, Fig. 6 shows analyses of wind speeds at the lowest model level in CTRL (Fig. 6a) and CYCLED-4 (Fig. 6b) as well as the difference between CTRL and CYCLED-4 (Fig. 6c). Though the Coyote data strengthened low-level winds around much of the vortex, the biggest impact was on the northeast side where the Coyote observed the strongest winds (cf. Figs. 5a and 6c). The resulting analysis compares better qualitatively with the TDR observations in Fig. 1, which also shows an extension of the strongest winds in this region.
As shown in Table 2, analysis differences for best track metrics were generally small and fell within the range of uncertainty1 for each metric. The reason for this is partly due to the initialization procedure in HWRF which adjusts the TC location and intensity prior to inner-core DA (see Tong et al. 2018). All four of the experiments that assimilated Coyote data had larger initial position and PMIN error than CONTROL, but the error was commensurate with best track uncertainty. The initial VMAX in all experiments differed by much less than typical uncertainty. For the RMW, only SINGLE-2 and SINGLE-4 were superior to the other experiments, though uncertainty estimates for that parameter are unavailable.
Analysis errors for the single-cycle experiments. The lowest error for each metric is indicated in bold font.
Figure 7 explores the impacts of the Coyote data in 126-h forecasts initialized from the four different analyses. Figures 7a–c illustrate how the Coyote data impacts the track and intensity forecasts for each of the four configurations compared with CONTROL and the best track, while Figs. 7g–i show the forecast errors for each experiment. Comparing like colors illustrates the impact of increasing the gross error ratio, and comparing solid/dashed lines shows the impact of data resolution. There is some indication that either increasing horizontal resolution or gross error ratio results in better forecasts and that increasing both, as in SINGLE-4, might be optimal.
To better highlight the performance of SINGLE-4, which produced the best forecast over the first few days, Figs. 7d–f and 7j–l show the same information as Figs. 7a–c and 7g–i but compares only CONTROL and SINGLE-4. One obvious result in Fig. 7 is that the forecast storm in SINGLE-4 was somewhat stronger than that in CONTROL out to about 48 h so that the intensity in SINGLE-4 better verified against the best track. This is consistent with the stronger analysis vortex in Fig. 6. Further, the track error in SINGLE-4 tended to be lowest among all experiments assimilating Coyote data and generally was commensurate with that in CONTROL.
Given the variability of errors across lead times, the time average of forecast errors is useful to examine. As such, Fig. 8 shows the MAE of track and intensity metrics, where the mean was taken across all lead times. For all three metrics in Fig. 8, SINGLE-1 tended to have larger errors than CONTROL, though adding more Coyote data in the remaining three experiments improved the verification so that errors were comparable to or lower than those in CONTROL, particularly in SINGLE-4. This result suggests additive improvements from increasing data resolution and increasing the gross error check threshold. These results guided the choice of the configuration for the fully cycled experiments.
b. Fully cycled analyses
The fully cycled experiments were carried out every 6 h between 0000 UTC 23 September and 1800 UTC 26 September. As explained above, the CYCLED-1 experiment is the equivalent of the SINGLE-1 configuration (3-km thinning of Coyote observations and default gross error ratio). CYCLED-4 used the SINGLE-4 configuration (1.5-km thinning grid spacing and doubled gross error ratio) that resulted in the smallest analysis and forecast errors.
The impact of the Coyote data on the analyses depended on how well it sampled the inner core. To show this, Figs. 9 and 10, respectively, explore the innovations for the Coyote data and commensurate analysis increments of wind speed at the lowest model level. The C1 flight 0000 UTC 23 September (Fig. 9a) as well as the C5–6 flights on 1800 UTC 24 September (Fig. 9c) asymmetrically sampled the inner core, which apparently resulted in asymmetric wind speed analysis increments (Figs. 10g,i). In particular, C1 had almost no Coyote observations north of the center, and most observations were concentrated to the southwest of the center (Fig. 9a). Innovations strongly varied along the flight path, showing the first guess wind speeds were too weak southwest of the center but too strong to the southeast (Fig. 9d). As a result, analysis increments (Fig. 10g) southwest of the center locally strengthened the first-guess winds, while those southeast of the center, weakened them. On top of this was a signal consistent with eastward movement of the vortex (e.g., the tripole in wind speed increments in Figs. 10g–i). The analysis increments north of the center were small, and the resulting analysis had a somewhat unrealistic structure with a strong wavenumber-2 component (Fig. 10d). There was no evidence of this feature in the corresponding TDR analysis (Figs. 1b,e). The 1800 UTC 24 September cycle, when the Coyote data covered an even smaller region, had similar issues. There were large positive innovations associated with the limited area of Coyote coverage to the west and northwest of the storm center (Fig. 9f) that approximately coincided with a strong positive increment to wind speeds (Fig. 10i). Again, the analysis had a somewhat unrealistic wavenumber-2 structure (Fig. 10f), which is inconsistent with the TDR data. This result suggests that the model covariances may be suboptimal for assimilating asymmetrically distributed observations such as these.
The fundamentally different data distribution in C3–4 on the 1800 UTC 23 September resulted in a different distribution of wind speed analysis increments. Nearly every quadrant of the storm around or within the RMW was sampled by the two Coyotes that flew during that cycle (Fig. 9b), and the two Coyote paths overlapped in an area to the south of the center. Generally speaking, the innovations were more strongly positive when the Coyotes were farthest from the center, and they were negative near the center (Fig. 9e). The commensurate structure of analysis increments was generally symmetric, with a ring of positive speed increments surrounding a region of negative increments near the center (Fig. 10h). The result was a much more symmetric structure (Fig. 10e) than at the other two analysis times (Figs. 10d,f).
Despite the variability in data distribution, assimilating the Coyote data in CYCLED-4 clearly drew the analyses closer to observations from the Coyote. In Fig. 9, the number of large-magnitude innovations for Coyote-observed wind speed decreased substantially from the first guess (Figs. 9d–f) to the analysis (Figs. 9g–i) in all three cases (see also Table 3). The RMS of OMA for wind speed was consistently about half of that for the OMB, and the bias was also substantially reduced. Likewise, the CYCLED-4 analyses better fit the HDOB flight-level winds than did the CONTROL analyses (Table 3).
OMB and OMA RMS error and bias (m s−1) for Coyote and HDOB U/V wind components and wind speed. Note that these statistics were computed from all observations available (e.g., both assimilated and rejected by the DA system). Wind speed statistics were computed from the original U and V observations. The lowest value for each statistic is indicated in bold font.
Meanwhile, the CYCLED-1 experiment demonstrated a worse fit to observations than either CYCLED-4 or CONTROL (Table 3). Note that the RMS of OMB was very similar in the first cycle of CYCLED-1 and CYCLED-4, which again suggests that the characteristics of the denser Coyote data available to CYCLED-4 did not substantially differ from that in CYCLED-1. Nevertheless, the RMS and bias of Coyote wind OMA in CYCLED-1 was much higher than in CYCLED-4 starting with the first cycle and continuing through subsequent analyses. Similarly, CYCLED-1 analyses generally did not fit HDOB winds as well as CONTROL.
The impact of the Coyote data on NHC-verified metrics varied strongly by analysis cycle. To demonstrate this, Table 4 compares the best track verification metrics with a focus on the inner core. In particular, VMAX, SLP, RMW, and location are shown for CYCLED-1, CYCLED-4, and CONTROL. Generally speaking, adding Coyote data increased the intensity for the first two cycles with Coyote data, and impacts were mixed for the third cycle. For the first cycle in particular, the changes imparted by assimilating the Coyote data made an already stronger-than-observed analysis VMAX in CONTROL even stronger, so that the difference in VMAX between CYCLED-4 and the best track far exceeded average best track errors (Landsea and Franklin 2013). This issue might be related to the asymmetric analysis increments shown in Fig. 10. Meanwhile, the CYCLED-4 RMW was consistently the smallest among all experiments so that it better agreed with the best track. These differences appear to have important implications for the forecasts, as described next.
Analysis values and errors for standard verification metrics. For track, the direction of the position error is included.
c. Fully cycled forecasts
The impact of the Coyote observations was assessed in a series of sixteen 126-h forecasts that cycled every 6 h from 0000 UTC 23 September to 1800 UTC 25 September. Although only three cycles contained Coyote observations, the residual impact of assimilating these data would be expected to propagate through subsequent cycles. To quantify the full impact of the Coyote data, we continued to cycle 48 h past the last cycle with Coyote observations. We mostly focus on the results from CYCLED-4, though some contrast with CYCLED-1 is also offered. First, we evaluate the short-term (6-h) forecasts in terms of the OMB of various reconnaissance observations. We also examine differences in track, VMAX, PMIN, and RMW between the various experiments and the best track, first focusing on the cycles where Coyote data were available. We conclude with an assessment of the average forecast skill with respect to the best track using the consistency metric.
The RMS OMB for all available flight-level wind and temperature, SFMR surface wind speed, and TDR radial wind speed is shown in Fig. 11. All calculations were performed over the innermost HWRF domain, D03. Flight-level and SFMR data exist for all the 6-h cycles, but TDR was not assimilated until later cycles and only when the Coyote was not flying, hence the missing values for these observations. The number of observations per cycle ranged from approximately 600 to 3200 for flight level and SFMR and between 10 000 and 25 000 for TDR.
Figure 11 indicates that the short-term forecasts better agreed with in situ reconnaissance observations when the Coyote data were assimilated in the CYCLED-4 configuration but not in CYCLED-1. The largest improvement in CYCLED-4 OMB was in the winds at flight level, though the mean OMB for both SFMR wind speed and TDR radial velocity was also smaller for CYCLED-4. The smaller improvements for TDR and SFMR might be because those model quantities are derived values, and in the case of SFMR, dependent on the surface-layer physics. The biggest improvement in any variable was not realized immediately but instead in the forecasts initialized 12 to 18 h after the assimilation of Coyote data. In the CYCLED-1 experiment, only TDR OMB was reduced, and there were slight increases in OMB from HDOB U and V and SFMR. As for thermodynamic observations, there was practically no difference in OMB for HDOB temperature for either CYCLED-1 or CYCLED-4.
In addition to producing generally better short-term forecasts as evaluated against reconnaissance observations, CYCLED-4 also verified better against the best track in longer-term forecasts. In the below analysis, a theme emerges that CYCLED-4 produced generally better forecasts than CONTROL for track and intensity metrics, but this was not the case for CYCLED-1. This is consistent with the single-cycle tests, where SINGLE-1 was somewhat worse than CONTROL, but SINGLE-4 was better.
For the cycles when Coyote data were available, the added data had mixed impacts in CYCLED-1, but they more consistently benefited CYCLED-4. For example, Fig. 12 reveals that the track errors in CYCLED-1 were larger than those in CONTROL for two of the three cycles with Coyote data. The VMAX errors in CYCLED-1 were commensurate with those in CONTROL, though PMIN errors generally improved upon CONTROL for the first 48 h. Meanwhile, track errors in CYCLED-4 improved on those in CONTROL for two of the three cycles, and they were comparable for the third. Finally, although VMAX errors in CYCLED-4 fluctuated, PMIN errors were consistently lower than those in CONTROL for the first 72 h.
The varying performance of the forecasts again hints at the importance of a symmetric data distribution around the TC center. Among the three cycles that assimilated Coyote data, clearly the most positive impact of the data occurred at 1800 UTC 23 September, which was when the C3 and C4 flights nearly encircled Maria’s center. The intensity forecast in CYCLED-4 at that time agreed remarkably well with the best track for much of the duration of the forecast (Fig. 13). The VMAX error in CYCLED-4 was lower than in CONTROL at most lead times, and the PMIN error was substantially lower than in CONTROL for about the first 48 h. Further, both the CYCLED-1 and CYCLED-4 tracks improved more on CONTROL in this cycle than in the others. In contrast, the unreasonably strong intensity analyses in the first cycle that assimilated Coyote data (Table 4, Fig. 10d) were followed by generally degraded VMAX forecasts (Fig. 12d). This is especially true for CYCLED-4, which had the highest analysis VMAX and also a high bias in the VMAX forecast at many lead times through 72 h.
In terms of average errors evaluated over all available cycles, CYCLED-4 had notably better track forecasts than the other experiments (Fig. 14). The track MAE in CYCLED-4 was lower than in CONTROL at all lead times by about 10% on average, and the improvement was marginally consistent out to 72 h. The improvement in track forecast was statistically significant with 95% confidence at 60 h. Although the track MAE in CYCLED-4 was skillful at longer forecast lead times, the median error was less skillful (not shown), and the improvement was therefore not assessed as consistent. Meanwhile, the track error in CYCLED-1 was similar to or somewhat worse than that CONTROL.
Despite very low intensity forecast errors in CONTROL, the Coyote data further improved the intensity forecast in CYCLED-4, on average. Figure 15 shows that the MAE for VMAX and PMIN was, respectively, about 2.5 m s−1 and 5 hPa in CONTROL for the first 96 h. Assimilating Coyote observations reduced PMIN forecast errors with marginal consistency at most lead times. There was also marginally consistent improvement in VMAX at most lead times after 24 h for CYCLED-4 (Fig. 15a, top panel), and on average the CYCLED-4 VMAX forecasts improved upon CONTROL by 3.8%. The larger errors at the initial time were dominated by the first case, when the analyzed storm was much stronger than the intensity reported in the best track. The improvement of VMAX MAE in CYCLED-4 was associated with improved weak intensity bias (Fig. 15a bottom panel), which was consistent with the improved PMIN bias. This seems to be in part a result of analyses with a stronger VMAX and smaller RMW, as discussed above. The sign of the PMIN bias reversed at later lead times in both CYCLED-4 and CONTROL experiments, but still improved in CYCLED-4 compared to CONTROL.
The impacts on the CYCLED-1 intensity forecast were mixed. For PMIN, the CYCLED-1 forecast improved upon CONTROL out to 72 h (Fig. 15b), though the improvement was smaller than for CYCLED-4. Meanwhile, CYCLED-1 had marginally consistent VMAX degradation in the early forecasts and mixed results after 24 h (Fig. 15a). As opposed to CYCLED-4, the average VMAX skill in CYCLED-1 was about 3.8% worse than CONTROL. The VMAX bias in CYCLED-1 did not meaningfully differ from that in CONTROL, on average.
6. Summary and conclusions
This study summarizes the results from the first-ever attempt to assimilate sUAS observations into an NCEP operational model. Coyote sUAS observations were assimilated into the operational HWRF model in a series of experiments intended to guide future efforts to assimilate sUAS observations operationally. A single-cycle test case and fully cycled experiments examined the impact of varying both the spatial density of observations and the stringency of DA quality control. In particular, experiments varying the observation density explored whether Coyote observations should be assimilated at a higher horizontal resolution than flight-level data from conventional reconnaissance aircraft. Initial testing revealed a high rejection rate of Coyote wind observations compared with thermodynamic data, so experiments also explored the impacts of relaxing quality control for wind observations. This approach seems reasonable given the greater variability and high temporal frequency of these observations in the hurricane PBL where sUAS typically operate.
Results from both sets of experiments suggest that Coyote data can improve TC forecasts if they are assimilated at a higher resolution than currently operational flight-level reconnaissance data and if the quality control is relaxed. The cycled experiment that assimilated the most Coyote data (CYCLED-4) generally produced better analyses and forecasts as compared against both an experiment that assimilated less Coyote data (CYCLED-1) and an experiment that assimilated no Coyote data (CONTROL). The CYCLED-4 analyses had better OMA agreement with the Coyote data itself and with other reconnaissance data. Further, the OMB in CYCLED-4 also tended to be lower than the OMB in other experiments for variables related to wind (i.e., flight-level wind, SFMR wind speed, and TDR radial velocity). This suggests that the short-term forecasts in CYCLED-4 were more skillful, at least in the regions being sampled by the reconnaissance aircraft. CYCLED-4 also produced improved track and intensity forecasts when evaluated against the NHC best track, though the sample size was fairly small. The intensity improvement in CYCLED-4 was generally associated with reduced bias.
Though the sample size here is small, experience with assimilating TC inner-core data has taught us that the results of such work can yield reasonable positive expectations. Perhaps the best example of this came from Zhang et al. (2009), the first study to examine the impact of assimilating high-density inner-core observations. In particular, they evaluated the impact of assimilating high-density Doppler velocity observations from the NEXRAD network on a coastal TC. Their sample size was likewise very small; in fact, all forecast evaluations were limited to a single day. Nevertheless, their experiments assimilating the additional data showed remarkable improvement and encouraged a long-series of subsequent studies to examine the impact of assimilating inner-core reconnaissance data. Most of the subsequent airborne studies that added high-density, inner-core data also showed improvement, and reconnaissance data are now known to improve operational intensity forecasts by at least 10%–15% (Zawislak et al. 2022).
One issue to consider for future assimilation of sUAS data is that analyses benefited more when observations were symmetrically distributed around the storm center. When this is not the case, DA relies more heavily on the background covariance information at locations azimuthally or radially displaced from the assimilated observations, which can result in suboptimal updates to the model background. This may be less of a factor if TDR data are assimilated concurrently to the sUAS, particularly in unobserved regions of the TC. Symmetry of TDR data is already taken into account in the design of the NOAA aircraft flight patterns, as outlined in the National Hurricane Operations Plan (NOAA 2022). Storm-relative observations (Aksoy 2013) have also been shown to allow for a more symmetric observation coverage, especially when shorter assimilation windows are used. A more sophisticated DA system such as 4DEnVar may also help to reduce the need for symmetric data coverage (e.g., Davis et al. 2021).
More evaluation is needed to determine how the assimilation of sUAS observations can be further optimized. An important factor to account for here is the intrinsically different nature of these observations compared to other types of aircraft observations that are typically collected by crewed aircraft. The sUASs typically fly below crewed-aircraft altitude and at much slower speeds, resulting in the sampling of the more turbulent areas of the boundary layer at much higher sampling rates. It is therefore necessary to further investigate how the thinning or superobbing (i.e., combining observations within a volume to produce a single “super observation”) of sUAS observations can be optimized, especially accounting for potential impacts of implementing sophisticated superobbing techniques that also consider the variance of measurements within a given superobbing volume. Another avenue of exploration is more frequent cycling that was shown to be beneficial in a research DA system (Aksoy et al. 2022). While this is not an option in the current HWRF operational system, the next generation Hurricane Analysis and Forecast System (HAFS) is expected to have this capability. More frequent cycling should improve analyses by better mapping of the observations to the model background at times and locations where the vortex may be rapidly evolving and by providing updated covariances.
A caveat of this work is that the sample size is very small. Coyote data were only assimilated in 3 cycles and forecast error statistics were evaluated over 16 cycles. This is far too small of a sample to make general conclusions, and many more similar sUAS flights are needed for a more robust assessment of forecast impact.
For major hurricanes with reconnaissance, uncertainty for position, VMAX, and PMIN are, respectively, about 20 km, 4–6 m s−1, and 2–4 hPa (Torn and Snyder 2012; Landsea and Franklin 2013).
Acknowledgments.
This research was supported by the NOAA Uncrewed Systems Research Transition Office in the Office of Oceanic and Atmospheric Research. Partial funding support was also provided through the Cooperative Agreement NA20OAR4320472 between NOAA and the University of Miami. Appreciation goes out to all NOAA Aircraft Operations Center (AOC) staff who supported P-3 sUAS Hurricane Hunter missions into Maria (2017). Without their dedication these missions would not have been possible. The Coyote observations used here were the cumulative result of years of development spearheaded by Joseph Cione of NOAA’s Hurricane Research Division. The authors acknowledge the NOAA Research and Development High Performance Computing Program for providing computing and storage resources that have contributed to the research results reported within this paper (https://rdhpcs.noaa.gov). Internal reviewers Andy Hazelton and Peter Dodge as well as several anonymous reviewers provided insightful comments to improve the final manuscript.
Data availability statement.
The experimental data are archived on the NOAA, jet research, and development high-performance computing system and is available from the authors upon request. Hurricane reconnaissance observations are publicly accessible through the NOAA/AOML/HRD data portal at https://www.noaa.gov/hrd/Storm_pages/maria2017.
REFERENCES
Aberson, S. D., K. J. Sellwood, and P. A. Leighton, 2017: Calculating dropwindsonde location and time from TEMP-DROP messages for accurate assimilation and analysis. J. Atmos. Oceanic Technol., 34, 1673–1678, https://doi.org/10.1175/JTECH-D-17-0023.1.
Aksoy, A., 2013: Storm-relative observations in tropical cyclone data assimilation with an ensemble Kalman filter. Mon. Wea. Rev., 141, 506–522, https://doi.org/10.1175/MWR-D-12-00094.1.
Aksoy, A., S. Lorsolo, T. Vukicevic, K. J. Sellwood, S. D. Aberson, and F. Zhang, 2012: The HWRF Hurricane Ensemble Data Assimilation System (HEDAS) for high-resolution data: The impact of airborne Doppler radar observations in an OSSE. Mon. Wea. Rev., 140, 1843–1862, https://doi.org/10.1175/MWR-D-11-00212.1.
Aksoy, A., J. J. Cione, B. A. Dahl, and P. D. Reasor, 2022: Tropical cyclone data assimilation with Coyote uncrewed aircraft system, observations, very frequent cycling and a new online quality control technique. Mon. Wea. Rev., 150, 797–820, https://doi.org/10.1175/MWR-D-21-0124.1.
Beven, J. L., II, R. Berg, and A. Hagen, 2019: National Hurricane Center Tropical Cyclone Report: Hurricane Michael (7–11 October 2018). NOAA/NHC Tech. Rep. AL142018, 86 pp., https://www.nhc.noaa.gov/data/tcr/AL142018_Michael.pdf.
Biswas, M. K., and Coauthors, 2018: Hurricane Weather Research and Forecasting (HWRF) Model: 2018 Scientific documentation. Scientific Doc. HWRF v4.0a, 112 pp., https://dtcenter.org/sites/default/files/community-code/hwrf/docs/scientific_documents/HWRFv4.0a_ScientificDoc.pdf.
Blumberg, A. F., and G. L. Mellor, 1987: A description of a three-dimensional coastal ocean circulation model. Three-Dimensional Coastal Models, N. Heaps, Ed., Amer. Geophys. Union, 1–16.
Cione, J. J., E. A. Kalina, E. W. Uhlhorn, A. M. Farber, and B. Damiano, 2016: Coyote unmanned aircraft system observations in Hurricane Edouard. Earth Space Sci., 3, 370–380, https://doi.org/10.1002/2016EA000187.
Cione, J. J., and Coauthors, 2020: Eye of the storm: Observing hurricanes with a small unmanned aircraft system. Bull. Amer. Meteor. Soc., 101, E186–E205, https://doi.org/10.1175/BAMS-D-19-0169.1.
Davis, B., X. Wang, and X. Lu, 2021: A comparison of HWRF six-hourly 4DEnVar and hourly 4DEnVar assimilation of inner core tail Doppler radar observations for the prediction of Hurricane Edouard (2014). Atmosphere, 12, 942, https://doi.org/10.3390/atmos12080942.
Ditchek, S. D., J. A. Sippel, P. J. Marinescu, and G. J. Alaka Jr., 2023: Improving best track verification of tropical cyclones: A new metric to identify forecast consistency. Wea. Forecasting, 38, 817–831, https://doi.org/10.1175/WAF-D-22-0168.1.
Elston, J., B. Argrow, M. Stachura, D. Weibel, D. Lawrence, and D. Pope, 2015: Overview of small fixed-wing unmanned aircraft for meteorological sampling. J. Atmos. Oceanic Technol., 32, 97–115, https://doi.org/10.1175/JTECH-D-13-00236.1.
Gamache, J. F., 1997: Evaluation of a fully three-dimensional variational Doppler analysis technique. Preprints, 28th Conf. on Radar Meteorology, Austin, TX, Amer. Meteor. Soc., 422–423.
Hock, T. F., and J. L. Franklin, 1999: The NCAR GPS dropwindsonde. Bull. Amer. Meteor. Soc., 80, 407–420, https://doi.org/10.1175/1520-0477(1999)080<0407:TNGD>2.0.CO;2.
Hu, M., H. Shao, D. Stark, K. Newman, C. Zhou, and X. Zhang, 2016: Gridpoint Statistical Interpolation Advanced User’s Guide Version 3.5. Developmental Testbed Center, 148 pp., https://www.dtcenter.org/com-GSI/users/docs/index.php.
Hu, M., G. Ge, C. Zhou, D. Stark, H. Shao, K. Newman, J. Beck, and X. Zhang, 2018: Gridpoint Statistical Interpolation (GSI) User’s Guide Version 3.7. Developmental Testbed Center, 147 pp., https://www.dtcenter.org/com-GSI/users/docs/index.php.
Landsea, C. W., and J. L. Franklin, 2013: Atlantic hurricane database uncertainty and presentation of a new database format. Mon. Wea. Rev., 141, 3576–3592, https://doi.org/10.1175/MWR-D-12-00254.1.
Lu, X., X. Wang, M. Tong, and V. Tallapragada, 2017: GSI-based, continuously cycled, dual-resolution hybrid ensemble–variational DA system for HWRF: System description and experiments with Edouard (2014). Mon. Wea. Rev., 145, 4877–4898, https://doi.org/10.1175/MWR-D-17-0068.1.
Marchok, T. P., 2002: How the NCEP tropical cyclone tracker works. Conf. on Hurricanes and Tropical Meteorology, San Diego, CA, Amer. Meteor. Soc., P1.13, https://ams.confex.com/ams/pdfpapers/37628.pdf.
NOAA, 2022: National hurricane operations plan. Office of the Federal Coordinator for Meteorological Services and Supporting Research (OFCM) Doc. FCM-P12-2022, 186 pp., https://www.weather.gov/media/nws/IHC2022/2022_NHOP_June_1.pdf.
Pasch, R. J., A. B. Penney, and R. Berg, 2023: National Hurricane Center Tropical Cyclone Report: Hurricane Maria (16–30 September 2017). NOAA/NHC Tech. Rep. AL152017, 48 pp., https://www.nhc.noaa.gov/data/tcr/AL152017_Maria.pdf.
Pinto, J. O., and Coauthors, 2021: The status and future of small uncrewed aircraft systems (UAS) in operational meteorology. Bull. Amer. Meteor. Soc., 102, E2121–E2136, https://doi.org/10.1175/BAMS-D-20-0138.1.
Rappaport, E. N., and Coauthors, 2009: Advances and challenges at the National Hurricane Center. Wea. Forecasting, 24, 395–419, https://doi.org/10.1175/2008WAF2222128.1.
Rogers, R., and Coauthors, 2006: The intensity forecasting experiment: A NOAA multiyear field program for improving tropical cyclone intensity forecasts. Bull. Amer. Meteor. Soc., 87, 1523–1538, https://doi.org/10.1175/BAMS-87-11-1523.
Rogers, R., and Coauthors, 2013: NOAA’s hurricane intensity forecasting experiment: A progress report. Bull. Amer. Meteor. Soc., 94, 859–882, https://doi.org/10.1175/BAMS-D-12-00089.1.
Shao, H., and Coauthors, 2016: Bridging research to operations transitions: Status and plans of community GSI. Bull. Amer. Meteor. Soc., 97, 1427–1440, https://doi.org/10.1175/BAMS-D-13-00245.1.
Tong, M., and Coauthors, 2018: Impact of assimilating aircraft reconnaissance observations on tropical cyclone initialization and prediction using operational HWRF and GSI ensemble-variational hybrid data assimilation. Mon. Wea. Rev., 146, 4155–4177, https://doi.org/10.1175/MWR-D-17-0380.1.
Torn, R. D., and C. Snyder, 2012: Uncertainty of tropical cyclone best-track information. Wea. Forecasting, 27, 715–729, https://doi.org/10.1175/WAF-D-11-00085.1.
Uhlhorn, E. W., P. G. Black, J. L. Franklin, M. Goodberlet, J. Carswell, and A. S. Goldstein, 2007: Hurricane surface wind measurements from an operational Stepped Frequency Microwave Radiometer. Mon. Wea. Rev., 135, 3070–3085, https://doi.org/10.1175/MWR3454.1.
Velden, C. S., and S. B. Goldenberg, 1987: The inclusion of high-density satellite wind information in a barotropic hurricane-track forecast model. Preprints, 17th Conf. on Hurricanes and Tropical Meteorology, Miami, FL, Amer. Meteor. Soc., 90–93.
Willoughby, H. E., and M. B. Chelmow, 1982: Objective determination of hurricane tracks from aircraft observations. Mon. Wea. Rev., 110, 1298–1305, https://doi.org/10.1175/1520-0493(1982)110<1298:ODOHTF>2.0.CO;2.
Zawislak, J., and Coauthors, 2022: Accomplishments of NOAA’s airborne hurricane field program and a broader future approach to forecast improvement. Bull. Amer. Meteor. Soc., 103, E311–E338, https://doi.org/10.1175/BAMS-D-20-0174.1.
Zhang, F., Y. Weng, J. A. Sippel, Z. Meng, and C. H. Bishop, 2009: Cloud-resolving hurricane initialization and prediction through assimilation of Doppler radar observations with an ensemble Kalman filter. Mon. Wea. Rev., 137, 2105–2125, https://doi.org/10.1175/2009MWR2645.1.