• Abbey, R. F., Jr., L. M. Leslie, and G. J. Holland, 1995: Estimates of the inherent and practical limits of mean forecast errors of tropical cyclones. Preprints, 21st Conf. on Hurricanes and Tropical Meteorology, Miami, FL, Amer. Meteor. Soc., 201–203.

  • Aberson, S. D., 1997: The prediction of the performance of a nested barotropic hurricane track forecast model. Wea. Forecasting,12, 24–30.

    • Crossref
    • Export Citation
  • Carr, L. E., and R. L. Elsberry, 1994: Systematic and integrated approach to tropical cyclone track forecasting. Part I. Approach overview and description of meteorological basins. Tech. Rep. NPS-MR-94-002, Naval Postgraduate School, Monterey, CA, 273 pp. [Available from Naval Postgraduate School, Monterey, CA 93943-5114.].

  • Fiorino, M., J. Goerss, J. Jensen, and E. Harrison, 1993: An evaluation of the real-time tropical cyclone forecast skill of the Navy Operational Global Atmospheric Prediction System in the western North Pacific. Wea. Forecasting,8, 3–24.

    • Crossref
    • Export Citation
  • Goerss, J., and R. Jeffries, 1994: Assimilation of synthetic tropical cyclone observations into the Navy Operational Global Atmospheric Prediction System. Wea. Forecasting,9, 557–576.

    • Crossref
    • Export Citation
  • Guard, C. P., 1995: Wind-pressure relationships to determine hurricane intensity. Minutes, 49th Interdepartmental Hurricane Conf., Silver Spring, MD, OFCM.

  • JTWC, 1997: Tropical cyclone summary 1996. U.S. Naval Pacific Meteorology and Oceanography Center West/Joint Typhoon Warning Center, 116 pp. [Available from JTWC, FPO AP 96536-0051, Guam.].

  • Kurihara, Y., M. A. Bender, R. E. Tuleya, and R. J. Ross, 1990: Prediction experiments of Hurricane Gloria (1985) using a multiply nested movable mesh model. Mon. Wea. Rev.,118, 2186–2198.

  • ——, ——, and R. J. Ross, 1993: An initialization scheme of hurricane models by vortex specification. Mon. Wea. Rev.,121, 2030–2045.

    • Crossref
    • Export Citation
  • ——, ——, R. E. Tuleya, and R. J. Ross, 1995: Improvements in the GFDL Hurricane Prediction System. Mon. Wea. Rev.,123, 2791–2801.

    • Crossref
    • Export Citation
  • ——, R. E. Tuleya, and M. A. Bender, 1998: The GFDL Hurricane Prediction System and its performance in the 1995 hurricane season. Mon. Wea. Rev.,126, 1306–1322.

    • Crossref
    • Export Citation
  • Neuman, C., 1992: Final report, Joint Typhoon Warning Center (JTWC92) Model. SAIC Contract Rep. N 00014-90-C-6042 (Part 2), 83 pp. [Available from SAIC, 550 Camino El Estero, #205, Monterey, CA 93940.].

  • OFCM, 1997: National Plan for Tropical Cyclone Research and Reconnaissance (1997–2002), FCM-P25-1997, Washington, DC, 137 pp. [Available from Office of the Federal Coordinator for Meteorological Services and Supporting Research, 8455 Colesvill Road, Suite 1500, Silver Spring, MD 20910.].

  • Pike, A. C., and C. J. Neumann, 1987: The variation of track forecast difficulty among tropical cyclone basins. Wea. Forecasting,2, 237–241.

    • Crossref
    • Export Citation
  • View in gallery
    Fig. 1.

    GFDN detection ability at different forecast periods. Percentages on the right are probability of detection for tropical cyclones with verifying best track greater than tropical depression intensity. See text for a full explanation of bar positions.

  • View in gallery
    Fig. 2.

    (a) Mean track errors for GFDN forecasts. (b) Number of cases. Errors are stratified according to the intensity of the forecast tropical cyclone.

  • View in gallery
    Fig. 3.

    The 72-h forecast mean track error distribution and accumulated percentage of all forecasts.

  • View in gallery
    Fig. 4.

    Mean GFDN position bias with respect to best track for each forecast period. Ellipses indicate position errors associated with half of all forecasts, centered about the mean.

  • View in gallery
    Fig. 5.

    GFDN intensity bias vs forecast period for different verifying intensities.

  • View in gallery
    Fig. 6.

    Skill relative to CLIPER for GFDN, NOGAPS, and the official JTWC warnings for 1996 WPAC tropical cyclones.

  • View in gallery
    Fig. 7.

    JTWC skill relative to CLIPER for warnings with and without GFDN guidance.

  • View in gallery
    Fig. 8.

    Mean difference between NOGAPS and GFDN track errors for the homogeneous forecast set. Error bars indicate the standard deviation of the error difference.

  • View in gallery
    Fig. 9.

    Mean, maximum, and minimum separation between NOGAPS and GFDN positions using lagged–unlagged forecast pairs.

  • View in gallery
    Fig. 10.

    Mean track errors for forecasts with verifying intensity of tropical storm strength or greater for GFDN, NOGAPS, and the ensemble mean, GFNG. Numbers along the top are correlation coefficients between the GFNG error and the ensemble spread.

  • View in gallery
    Fig. 11.

    Mean track errors for GFDN and NOGAPS when two or more systems of tropical storm strength or greater were active in the western North Pacific. Numbers along the top are the number of cases at each forecast period.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 172 27 4
PDF Downloads 30 15 0

Performance of the Navy’s Tropical Cyclone Prediction Model in the Western North Pacific Basin during 1996

M. A. RennickFleet Numerical Meteorology and Oceanography Center, Monterey, California

Search for other papers by M. A. Rennick in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

The U.S. Navy’s operational implementation of the hurricane prediction system developed at the National Oceanographic and Atmospheric Administration’s Geophysical Fluid Dynamics Laboratory is described, and the performance of the model during the 1996 western North Pacific tropical cyclone season is analyzed.

The model was highly reliable in terms of maintaining and tracking tropical cyclones, maintaining 96%, 93%, and 93% of all tropical storms and typhoons at 24-, 48-, and 72-h forecast periods. Subsequent model improvements raised these percentages to 96%, 93%, and 100%. Overall track errors were 176, 316, and 466 km for the same periods. Errors for tropical storms and typhoons were 75–150 km smaller than those for tropical depressions. The difference generally grew with forecast length. Large track errors were generally associated with a sheared environment, spurious interactions with elevated terrain, or poorly timed recurvature. On average, the model slightly underforecast intensity, but intense systems were significantly underforecast due to the inability of the model to resolve the eyewall.

For the entire season, tropical cyclone track errors are very similar to those of the navy’s global model, Navy Operational Global Atmospheric Prediction System. However, significant differences are found in individual forecasts. Further study is required to identify environmental features that lead to systematic differences in model performance.

Corresponding author address: Dr. M. A. Rennick, FLENUMMETOCCEN, 7 Grace Hopper Ave., Monterey, CA 93943.

Email: mrennick@fnmoc.navy.mil

Abstract

The U.S. Navy’s operational implementation of the hurricane prediction system developed at the National Oceanographic and Atmospheric Administration’s Geophysical Fluid Dynamics Laboratory is described, and the performance of the model during the 1996 western North Pacific tropical cyclone season is analyzed.

The model was highly reliable in terms of maintaining and tracking tropical cyclones, maintaining 96%, 93%, and 93% of all tropical storms and typhoons at 24-, 48-, and 72-h forecast periods. Subsequent model improvements raised these percentages to 96%, 93%, and 100%. Overall track errors were 176, 316, and 466 km for the same periods. Errors for tropical storms and typhoons were 75–150 km smaller than those for tropical depressions. The difference generally grew with forecast length. Large track errors were generally associated with a sheared environment, spurious interactions with elevated terrain, or poorly timed recurvature. On average, the model slightly underforecast intensity, but intense systems were significantly underforecast due to the inability of the model to resolve the eyewall.

For the entire season, tropical cyclone track errors are very similar to those of the navy’s global model, Navy Operational Global Atmospheric Prediction System. However, significant differences are found in individual forecasts. Further study is required to identify environmental features that lead to systematic differences in model performance.

Corresponding author address: Dr. M. A. Rennick, FLENUMMETOCCEN, 7 Grace Hopper Ave., Monterey, CA 93943.

Email: mrennick@fnmoc.navy.mil

1. Introduction

Accurate and timely forecasts of tropical cyclone track and intensity are important in support of U.S. Navy activities in the Pacific basin and throughout the world. The Fleet Numerical Meteorology and Oceanography Center (FNMOC) has traditionally provided the Joint Typhoon Warning Center (JTWC) forecasters with numerical guidance products for their forecasts. In the past, these were primarily based on climatological, statistical, or very simple dynamical models.

In recent years, sophisticated dynamical models of the atmosphere have become important tools for tropical cyclone prediction. Since June 1990, FNMOC has included synthetic observations in the vicinity of tropical cyclones in their Navy Operational Global Atmospheric Prediction System (NOGAPS; Fiorino et al. 1993; Goerss and Jeffries 1994) and has developed an automated tracking system. As these systems have become more refined, JTWC forecasters have come to rely on NOGAPS forecasts as a primary guidance tool.

The hurricane prediction system developed by the National Oceanographic and Atmospheric Administration’s Geophysical Fluid Dynamics Laboratory (GFDL) has added sophisticated limited area models to the mix of tropical cyclone forecast guidance products. Starting with the 1995 Atlantic hurricane season, their model has run operationally at the National Centers for Environmental Prediction (NCEP) to provide guidance for the National Hurricane Center forecasters (Kurihara et al. 1998). In late May 1996, FNMOC began to use the same basic model to provide guidance to JTWC forecasters.

The purpose of this paper is to describe the operational implementation of the GFDL model at FNMOC and to analyze the performance of the model during the summer and fall of 1996 on tropical cyclones originating in the western North Pacific basin (specifically, the region bordered by equator–50°N; 100°E–180°, hereinafter referred to as WPAC). In addition, it compares the performance of the model to the navy’s other principal tropical cyclone track aid, NOGAPS, and to the official JTWC warnings.

2. Model implementation

The navy implementation of the GFDL model, GFDN, is almost identical to that which runs operationally at NCEP. It is a triple-nested movable mesh model, including initialization, forecast, and diagnostic sections. It has been fully documented in a series of papers, notably Kurihara et al. (1990, 1993, 1995, 1998). Some key features include three computational nests with resolutions of 1°, 1/3°, and 1/6°; convective adjustment; surface fluxes; second-order turbulence; infrared and solar radiation; a bulk subsurface layer; and parameterization of surface features by vegetation type.

The model is initialized from a global analysis and an initialization message, specifying the observed structure of the cyclone. The tropical cyclone component is removed from the global analysis, and replaced by a synthetic vortex generated by an axisymmetric version of the forecast model constrained by the structure indicated by the initialization message. An asymmetric (beta advection) component is also added to the synthetic vortex (Kurihara et al. 1993; Kurihara et al. 1995). Boundary conditions are updated periodically from forecast fields generated by the global forecast model.

GFDN is identical to the NCEP version, except that the analysis section has been modified to use NOGAPS global analyses and forecasts as initial and boundary conditions and a message originated by JTWC as the initialization message. Other changes accommodate the FNMOC operational environment and do not affect the model results.

The strategy for running GFDN is closely tied to JTWC procedures. Whenever one or more tropical cyclones are active within their area of responsibility, JTWC issues a tropical cyclone bogus message (TCBOGUS). This message lists all active tropical cyclones, ordered according to JTWC’s current operational priorities. For each cyclone, the TCBOGUS specifies (among other things) the location, central pressure, radius, and speed of maximum wind, radius, and pressure of the last closed isobar, and (if appropriate) radii of 35- and 50-kt winds. All radii are measured with respect to the cyclone location. The TCBOGUS is valid at the synoptic hour (0000, 0600, 1200, and 1800 UTC) and is issued at least 30 min prior to the hour. This is the primary source of initialization data with which GFDN constrains its synthetic vortex.

A secondary source of initialization data is the official warning JTWC issues during the first 2.5 h following the synoptic hour. This warning includes the current and forecast (to 72 h) positions of each active tropical cyclone. Since the warning is based on later, more thoroughly analyzed data, GFDN will override information from the TCBOGUS with that from the warning, if it is available.

Starting 22 May 1996, FNMOC ran GFDN at 0600 and 1800 UTC whenever a TCBOGUS valid at that time included a WPAC tropical cyclone. The model used the TCBOGUS information for the highest priority tropical cyclone and the current (0600/1800 UTC) NOGAPS analyses (available ∼1015/2215 UTC) for initialization and the forecast fields from the previous 0000/1200 UTC NOGAPS for boundary conditions. The GFDN forecasts were available to JTWC by about 1130/2330 UTC, in time for their subsequent 1200/0000 UTC warnings. Beginning 11 September 1996, if the TCBOGUS included more than one WPAC cyclone, the second priority message was used in conjunction with the preliminary (based on limited observations) NOGAPS 0600/1800 UTC analyses (available ∼0745/1945 UTC). The resulting GFDN forecasts were available by about 0845/2045 UTC. This arrangement provided JTWC with numerical forecast guidance for two storms. The late execution time of the first priority forecast gave it the benefit of the additional data available for the NOGAPS analysis from which it was initialized.

3. 1996 WPAC season overview

The western North Pacific basin was very active during 1996. JTWC issued a total of 883 warnings on 43 different tropical cyclones, one less than the record set in 1964 (JTWC 1997). Table 1 lists the mean forecast position errors at 24, 48, and 72 h for all JTWC WPAC warnings in 1996, and for the 5-yr period 1991–95. The 1996 errors at 48 and 72 h were smaller than those of any other season to date, except for 1994. The variability in tropical cyclone behavior from year to year can lead to misleading interpretations of single season error statistics (Pike and Neumann 1987). A simple correction can be applied to the mean errors by comparing them to those generated by CLIPER (Neumann 1992), a statistical model based on climatology and persistence. JTWC’s skill, as a percentage improvement over CLIPER, is also listed in Table 1. The skill scores suggest that the 1996 forecasts at 48 and 72 h were more skillful than the 5-yr. However, as may be expected with only four degrees of freedom, this increase in skill is significant at only the 60% level according to a Students t-test.

4. Basic GFDN performance

GFDN was initiated 265 times on a total of 31 tropical cyclones during 1996. Two-hundred and ten of these were first priority cases, using the NOGAPS 0600/1800 UTC analyses. The remaining 55 were second priority cases and used the preliminary NOGAPS 0600/1800 UTC analyses. No significant variations were found in the performance of the two sets of forecasts; they are all treated together in the following analysis.

a. Detection ability

One of the first requirements of a tropical cyclone model is that it must depict and maintain existing systems, ideally, for as long as they remain active. By definition, GFDN ran only when a tropical disturbance of at least 13 m s−1 was identified by JTWC. The actual distribution of initial intensity, as specified by the TCBOGUS, is indicated in Table 2, as is the number of cases in which the model was able to track the tropical cyclone for at least 12 h. In general, the 27 initial detection failures were associated with weak systems. In particular, 16 were initialized as tropical depressions. The standard of comparison is the best track position. JTWC issues best track positions for any system of at least tropical depression (13 m s−1) intensity that has not made an extratropical transition. For five cases (three tropical depressions and two tropical storms) JTWC did not issue a best track position 12 h after the initial time, indicating that the system actually dissipated and these were not failures at all. The two missing supertyphoon cases and four of the seven remaining tropical storm cases were actually forecast by the model, but lost due to postprocessing problems. Two of the other three initial tropical storm tracking failures verified with a best track of tropical depression intensity.

The detection ability for all forecast periods is shown in Fig. 1. The top (solid) bar indicates the total number of GFDN forecasts attempted. For each forecast period, the upper (horizontal hatch) bar shows the number of forecasts in which the tropical cyclone could be identified, regardless of its intensity. The length of the middle (vertical hatch) bar shows the number of verifying best track positions. The displacement of the bar to the right of the axis indicates the number of GFDN forecasts with an identifiable cyclone for which no verifying best track position was issued. For example, of the 265 total cases, 225 model forecasts were able to track the tropical cyclone for 72 h. JTWC issued a verifying best track position for 189 of these cases. There were 49 cases (white space to the left of the best track bar) for which no best track was issued, but a cyclone was present in the model. There were 13 cases (overhang of the best track bar with respect to GFDN) for which the best track was issued, but the model failed to depict it. The former were not necessarily false alarms (although some were) because no attempt was made to filter the model forecasts with respect to intensity or extratropical transitions. The latter represent missed forecasts. The lower (diagonal hatch) bar of each triplet is the same as the middle bar, except it counts only those best track positions with tropical storm (18 m s−1) strength or greater. The percentages at the far right of the figure represent the probability of detection, based on best track intensities of at least tropical storm strength.

As discussed in detail for the initial tracking ability, there are several reasons why the model failed to track some systems. At early forecast periods, the inability of the model to maintain or intensify a weak system is the dominant factor. At later times, false weakening accounts for a few cases, but a majority of lost forecasts were due to problems encountered with Supertyphoon Dale (36W) and Tropical Storm Ernie (37W) in mid-November. At this time, a strong midlatitude jet developed in the northern part of the model domain. An inconsistency in the outflow boundary condition caused the model to fail, thereby losing all GFDN forecasts for 3 days. When these cases were rerun with corrected source code supplied by GFDL, all tracked successfully, raising the probability of detection for 72-h forecasts to 100%.

b. Track errors

Overall track errors for all GFDN forecasts were 121, 176, 241, 316, and 466 km at 12, 24, 36, 48, and 72 h, respectively. These errors are shown in Fig. 2a, stratified by intensity, as forecast by the model. The number of forecasts of each intensity is shown in Fig. 2b. While there is not much difference between the errors demonstrated for forecast tropical storms and typhoons, the errors for tropical depressions are 75–150 km larger than for stronger systems at all forecast periods. The separation between tropical depressions and more intense systems is even more striking when they are stratified with respect to verifying intensity (not shown), rather than forecast intensity. The model forecast intensity is used here, because it is known by the operational forecaster at the time the official forecast is issued. It can be used to assign a measure of confidence to the track forecast. In this regard, it is gratifying to note that the model track forecasts become more reliable as the intensity of the tropical cyclone (and usually the urgency of the forecast) increases.

It is important to know the distribution of model error, as well as its mean value. For the 72-h mean track errors, this distribution is shown in Fig. 3. Clearly, this is not a normal distribution. While the overall average error is 466 km, the median error is only 370 km. Ten forecasts had very large errors (>1200 km), while about 1/3 of all forecasts had mean track errors less than 300 km. This value is significant because a 72-h mean track error of 280 km was identified by Abbey et al. (1994) as a practical limit of forecast skill in WPAC.

There are, of course, several sources for the observed model track errors. Most of the worst errors (>1200 km) were associated with one of two tropical cyclones: Beth (32W) and Fern (42W). After crossing the northern tip of the Philippines, the model representation of Beth moved northwestward, making landfall along the southern Chinese coast, west of Hong Kong. In fact, Beth executed a slow turn to the left, eventually turning southwestward and dissipating over Vietnam. The model Fern propagated rapidly toward the northeast when it should have proceeded slowly eastward and then made a right turn to the west. It is believed that these problems were due to deficiencies of the model in a sheared environment.

Other significant model errors were caused by the failure of the model to properly track a tropical cyclone as it crossed terrain, and failure to properly time recurvature. Particularly at 12 and 24 h, some model errors were associated with large errors in the initial fix position.

The model track was systematically biased toward the left of and behind the best track position. This bias is shown in Fig. 4. The left bias is largely (but not entirely) due to a failure of the model to adequately predict recurvature. The ellipses in the figure indicate the range of cross- and alongtrack errors required to account for half of all forecasts at each forecast period. Although the distribution of the error was such that these ellipses are rather large (460 km in the alongtrack direction for 72-h forecasts), the bias toward the left rear was significant according to a χ2 test at better than the 98% level for all except the 12-h forecasts.

c. Intensity forecasts

One of the potential advantages of GFDN over tropical cyclone guidance derived from global models such as NOGAPS is that it provides a forecast of intensity, as well as position. The resolution of the innermost nest (1/6°) allows the model to provide useful estimates of tropical cyclone intensity, at least for moderate systems.

Although there is significant uncertainty in many of the best track intensity estimates (Guard 1995), the GFDN intensity bias with respect to the best track was computed and is shown in Fig. 5. Two points are immediately apparent. First, for any given verifying intensity, there is little variation in model bias with respect to forecast period. Second, there is a strong stratification of intensity bias with respect to verifying intensity. The significant underforecasting of strong tropical cyclones is largely due to the inability of the model to resolve the eyewall, resulting in weaker than observed pressure gradients and maximum wind speeds. The sign of the intensity error was very consistent from case to case. The mean absolute error (not shown) was within 1 m s−1 of the absolute value of the bias for tropical cyclones of typhoon strength or greater, at all forecast periods. For weaker systems, it was within 3 m s−1.

5. Comparisons to other forecasts

An overall view of GFDN’s performance compared to NOGAPS and JTWC official warnings is given by Fig. 6. Since GFDN and NOGAPS were each run twice per day at different synoptic times and JTWC warnings were issued four times per day, these statistics do not represent a homogeneous set of forecasts. Therefore, the forecast improvement over the “no skill” climate-persistence (CLIPER) forecast is shown, rather than the actual errors. They were averaged over all forecasts of the 31 tropical cyclones forecast by GFDN made by each model. The JTWC forecasts were skillful at all forecast periods, whereas the two dynamical models were skillful only for forecast periods of 24 h and longer. Especially at the later forecast periods (48, 72 h) the skill of all three forecasts is quite similar.

a. JTWC official warnings

JTWC forecasters synthesize guidance from a large number of numerical forecast and observational aids in order to generate their official warnings. Therefore, it is difficult to assess the impact of any given aid, such as GFDN, on seasonal forecast statistics. The subjective judgment of the forecasters is that GFDN made a “significant contribution” to their long-term trend in improving their error statistics (JTWC 1997). Another indication of the impact of GFDN guidance is given by Fig. 7, which shows the relative skill of the JTWC forecasts (as percent improvement over CLIPER) for cases for which GFDN forecasts were or were not available and the number of forecasts in each category. JTWC’s improved skill when GFDN guidance was available may be due to any of several causes, including a tendency to run the model on relatively well-defined (less error prone) tropical cyclones. However, it is reasonable to assume that it is at least partially a reflection of the value of the GFDN track guidance.

b. NOGAPS

During the 2 yr prior to the implementation of GFDN, the most reliable track guidance came from NOGAPS. Preliminary tests of GFDN during July–November 1995 suggested that GFDN guidance was at least as good as NOGAPS and perhaps better. Statistics for the 1996 season (Fig. 6) also support this conclusion.

It is difficult to compare NOGAPS and GFDN errors directly, because the two models were never run at the same time. NOGAPS was always run at 0000 or 1200 UTC whereas GFDN was run at 0600 or 1800 UTC. This arrangement ensured that JTWC would usually have 6-h-old guidance from one of the two models at each of their warning times, but it complicates model comparison.

Skill scores relative to CLIPER were computed for GFDN and NOGAPS forecasts in order to compare the two models in the absence of a homogeneous set of forecasts. As shown in Fig. 6, the two models had similar skill scores beyond 24 h. GFDN had somewhat greater skill at 24–48 h, while NOGAPS had a slightly higher skill at 72 h. If only systems of tropical storm strength or greater are considered, the skill of GFDN forecasts increases by about five percentage points, while that of NOGAPS remains unchanged except at 72 h when it decreases by almost five points.

In order to address the question of whether there is independently useful information in the two model forecasts, or if the track errors are virtually identical on an individual as well as an average basis, a series of lagged forecasts was employed. Following each model run, a lagged forecast was made using the 18–78-h tropical cyclone positions and labeled as though it had been made 6 h later than the actual model run time. These lagged forecasts were made for both NOGAPS and GFDN, starting in late July, although the lagged GFDN forecast tracks ended at hour 48 until mid-October.

A homogeneous set of NOGAPS and GFDN forecasts was constructed from the lagged and unlagged forecasts such that at 0000/1200 UTC, an unlagged NOGAPS forecast was paired with a 6-h lagged GFDN forecast, while at 0600/1800 UTC, a 6-h lagged NOGAPS forecast was paired with an unlagged GFDN forecast. Table 3 indicates the number of forecasts for which each model had the smaller track error at each forecast period. As would be expected, the errors for the unlagged model are generally smaller than those for the lagged model. However, especially at the longer forecast periods, there are a significant number of cases (according to a χ2 test) for which the lagged model has smaller errors than the unlagged model.

The average difference between the NOGAPS and GFDN track errors is shown in Fig. 8 for each forecast period. The error bars in the figure represent the standard deviation of the difference between the two models. Through hour 48 there were an approximately equal number of forecasts made at 0000/1200 and 0600/1800 UTC, and it is not surprising that the mean error difference is small. If the two models performed comparably on a case by case basis, then the standard deviation of the difference between the two model errors would also be small, on the order of the mean degradation of either forecast over a 6-h lag period (40 km). Instead, it ranges from 85 km at 12 h to over 250 km at 48 h. This indicates that the difference in model errors is primarily due to model differences, rather than the comparison of a lagged and unlagged forecast. At 72 h the unlagged GFDN forecasts outnumbered the unlagged NOGAPS forecasts by almost four to one. The mean error difference (55 km) is only slightly greater than would be expected from the lagged–unlagged error difference, alone. The standard deviation of the difference (over 400 km) is nearly as large as the error in either forecast, indicating that even with this skewed sample, there is great variability between the performance of the two models.

In an attempt to evaluate the independence of the two model forecasts at any given time, correlation coefficients between the NOGAPS and GFDN mean track errors were calculated. Separate calculations were made for the 0000/1200 UTC and 0600/1800 UTC forecasts at each forecast period. By treating the two analysis times separately, the inherent superiority of the unlagged member of the forecast pair is negated, and the relative performance of the two models is compared. The correlation coefficients, always somewhat greater for the 0000/1200 UTC forecasts than for the 0600/1800 UTC forecasts, range from a maximum of 0.54 at 12 h to a minimum of 0.12 at 72 h. These correlations are significant at greater than the 98% level for all forecast periods up to 48 h. However, the proportion of variability of one model’s error that is explained by a linear relationship to that of the other model ranges from about 30% at hour 12 to less than 5% at hour 72. Thus, while the two models may not be considered independent (except at hour 72), their variability cannot be explained by their dependence, alone. A remaining challenge is to identify those other features, whether in the environment or the tropical cyclone vortex structure, that account for the remaining variability between the two model forecasts.

As might be expected from the large standard deviation and weak correlation between forecast errors, the separation between corresponding forecast positions can be quite large. The mean, maximum, and minimum separations are shown in Fig. 9. For some forecasts, the tracks were nearly identical, as shown by the minimum separation curve. Other forecasts were highly divergent, as evidenced by the maximum separation of almost 2200 km at 72 h. The mean separation of about 560 km at 72 h is greater than the mean unlagged error of either model for a pure 72-h forecast.

A tiny ensemble was created and a mean track was obtained by averaging the two tropical cyclone positions from each of the two forecasts. The mean track errors for all such ensemble mean forecasts (labeled GFNG) are shown in Fig. 10. At all forecast periods, the ensemble mean errors are smaller than those of either individual model, but the differences are quite small. The numbers along the top of the figure are the correlation coefficients between the ensemble mean error and the separation between the positions of the two ensemble members. Each of these coefficients is significant at the 99% level, indicating that the composite track is most reliable when the two original tracks are in relatively good agreement.

An additional way in which to compare GFDN and NOGAPS forecasts has to do with multiple storm environments. The ability of GFDN to forecast multiple storms has been questioned (OFCM 1997) because its initialization and inner mesh can follow only one cyclone at a time. NOGAPS, on the other hand, initializes and forecasts all existing systems at once. Therefore, NOGAPS might be expected to perform better when two or more systems are close enough together to interact, either directly or indirectly.

In order to test this hypothesis, the number of active tropical cyclones of tropical storm strength or greater was determined for each model verifying time. The mean track errors for GFDN and NOGAPS when two or more systems were active simultaneously and separated by no more than 3000 km are shown in Fig. 11. The number of cases at each forecast period is shown along the top of the figure. Contrary to what may have been expected, GFDN errors were smaller than NOGAPS errors in multiple-storm cases. This may be due to the ability of GFDN to maintain a sufficient signal from the “other” cyclones in the outer mesh region. It may also reflect a known tendency of NOGAPS to generate false interactions between nearby cyclones (L. E. Carr 1997, personal communication). Results are qualitatively similar if the analysis is restricted to systems separated by less than 1500 km, although the sample size is smaller. In any case, further study is required to understand the impact of interactions among tropical cyclones on GFDN forecasts.

6. Summary

GFDN has been shown to provide high-quality tropical cyclone track guidance in the western North Pacific basin. It has excellent detection capability, especially for cyclones of tropical storm strength or greater. Mean track error decreases with cyclone intensity, while the forecast intensity error increases. Due to its perceived positive impact on the official warnings, GFDN is now being run in all basins within JTWC’s area of responsibility.

Some systematic problems with the model have been identified. These include the occasional inability to track an initial tropical depression, poor tracking performance over elevated terrain, erratic timing of recurvature, and unreliable performance in a sheared environment. Model modifications that address the first two problems have been put in place. Kurihara et al. (1998) describe a number of model improvements currently under development at GFDL to address the others.

GFDN and NOGAPS track forecasts were shown to be quite similar in the mean, but significantly different in individual forecasts. Systematic studies along the lines suggested by Carr and Elsberry (1994) or Aberson (1997) are required in order to identify features in the environment and/or tropical cyclone vortex that account for this behavior. A two-member ensemble comprising GFDN and NOGAPS forecasts was constructed using quite similar initial environmental fields (NOGAPS analyses) but different vortex specifications and forecast model formulations (GFDN and NOGAPS). The ensemble mean performed only slightly better than the individual members, but there was a significant correlation between ensemble mean track error and track separation.

Acknowledgments

The author thanks Y. Kurihara, M. Bender, and R. Tuleya of GFDL for providing the model source code, and for subsequent helpful discussions and correspondence.

REFERENCES

  • Abbey, R. F., Jr., L. M. Leslie, and G. J. Holland, 1995: Estimates of the inherent and practical limits of mean forecast errors of tropical cyclones. Preprints, 21st Conf. on Hurricanes and Tropical Meteorology, Miami, FL, Amer. Meteor. Soc., 201–203.

  • Aberson, S. D., 1997: The prediction of the performance of a nested barotropic hurricane track forecast model. Wea. Forecasting,12, 24–30.

    • Crossref
    • Export Citation
  • Carr, L. E., and R. L. Elsberry, 1994: Systematic and integrated approach to tropical cyclone track forecasting. Part I. Approach overview and description of meteorological basins. Tech. Rep. NPS-MR-94-002, Naval Postgraduate School, Monterey, CA, 273 pp. [Available from Naval Postgraduate School, Monterey, CA 93943-5114.].

  • Fiorino, M., J. Goerss, J. Jensen, and E. Harrison, 1993: An evaluation of the real-time tropical cyclone forecast skill of the Navy Operational Global Atmospheric Prediction System in the western North Pacific. Wea. Forecasting,8, 3–24.

    • Crossref
    • Export Citation
  • Goerss, J., and R. Jeffries, 1994: Assimilation of synthetic tropical cyclone observations into the Navy Operational Global Atmospheric Prediction System. Wea. Forecasting,9, 557–576.

    • Crossref
    • Export Citation
  • Guard, C. P., 1995: Wind-pressure relationships to determine hurricane intensity. Minutes, 49th Interdepartmental Hurricane Conf., Silver Spring, MD, OFCM.

  • JTWC, 1997: Tropical cyclone summary 1996. U.S. Naval Pacific Meteorology and Oceanography Center West/Joint Typhoon Warning Center, 116 pp. [Available from JTWC, FPO AP 96536-0051, Guam.].

  • Kurihara, Y., M. A. Bender, R. E. Tuleya, and R. J. Ross, 1990: Prediction experiments of Hurricane Gloria (1985) using a multiply nested movable mesh model. Mon. Wea. Rev.,118, 2186–2198.

  • ——, ——, and R. J. Ross, 1993: An initialization scheme of hurricane models by vortex specification. Mon. Wea. Rev.,121, 2030–2045.

    • Crossref
    • Export Citation
  • ——, ——, R. E. Tuleya, and R. J. Ross, 1995: Improvements in the GFDL Hurricane Prediction System. Mon. Wea. Rev.,123, 2791–2801.

    • Crossref
    • Export Citation
  • ——, R. E. Tuleya, and M. A. Bender, 1998: The GFDL Hurricane Prediction System and its performance in the 1995 hurricane season. Mon. Wea. Rev.,126, 1306–1322.

    • Crossref
    • Export Citation
  • Neuman, C., 1992: Final report, Joint Typhoon Warning Center (JTWC92) Model. SAIC Contract Rep. N 00014-90-C-6042 (Part 2), 83 pp. [Available from SAIC, 550 Camino El Estero, #205, Monterey, CA 93940.].

  • OFCM, 1997: National Plan for Tropical Cyclone Research and Reconnaissance (1997–2002), FCM-P25-1997, Washington, DC, 137 pp. [Available from Office of the Federal Coordinator for Meteorological Services and Supporting Research, 8455 Colesvill Road, Suite 1500, Silver Spring, MD 20910.].

  • Pike, A. C., and C. J. Neumann, 1987: The variation of track forecast difficulty among tropical cyclone basins. Wea. Forecasting,2, 237–241.

    • Crossref
    • Export Citation

Fig. 1.
Fig. 1.

GFDN detection ability at different forecast periods. Percentages on the right are probability of detection for tropical cyclones with verifying best track greater than tropical depression intensity. See text for a full explanation of bar positions.

Citation: Weather and Forecasting 14, 3; 10.1175/1520-0434(1999)014<0297:POTNST>2.0.CO;2

Fig. 2.
Fig. 2.

(a) Mean track errors for GFDN forecasts. (b) Number of cases. Errors are stratified according to the intensity of the forecast tropical cyclone.

Citation: Weather and Forecasting 14, 3; 10.1175/1520-0434(1999)014<0297:POTNST>2.0.CO;2

Fig. 3.
Fig. 3.

The 72-h forecast mean track error distribution and accumulated percentage of all forecasts.

Citation: Weather and Forecasting 14, 3; 10.1175/1520-0434(1999)014<0297:POTNST>2.0.CO;2

Fig. 4.
Fig. 4.

Mean GFDN position bias with respect to best track for each forecast period. Ellipses indicate position errors associated with half of all forecasts, centered about the mean.

Citation: Weather and Forecasting 14, 3; 10.1175/1520-0434(1999)014<0297:POTNST>2.0.CO;2

Fig. 5.
Fig. 5.

GFDN intensity bias vs forecast period for different verifying intensities.

Citation: Weather and Forecasting 14, 3; 10.1175/1520-0434(1999)014<0297:POTNST>2.0.CO;2

Fig. 6.
Fig. 6.

Skill relative to CLIPER for GFDN, NOGAPS, and the official JTWC warnings for 1996 WPAC tropical cyclones.

Citation: Weather and Forecasting 14, 3; 10.1175/1520-0434(1999)014<0297:POTNST>2.0.CO;2

Fig. 7.
Fig. 7.

JTWC skill relative to CLIPER for warnings with and without GFDN guidance.

Citation: Weather and Forecasting 14, 3; 10.1175/1520-0434(1999)014<0297:POTNST>2.0.CO;2

Fig. 8.
Fig. 8.

Mean difference between NOGAPS and GFDN track errors for the homogeneous forecast set. Error bars indicate the standard deviation of the error difference.

Citation: Weather and Forecasting 14, 3; 10.1175/1520-0434(1999)014<0297:POTNST>2.0.CO;2

Fig. 9.
Fig. 9.

Mean, maximum, and minimum separation between NOGAPS and GFDN positions using lagged–unlagged forecast pairs.

Citation: Weather and Forecasting 14, 3; 10.1175/1520-0434(1999)014<0297:POTNST>2.0.CO;2

Fig. 10.
Fig. 10.

Mean track errors for forecasts with verifying intensity of tropical storm strength or greater for GFDN, NOGAPS, and the ensemble mean, GFNG. Numbers along the top are correlation coefficients between the GFNG error and the ensemble spread.

Citation: Weather and Forecasting 14, 3; 10.1175/1520-0434(1999)014<0297:POTNST>2.0.CO;2

Fig. 11.
Fig. 11.

Mean track errors for GFDN and NOGAPS when two or more systems of tropical storm strength or greater were active in the western North Pacific. Numbers along the top are the number of cases at each forecast period.

Citation: Weather and Forecasting 14, 3; 10.1175/1520-0434(1999)014<0297:POTNST>2.0.CO;2

Table 1.

JTWC mean track errors and skill scores relative to CLIPER for 24-, 48-, and 72-h forecasts.

Table 1.
Table 2.

Number of GFDN forecasts initiated for each level of tropical cyclone intensity.

Table 2.
Table 3.

Frequency of superior performance at each forecast period. Cases are separated into those made from 0000/1200 UTC (unlagged NOGAPS) and 0600/1800 UTC (unlagged GFDN) analyses.

Table 3.
Save