## 1. Introduction

In 1997, the Japan Meteorological Agency (JMA) started to provide public users with 3-day track forecasts of tropical cyclones (TCs) in the western North Pacific Ocean and South China Sea based on numerical weather prediction [NWP; the (Regional Specialized Meteorological Center) RSMC Tokyo-Typhoon Center 1997]. Since then, the remarkable progress of the JMA NWP system has brought considerable improvement of the track forecasts. According to the verification of the JMA global NWP system, the 3-yr running mean of position errors of 5-day predictions in 2007 (451 km, the average of 2005, 2006, and 2007) is smaller than that of 3-day predictions in 1997 (472 km, the average of 1995, 1996, and 1997), indicating that we have succeeded in obtaining a 2-day lead time of deterministic TC track predictions over the past 11 yr (see Fig. 1).

While the accuracy of TC track forecasts has improved significantly, ensemble techniques have been attracting much attention because they are expected to improve deterministic TC track forecasts and also provide the uncertainty information (WMO 2008a), based on ensemble mean and ensemble spread, respectively (e.g., Jeffries and Fukada 2002; Vijaya Kumar and Krishnamurti 2003; Sampson et al. 2006; Goerss 2007). For the western North Pacific basin, Goerss et al. (2004) have shown that the consensus of three models, the Navy Operational Global Atmospheric Prediction System (NOGAPS; Hogan and Rosmond 1991; Goerss and Jeffries 1994), the Met Office (UKMO) global model (Cullen 1993; Heming et al. 1995), and the JMA typhoon model (TYM; Kuma 1996), has reduced forecast errors with respect to the best of the individual models. For 2002, for example, the improvement rate of the consensus with respect to the best of the individual models is 17%, 14%, and 11% at 24, 48, and 72 h, respectively. Goerss (2000) has also shown the effectiveness of consensus forecasts for both the western North Pacific and the Atlantic basin: using the JMA global model (GSM; Kuma 1996), NOGAPS, and UKMO for the western North Pacific basin; and the Geophysical Fluid Dynamics Laboratory Hurricane Prediction System (GFDL; Kurihara et al. 1993, 1995, 1998), NOGAPS, and UKMO for the Atlantic basin. For the western North Pacific during 1997, the improvement rate with respect to the best of the individual models is 16%, 13%, and 12% at 24, 48, and 72 h, and for the 1995–96 Atlantic hurricane seasons, the improvement rate is 16%, 20%, and 23% at 24, 48, and 72 h, respectively. Furthermore, Komori et al. (2007) have verified the ensemble mean track of three global models from operational NWP centers, the European Centre for Medium-Range Weather Forecasts (ECMWF), JMA, and UKMO. In the verification of 48- and 96-h predictions for the western North Pacific basin from 1991 to 2005, and for the Atlantic basin from 1999 to 2005, the ensemble mean has the best performance with respect to the best of the individual models with few exceptional years. For a study on evaluating the ensemble spread of TC track predictions, Goerss (2000) and Elsberry and Carr (2000) have demonstrated the benefit of the ensemble spread as a measure of confidence in ensemble TC predictions: a small spread of tracks is often an indication of a small error of ensemble mean track prediction. These arguments form the rationale for the development of the ensemble prediction system (EPS).

Recently, JMA has developed a new Typhoon EPS (TEPS). Different from the multimodel EPS mentioned above, TEPS is based on a sole NWP model with perturbed initial conditions. To evaluate whether the ensemble mean and ensemble spread of TEPS have similar characteristics as compared with the multimodel EPS used in previous studies, we conducted a series of quasi-operational runs of TEPS from May to December 2007. This paper describes the verification results of TEPS during the quasi-operational period as well as its specifications: first, the skill of ensemble mean track predictions is verified; and second, the effectiveness of confidence information on track predictions using the ensemble spread is verified. Moreover, the verification of confidence information is performed by dividing the ensemble spread into along-track and cross-track directions since the prediction of uncertainty in timing and location of TC impacts is essential (Aberson 2001). Section 2 describes the specifications of TEPS: the NWP system and the method to create initial perturbations. Section 3 describes the performance of TEPS, and section 4 discusses the performance. The summary and conclusions are presented in section 5.

## 2. Specifications of TEPS

### a. Overview of TEPS

In February 2008, JMA started an operation of TEPS (Fig. 2). TEPS uses a lower-resolution version of the JMA/GSM. The model resolution of TEPS is TL319L60, the spectral triangular truncation TL319 (the “L” indicating the linear grid option) and 60 vertical levels, while that of JMA/GSM is TL959L60 (Iwamura and Kitagawa 2008). As TEPS does not have its own data assimilation system, a global analysis for JMA/GSM, which is based on a four-dimensional variational data system (4DVAR; Kadowaki 2005; JMA 2007), is used for an initial condition of the nonperturbed run of TEPS after interpolating it from TL959L60 to TL319L60.

Forecasters on duty at JMA initiate the operation of TEPS when the following two conditions are satisfied:

TCs are located in the RSMC Tokyo-Typhoon Center’s responsibility area of 0°–60°N, 100°E–180°, or are expected to move into the area within 24 h.

The TC’s maximum sustained wind speed exceeds 34 kt or is expected to reach the threshold within 24 h (in that sense, the name of TEPS might not be appropriate because we call TCs with a maximum sustained wind of 64 kt or more “typhoon”).

TEPS runs 4 times a day, initiated at 0000, 0600, 1200, and 1800 UTC with a prediction range of 132 h. The ensemble size is 11. A singular vector (SV) method (Buizza 1994; Molteni et al. 1996; Puri et al. 2001) is employed to create initial perturbations (see section 2b for more details). A method dealing with the uncertainty of an NWP model itself such as a stochastic physics method (Buizza et al. 1999) is yet to be considered.

### b. Singular vector method

**x**as shown in Eq. (1):

**x**(

*t*=

*t*

_{0}) is a perturbation at a base time

*t*

_{0},

**x**(

*t*=

*t*) is one at an optimization time

_{a}*t*(

_{a}*t*>

_{a}*t*

_{0}), and ‖ · ‖ denotes the norm associated with the Euclidean inner product. The growth rate of a perturbation given by Eq. (1) changes into Eq. (2), using a tangent forward propagator 𝗠:

_{i}and 𝗘

_{f}are norm operators at

*t*

_{0}and

*t*, respectively. The local projection operator 𝗧 sets all elements of a vector to 0 outside a prescribed domain, which enables the calculation of perturbations with maximum amplitude at

_{a}*t*over the targeted area. Here [,] denotes the Eulerian inner product. The growth rate equation still changes to Eqs. (3) and (4) from Eq. (2), using

_{a}In TEPS, 𝗠 and 𝗠***** are the tangent-linear and adjoint models used for the 4DVAR, which has been in operation since February 2005 (Kadowaki 2005; JMA 2007). While their resolutions are T159L40 for the 4DVAR as of February 2009, TEPS uses the lower-resolution version, T63L40. They consist of dynamics based on Eulerian integrations and physical processes containing representations of vertical diffusion, gravity wave drag, large-scale condensation, longwave radiation, and deep cumulus convection. The SVs based on the tangent-linear and adjoint models including the full physical processes (the simplified physical processes without moist processes) are called moist (dry) SVs. In TEPS, dry SVs are calculated targeting for a midlatitude area in the RSMC Tokyo-Typhoon Center’s responsibility area, aiming to identify the dynamically most unstable mode of the atmosphere like the baroclinic mode (Buizza and Palmer 1995). Moist SVs are calculated targeting for TC surroundings where moist processes are crucial (Barkmeijer et al. 2001).

TEPS can target up to three TCs at the same time. If more than three TCs exist, three TCs are selected in the order of concerns of the RSMC Tokyo-Typhoon Center. The targeted area of dry SV calculations is fixed, 20°–60°N, 100°E–180°, and that of moist SV calculations is a rectangle, 10° in latitude and 20° in longitude with its center at a TC central position of the 24-h forecast. The evaluation time interval is 24 h for both dry and moist SVs.

*ζ*,

_{x}*D*,

_{x}*T*,

_{x}*q*, and

_{x}*P*being the vorticity, divergence, temperature, specific humidity, and surface pressure components, respectively, of a vector

_{x}**x**. The temperature lapse rate Γ is taken into consideration in an available potential energy term (Lorenz 1955). Here

*g*is the acceleration of gravity,

*c*is the specific heat of dry air at constant pressure,

_{p}*L*is the latent heat of condensation, and

_{c}*R*is the gas constant for dry air. Here

_{d}*p*(

*η*) is the pressure at eta levels.

*T*= 300 K is a reference temperature,

_{r}*P*= 800 hPa is a reference pressure, and

_{r}*w*is a constant (

_{q}*w*= 1 in this study). For Γ, its representative value, 2/3Γ

_{q}*is used, and then an available potential energy term is written as 3*

_{d}*c*/

_{p}T_{x}T_{x}*T*.

_{r}In Eq. (6), the vertical integration of the kinetic energy term and the available potential energy term is performed with an initial and final norm set to zero above 100 hPa (model level 26). Similarly, the vertical integration of the specific humidity term is performed with both norms set to zero above 500 hPa (model level 15). Otherwise, as is the case with Barkmeijer et al. (2001), SVs have a shallow vertical structure in the upper troposphere or have a large specific humidity amplitude in the upper troposphere where the amount of specific humidity is relatively small. Through some numerical experiments, it is found that those SVs have little impact on an ensemble of TC track predictions, and therefore, we have adopted the above conditions in the vertical integration.

Figures 3a,b show the vertical distributions of the total energy components of moist SVs at an initial and evaluation time, respectively. These are the composites of all moist SVs computed during the quasi-operational period (the total number of samples is 3077). In both figures, the energy is normalized by the peak value of a certain component at a certain model level. As Fig. 3a shows, the initial SVs are mostly explained by the kinetic energy component centered at about 650 hPa (model level 11). This kinetic energy component is largely explained by the temperature lapse rate term in Eq. (6). In the experimental SV calculations not considering the lapse rate, where the available potential energy term is written as *c _{p}T_{x}T_{x}*/

*T*, the available potential energy is dominant in the calculated initial SVs. Through some numerical experiments, it is found that those SVs need larger initial amplitudes of total energy in order to get the same amount of ensemble spread in terms of TC tracks with respect to SVs as shown in Fig. 3a. This result would indicate that wind perturbations cause a larger ensemble spread of tracks than temperature perturbations. Final SVs are explained by the specific humidity energy at the lower troposphere and kinetic energy from the lower to upper troposphere. These distributions have a good agreement with those of total energy of forecast errors by JMA/GSM in tropical regions and TC surroundings (figures not shown).

_{r}The composite of the vertically accumulated total energy of the initial moist SVs calculated during the quasi-operational period is shown in Fig. 4a for the leading moist SVs, and Fig. 4b for all moist SVs. Just like Figs. 3a,b, the energy is normalized by the peak value of the energy in the 4000 km × 4000 km domain centered on analyzed TC position. As Fig. 4a shows, the distribution is asymmetric, and a sensitive region spreads from north–west to east related to the storm center. The peak point is approximately 500 km away from the center in both Figs. 4a,b, which is consistent with the results of Reynolds et al. (2009) and Peng and Reynolds (2006) that suggest that the high sensitivity appears in the region where the potential vorticity gradient of the vortex first changes sign.

### c. Generation of initial perturbations

**p**

*is the*

_{i}*i*th initial perturbation,

*r*are the coefficients of the linear combination,

*n*is the number of computed SVs, and

**x̂**

_{j}is a SV (1 ≤

*j*≤

*n*). The linear combination gives spatially spreading initial perturbations because each SV tends to be confined to a local area. In each SV calculation, up to 10 SVs can be computed depending on an operationally allocated calculation time, which means we have up to 40 SVs (

*n*≤ 40), that is 10 dry SVs and 30 moist SVs, at one prediction event. Note that there is a possibility that the calculated SVs are similar to each other. For example, if two TCs exist at the same time, and the targeted areas for the two TCs are overlapped, a SV for one TC can be similar to the SV for the other TC. To avoid duplication, if the value of the inner product of any two SVs is 0.5 or more [(

**x̂**

_{i},

**x̂**

_{j}) ≥ 0.5], one of the two SVs is eliminated from the SVs used in Eq. (7).

*m*×

*n*matrix 𝗔 whose columns are composed of each SV:

*m*is the dimension of a SV and

*n*is the number of columns of 𝗔. Then, we compute an

*n*×

*n*orthogonal matrix 𝗥, which determines the coefficients:

*V*defined in Eq. (12):

*b*is a factor of 𝗕 and

*V*represents the deviation of the square of factors of

_{i}**p̂**

_{i}. A small value of

*V*means small deviation of factors of

_{i}**p̂**

_{i}, indicating

**p̂**

_{i}can be an initial perturbation with a large spatial spread. Therefore, we can obtain a set of initial perturbations with a spatial spread by minimizing the sum of

*V*. For the 10 perturbed runs, the first to fifth column vectors of 𝗕 are selected, and

_{i}^{−1}.

As the kinetic energy is dominant in the initial SVs, and the wind perturbations are found to be sensitive to the ensemble spread of tracks, we selected wind speed as a variable to control the amplitudes of the initial perturbations. The value of 6.0 m s^{−1} was obtained through the preliminary tests of TEPS, where the ensemble spread of tracks was statistically verified. However, as it is arbitrary and totally dependent on the NWP system, we will need to adjust the value again when the system is changed.

Table 1 summarizes the specifications of TEPS. Note that JMA operates one-week EPS (WEPS; WMO 2008b). While WEPS is designed for medium-range forecasts, both TEPS and WEPS have similar characteristics: they use the same NWP model, the same initial condition for the nonperturbed run, and the similar procedure to make initial perturbations based on an SV method. This is for the efficient development and maintenance of the operational NWP systems. In principle, we could merge those two systems, considering that the other NWP centers like ECMWF are successful in both medium-range forecasts and TC track forecasts with a single EPS, and that TEPS and WEPS follow the ECMWF’s EPS (e.g., Buizza 1994; Buizza and Palmer 1995; Palmer et al. 1998; Barkmeijer et al. 2001; Puri et al. 2001) in many aspects, especially in the calculation of initial perturbations based on a SV method. The merger of the systems might yield further benefits through saving computer resources.

## 3. Performance of the TEPS

### a. Case studies

Figure 5 shows the examples of TEPS. The top figures are for Typhoon Maria, initiated at 1200 UTC 6 August 2006, and the bottom figures are for Typhoon Chaba, initiated at 1200 UTC 28 August 2004. The left panels show the track prediction by JMA/GSM (solid) with the best track (dash), and the right panels show all tracks by TEPS (these are the results of the preliminary test of TEPS before the quasi-operational TEPS). In Maria’s case, a large ensemble spread exists: some of the ensemble members predict the same scenario as JMA/GSM that says Maria heads toward the west of Japan. On the contrary, the other members recurve and head for the Kanto region. In reality, as the best track shows, Maria recurved and hit the south of Kanto area. It is noteworthy that TEPS had captured the possibility of the best track. From a perspective of disaster prevention or mitigation, it is of great importance to figure out all possible scenarios in advance and take measures as needed. TEPS is expected to capture such a potential spread of tracks. As opposed to Maria’s case, Chaba’s case illustrates that the ensemble spread is quite small, which means the confidence of the prediction is relatively high. In fact, the deterministic prediction by JMA/GSM was almost perfect. Just like those two cases, TEPS is expected to provide the confidence information of track prediction based on the ensemble spread that could vary from TC to TC and initial time to initial time.

### b. Quasi operation

To evaluate the performance of TEPS, we conducted quasi-operational runs of TEPS from May to December 2007 and verified the ensemble mean track predictions and a relationship between the ensemble spread of tracks and the position error of ensemble mean. Note that the specifications of the quasi-operational TEPS are different from those of the operational TEPS in several points. For example, as Fig. 2 shows, the analysis field for the quasi-operational TEPS comes from a global analysis for JMA/GSM with a resolution of TL319L40 because TL959L60 JMA/GSM was implemented on 21 November 2007 (note that TL319L40 JMA/GSM had been operated even after 21 November 2007 in order to provide the analysis field for the quasi-operational TEPS, which means the quasi-operational TEPS uses the global analysis of TL319L40 JMA/GSM throughout the quasi-operational period).

### c. Ensemble mean track predictions

Figure 6a shows the position error of ensemble mean track predictions for TCs of tropical-storm-or-greater intensity. Figure 6b includes the extratropical-transition stages of the verifying TCs. The verifications are based on the best-track data produced by RSMC Tokyo-Typhoon Center. The position error of the ensemble mean (MEAN) is smaller than those of the nonperturbed runs (CTL) in 4- and 5-day predictions. For predictions up to 3 days, they have almost the same performance. The error reduction in 5-day predictions shown in Fig. 6a is 40 km. A *t* test performed with a confidence level of 95% shows that differences between MEAN and CTL is significant for 5-day predictions (Figs. 6c,d). JMA plans to provide public users with 5-day track forecasts. TEPS is expected to improve the deterministic TC track forecasts over the extended forecast range (note that JMA/GSM is operated 4 times a day, 0000, 0600, 1200, and 1800 UTC, but the forecast range is 84 h except for 1200 UTC, where the forecast range is 216 h).

### d. Confidence information

Any variable that is correlated with the position error of ensemble mean can be used as an indicator of the position error. Considering the sum of the spread of every 6 h as a variable, we verified a relationship between the 5-day prediction error of ensemble mean and the spread of tracks of 5-day prediction (Fig. 7). Here, the ensemble spread is defined by the sum of the spread of every 6 h from initial time to 5 days so that the spread represents the uncertainty of tracks, not positions. The verifying TCs are exactly same as those for Fig. 6b, and each dot gives the verification result of each prediction event. When the ensemble spread is relatively small, the position error of the corresponding prediction event is also small and, more importantly, there is no case that has a large position error. When the ensemble spread is relatively large, there is a possibility that the position error becomes large. Note that the large ensemble spread does not guarantee large position error, but a possibility of it.

While Fig. 7 is the verification only for 5-day predictions, the same relationship can be seen in verifications for other prediction times. Based on this relationship, we classify the confidence of TC track prediction and assign a confidence index, A, B, or C every 6 h from the initial time to 5 days, where A, B, and C represent categories of the highest, the middle level, and the lowest confidence, respectively, and the frequency of each category is set to 40%, 40%, and 20%, respectively. As shown in Fig. 8, when collecting all predictions judged as A (ensemble spread is relatively small) and then verifying the position errors of the ensemble mean, it proves that the average position errors are considerably small compared with those of B. On the contrary, when collecting predictions judged as C (ensemble spread is relatively large) and then verifying the position errors of the ensemble mean, the average position errors are larger than those of B. The position error of A is better than B by 14%, 25%, 34%, 40%, and 39% at 24, 48, 72, 96, and 120 h, respectively. On the other hand, the position error of C is worse than B by 96%, 98%, 93%, 55%, and 28% at 24, 48, 72, 96, and 120 h, respectively. Another way to explain Fig. 8 is that the solid line in Fig. 6b showing the position errors of ensemble mean is divided into three lines only based on the information of ensemble spread. All cases at a certain prediction time in Fig. 6b are divided into three categories based on the ensemble spread, and the position error of the ensemble mean is verified at each category. Therefore, the three lines that are clearly separated in Fig. 8 imply that the ensemble spread of TEPS can be an indicator of position error. Figure 9 gives an example of a possible application using the above confidence information. JMA plans to use the ensemble spread of TEPS to optimize the size of probability circle, which has been used in JMA’s operational track forecasts to express forecast uncertainty, and whose size is currently based on the statistics of the recent year’s verifications of JMA/GSM.

The reason why each category is set to 40%, 40%, and 20%, not 33%, 33%, and 33%, is to split the three lines in Fig. 8 as much as possible. Figure 10 shows the position error of each 3-day prediction in 2007 by JMA/GSM, and the error is sorted in an ascending order. As Fig. 10 shows, the distribution is not uniform, and the number of cases that have relatively large position errors is about 10%–20% of the total number of cases. Therefore, we make the rate of category C smaller than that of categories A and B.

A position error can also be broken into along-track (AT) and cross-track (CT) components (e.g., Goerss 2000; Aberson 2001). Accordingly, an ensemble can be used to predict the spread in each of these components around the ensemble mean. Here, we verify whether a relationship exists between the ensemble spread in the AT and CT directions and their associated errors. First, the AT direction is defined at a given time T as the vector difference between ensemble mean forecasts at T and T—1 day. Then, the CT direction is set to the orthogonal direction against the AT direction. Figures 11a,b show the AT and CT direction verifications. Like Fig. 8, we prepare three confidence indices: A, B, and C with the frequency being 40%, 40%, and 20%, respectively. As both figures show, three lines are clearly split, indicating the fact that confidence information would be applicable to the uncertainty forecasts in the AT and CT directions.

## 4. Discussion

Through the quasi-operational TEPS, we found two things: one is that the position error of the ensemble mean is better than that of the nonperturbed run for predictions longer than three days; the other is that the ensemble spread can be used as an indicator of position error. As we mentioned in section 3d, optimizing the size of the probability circle based on TEPS is now under development. Quantitative verification with respect to the current system will be future work. However, considering the verification results shown in Figs. 7, 8, and 11, the ensemble spread of TEPS shows promise in providing the uncertainty forecasts. For example, consider a probability circle, which represents that a TC is expected to move into the circle with a probability of 70% at a certain forecast time. Based on the ensemble mean of TEPS in 2007, the radius of the circle is 335 km for 3-day predictions and 489 km for 5-day predictions on average. In the case of A, where ensemble spread is relatively small, however, the radius is 207 and 347 km for 3- and 5-day predictions, which are smaller than the average radius by 38% and 29%, respectively. On the other hand, in the case of C, where ensemble spread is relatively large, the radius is 527 and 675 km for 3- and 5-day predictions, which are larger than the average radius by 57% and 38%, respectively, while in the case of B, the radius is almost the same as the average radius, 348 and 499 km for 3- and 5-day predictions.

As for deterministic predictions, on the other hand, the following two questions still remain. First, how does the skill of the ensemble mean of TEPS compare to that of JMA/GSM, and second, how does the skill of the ensemble mean of TEPS compare with respect to that of the multimodel EPS? Figure 12a shows the preliminary verification result of the ensemble mean of the operational TEPS as compared with JMA/GSM. The verification is based on the 1st (NEOGURI) to 18th (BAVI) TC in 2008 analyzed by the RSMC Tokyo-Typhoon Center, and like Fig. 6b, the extratropical-transition stages are included in the verification. Note that the number of cases decreases after 90 h because JMA/GSM predicts over 90 h only at 1200 UTC. A *t* test shows that the ensemble mean of TEPS is significantly worse than JMA/GSM up to 84 h (Fig. 12b). Considering the differences of specifications between the quasi-operational and operational TEPS, the difference of analysis field might cause the decline of the performance. Though the horizontal resolutions of the model and data assimilation are the same in the quasi-operational TEPS, that is TL319, we interpolate the analysis field from TL959 to TL319 in the operational TEPS. As we have not seen such decline of the performance in the quasi-operational TEPS, and it would be ideal that the model and data assimilation scheme have the same resolution, we believe that there is room for improvement in the technique of the interpolation. When the ensemble mean of TEPS is compared with that of multimodel EPS, the improvement is limited after 3-day predictions. In the multimodel EPS presented by Goerss (2000), Goerss et al. (2004), and Komori et al. (2007), the ensemble mean of the multimodel EPS is better than the best of the individual models from the early prediction time, say 24 h. This result may imply that dealing with the uncertainty of NWP model and accounting for model errors in ensemble techniques would result in the improvement of ensemble mean tracks in TEPS, especially for the early prediction time.

## 5. Summary and conclusions

JMA started an operation of a new EPS, the Typhoon EPS, in February 2008, aiming to improve both deterministic and probabilistic TC track forecasts. TEPS is composed of 11 integrations with a TL319L60 global model and operated for TCs in the western North Pacific and South China Sea. The method to create initial perturbations is based on a singular vector (SV) method, but a method to deal with the uncertainty of the NWP model itself is yet to be considered.

The SVs are calculated targeting both TCs (up to three TCs at one prediction event) and a midlatitude area in the RSMC Tokyo-Typhoon Center’s responsibility area. For the 10 perturbed runs, 5 initial perturbations, which are added to and subtracted from an analysis field, are created by linearly combining the SVs. Coefficients are determined so that the spatial distributions of the perturbations are sufficiently spread. Moist SVs are calculated for TCs, and dry SVs are calculated for the midlatitude. The growth rate of both moist and dry SVs is based on a moist total energy norm. The evaluation interval time is 24 h for both types of SVs. The composites of vertically accumulated total energy of moist initial SVs have an asymmetric pattern and the most sensitive area is approximately 500 km away from the center of TCs. The total energy of moist initial SVs is mainly explained by the kinetic energy component, which leads to large ensemble spreads in TC track predictions.

Through the verifications of the quasi-operational TEPS, we found two benefits. First, the position error of ensemble mean is better than that of nonperturbed run for predictions beyond 3 days; the other is that the ensemble spread can be used as an indicator of position error. For 2008 when TEPS was in operational use, however, it was also found that the ensemble mean was significantly worse than the deterministic model (JMA/GSM) out to 84 h. Beyond 84 h, JMA/GSM was still better than the ensemble mean of TEPS though the sample size was small.

Future work includes the development of the application for uncertainty forecasts, using the ensemble spread of TEPS. The interpolation technique of the analysis field from TL959L60 to TL319L60 and the introduction of model errors into the ensemble perturbation technique would also be issues to be addressed. In addition, the prediction range of TEPS, 132 h, allows us to make up the lagged average forecast (LAF); hence, verifying the impact of LAF technique would also be a challenge for the future.

## Acknowledgments

Special thanks are given to Sharan Majumdar of the University of Miami for a thorough and constructive review of the manuscript, and to Edward Fukada of the Joint Typhoon Warning Center and Buck Sampson of the Naval Research Laboratory for their helpful comments on multimodel EPS. We also thank Hirokatsu Onoda, Toshiyuki Ishibashi, Ko Koizumi, and Masashi Nagata of JMA for their helpful comments and review of the manuscript. This work was supported by Grant-in-Aid for Scientific Research (A) 19201037. The manuscript was revised and completed under the support of the Office of Naval Research Grant N0000140810250.

## REFERENCES

Aberson, S. D., 2001: The ensemble of tropical cyclone track forecasting models in the North Atlantic basin (1976–2000).

,*Bull. Amer. Meteor. Soc.***82****,**1895–1904.Barkmeijer, J., R. Buizza, T. N. Palmer, K. Puri, and J-F. Mahfouf, 2001: Tropical singular vectors computed with linearized diabatic physics.

,*Quart. J. Roy. Meteor. Soc.***127****,**685–708.Buizza, R., 1994: Sensitivity of optimal unstable structures.

,*Quart. J. Roy. Meteor. Soc.***120****,**429–451.Buizza, R., and T. N. Palmer, 1995: The singular vector structure of the atmospheric global circulation.

,*J. Atmos. Sci.***52****,**1434–1456.Buizza, R., M. Miller, and T. N. Palmer, 1999: Stochastic simulation of model uncertainties.

,*Quart. J. Roy. Meteor. Soc.***125****,**2887–2908.Cullen, M. J. P., 1993: The Unified Forecast/Climate Model.

,*Meteor. Mag.***122****,**81–122.Elsberry, R. L., and L. E. Carr III, 2000: Consensus of dynamical tropical cyclone track forecasts—Errors versus spread.

,*Mon. Wea. Rev.***128****,**4131–4138.Goerss, J. S., 2000: Tropical cyclone track forecasts using an ensemble of dynamical models.

,*Mon. Wea. Rev.***128****,**1187–1193.Goerss, J. S., 2007: Prediction of consensus tropical cyclone track forecast error.

,*Mon. Wea. Rev.***135****,**1985–1993.Goerss, J. S., and R. A. Jeffries, 1994: Assimilation of synthetic tropical cyclone observations into the Navy Operational Global Atmospheric Prediction System.

,*Wea. Forecasting***9****,**557–576.Goerss, J. S., C. R. Sampson, and J. M. Gross, 2004: A history of western North Pacific tropical cyclone track forecast skill.

,*Wea. Forecasting***19****,**633–638.Heming, J. T., J. C. L. Chan, and A. M. Radford, 1995: A new scheme for the initialization of tropical cyclones in the UK Meteorological Office global model.

,*Meteor. Appl.***2****,**171–184.Hogan, T. F., and T. E. Rosmond, 1991: The description of the Navy Operational Global Atmospheric Prediction System’s spectral forecast model.

,*Mon. Wea. Rev.***119****,**1786–1815.Iwamura, K., and H. Kitagawa, 2008: An upgrade of the JMA Operational Global NWP Model.

,*CAS/JSC WGNE Res. Act. Atmos. Oceanic Modell.***38****,**603–604.Jeffries, R. A., and E. J. Fukada, 2002: Consensus approach to tropical cyclone forecasting.

*Proc. Fifth Int. Workshop on Tropical Cyclones,*Cairns, Australia, WMO, Topic 3.2. [Available online at http://www.aoml.noaa.gov/hrd/iwtc/index.html].JMA, 2007: Outline of the operational numerical weather prediction at the Japan Meteorological Agency. Appendix to WMO Numerical Weather Prediction Progress Rep., Japan Meteorological Agency, Tokyo, Japan, 194 pp. [Available online at http://www.jma.go.jp/jma/jma-eng/jma-center/nwp/outline-nwp/index.htm].

Kadowaki, T., 2005: A 4-dimensional variational assimilation system for the JMA Global Spectrum Model.

,*CAS/JSC WGNE Res. Act. Atmos. Oceanic Modell.***34****,**117–118.Komori, T., M. Yamaguchi, R. Sakai, and Y. Takeuchi, 2007: WGNE intercomparison of tropical cyclone forecasts with operational global models: Quindecennial report. Science Highlights, WCRP, 4 pp. [Available online at http://wcrp.wmo.int/documents/WGNE_TC_Intercomparison_Quindecennial_Quicklook15Anniversary.pdf].

Kuma, K., 1996: NWP activities at Japan Meteorological Agency. Preprints,

*11th Conf. on Numerical Weather Prediction,*Norfolk, VA, Amer. Meteor. Soc., J15–J16.Kurihara, Y., M. A. Bender, and R. J. Ross, 1993: An initialization scheme of hurricane models by vortex specification.

,*Mon. Wea. Rev.***121****,**2030–2045.Kurihara, Y., M. A. Bender, R. E. Tuleya, and R. J. Ross, 1995: Improvements in the GFDL hurricane prediction system.

,*Mon. Wea. Rev.***123****,**2791–2801.Kurihara, Y., R. E. Tuleya, and M. A. Bender, 1998: The GFDL hurricane prediction system and its performance in the 1995 hurricane season.

,*Mon. Wea. Rev.***126****,**1306–1322.Lorenz, E. N., 1955: Available potential energy and the maintenance of the general circulation.

,*Tellus***7****,**157–167.Lorenz, E. N., 1965: A study of the predictability of a 28-variable atmospheric model.

,*Tellus***17****,**321–333.Molteni, F., R. Buizza, T. N. Palmer, and T. Petroliagis, 1996: The ECMWF ensemble prediction system: Meteorology and validation.

,*Quart. J. Roy. Meteor. Soc.***122****,**73–120.Palmer, T. N., R. Gelaro, J. Barkmeijer, and R. Buizza, 1998: Singular vectors, metrics, and adaptive observations.

,*J. Atmos. Sci.***55****,**633–653.Peng, M. S., and C. A. Reynolds, 2006: Sensitivity of tropical cyclone forecasts as revealed by singular vectors.

,*J. Atmos. Sci.***63****,**2508–2528.Puri, K., J. Barkmeijer, and T. N. Palmer, 2001: Ensemble prediction of tropical cyclones using targeted diabatic singular vectors.

,*Quart. J. Roy. Meteor. Soc.***127****,**709–731.Reynolds, C. A., M. S. Peng, and J-H. Chen, 2009: Recurving tropical cyclones: Singular vector sensitivity and downstream impacts.

,*Mon. Wea. Rev.***137****,**1320–1337.RSMC Tokyo-Typhoon Center, 1997: Annual Report on Activities of the RSMC Tokyo-Typhoon Center. RSMC Tokyo-Typhoon Center, 115 pp.

Sampson, C. R., J. S. Goerss, and H. C. Weber, 2006: Operational performance of a new barotropic model (WBAR) in the western North Pacific basin.

,*Wea. Forecasting***21****,**656–662.Strang, G., 1986:

*Introduction of Applied Mathematics*. Wellesley-Cambridge Press, 758 pp.Vijaya Kumar, T. S. V., and T. N. Krishnamurti, 2003: Multimodel superensemble forecasting of tropical cyclones in the Pacific.

,*Mon. Wea. Rev.***131****,**574–583.WMO, 2008a: Guidelines on communicating forecast uncertainty. WMO/TD-1422, PWS-18, WMO, 25 pp. [Available online at http://www.wmo.int/pages/prog/amp/pwsp/documents/TD-1422.pdf].

WMO, 2008b: Joint WMO technical progress report on the Global Data Processing and Forecasting System and Numerical Weather Prediction research activities for 2007. WMO–JMA, 40 pp. [Available online at http://www.wmo.int/pages/prog/www/DPFS/ProgressReports/2007/Japan_2007.pdf].

Time line of the upgrade of JMA NWP systems: JMA/GSM, TYM, WEPS, and TEPS.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Time line of the upgrade of JMA NWP systems: JMA/GSM, TYM, WEPS, and TEPS.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Time line of the upgrade of JMA NWP systems: JMA/GSM, TYM, WEPS, and TEPS.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Vertical distributions of total energy components, kinetic energy (dashed line), available potential energy (thick line), and specific humidity energy (thin line), of moist SVs at (a) an initial time and (b) the evaluation time. The distributions are the results of averaging all initial and final moist SVs during the quasi-operational period of TEPS (the total number is 3077). Vertical levels 10, 15, 20, and 25 nearly correspond to 700-, 500-, 300-, and 150-hPa heights, respectively.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Vertical distributions of total energy components, kinetic energy (dashed line), available potential energy (thick line), and specific humidity energy (thin line), of moist SVs at (a) an initial time and (b) the evaluation time. The distributions are the results of averaging all initial and final moist SVs during the quasi-operational period of TEPS (the total number is 3077). Vertical levels 10, 15, 20, and 25 nearly correspond to 700-, 500-, 300-, and 150-hPa heights, respectively.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Vertical distributions of total energy components, kinetic energy (dashed line), available potential energy (thick line), and specific humidity energy (thin line), of moist SVs at (a) an initial time and (b) the evaluation time. The distributions are the results of averaging all initial and final moist SVs during the quasi-operational period of TEPS (the total number is 3077). Vertical levels 10, 15, 20, and 25 nearly correspond to 700-, 500-, 300-, and 150-hPa heights, respectively.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Composites of vertically accumulated total energy of (a) the first and (b) all initial SVs during the quasi-operational period of TEPS. The energy is normalized by the peak value of the energy in the 4000 km × 4000 km domain centered on analyzed TC position.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Composites of vertically accumulated total energy of (a) the first and (b) all initial SVs during the quasi-operational period of TEPS. The energy is normalized by the peak value of the energy in the 4000 km × 4000 km domain centered on analyzed TC position.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Composites of vertically accumulated total energy of (a) the first and (b) all initial SVs during the quasi-operational period of TEPS. The energy is normalized by the peak value of the energy in the 4000 km × 4000 km domain centered on analyzed TC position.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Example of TEPS. (top) Typhoon Maria, initiated at 1200 UTC 6 Aug 2006 and (bottom) Typhoon Chaba, initiated at 1200 UTC 28 Aug 2004. (left) The track prediction by JMA/GSM (solid) with the best track (dash) and (right) all tracks by TEPS.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Example of TEPS. (top) Typhoon Maria, initiated at 1200 UTC 6 Aug 2006 and (bottom) Typhoon Chaba, initiated at 1200 UTC 28 Aug 2004. (left) The track prediction by JMA/GSM (solid) with the best track (dash) and (right) all tracks by TEPS.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Example of TEPS. (top) Typhoon Maria, initiated at 1200 UTC 6 Aug 2006 and (bottom) Typhoon Chaba, initiated at 1200 UTC 28 Aug 2004. (left) The track prediction by JMA/GSM (solid) with the best track (dash) and (right) all tracks by TEPS.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

(a),(b) Position errors (km) of ensemble mean track predictions (thick line) as compared with those of control runs (thin line). Dots correspond to the *y* axes on the right, which represent the number of verification samples. Both (a) and (b) are the results of verifying TCs during the quasi-operational period of TEPS with a intensity of tropical storm or more, but (b) includes the extratropical-transition stages of the verifying TCs. (c),(d) The result of a *t* test performed to differences between MEAN and CTL at (a),(b) with a confidence level of 95%.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

(a),(b) Position errors (km) of ensemble mean track predictions (thick line) as compared with those of control runs (thin line). Dots correspond to the *y* axes on the right, which represent the number of verification samples. Both (a) and (b) are the results of verifying TCs during the quasi-operational period of TEPS with a intensity of tropical storm or more, but (b) includes the extratropical-transition stages of the verifying TCs. (c),(d) The result of a *t* test performed to differences between MEAN and CTL at (a),(b) with a confidence level of 95%.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

(a),(b) Position errors (km) of ensemble mean track predictions (thick line) as compared with those of control runs (thin line). Dots correspond to the *y* axes on the right, which represent the number of verification samples. Both (a) and (b) are the results of verifying TCs during the quasi-operational period of TEPS with a intensity of tropical storm or more, but (b) includes the extratropical-transition stages of the verifying TCs. (c),(d) The result of a *t* test performed to differences between MEAN and CTL at (a),(b) with a confidence level of 95%.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Relationship between the spread of tracks of 5-day predictions and the 5-day prediction error of the ensemble mean. The ensemble spread is defined by the sum of the spread of every 6 h from the initial time to 5 days. The verification is based on the cases of 5-day predictions in Fig. 6b. The total number of cases is 149.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Relationship between the spread of tracks of 5-day predictions and the 5-day prediction error of the ensemble mean. The ensemble spread is defined by the sum of the spread of every 6 h from the initial time to 5 days. The verification is based on the cases of 5-day predictions in Fig. 6b. The total number of cases is 149.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Relationship between the spread of tracks of 5-day predictions and the 5-day prediction error of the ensemble mean. The ensemble spread is defined by the sum of the spread of every 6 h from the initial time to 5 days. The verification is based on the cases of 5-day predictions in Fig. 6b. The total number of cases is 149.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Verification result of confidence indices on TC track predictions. Based on the ensemble spread, a confidence index, A, B, or C is given to ensemble mean track predictions at each prediction time at each prediction event (A represents the highest confidence). The thin line is the position error of the ensemble mean by all A cases. The thin and dashed lines are for B and C cases, respectively.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Verification result of confidence indices on TC track predictions. Based on the ensemble spread, a confidence index, A, B, or C is given to ensemble mean track predictions at each prediction time at each prediction event (A represents the highest confidence). The thin line is the position error of the ensemble mean by all A cases. The thin and dashed lines are for B and C cases, respectively.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Verification result of confidence indices on TC track predictions. Based on the ensemble spread, a confidence index, A, B, or C is given to ensemble mean track predictions at each prediction time at each prediction event (A represents the highest confidence). The thin line is the position error of the ensemble mean by all A cases. The thin and dashed lines are for B and C cases, respectively.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Example of an application of TEPS. Based on the ensemble spread, the confidence of the track predictions is presented. (left) Typhoon Usagi, initiated at 1200 UTC 29 Jul 2007 and (right) Typhoon Fitow, initiated at 1800 UTC 2 Sep 2007.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Example of an application of TEPS. Based on the ensemble spread, the confidence of the track predictions is presented. (left) Typhoon Usagi, initiated at 1200 UTC 29 Jul 2007 and (right) Typhoon Fitow, initiated at 1800 UTC 2 Sep 2007.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Example of an application of TEPS. Based on the ensemble spread, the confidence of the track predictions is presented. (left) Typhoon Usagi, initiated at 1200 UTC 29 Jul 2007 and (right) Typhoon Fitow, initiated at 1800 UTC 2 Sep 2007.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Position error (km) of each 3-day prediction initiated 0000 UTC by JMA/GSM in 2007. The errors are sorted in an ascending order. The total number of cases is 163.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Position error (km) of each 3-day prediction initiated 0000 UTC by JMA/GSM in 2007. The errors are sorted in an ascending order. The total number of cases is 163.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Position error (km) of each 3-day prediction initiated 0000 UTC by JMA/GSM in 2007. The errors are sorted in an ascending order. The total number of cases is 163.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

As in Fig. 8, but the spreads and position errors are divided into (a) AT and (b) CT directions.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

As in Fig. 8, but the spreads and position errors are divided into (a) AT and (b) CT directions.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

As in Fig. 8, but the spreads and position errors are divided into (a) AT and (b) CT directions.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

As in Figs. 6b,d, but the ensemble mean is compared with JMA/GSM, and the verification is based on the operational TEPS in 2008.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

As in Figs. 6b,d, but the ensemble mean is compared with JMA/GSM, and the verification is based on the operational TEPS in 2008.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

As in Figs. 6b,d, but the ensemble mean is compared with JMA/GSM, and the verification is based on the operational TEPS in 2008.

Citation: Monthly Weather Review 137, 8; 10.1175/2009MWR2697.1

Specifications of the TEPS as compared with WEPS as of February 2009.