## Abstract

This study provides a statistical review on the forecast errors of tropical storm tracks and suggests a Bayesian procedure for updating the uncertainty about the error. The forecast track errors are assumed to form an axisymmetric bivariate normal distribution on a two-dimensional surface. The parameters are a mean vector and a covariance matrix, which imply the accuracy and precision of the operational forecast. A Bayesian method improves quantifying the varying parameters in the bivariate normal distribution. A normal-inverse-Wishart distribution is employed to determine the posterior distribution (i.e., the weights on the parameters). Based on the posterior distribution, the predictive probability density of track forecast errors is obtained as the marginal distribution. Here, “storm approach” is defined for any location within a specified radius of a tropical storm. Consequently, the storm approach probability for each location is derived through partial integration of the marginal distribution within the forecast storm radius. The storm approach probability is considered a realistic and effective representation of storm warning for communicating the threat to local residents since the location-specific interpretation is available on a par with the official track forecast.

## 1. Introduction

Tropical storms regularly impose enormous threats to maritime countries and coastal regions in the Pacific (Mendelsohn et al. 2012; Houser et al. 2014). Prevention of the damages from the storms is an imminent issue in our society at all times. The global warming environment forces nations to prepare for more uncertain situations given likelihood of intensified storms (Elsner et al. 2008; Kang and Elsner 2018, 2019). National meteorological centers provide forecast information on storm track, intensity, and size for people to avoid risks and dangers. Mostly, however, they are derived from a deterministic forecast scenarios and do not represent how probable the influence will be. The storm forecast can be supported by a location-specific probability of a storm approach, based on the uncertainty of forecast tracks. From the point of view of a local resident, the idea of the possible attack is called “strike probability.” In general, storm strike probability is defined as the probability that a storm will fall somewhere within a particular distance close enough to a location (i.e., 140 km) (Jarrell and Brand 1983). This is why the potential area of the storm wind occurrence is more useful information than track uncertainty itself since a track is only the center of danger. Successful representation of the forecast uncertainty has been available in operations, using Monte Carlo sampling from past operational forecasts (DeMaria et al. 2009). Owing to the growth of computing resources, wind speed probability is also available in conjunction with numerical weather prediction products (DeMaria et al. 2013; Goerss 2007). There are some other representations of wind warnings where a forecast storm wind speed is taken into account. The U.S. Joint Typhoon Warning Center (JTWC) presents the area of potential “gale force winds,” which is the addition of expected radial distance of gale force winds to the 5-yr running mean official forecast track error at each corresponding forecast time (Sampson et al. 2017). Gale-force winds define 34 kt (1 kt ≈ 0.51 m s^{−1}) or higher wind speeds. The Japan Meteorological Agency (JMA) provides a similar form of the radius of storm wind area, whose uncertainty solely relies on the probability of storm track forecast errors (Fukuda 2018). Overall, the probabilistic warning of storm approach is partially or entirely related to the errors of the storm track forecast.

This study tries to suggest a probability map showing how probable the storm influence will be by a Bayesian procedure for updating the inherent forecast uncertainty. The author takes note of a common assumption in most information sources that the track errors are normally distributed on a two-dimensional surface. It is, however, considered rather a poor assumption for a large number of error samples. The normality of probability distribution may not be guaranteed in spite of a Gaussian form since the normality test result depends on the number of samples (Razali and Wah 2011). An error sample could be a realization from a certain normally distributed population, but what if the parameters of the distribution vary? The crucial point is that the probability distribution for each realization could be influenced by other factors such as climate variability and improving forecast techniques. This implies the parameters (i.e., mean and variance) are not fixed and themselves have certain probability distributions. Using an example set of operational track forecast errors, the current study shows how to overcome the fixed normal distribution issue, and presents a procedure of consequential storm approach probability. Here, “storm approach” is defined by the inclusion of a specific location under the influence of a tropical storm. A Bayesian method is utilized to effectively update the distribution of the parameters, after the earlier work for univariate track forecast errors (Kang et al. 2016). For a bivariate normal distribution, a normal-inverse-Wishart distribution is used to fit the parameters. Based on the posterior probabilities of the parameters, the predictive probabilities of track forecast errors are obtained as marginal distributions. Then, the location-specific probability of storm approach is available given a specified radius of a tropical storm, hereafter called storm warning radius. Section 2 describes the data, and section 3 reviews the bivariate distribution of the errors on the two-dimensional surface. Section 4 introduces a Bayesian procedure for updating the parameters of the normal distribution, and section 5 shows an example map of storm approach probability distribution. Section 6 provides a summary of the results and a discussion. Statistics and figures are made using the R programming language (www.r-project.org).

## 2. Data

Following the classification rule, tropical cyclones are referred to as tropical storms when the maximum sustained wind speed near the center exceeds 17 m s^{−1} (World Meterological Organization 2017). Here, the radial distance of 15 m s^{−1} wind speed from the storm center is used for indicating the storm warning radius, and the “storm approach” is defined by the inclusion of a specific location within the storm warning area. Then, the storm approach probability relies on the two forecast uncertainties such as storm position and wind warning radius. Due to the inhomogeneous pattern of storm structure and the observational deficiency, the reliable size forecast error is still hard to evaluate. In this study, the storm approach probability is derived from track forecast errors, given a forecast storm warning radius at the forecast time.

A representation of the forecast uncertainty is motivated by the official storm warnings. Most operational forecast agencies may exploit numerical model simulations to support the storm warning information in real time. This study, on the other hand, tries to deal with only the final forecast errors. For the following experiment, a set of operational track forecast errors from the Korea Meteorological Administration (KMA) is utilized. Considering that the Bayesian statistics involves all uncertainties during the data period, the inclusion of long-past years might not be effective to evaluate the current credibility of track forecast uncertainty. A certain data range such as 3-yr period is taken into account for an adequate extent to demonstrate current forecast skills. Then, the data consist of 72-h forecast storm positions (longitude, latitude), and associated observations over the 3-yr period 2016–18. From the data, error distances (km) at 6-h intervals are selected and analyzed. KMA’s observation is used for the evaluation of the error distance. Here, the error distance is calculated by an ellipsoidal method (Vincenty 1975). For the ellipsoid, the lengths of equatorial and polar axes are configured as 6 378 137 and 6 356 752 m, respectively. Different from a great circle distance on an assumed spherical Earth, the ellipsoidal distance would help this study find more accurate results. The example data are only for demonstration of the analysis procedure, and similar results are expected for the observations from other operational centers.

## 3. Spatial distribution of error distances

A histogram shows the distribution of track error distances at 72-h forecast time (Fig. 1a). A Gaussian kernel smoother is used to delineate the density distribution with 100-km bandwidth (dark gray line). As having continuous and nonnegative values, the errors are consistent with a gamma distribution (Kang et al. 2016). Operational centers utilize the quantile at a cumulative probability density as a probability circle radius. For example, the 0.7 quantile of the error distribution in Fig. 1a appears as 255 km. Then the distance is considered as the radius of a circle within which 70% of the forecast storm positions have been observed. The historical errors are interpreted directly as the uncertainty of future forecasts. The area is typically understood as the circular range that a tropical cyclone is expected to move in at 70% probability (Fig. 1b). The track uncertainty is only the uncertainty of storm centers indicated by the lowest air pressure. Given a storm warning radius, the area of possible approach could be extended as much. Practically, JTWC provides an official wind warning circle by the addition of storm warning radius (34-kt wind speed) to the track uncertainty, while JMA applies the radius of 50-kt wind speed for the storm warning area. For a local resident, however, the format of the information might not be easy to understand, as it covers a somewhat large area. For example, a person (blue dot) located at a distance from the forecast position finds that the location is out of the 70% probability circle (dashed inner circle), but at the same time, the storm approach thereof is possible within the forecast storm warning radius (outer circle). Definitely 70% is not a probability for the local resident. The storm warning area does not represent a probability that the storm can actually touch the location but is rather a matter of the forecaster’s concern for how far the storm approach can occur from the forecast position. The storm approach probability for the local resident will be available only when the probability densities of track forecast errors within forecast storm warning radius (shaded in blue) are integrated. For this, a two-dimensional probability density distribution should be an essential reference. The distribution of the errors (shaded in red around a red dot, the mean error) is not necessarily centered at the forecast position (black dot).

Two-dimensional error points in meridional–right angle coordinates are scattered in Fig. 2a. The vertical axis indicates north and south, while the horizontal axis shows a direction at the right angle. The horizontal axis does not mean a latitudinal line. The scattered 458 error points show a significant correlation (*r* = +0.34; [0.26, 0.42], 95% CI) between the movements in the horizontal and vertical directions. The values in the parentheses represent the correlation coefficient, and the lower and upper limits of the 95% confidence interval, respectively. The result is interpreted as significant as the interval does not include zero (no correlation). In other words, northward (southward) errors are likely to occur with the errors to the right (left). These errors might result from fast-moving storms after recurving, which needs further analysis and is out of scope of the present research. The tilt of the distribution is removed (*r* = +0.03; [−0.06, 0.12], 95% CI) when the along- and cross-track coordinates are employed (Fig. 2b). Here, the vertical (horizontal) axis indicates the along (cross) track direction, where along track is determined by 24-h movement of a storm. The mean error position (red dot) is shown slightly negative, which may be associated with some conservative human forecasts for fast-moving storms. The blue circle indicates the 70% cumulative density level. The distribution of most error positions looks axisymmetric. Then, axisymmetric bivariate normal distribution could be modeled by along- and cross-track error distances. The axisymmetry implies the same variance in the along-track and cross-track errors, and at the same time zero correlation coefficient between the two.

“Bivariate normal distribution” is satisfied when both the along-track errors and the cross-track errors take the form of a normal distribution (Balakrishnan and Lai 2009). It is noted, on the other hand, that error distributions in both directions (i.e., the along-track errors and cross-track errors) are seen to violate the normality through the Shapiro–Wilk test (Royston 1982). The violation against the bivariate normality is also confirmed through the tests such as Anderson–Darling test (Paulson et al. 1987), Doornik–Hansen test (Doornik and Hansen 2008), and Royston test (Royston 1992). The Shapiro–Wilk test is used for testing univariate normality, while the others for testing multivariate normality. Each tests the null hypothesis that an error sample came from a normally distributed population. The *p* values appear less than 0.05 for all the above tests, and the null hypothesis is rejected at the 95% confidence level. The nonnormality is led by the increasing number of samples (Razali and Wah 2011). Here, it is assumed that an error sample is a realization of each normal distribution at the moment when the forecast is issued, and the parameters of each normal distribution vary in time. The parameters of a bivariate normal distribution consist of two means and covariance matrix. Implications of the mean and variance are the accuracy and inverse precision of forecasting, respectively. Varying accuracy and precision could be influenced by climate variability and improving forecast techniques. Then, the distribution of error samples in Fig. 2b is regarded as a consequence of the superimposed normal distributions with varying parameters. A normal distribution with some fixed parameters might not successfully describe the accuracy and precision of the errors at all.

## 4. Bayesian inference of error distribution

A Bayesian method is utilized to effectively update the probability distribution of the parameters (i.e., mean and variance). The updating procedure follows the previous work done for univariate track forecast errors (Kang et al. 2016). The current approach improves the preceding work by applying two-dimensional error values. Parametric rather than a nonparametric approach to the distribution of the parameters is also an improvement in terms of generalizability. In addition, the research focuses on the probability of individual forecast errors rather than the mean error per storm. For the posterior distribution of the parameters from the Bayesian procedure, the sample mean and variance are updated on an annual basis. An effective time period reflecting current forecast skill would not include the long-past years, which is subjectively determined by a forecaster. For an example, this study uses a 3-yr period for the Bayesian updating of the parameters.

The probability density of bivariate normal distribution for individual forecast errors is expressed as

where *x*_{1} and *x*_{2} are the error distances at along-track and cross-track directions, respectively; (*x*_{1}, *x*_{2}) is a point on two-dimensional surface, and **x** is the vector of the point. The mean and covariance matrix are given by ** μ** and

**Σ**. Given data, a likelihood function is the bivariate normal distribution conditional on the parameters. The prior probability distribution of unknown mean and variance can be modeled by a normal-inverse-Wishart distribution (Gelman et al. 2004). Then, the posterior distribution is calculated by the joint of the likelihood and the prior probabilities. Since the prior and the posterior probabilities are in the same distribution family, the prior distribution is called the conjugate prior. The probability density of a normal-inverse-Wishart distribution is expressed as

where $N$ and $W\u22121$ represent the normal distribution and the inverse-Wishart distribution, respectively. The terms *μ*_{0} and (1/*λ*)**Σ** are the mean and the covariance matrix of ** μ**;

*λ*is the number of samples. The terms

**Ψ**and

*ν*are the scale matrix and the degrees of freedom, respectively. Without loss of generality and to help with the clearer interpretation of the procedure and results, a forecast track error is assumed to come out of an axisymmetric bivariate normal distribution. Axisymmetry implies the same variance in the along-track and cross-track errors, and at the same time zero correlation coefficient between the two. For the study, the procedure is applied to 458 error samples over the 3-yr period 2016–18, which are 129, 133, and 196 for each year, sequentially. The number and the covariance matrix of the annual samples update the normal-inverse-Wishart distribution. Due to the nature of the conjugate prior in a Bayesian procedure, the posterior can be simply updated by adding new information to the hyperparameters. Here, hyperparameters mean the parameters for the normal-inverse-Wishart distribution of unknown mean and variance. Given new data, the hyperparameters are modified as

where *n* is the number of new samples, the covariance matrix of which is denoted as $S$. The term $x\xaf$ is the mean of the samples, and the superscript T denotes the transpose of a matrix. The initial priors of *λ*, *ν*, *μ*_{0}, and $S$ come from the first year values, and Eqs. (3)–(6) are sequentially updated by each set of subsequent year values. By construction, the posterior is the joint density function of the bivariate normal and the inverse-Wishart distributions. Applying these posterior weights to the parameters, the predictive probability for an individual track forecast error is obtained as the marginal distribution. The marginal distribution is the probability of an error distance, which is not conditional on the parameters.

Figure 3a shows the probability distribution of the parameters. Here, only the along-track mean is demonstrated, and the cross-track mean has the same feature due to the assumption of axisymmetric error distribution. In addition, the same variance is applied to both directions. Multiple circles represent the overlapping probabilities at variable cross-track mean per each along-track mean. Based on these posterior densities, the predictive probabilities of individual track forecast errors are obtained as the marginal distribution. Now, the directional profile of density distribution is available (Fig. 3b). The abscissa indicates the distance of a location from the mean error position. Then it shows the half-cross section of the two-dimensional probability distribution along a certain direction. The black dot represents probabilities at 20-km intervals. Since an axisymmetric distribution is assumed, this probability distribution should be the same for every radial direction from the mean error position.

## 5. Storm approach probability

Storm approach probability showing how probable the storm influence will be, can be calculated at a specific location given a forecast storm warning radius. Based on the above probability density distribution, a probability matrix of the location distance and the storm warning radius is available (Fig. 4a). Here, the distance is between the specific location and mean error position. The relatively large probability density is confirmed for the shorter distance and the smaller storm warning radius. Referenced by the location distance and the storm warning radius, a storm approach probability can be assigned to specific locations. A real storm case, Typhoon Mitag in 2019, is used for a demonstration. Figure 4b shows the KMA’s 120-h forecast issued at 1200 UTC (coordinated universal time) 29 September 2019. The track forecast is shown as a blue line linking 24-h forecast positions (blue dots). Verification of the forecast with observation is not within the scope of this study. The 72-h forecast is utilized for a Bayesian representation of the probability map. The forecast storm wind warning radius is given as 300 km, and the storm approach probability at a location refers to the probability value in Fig. 4a. The intersection of the two dashed lines indicates the center of the probability distribution, which is the mean error of the 72-h track forecasts. It is confirmed that the bivariate distribution is not necessarily centered at the forecast position. The contour indicates the same probability of a storm approach. For example, 0.3 can be directly interpreted as 3 out of 10 storms in the same situation can approach the location. As the maximum probability depends on the given forecast storm warning radius (see Fig. 4a), the maximum probability on the map does not reach 1.0. This, on the other hand, implies that the probability may increase at all locations if the actual storm area gets bigger than forecast.

The study result can be compared with the representation of the conventional wind warning area (Fig. 5). Two colored circles are overlaid on the same contours in Fig. 4b. The 70% probability circle empirically driven from the cumulative error distribution is colored in blue (consistent with Figs. 1a and 2b). The green circle is the addition of the forecast radial distances of 15 m s^{−1} to the blue circle. This is similar to and a bit larger than the JTWC’s wind warning area of 34 kt (≃ 17.5 m s^{−1}) or higher wind speeds. The green circle can be compared to the 70% probability circle that the current study result suggests (black contour labeled 0.7). Both provide the wind warnings by a storm. The green circle is interpreted as the 15 m s^{−1} or higher wind speeds are likely to be included within the area at 70% probability level. Its probability distribution is unrealistically centered at the forecast location and covers a very large area, which makes the information less useful to local residents. The 70% storm approach probability at a location, on the other hand, implies that the location is likely to encounter 15 m s^{−1} or higher storm wind speeds at the same probability level. Compared to the conventional warning method based on a histogram of errors, the map of storm approach probability is considered more effective for local residents, by providing a much smaller warning area with a more realistic two-dimensional probability distribution.

## 6. Summary and discussion

This study tries to suggest a probability map showing how probable the storm influence will be by a Bayesian procedure for updating the inherent forecast uncertainty. For a demonstration of the statistical procedure, a set of operational track forecast errors from KMA is used. In the study, the following assumptions are proposed:

A forecast track error is a realization from a separate normal distribution at the moment when the forecast is issued.

The parameters for each normal distribution vary with time.

The distribution of the errors is axisymmetric.

For the assumption of axisymmetric distribution, equal variance and zero correlation between the along-track and cross-track errors are prescribed. The overall steps for the Bayesian updating and storm approach probability are summarized in Fig. 6. A preceding Bayesian method by Kang et al. (2016) is improved to update the posterior probability of the varying parameters of the bivariate normal distribution. The weights on the parameters are determined by the posterior probability distribution of the normal-inverse-Wishart distribution. For the distribution of parameters, sample mean and variance are updated on an annual basis. Based on the weights, the predictive probability density of track forecast errors is obtained as the marginal distribution. In this study, the radial distance of 15 m s^{−1} wind speed from the storm center is used for indicating the storm warning radius, and the “storm approach” is defined by the inclusion of a specific location within the storm warning area. The location-specific probability can be obtained if a forecast storm warning radius is provided. It is calculated by partial integration of the marginal distribution within the radius from the location. As a consequence, a map of storm approach probability is available to local residents.

In the beginning part of the manuscript, other representation methods such as strike probability, wind speed probability, and the two wind warning areas from JTWC and JMA were introduced. The former two are dependent on numerical model simulations, and the latter two are not based on location-specific track error density. The improvement of the current storm approach probability lies in the first location-specific representation of the pure operational forecast uncertainty through a Bayesian procedure. Compared to the conventional warning method based on a histogram of errors, the map of storm approach probability is considered more effective to local residents, by providing a much smaller warning area with a more realistic two-dimensional probability distribution. Future studies on the storm approach probability may include the uncertainty of storm size and the variance of numerical model outputs. For those, the present study is expected to contribute as meaningful research basis.

## Acknowledgments

This research was supported by Kyungpook National University Research Fund, 2019.

## REFERENCES

*Continuous Bivariate Distributions*

*Wea. Forecasting*

*Wea. Forecasting*

*Oxf. Bull. Econ. Stat.*

*Nature*

*Bayesian Data Analysis*

*Mon. Wea. Rev.*

*Bull. Amer. Meteor. Soc.*

*Wea. Forecasting*

*npj Climate Atmos. Sci.*

*Environ. Res. Lett.*

*Nat. Climate Change*

*J. Stat. Comput. Simul.*

*J. Stat. Model. Anal.*

*Appl. Stat.*

*Stat. Comput.*

*Wea. Forecasting*

*Surv. Rev.*