Abstract

To assist in thunderstorm warning, automated nowcasting systems have been developed that detect thunderstorm cells in radar images and propagate them forward in time to generate forecasted threat areas. Current methods, however, fail to quantify the probabilistic nature of the error structure of such forecasts. This paper introduces the Thunderstorm Environment Strike Probability Algorithm (THESPA), which forecasters can use to provide probabilistic thunderstorm nowcasts for risk assessment and emergency decision making. This method accounts for the prediction error by transforming thunderstorm nowcasts into a strike probability, or the probability that a given location will be impacted by a thunderstorm in a given period, by specifying a bivariate Gaussian distribution of speed and direction errors. This paper presents the development and analysis of the THESPA method and verifies performance using experimental data. Results from a statistical analysis of Thunderstorm Identification, Tracking, Analysis, and Nowcasting (TITAN) tracking errors of nowcasts made near Sydney, Australia, were used to specify the distribution, which was then applied to data collected from the World Weather Research Programme (WWRP) Beijing 2008 Forecast Demonstration Project. The results are encouraging and show Brier skill scores between 0.36 and 0.44 with respect to a deterministic advected threat area forecast.

1. Introduction

Providing public warnings of impending thunderstorms is an important role of weather services. By observing the motion and evolution of thunderstorms using radar data, forecasters can reasonably predict the location and severity of thunderstorms up to 1 h into the future. This is an example of what is widely known as “nowcasting.”

To help objectify and automate the nowcasting process, many radar-based systems have been developed in recent years. Among these are Thunderstorm Identification, Tracking, Analysis and Nowcasting (TITAN; Dixon and Wiener 1993), the Warning Decision Support System (WDSS; Eilts et al. 1996), the Short-Term Ensemble Prediction System (STEPS; Bowler et al. 2006), Short-Range Warnings of Intense Rainstorms in Localized Systems (SWIRLS; Wong et al. 2006), the Canadian Radar Decision System (CARDS; Joe et al. 2003), the Global/Regional Assimilation and Prediction System-Severe Weather Integrated Forecast Tool (GRAPES-SWIFT; Feng et al. 2007; Hu et al. 2007), the Nowcasting and Initialisation for Modelling Using Regional Observation Data System (Nimrod; Golding 1998), National Convective Weather Forecast version 2 (NCWF-2; Megenhardt et al. 2004), and the system described by Schmeits et al. (2008).

Two of these systems, TITAN and CARDS, classify thunderstorms in radar data by detecting connected regions of high-intensity reflectivity then track these regions through time to generate size, location, and velocity estimates of the thunderstorm. WDSS is similar but also uses Doppler radar data to detect regions of vorticity associated with tornadoes. The nowcast that is generated from these systems is usually of a deterministic nature, often with an implicit threat area indicated by velocity vectors and detected storm shapes. Forecasters must then interpret these indicators to make public warnings, acknowledging that the nowcasts usually contain significant errors in position, size, and intensity, owing to uncertainties in predicted storm motion and evolution. The usual approach is to hand draw a threat area that is somewhat broader than the implicit threat area.

STEPS, GRAPES-SWIFT, and Nimrod all blend radar-based nowcasting with numerical weather prediction (NWP), and, in general, forecast rainfall rather than track thunderstorms. STEPS is an ensemble precipitation forecasting system that combines advection forecasts with downscaled NWP rainfall forecasts. Similarly, Nimrod also uses advection in its nowcasting component. SWIRLS makes use of both radar and rain gauge data to monitor and predict local rainfall distribution trends within the next couple of hours, using correlation to track radar echoes. The system described by Schmeits et al. (2008) makes probabilistic thunderstorm forecasts over a 6-h period based on blending NWP and advected lightning and radar storm detections but, unlike the thunderstorm nowcasting systems described earlier, tracks thunderstorm-prone regions rather than individual storms.

Another system that generates probabilistic thunderstorm location output is NCWF-2. This system uses both advection and extrapolation of filtered radar reflectivity fields, combined with a probability distribution function (PDF) based on a “local Lagrangian” probability distribution (Germann and Zawadzki 2004) around a point. This system generates probability fields based on the radar reflectivity field, in contrast to our system (described shortly), which generates probability fields from individual thunderstorms detected by other systems, for example, TITAN.

In this paper we present the Thunderstorm Environment Strike Probability Algorithm (THESPA) to generate probabilistic nowcasts. THESPA uses the concept of a “strike probability,” originally used in cyclone forecasting, which gives the probability a given location will be affected by a storm over a nominated period. The two main approaches for computing cyclone strike probability are based on NWP, which models the cyclones explicitly. One approach employs an ensemble of numerical models or runs from a single model, where the weighted percentage of the ensemble members that put the cyclone within a set distance of a given point is the strike probability for that point (Weber 2003; van der Grijn 2002). Another approach (Templeton and Keenan 1982) calculates the strike probability by time-interpolating the forecast cyclone position in a single NWP model run, assuming the cyclone path is given by a polynomial, the shape of the cyclone is circular, the probability distribution at each step is given by an a priori bivariate normal distribution, and there is statistical independence between the time steps. THESPA computes strike probabilities also using an assumed bivariate normal distribution, but here the approach assumes a linear path, a storm shape given by the initial detection, and complete dependence along potential storm tracks with mutual exclusivity between tracks.

THESPA can improve the capabilities of current nowcasting systems by producing automated objective probabilistic threat areas based on the error characteristics of the estimated storm motion from one or more sources. This probabilistic threat area can be combined with probabilistic products from other automated systems. Additionally, forecasters can manipulate storm motion, size, and other properties to issue better warnings to provide more effective risk assessment and decision making. Such an approach is being pursued by the U.S. National Weather Service (Kuhlman et al. 2008).

In section 2 we explore the errors in predicted thunderstorm motion that lead to an empirical strike probability. An analysis of the variability of thunderstorm motion dependent on the nowcast lead time and other storm properties is included. In section 3 we develop the THESPA method for generating thunderstorm strike probability. Section 4 shows examples of THESPA strike probability nowcasts generated from TITAN nowcasts and verifies their accuracy using data collected during the World Weather Research Programme (WWRP) Beijing 2008 Forecast Demonstration Project (B08FDP) (Yu et al. 2005). In section 5 we give an example of the use of THESPA in B08FDP, where strike probabilities from three independent nowcast systems were combined to create a consensus strike probability. Conclusions are given in section 6.

2. Errors in nowcast thunderstorm motion

Automated nowcast systems such as TITAN attempt to provide an optimal estimate for thunderstorm motion, based on the most recent set of detections for individual storms. The usual approach is to detect thunderstorm cells in 2D radar constant altitude plan position indicator (CAPPI) images or 3D radar volumes using specified reflectivity and size thresholds, then “track” them in subsequent radar images using cross-correlation techniques. Each storm’s speed and direction can be estimated from its position in two or more recent detections. Some algorithms adjust the storm motion using information about the motion of surrounding storms, or model-based wind fields (Feng et al. 2007; Hu et al. 2007). Even when using such sophisticated techniques, errors in the predicted storm location may be 5–10 km after 30 min (Ebert et al. 2004). If the track errors can be estimated, they can be exploited to construct a probabilistic storm nowcast that accounts for the uncertainty in storm motion.

To investigate the statistics of predicted thunderstorm motion a database of 1682 thunderstorm tracks was collected from the Kurnell radar near Sydney, Australia. This included 8302 TITAN nowcasts up to 1 h, which were verified against actual motion (Ebert et al. 2005). The goal was to determine how the errors in nowcast thunderstorm position (determined by speed and direction) vary as a function of lead time, storm speed, storm strength, and other factors. These TITAN storm tracks were produced in an operational setting in the Sydney office of the Bureau of Meteorology and used various reflectivity thresholds to detect the thunderstorms. The 35-dBZ threshold was used most often, based on experimentation (J. Bally 2008, personal communication), although thresholds of 40 and 45 dBZ were also used. The radar scan interval was 5 min.

The first step in developing THESPA was understanding what the strike probability should look like. Intuitively, we expect it to be high near the detected storm and in the direction of the predicted motion, and fan out with decreasing probability as the distance from the detected storm increases. To make an “empirical” thunderstorm strike probability chart from the Sydney data, we use information about the storm’s predicted trajectory during a period of time, for example, 1 h, as well as the error in its predicted motion. Each storm’s velocity vector was normalized by aligning its predicted hourly trajectory onto the x axis with a (nondimensional) length of 1 and plotting its subsequent detected positions relative to the predicted trajectory. For a 5-min scan interval a perfectly predicted storm would yield 12 detections evenly spaced along the x axis between 0 and 1; imperfectly predicted tracks will generally lie off the x axis (due to errors in predicted direction) and may have unevenly spaced detections (due to errors in predicted speed). The detected relative tracks were accumulated for all nowcasts to build a two-dimensional histogram representing the empirical strike probability, shown in Fig. 1.

Fig. 1.

Histogram of normalized detected thunderstorm positions (from 1682 Kurnell thunderstorm tracks) relative to their nowcast trajectories. Each detected thunderstorm position was recorded in polar coordinates [rd /(tnsn), θdθn] where θd is the detected angular position, θn is the nowcast angular position, rd is the detected radial position, tn is the nowcast period (lead time), and sn is the nowcast speed; the result is converted to (dimensionless) Cartesian coordinates. This value was then used to increment a corresponding bin in this 2D histogram. Counts for a given bin are indicated by the scale on the right.

Fig. 1.

Histogram of normalized detected thunderstorm positions (from 1682 Kurnell thunderstorm tracks) relative to their nowcast trajectories. Each detected thunderstorm position was recorded in polar coordinates [rd /(tnsn), θdθn] where θd is the detected angular position, θn is the nowcast angular position, rd is the detected radial position, tn is the nowcast period (lead time), and sn is the nowcast speed; the result is converted to (dimensionless) Cartesian coordinates. This value was then used to increment a corresponding bin in this 2D histogram. Counts for a given bin are indicated by the scale on the right.

It can be seen in Fig. 1 that the empirical strike probability has an elliptical core and roughly triangular envelope, in agreement with our expectations. Few thunderstorms lasted the full hour; in fact most lasted only a short time or were slow compared with their initial speed, as evident from the large spike in the histogram less than 0.1 from the origin. This may be related to the use of a 35-dBZ detection threshold, which would include weak storms with a short life.

To build the error information into a mathematical model for strike probability, we next investigated the statistics of the nowcast position errors as a function of lead time and detected storm properties. The position errors were separated into speed and direction components, as earlier studies have indicated that the error in predicted storm direction generally outweighs the error in predicted speed (Bellon and Austin 1978).

Figure 2 shows distance (s), direction (θ ), and speed (V ) error standard deviations as a function of nowcast lead time. Standard deviations of the error will be used to specify the bivariate normal probability distributions that underpin THESPA. Figure 2a shows a roughly linear increase in the standard deviation of the Cartesian distance errors with time. When viewed in polar coordinates, Fig. 2b shows that the standard deviation for both speed and direction errors (σV and σθ ) are roughly constant with time, at around 10 km h−1 and 30°, respectively. This is consistent with the findings of Bellon and Austin (1978) and suggests that we can indeed use these standard deviations for predicting the variability of thunderstorm motion.

Fig. 2.

Thunderstorm nowcast verification statistics (from 1682 Kurnell thunderstorm tracks) as a function of lead time t (min) showing std devs of errors in (a) s distance (km), s/t, and s/t; and (b) V speed (km h−1) and θ direction (°).

Fig. 2.

Thunderstorm nowcast verification statistics (from 1682 Kurnell thunderstorm tracks) as a function of lead time t (min) showing std devs of errors in (a) s distance (km), s/t, and s/t; and (b) V speed (km h−1) and θ direction (°).

The results in Fig. 2 were for all storms considered together. There may be further dependencies of speed and direction error on other storm characteristics such as the height of maximum radar reflectivity, vertically integrated liquid (VIL), and detected speed. Figures 3 and 4 show that for the detected storm height of maximum reflectivity and VIL, there is rough consistency of σV and σθ through the lead times (the curves overlay each other). Moreover, no functional dependence was found on lead time. Stronger storms, that is, those with larger values of height of maximum reflectivity and VIL, tended to have smaller errors in storm direction. The speed errors were not sensitive to storm intensity except for very low values of VIL (Fig. 4a).

Fig. 3.

Thunderstorm nowcast verification statistics as a function of height of maximum reflectivity showing (a) std devs of errors in speed [σV (km h−1)] and (b) std devs of errors in direction [σθ (°)], with lead times 5–60 min shown with gray lines, and the average over all lead times shown in black. No systematic variation was found with respect to lead time; the gray lines are shown to indicate the scatter about the mean.

Fig. 3.

Thunderstorm nowcast verification statistics as a function of height of maximum reflectivity showing (a) std devs of errors in speed [σV (km h−1)] and (b) std devs of errors in direction [σθ (°)], with lead times 5–60 min shown with gray lines, and the average over all lead times shown in black. No systematic variation was found with respect to lead time; the gray lines are shown to indicate the scatter about the mean.

Fig. 4.

As in Fig. 3, for verification statistics as a function of VIL. Not all nowcasts in the Sydney dataset reported VIL, so here we show only the 10-min lead times (gray lines) and the average over all lead times (black).

Fig. 4.

As in Fig. 3, for verification statistics as a function of VIL. Not all nowcasts in the Sydney dataset reported VIL, so here we show only the 10-min lead times (gray lines) and the average over all lead times (black).

Figure 5 shows σV and σθ versus initial detection speed. Figure 5a shows somewhat erratic behavior of σV at low speed, but it becomes fairly consistent at a detected speed greater than about 15 km h−1, with a steady increase of σV from 6 to 10 km h−1 as the detected speed increases to 50 km h−1. The direction can have quite large uncertainties (σθ ≥ 50°) for the slowest thunderstorms, decreasing to about 15° for storms moving at 30 km h−1 or more, as one might expect for faster storms, the motion of which is easier to diagnose (Fig. 5b).

Fig. 5.

As in Fig. 3, for verification statistics as a function of initial detected speed V. Shown are lead times 5–60 min (gray) and the average over all lead times (black).

Fig. 5.

As in Fig. 3, for verification statistics as a function of initial detected speed V. Shown are lead times 5–60 min (gray) and the average over all lead times (black).

In summary, the speed and direction errors of estimated thunderstorm motion appear to be nearly constant with lead time, which simplifies the mathematical model used by THESPA. Of the thunderstorm measures analyzed, detected speed appears to be the best candidate for functional dependence of speed and direction error standard deviations, compared with height of maximum reflectivity and VIL. However, initial tests indicated that little improvement was likely to be gained from functional dependence. In this study, we assumed constant error statistics for all thunderstorms.

3. The THESPA method

We wish to calculate the probability that a thunderstorm passes over some point (x, y) at some time (t) in the nowcast period, given an existing thunderstorm defined by its shape, size, speed (V ), and direction (θ ) at time t = 0. Here, without loss of generality, we assume its velocity is oriented along the x axis. It is assumed that the thunderstorm moves at constant velocity, lasts for the full nowcast period, and does not change size or shape. Of course, these are not completely realistic assumptions but are necessary so that the assumption of Lagrangian steady state is justifiable over the relatively short lead times we are considering.

In section 2 it was shown that the distribution of thunderstorm speed and direction errors had standard deviations σV and σθ that were roughly constant over the nowcast period. We assume that at any instant in time t the central position, which we will call the center of gravity (CoG), of the thunderstorm is given by a bivariate normal (2D Gaussian) distribution in Cartesian space centered at x = Vt, and whose standard deviations are σx = V and σy = Vt tan(σθ). The formula for the 2D Gaussian distribution of thunderstorm CoG positions is

 
formula

which has a dimension of area−1, and is normalized so that integrating over all x and y gives a probability of 1. Figure 6 illustrates the ellipse of one standard deviation for the position of the CoG after a time t.

Fig. 6.

Schematic showing the original thunderstorm (TS) and the one-std-dev ellipse for its CoG after time t given speed V, with velocity and direction std dev of σV, σθ, respectively.

Fig. 6.

Schematic showing the original thunderstorm (TS) and the one-std-dev ellipse for its CoG after time t given speed V, with velocity and direction std dev of σV, σθ, respectively.

To derive the probability that a point (x, y) will be affected by any part of a thunderstorm over the nowcast period T, we start by computing the probability that (x, y) will coincide with the storm’s CoG, then expand the scope to include any part of the storm. This probability involves summing the probability that the point coincides with the CoG at time T together with all CoGs that coincide with the point at earlier times, that is, faster storms. This is equivalent to summing along a radial from the point to infinity due to the assumption of thunderstorm linear motion. To make the derivation more tractable, we first convert the Cartesian coordinate variables (x, y) in Eq. (1) to polar coordinate variables (r, θ ).1 We then set t = T to use the distribution at the end of the nowcast period and integrate over radial distances from r to infinity with each point weighted by the area r dr dθ. Thus the probability that a point (r, θ ) will coincide with the CoG of the thunderstorm over the nowcast period is given by

 
formula

where s and ds are radial distance measures. Note that probCoG(r, θ ) has the dimension of angle−1. Completing the integration gives

 
formula

where η = sinθ2σx2 + cosθ2σy2, and erf is the error function (Fristedt and Gray 1997).

The next step in the development of the algorithm is to derive a distribution that characterizes the locations that will be affected by any part of the thunderstorm over the nowcast period, assuming the size and shape of the storm remains unchanged during this period. To compute the required probability given a specific thunderstorm, we need for each point (x, y) to combine the probabilities of all possible future positions of the thunderstorm that overlay (x, y). This set of possible future thunderstorms is dubbed the thunderstorm overlay set, and the set of CoGs of these possible storms is dubbed the thunderstorm area (TSA), as shown in Fig. 7. The CoGs for all possible thunderstorms overlaying (x, y) encompass the TSA, the shape of which is a 180° rotation of the original thunderstorm centered at (x, y). Thus, for each such possible thunderstorm position overlaying (x, y), we find the probability at its CoG using Eq. (3), and, to account for the whole TSA, we combine such probabilities for all storm footprints within the thunderstorm overlay set.

Fig. 7.

A set of possible future positions of the original thunderstorm (shaded) that overlay a specific point (x, y) (thunderstorm overlay set). Note that those shown (light curves) are examples of the extremes of the thunderstorm overlay set, with (x, y) being right on their boundaries, and the collection of their CoGs laying on the dark solid curve (the TSA boundary). The thunderstorm overlay set also includes all those potential thunderstorms whose CoGs lie within the TSA boundary.

Fig. 7.

A set of possible future positions of the original thunderstorm (shaded) that overlay a specific point (x, y) (thunderstorm overlay set). Note that those shown (light curves) are examples of the extremes of the thunderstorm overlay set, with (x, y) being right on their boundaries, and the collection of their CoGs laying on the dark solid curve (the TSA boundary). The thunderstorm overlay set also includes all those potential thunderstorms whose CoGs lie within the TSA boundary.

Since the probabilities over the TSA are spatially correlated, they cannot simply be added. The correlation results from the fact that some potential positions of the CoG encompass other positions, namely those of thunderstorm trajectories in the same direction but faster, that is, those that have passed over the same point earlier and are now farther from the origin. These “downstream” points should not be included in the combination. Since the probability densities of the CoG positions are in units of angle−1, we can find the total probability [strikeProb(x, y)] that the point (x, y) was affected by any part of the thunderstorm by integrating along the “trailing edge” of the TSA for that point. This trailing edge is seen in Fig. 8 as points of the TSA closest to the origin. Let TExy(θ ) (a mapping from polar angle to radial distance) describe the shape of the trailing edge for the given TSA with θ1 to θ2 being the angular limits of the trailing edge. Computing the strike probability involves integrating probCoG(r, θ ) [Eq. (3)] along the trailing edge, giving

 
formula

where r has been replaced by TExy(θ ). Note that the relatively small number of points along the trailing edge compared with the full area of the TSA leads to more efficient computation. One then repeats this operation for each point (x, y) in the geographic region to generate the complete probability field.

Fig. 8.

Plan in Cartesian coordinates of the original thunderstorm (in gray), the TSA for a position (x, y), and the “trailing edge” (thick line) that contributes all the probability to the sum for that position; θ1 and θ2 are the angular limits of the trailing edge.

Fig. 8.

Plan in Cartesian coordinates of the original thunderstorm (in gray), the TSA for a position (x, y), and the “trailing edge” (thick line) that contributes all the probability to the sum for that position; θ1 and θ2 are the angular limits of the trailing edge.

To deal with the case of multiple thunderstorms in the same region, the strike probability distribution can be calculated for each thunderstorm separately and then pointwise combined with the usual rules of probability, assuming independence:

 
formula

for J thunderstorms affecting a point (x, y). Of course, storms are not necessarily statistically independent (Grecu and Krajewski 2000); however, by assuming they are we create probabilities that are upper bounds of the real values. An alternative approach would be to take the average probability, or the maximum. In a thunderstorm warning environment, potentially overstating the threat by using an upper bound may be prudent.

THESPA strike probability nowcasts for some typical thunderstorm parameters are shown in Fig. 9. The original thunderstorm shape is shown schematically as an ellipse, the interior of which of course has a probability of 1 since those points have already been affected by the thunderstorm. For high-speed thunderstorms relative to speed standard deviation (σV), the probability field shows a smoothed “triangular” envelope centered around the velocity vector (Fig. 9a). For slower-moving thunderstorms there is a non-negligible chance that the thunderstorm will move opposite to its initial velocity vector, which shows up as finite probability on the back side of the thunderstorm (with respect to velocity) (Fig. 9b).

Fig. 9.

Examples of output from THESPA with feasible storm-tracking values; (a) θ = 115°, σθ = 30°, V = 75 km h−1, σV = 10 km h−1; (b) θ = 115°, σθ = 30°, V = 4 km h−1, σV = 10 km h−1, with probability contours shown at 0.1 intervals. The original storms are shown as ellipses with nowcast velocity as an arrow. A scale marker shows 25 km. Black shading represents a strike probability of 1, and white shading represents a strike probability of 0, as indicated by the grayscale bar on the right.

Fig. 9.

Examples of output from THESPA with feasible storm-tracking values; (a) θ = 115°, σθ = 30°, V = 75 km h−1, σV = 10 km h−1; (b) θ = 115°, σθ = 30°, V = 4 km h−1, σV = 10 km h−1, with probability contours shown at 0.1 intervals. The original storms are shown as ellipses with nowcast velocity as an arrow. A scale marker shows 25 km. Black shading represents a strike probability of 1, and white shading represents a strike probability of 0, as indicated by the grayscale bar on the right.

4. Accuracy of strike probability nowcasts

We assess the quality of the strike probabilities by examining their reliability, that is, the agreement between the predicted probabilities and the corresponding observed detection frequency in storm datasets from Sydney and Beijing.

For each storm track, a 60-min strike probability nowcast at 1-km spatial resolution was generated from the first detection that included a nowcast velocity (this is the second thunderstorm detection, as TITAN requires two detections to compute velocity). The values for speed and direction uncertainty were set to σV = 10 km h−1 and σθ = 30° in accordance with the results from section 2. For verification, the observed area affected by the storm was defined as the convex hull of each storm observation (which were at 5- or 6-min intervals) and its predecessor in its track (if any), as shown in Fig. 10. For all forecast pixels with a given probability value the number of pixels containing observed thunderstorm occurrences and non-occurrences were counted, giving an observed relative frequency. These observed relative frequencies were then plotted against probability value, giving a reliability diagram (Wilks 1995). Since the algorithm will return nonzero probability values for points far from the initially detected storm, a bounding box is drawn around the storm based on its size and velocity. Not calculating a probability beyond this box improves computational efficiency of the algorithm. The ignored points had less than 0.030, and usually less than 0.005, probability.

Fig. 10.

Schematic diagram showing a pair of successive storm detections (dark gray) at time steps t = n and t = n + 1. For verification we wish to include not only the points under detected storms (dark gray) but also the points between detections that were presumably under the storm between radar scans (light gray). Hence, for verification we include all points in the convex hull (shown in both shades of gray) of the two detections.

Fig. 10.

Schematic diagram showing a pair of successive storm detections (dark gray) at time steps t = n and t = n + 1. For verification we wish to include not only the points under detected storms (dark gray) but also the points between detections that were presumably under the storm between radar scans (light gray). Hence, for verification we include all points in the convex hull (shown in both shades of gray) of the two detections.

We also calculate the Brier skill score (BSS) (Wilks 1995) as an overall measure of the system’s performance. The Brier score (BS) gives the magnitude of the probability forecast errors and is defined as

 
formula

where pi is the ith probability forecast of some event occurring in a series of N trials, and oi is the subsequent observation of that event, set to 1 if it occurs and zero otherwise. The Brier score measures the mean-squared probability error and ranges from 0 to 1, with a perfect score at 0. The BSS measures the relative skill of a probabilistic forecast over that of some reference forecast and is defined as

 
formula

ranging from minus infinity to 1. A BSS of 0 indicates no skill when compared to the reference forecast, and a perfect score is 1.

For a reference forecast we use a variation of the current manual forecast method, which is to project the detected storm forward at its detected velocity for the nowcast period and to define the probability of the swept-out area to be one and the rest zero (Fig. 11). We refer to this as the “advected threat area forecast.”

Fig. 11.

Schematic diagram showing the advected threat area forecast modeled on current manual practice that is used as the reference forecast in calculating the BSS. Here a detected thunderstorm is swept forward along the detected velocity vector to a new position at the end of the nowcast period, with the swept out area defined as having a thunderstorm strike probability of 1, and the rest 0.

Fig. 11.

Schematic diagram showing the advected threat area forecast modeled on current manual practice that is used as the reference forecast in calculating the BSS. Here a detected thunderstorm is swept forward along the detected velocity vector to a new position at the end of the nowcast period, with the swept out area defined as having a thunderstorm strike probability of 1, and the rest 0.

To get an initial assessment of the performance of THESPA to forecast thunderstorm strike probability, we examined the aforementioned Sydney dataset. Since this dataset was also used to determine the speed and direction standard deviations for THESPA, this initial study is not as useful a guide to the performance as an independent dataset but can be viewed as an upper bound on system performance. Overall, higher nowcast strike probabilities were associated with higher observed frequency of thunderstorm detection, which means THESPA was working correctly. When verified for all storms THESPA had a tendency to overestimate the probabilities, particularly at the highest values (Fig. 12a). These higher probabilities, in general over 0.8, were for regions close to the original storm. The drop in observed frequency for these regions was due to weak storms that were not subsequently detected, that is, having a storm lifetime of one radar scan cycle (hereafter called a “storm lifetime of one step”). When these weak storms were filtered out using VIL greater than 30 kg m−2 as a criterion, the reliability improved substantially (Fig. 12b). To confirm this explanation, it was found that the number of storms with a lifetime of one step in the dataset was 175, or about 10% of the total, but when storms with VIL under 30 kg m−2 were filtered out, the number of storms with a lifetime of one step was 21, or about 1.2%. The smaller number of cases at these higher probabilities (around 100 pixels per probability increment) also gave noisy or erratic results due to limited statistical sampling. Interestingly, these results are not reflected in the BSS, which shows little change between filtered and nonfiltered data. Even though the BS is halved, that is, improved, for the filtered data (Table 1), filtering the data also improved the advected forecast.

Fig. 12.

Reliability diagrams for thunderstorm strike probability computed from the original 1682 Sydney TITAN storm tracks with σV = 10 km h−1 and σθ = 30° for (a) all thunderstorms and (b) only those storms with VIL greater than 30 kg m−2. The diagonal line indicates perfect reliability. The sharpness diagram at upper left on each chart shows the number of times each probability value (horizontal axis, linear scale) was predicted (vertical axis, logarithmic scale). The BSS for (a) was 0.31 and for (b) was 0.29, with respect to the advected threat area forecast.

Fig. 12.

Reliability diagrams for thunderstorm strike probability computed from the original 1682 Sydney TITAN storm tracks with σV = 10 km h−1 and σθ = 30° for (a) all thunderstorms and (b) only those storms with VIL greater than 30 kg m−2. The diagonal line indicates perfect reliability. The sharpness diagram at upper left on each chart shows the number of times each probability value (horizontal axis, linear scale) was predicted (vertical axis, logarithmic scale). The BSS for (a) was 0.31 and for (b) was 0.29, with respect to the advected threat area forecast.

Table 1.

BS for THESPA (BS Th), BS for advected forecast (BS ad), resulting BSS, and percentage of short-lived storms, (i.e., lifetimes of fewer than four steps) for cases of filtering and not filtering on VIL less than 30 kg m−2 for datasets of Beijing data thresholded on 45, 40, and 35 dBZ and the Sydney data. In general, the smaller the percentage of short-lived storms, the better the BSS.

BS for THESPA (BS Th), BS for advected forecast (BS ad), resulting BSS, and percentage of short-lived storms, (i.e., lifetimes of fewer than four steps) for cases of filtering and not filtering on VIL less than 30 kg m−2 for datasets of Beijing data thresholded on 45, 40, and 35 dBZ and the Sydney data. In general, the smaller the percentage of short-lived storms, the better the BSS.
BS for THESPA (BS Th), BS for advected forecast (BS ad), resulting BSS, and percentage of short-lived storms, (i.e., lifetimes of fewer than four steps) for cases of filtering and not filtering on VIL less than 30 kg m−2 for datasets of Beijing data thresholded on 45, 40, and 35 dBZ and the Sydney data. In general, the smaller the percentage of short-lived storms, the better the BSS.

Next, THESPA was evaluated using TITAN thunderstorm nowcasts from B08FDP. In this experiment forecasters used a storm detection threshold of 45 dBZ, somewhat greater than that used in the Sydney calibration dataset. This provided a dataset of 2394 tracks, which was processed with the same standard deviations as determined from the Sydney data. The climate of Beijing, China, is different than that of Sydney, but one would expect the higher threshold to select more stable, predictable storms, and hence THESPA should be more reliable, especially at higher probability, than for the Sydney data. This was found to be the case, as is shown next.

The reliability for the THESPA strike probabilities in B08FDP can be seen in Fig. 13. Figure 13a shows results for all storms, and Fig. 13b shows results for the same storms also filtered on VIL greater than 30 kg m−2. Here we also include the “no resolution” (climatology) and the “no skill” (halfway between climatology and a perfect forecast) lines in the reliability diagram. The climatological strike frequency was calculated from the 51 days of B08FDP, 1 August–20 September 2008. This value for B08FDP at 45 dBZ is 0.003. The THESPA strike probabilities were quite reliable for the Beijing data, with better reliability for high-probability forecasts, compared with Sydney. The BSS for Beijing varied between 0.36 and 0.44, depending on the storm detection threshold and whether or not the storms were filtered by VIL (Table 1). The improved behavior of the VIL-filtered data (Fig. 13b) at high probability can be explained in terms of filtering out of numerous short-lived low-VIL “pulse” storms observed during B08FDP (P. Joe 2008, personal communication). We verified this hypothesis by counting the number of short-lived storms (storm lifetimes of three or fewer steps, or 12 min, which is 20% of the nowcast period) in the database, which amounted to 830, or about 33% of all storms. These short-lived storms would mostly affect higher-strike-probability regions. This was then compared with the same count after filtering out storms with VIL less than 30 kg m−2, and, indeed, the number of short-lived storms dropped to 228 (or about 9%).

Fig. 13.

As in Fig. 12, for THESPA strike probabilities generated from TITAN nowcasts collected during the WWRP Beijing 2008 Forecast Demonstration Project using a storm detection threshold of 45 dBZ (2394 thunderstorm tracks). This diagram includes the “no skill” and “no resolution” lines calculated from the climatology of thunderstorm occurrence. The BSS for (a) was 0.39 and for (b) was 0.41, with respect to the advected threat area forecast.

Fig. 13.

As in Fig. 12, for THESPA strike probabilities generated from TITAN nowcasts collected during the WWRP Beijing 2008 Forecast Demonstration Project using a storm detection threshold of 45 dBZ (2394 thunderstorm tracks). This diagram includes the “no skill” and “no resolution” lines calculated from the climatology of thunderstorm occurrence. The BSS for (a) was 0.39 and for (b) was 0.41, with respect to the advected threat area forecast.

To test the sensitivity of THESPA performance to the assigned parameters, namely thunderstorm detection threshold, σV and σθ, the B08FDP radar data were reprocessed to generate new TITAN nowcasts and new THESPA strike probabilities. Figure 14 shows the reliability when the detection threshold was 45 dBZ, the standard deviation for speed was decreased to 7 km h−1 (Fig. 14a), and the standard deviation for direction was reduced to 20° (Fig. 14b). Reducing the specified uncertainty in the direction error reduced the angular spread of the probability forecasts so that the probabilities were increased overall. This increase in overprediction was slightly detrimental to the forecast. On the other hand, the reduction in speed error uncertainty had less effect on the reliability. This is because the speed error uncertainty mainly affects the “leading edge” of the probability distribution, with fewer storms surviving long enough to influence the statistics there, whereas the direction error uncertainty affects the distribution for storms of any lifetime. Neither of these effects was reflected in the BSS, suggesting that THESPA is not very sensitive to σV and σθ.

Fig. 14.

As in Fig. 13a, with smaller assumed (a) speed error 7 km h−1 and (b) direction error 20°. The BSS for (a) was 0.40 and for (b) it was 0.41, with respect to the advected threat area forecast.

Fig. 14.

As in Fig. 13a, with smaller assumed (a) speed error 7 km h−1 and (b) direction error 20°. The BSS for (a) was 0.40 and for (b) it was 0.41, with respect to the advected threat area forecast.

Using a lower detection threshold of 40 dBZ (generating a database of 4244 storm tracks) appeared to have little effect on the reliability of the strike probabilities, as seen in Fig. 15. The BSS values were also little affected, with 0.38 for 40 dBZ data and 0.39 for 45 dBZ data. Similarly, when filtered on VIL there was little or no difference, with BSS of 0.41 for both datasets. This is likely due to the small change in the dataset size when filtered on 40 versus 45 dBZ, whereas the next step in dBZ, explained below, appears to have a larger effect.

Fig. 15.

As in Fig. 13a, with a storm detection threshold set to 40 dBZ (4244 storm tracks) for (a) all storms and (b) only those storms with VIL greater than 30 kg m−2. The BSS for (a) was 0.38 and for (b) was 0.41, with respect to the advected threat area forecast.

Fig. 15.

As in Fig. 13a, with a storm detection threshold set to 40 dBZ (4244 storm tracks) for (a) all storms and (b) only those storms with VIL greater than 30 kg m−2. The BSS for (a) was 0.38 and for (b) was 0.41, with respect to the advected threat area forecast.

Using a 35-dBZ threshold (resulting in a database of 12 597 storm tracks) had a negative impact with a BSS of 0.36. And unlike the 40- and 45-dBZ TITAN nowcasts, restricting the storms to high-VIL cases improved the reliability for the 35-dBZ threshold case (Fig. 16b), with an improvement in the BSS to 0.44. This is due to filtering out short-lived storms, with total short-lived storms in the database numbering 5521, or about 36%, and short-lived storms with VIL greater than 30 kg m−2 numbering 376, or about 2.5%. Table 1 gives a a summary of these results. In fact, the TITAN storm detection threshold was raised from 35 to 40 dBZ early in B08FDP, when it was seen that heavy stratiform rain was being misdiagnosed as thunderstorms (J. Bally 2008, personal communication). In general, the smaller the percentage of short-lived storms, the better the BSS, suggesting that THESPA could be improved if it took into account the distribution of storm lifetimes for the location and season.

Fig. 16.

As in Fig. 15, with a storm detection threshold set to 35 dBZ (12 597 storm tracks). The BSS for (a) was 0.36 and for (b) was 0.44, with respect to the advected threat area forecast.

Fig. 16.

As in Fig. 15, with a storm detection threshold set to 35 dBZ (12 597 storm tracks). The BSS for (a) was 0.36 and for (b) was 0.44, with respect to the advected threat area forecast.

5. Real-time example

The computational efficiency of THESPA allowed it to be run in the Thunderstorm Interactive Forecast System (TIFS; Bally 2004) in real time during B08FDP, as strike probability calculations for several storms in a 500-km square domain took only 1–2 s when run on a Linux workstation. In real-time operation in B08FDP strike probabilities were calculated not only for TITAN thunderstorm nowcasts but also from SWIRLS (Wong et al. 2006) and CARDS (Joe et al. 2003) nowcasts. In the absence of advance knowledge of typical track errors for SWIRLS and CARDS nowcasts, the strike probability calculations for these nowcasts used the same values for σV and σθ as were determined for the Sydney TITAN cases. The strike probability grids from each individual nowcast system were then averaged to create consensus strike probability nowcasts. In principle the input nowcasts could be weighted according to their recent accuracy; however, this was not done during B08FDP. These consensus strike probability grids formed the basis for integrated objective thunderstorm guidance used by operational forecasters at the Beijing Meteorological Bureau (BMB) during the Olympic Games (Bally et al. 2009).

A sample strike probability chart is shown in Fig. 17 for nowcasts generated on 6 September 2008 when thunderstorms were moving though Beijing from the southwest. TITAN, SWIRLS, and CARDS all detected the presence of a large thunderstorm over the city center but estimated somewhat different sizes and velocities for the storm. The consensus strike probability map shows a strong chance of the thunderstorm affecting a southwest–northeast region across central Beijing, with probabilities falling off toward the northeast. The two smaller target regions in the northeastern part of the strike probability map reflect the prediction of thunderstorms by only one of the three input systems (SWIRLS) and therefore have a central probability of 0.33. These display the characteristic fan-shaped probability pattern predicted by THESPA (see Fig. 9).

Fig. 17.

Thunderstorm nowcasts from the TITAN, SWIRLS, and CARDS nowcasting systems, and consensus strike probability (lower right) for the hour beginning at 2324 UTC 6 Sep 2008. The arrows show the predicted movement of the thunderstorms during the next 60 min.

Fig. 17.

Thunderstorm nowcasts from the TITAN, SWIRLS, and CARDS nowcasting systems, and consensus strike probability (lower right) for the hour beginning at 2324 UTC 6 Sep 2008. The arrows show the predicted movement of the thunderstorms during the next 60 min.

This is the first time probabilistic thunderstorm predictions from different nowcasting systems have been combined. BMB forecasters found them very useful, according to surveys conducted during the Olympics (Y. Duan 2008, personal communication). Verification of the real-time consensus strike probability nowcasts produced during B08FDP showed a high degree of reliability and skill (Bally et al. 2009).

6. Conclusions

In this paper we have presented a new algorithm for producing thunderstorm strike probabilities for thunderstorms detected with nowcast systems. A brief review of existing thunderstorm detection and nowcast systems showed that most such systems generate output consisting of shape, size, and velocity of thunderstorms, but not probabilities that can used by other systems. Some nowcast systems do generate thunderstorm probabilities, but these in general do not nowcast individual storms but rather the likelihood of thunderstorms occurring over regions, often blending NWP output with radar or other observational techniques.

We next investigated the statistics of thunderstorm motion using a dataset from a radar near Sydney, Australia, and found that the standard deviation of errors in speed and direction of motion were roughly constant over the lifetime of the storms. This provided the basis for the subsequent development of THESPA.

THESPA was designed for mathematical tractability and algorithmic simplicity. Verifying its output using datasets from the original Sydney radar, and also data from the Beijing 2008 Forecast Demonstration Project (B08FDP), showed the reliability to be quite good, with BSS between 0.36 and 0.44 with respect to an advected threat area forecast. The results were not very sensitive to the chosen values of the parameters, that is, detection threshold, σV and σθ. Likewise, favorable performance was shown when the algorithm parameters determined using the Sydney data were applied to data collected during B08FDP. The hypothesis that these parameters provide similar results in other locations should be substantiated in a future study. Looking more closely at the results, we also found that the smaller the percentage of short-lived storms, that is, those with a lifetime of three or fewer steps, the better the skill. This indicates that a fruitful line of enquiry would be to further develop THESPA to take into account the statistics of storm lifetime.

Some additional calibration of the probabilities, for example, as a postprocessing step, could improve their reliability still further. Including a functional dependence of the speed and direction error standard deviations on detected storm properties, namely VIL and thunderstorm speed, may also be tested.

The algorithm was embedded in the Thunderstorm Interactive Forecast System used by forecasters in the B08FDP, who found THESPA strike probabilities to be very useful. THESPA will be migrated into an operational environment in the Australian Bureau of Meteorology in the near future.

Acknowledgments

Tony Bannister and John Bally provided many useful comments during the course of this work. The authors thank the anonymous reviewers, especially reviewer A, for their very useful contributions to this paper.

REFERENCES

REFERENCES
Bally
,
J.
,
2004
:
The thunderstorm interactive forecast system: Turning automated thunderstorm tracks into severe weather warnings.
Wea. Forecasting
,
19
,
64
72
.
Bally
,
J.
,
D.
Scurrah
,
E.
Ebert
, and
D.
Su
,
2009
:
Composite products and nowcast decision support for the Beijing 2008 forecast demonstration project.
Proc. WMO Symp. on Nowcasting, Whistler, BC, Canada, WMO, 1.6. [Available online at http://www.nowcasting2009.ca/images/stories/Presentations/WSN09_Presentations.htm]
.
Bellon
,
A.
, and
G. L.
Austin
,
1978
:
The evaluation of two years of real-time operation of a short-term precipitation forecasting procedure (SHARP).
J. Appl. Meteor.
,
17
,
1778
1787
.
Bowler
,
N.
,
C.
Pierce
, and
A.
Seed
,
2006
:
STEPS: A probabilistic precipitation forecasting scheme which merges an extrapolation nowcast with downscaled NWP.
Quart. J. Roy. Meteor. Soc.
,
132
,
2127
2155
.
Dixon
,
M.
, and
G.
Wiener
,
1993
:
TITAN: Thunderstorm identification, tracking, analysis and nowcasting—A radar-based methodology.
J. Atmos. Oceanic Technol.
,
10
,
785
797
.
Ebert
,
E.
,
L.
Wilson
,
B.
Brown
,
P.
Nurmi
,
H.
Brooks
,
J.
Bally
, and
M.
Jaeneke
,
2004
:
Verification of nowcasts from the WWRP Sydney 2000 Forecast Demonstration Project.
Wea. Forecasting
,
19
,
73
96
.
Ebert
,
E.
,
T.
Keenan
,
J.
Bally
, and
S.
Dance
,
2005
:
Verification of operational thunderstorm nowcasts.
Proc. World Weather Research Program Symp. on Nowcasting and Very Short Range Forecasting, Toulouse, France, WMO WWRP, 8.12. [Available online at http://www.meteo.fr/cic/wsn05/DVD/index.html]
.
Eilts
,
M. D.
, and
Coauthors
,
1996
:
Severe weather warning decision support system.
Preprints, 18th Conf. on Severe Local Storms, San Francisco, CA, Amer. Meteor. Soc., 536–540
.
Feng
,
Y.
,
Y.
Wang
,
T.
Peng
, and
J.
Yan
,
2007
:
An algorithm on convective weather potential in the early rainy season over the Pearl River Delta in China.
Adv. Atmos. Sci.
,
24
,
101
110
.
Fristedt
,
B.
, and
L.
Gray
,
1997
:
A Modern Approach to Probability Theory.
Birkhäuser Boston, 756 pp
.
Germann
,
U.
, and
I.
Zawadzki
,
2004
:
Scale dependence of the predictability of precipitation from continental radar images. Part II: Probability forecasts.
J. Appl. Meteor.
,
43
,
74
89
.
Golding
,
B.
,
1998
:
Nimrod: A system for generating automated very short range forecasts.
Meteor. Appl.
,
5
,
1
16
.
Grecu
,
M.
, and
W. F.
Krajewski
,
2000
:
A large-sample investigation of statistical procedures for radar-based short-term quantitative precipitation forecasting.
J. Hydrol.
,
239
,
69
84
.
Hu
,
S.
,
S.
Gu
,
X.
Zhuang
, and
H.
Luo
,
2007
:
Automatic identification of storm cells using Doppler radars.
Acta Meteor. Sinica
,
21
,
353
365
.
Joe
,
P.
,
M.
Falla
,
P. V.
Rijn
,
L.
Stamadianos
,
T.
Falla
,
D.
Magosse
,
L.
Ing
, and
J.
Dobson
,
2003
:
Radar data processing for severe weather in the national radar project of Canada.
Preprints, 21st Conf. on Severe Local Storms, San Antonio, TX, Amer. Meteor. Soc., P4.13. [Available online at http://ams.confex.com/ams/pdfpapers/47421.pdf]
.
Kuhlman
,
K. M.
,
T. M.
Smith
,
G. J.
Stumpf
,
K. L.
Ortega
, and
K. L.
Manross
,
2008
:
Experimental probabilistic hazard information in practice: Results from the 2008 EWP spring program.
Preprints, 24th Conf. on Severe Local Storms, Savannah, GA, Amer. Meteor. Soc., 8A.2. [Available online at http://ams.confex.com/ams/pdfpapers/142027.pdf]
.
Megenhardt
,
D. L.
,
C.
Mueller
,
S.
Trier
,
D.
Ahijevych
, and
N.
Rehak
,
2004
:
NCWF-2 probabilistic forecasts.
Preprints, 11th Conf. on Aviation, Range, and Aerospace, Hyannis, MA, Amer. Meteor. Soc., 5.2. [Available online at http://ams.confex.com/ams/pdfpapers/81993.pdf]
.
Schmeits
,
M. J.
,
K. J.
Kok
,
D. H. P.
Vogelezang
, and
R. M. V.
Westrhenen
,
2008
:
Probabilistic forecasts of (severe) thunderstorms for the purpose of issuing a weather alarm in the Netherlands.
Wea. Forecasting
,
23
,
1253
1267
.
Templeton
,
J. I.
, and
T. D.
Keenan
,
1982
:
Tropical cyclone strike probability forecasting in the Australian region.
Bureau of Meteorology Tech. Rep. 49, 18 pp. [Available from Bureau of Meteorology, GPO Box 1289K, Melbourne, VIC, 3001, Australia]
.
van der Grijn
,
G.
,
2002
:
Tropical cyclone forecasting at ECMWF: New products and validation.
ECMWF Tech. Memo. 386, 13 pp
.
Weber
,
H. C.
,
2003
:
Hurricane track prediction using a statistical ensemble of numerical models.
Mon. Wea. Rev.
,
131
,
749
770
.
Wilks
,
D. S.
,
1995
:
Statistical Methods in Atmospheric Sciences.
Academic Press, 467 pp
.
Wong
,
M.
,
W.
Wong
, and
E.
Lai
,
2006
:
From SWIRLS to RAPIDS: Nowcast applications development in Hong Kong.
Proc. PWS Workshop on Warnings of Real-Time Hazards by Using Nowcasting Technology, Sydney, Australia, WMO, 9–13
.
Yu
,
X.
,
T.
Keenan
,
P.
Joe
,
J.
Wilson
,
J.
Bally
, and
Y. C.
Wang
,
2005
:
Overview of Beijing 2008 Olympics WWRP nowcast FDP scientific and implementation plan.
Proc. World Weather Research Program Symposium on Nowcasting and Very Short Range Forecasting, Toulouse, France, WMO WWRP, 7.34. [Available online at http://www.meteo.fr/cic/wsn05/DVD/index.html]
.

Footnotes

Corresponding author address: Sandy Dance, CAWCR, GPO Box 1289K, Melbourne, VIC 3001, Australia. Email: s.dance@bom.gov.au

1

To convert from Cartesian to polar coordinates, use x = r cosθ and y = r sinθ.

* A partnership between the Australian Bureau of Meteorology and CSIRO.