VOL. 10, NO. 6 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY DECEMBER 1993TITAN: Thunderstorm Identification, Tracking, Analysis, and Nowcasting--A Radar-based Methodology MICHAEL DIXON AND GERRY WIENERResearch Applications Program, National Center for Atmospheric Research, Boulder, Colorado(Manuscript received 18 May 1992, in final form 25 January 1993) ABSTRACT A methodology is presented for the real-time automated identification, tracking, and short-term forecastingof thunderstorms based on volume-scan weather radar data. The emphasis is on the concepts upon which themethodology is based. A "storm" is defined as a contiguous region exceeding thresholds for reflectivity and size.Storms defined in this way are identified at discrete time intervals. An optimization scheme is employed tomatch the storms at one time with those at the following time, with some geometric logic to deal with mergersand splits. The short-term forecast of both position and size is based on a weighted linear fit to the storm trackhistory data. The performance of the detection and forecast were evaluated for the summer 1991 season, andthe results are presented.1. Introduction In convective situations, the forecasting problemencompasses storm initiation, evolution, and movement. For storm initiation, progress has been made inthe use of data from sensitive Doppler radars to detectthose boundary-layer features that arc important forthe forecast (Wilson and Schreiber 1986 ). Meanwhile,forecasters have identified the need for an objectiveprocedure for detecting existing storms and extrapolating their evolution and movement (Wilson andMueller 1993 ). This paper deals with the developmentof such a procedure. Many of the techniques developed for the short-termforecasting of convective activity have incorporatedsome form of tracking, such as the extrapolation of themovement of features in the data. Pattern recognition(Austin 1985) uses the similarity of patterns in thedata fields at successive times to deduce movement.The cross-correlation technique (Rinehart and Garvey1978; Tuttle and Foote 1990) partitions the data fieldsinto blocks or features, and identifies the movementvector that maximizes the correlation between a featurein the latest data field and the corresponding, buttranslated, feature in the previous data field. Both thepattern recognition and cross-correlation techniquestreat the data as a two-dimensional field from whichthe movement of features may be inferred. An alternative approach is to consider storms to bedistinct three-dimensional entities that may be iden Correxponding author address: Dr. Michael J. Dixon, ResearchApplications Program, NCAR, RAP/FW, P.O. Box 3000, Boulder,CO 80307-3000.tiffed and for which physically based properties maybe computed. These entities are then tracked bymatching the storms at one time to their counterpartsat a later time. This is referred to as "centroid tracking"(Austin and Bellon 1982). The advantage of this approach is that it makes more complete use of the information available; therefore, if done correctly, centroid tracking should produce better forecasts than thetechniques based on two-dimensional data. In addition,this method provides a tool for the scientific analysisof storms as three-dimensional entities. Crane (1979) presents a method in which two-dimensional cells are identified as regions around localmaxima in the reflectivity field for a given PPI (planposition indicator). These cells are grouped into "volume cells" through the vertical association of cells insuccessive PPIs. The volume cells are tracked by estimating their velocity from past and present centroidlocations, using this velocity to forecast the new position, and searching for cells close to the forecast position. For a recently formed storm, the steering-levelwind is used as the forecast velocity. Witt and Johnson(1993) and Rosenfeld (1987) detail similar methods.In Witt's method, cells are defined as regions with reflectivity exceeding a given threshold, and the forecastvelocity is based on a linear fit to the recent centroidhistory of the storm. The starting storm velocity is anoperator-input value, which presumably would be setto the best guess for storm movement. Rosenfeld'smethod uses a reflectivity threshold to delineate astorm, and then identifies cells within the storm as regions around local maxima. The tracking technique issimilar to Crane's but is more complicated and includesa check for overlap between the actual cell positionsand their corresponding forecast locations.7851993 American Meteorological Society786 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 10 The method presented here is similar, in some respects, to those referred to above, in that storms aredefined as three-dimensional regions of reflectivity exceeding a threshold and are logically matched fromone scan time to the next. However, the identificationmethod is much simpler and is based on radar dataremapped into Cartesian coordinates, whereas themethods just referred to use data in radar coordinateswith the associated geometrical complexity. The tracking component is based on an optimal solution to thematching problem, and no assumption is made aboutinitial storm movement. Mergers and splits are identified through geometric logic about the storm positionsand shapes. Forecasts are based on a weighted linearfit to the history of both the position and size of thestorm. Throughout, the emphasis is on simplicity because many similar methods have a tendency to become overcomplicated. The system is designed to keep pace with real-timeradar data and to provide analysis and forecast resultswithin 10 s or so of the end of a volume scan. Thestorm and track data are maintained in a database thatpermits analysis of the storm and track properties. Thecapability to display these properties during real-timedata operations has been demonstrated. In addition,the track data are available for postanalysis of stormproperties and the accuracy of the forecasts. This methodology was originally developed, albeitin a somewhat simpler form, for the objective evaluation of a rain-augmentation experiment (Dixon andMather 1986). Since mid-1990, the technique has beenrefined and enhanced as part of an effort to improveand automate convective nowcasting in the Denverregion of Colorado.2. Storm identificationa. Storm definition The experimental unit is defined here as a contiguousregion, all of which exhibits reflectivities above a giventhreshold (Tz), and the volume of which exceeds athreshold (Tv). Clearly, the value of Tz determines the type of stormthat will be identified. Some possibilities are - individual convective cells, Tz '-- 40-50 dBZ, - convective storms, T~ = 30-40 dBZ, - mesoscale convective complexes, Tz = 25-30 dBZ, - snow bands, T~ = 15-25 dBZ. For this study, the experimental unit is the convective storm. To investigate the sensitivity of the methodto the reflectivity threshold, Tz was set to 30, 35, and40 dBZ. For all three cases the identification andtracking technique worked well. Obviously, the 40-dBZstorms were smaller and more intense (on average)than the 30-dBZ storms, and the lower the thresholdthe greater the number of apparent mergers. Based on the results of the initial experiments, T~was set to 35 dBZ for the final phase of the study. Itshould be borne in mind that the subject of this paperis the methodology, and that 35 dBZ was chosen as athreshold suitable for developing and evaluating themethod. We are not suggesting that this is the corrector only threshold for studying convective storms. (Incidentally, the system was tested on snow band storms,at thresholds ranging from 15 to 25 dBZ, with promising results.) The use of the volume threshold Tv is necessary toprevent the tracking of noise or small regions of residualground clutter, and to keep the number of identifiedstorms within reasonable limits. For this study, Tv wasset to 50 km3.b. Data preparation The storm identification technique may be appliedto data in the radar coordinate system of range, azimuth, and elevation. In fact, the methodology wasoriginally developed on radar-space data (Dixon andMather 1986). The geometry of a Cartesian coordinate system ismuch simpler, however, and this both aids one's conceptual understanding of the procedure and simplifiesthe computations of storm properties. Therefore, theradar coordinate data are transformed into Cartesiancoordinates, and noise and ground clutter are filteredout. The details of these operations are given in appendixes A and B.c. The identification method To identify "storms" according to the definitiongiven in section 2a, we need to find contiguous regionsthat have reflectivities above Tz. For clarity, this willbe described in two dimensions (x, y). The concept isreadily extended to incorporate the third (z) dimension. Consider Fig. 1. Assume that all of the shadedsquares represent Cartesian grid locations with reflectivity in excess of Tz. There are two steps to the procedure: 1 ) Identify contiguous sequences of points (referredto as runs) in one of the principal directions (in thiscase the x direction) for which the reflectivity exceedsT~. There are 15 such runs in Fig. 1. 2) Group runs that are adjacent. A group of runsshould contain all of the points in one storm. In thisexample, storm 1 comprises runs 1-6; storm 2 comprises runs 7, 8, and 10; storm 3 comprises runs 9, 11,13, and 14; storm 4 contains only run 12; and storm5, only run 15. Note that runs 5 and 7 are not considered adjacent since they only touch along a diagonal.The same applies to runs 12 and 15. It is likely thatstorms 4 and 5 would be rejected because they areDECEMBER 1993DIXON AND FIG. 1. Example of storm data runs--2D case. Shading indicatesgrid points where the reflectivity exceeds 2/'~. Different shades indicatedifferent storms.small. In the three-dimensional case, step 2 searchesfor adjacent runs in both y (alongside) and z (aboveor below).The advantage of this two-step approach is that it reduces the dimensionality of the problem, making itmore efficient computationally. In the case of a threedimensional grid, once the runs have been found, theidentification procedure reduces to a two-dimensionalproblem.d. Storm analysis For the purposes of the nowcasting experiment andstorm analysis, a large number of storm properties werecomputed. These are listed in appendix C. However,only the following storm properties arc relevant to thismethodology: - reflectivity-weighted centroid (-~, ~:, ~z), using Zto weight the centroid computations, - volume V, - the size and shape of the area of the storm projected onto a horizontal plane (i.e., the area as it wouldappear from directly above the storm). The shape isapproximated by an ellipse, which best fits the projectedarea as suggested by Zittel ( 1976 ) (Fig. 2). The ellipseproperties are the centroid (2e, :~e), the major and minor radii (lhajo~, r~i,o~), and the orientation of the majoraxis relative to the x axis (0). The computation of the reflectivity-weighted cen~troid and volume is straightforward. The computationof the ellipse properties is based on a principal component transformation of the (x, y) data in the projected area. This is a rotational transform that yieldsaxes along the principal components of the data (Richards 1986). In the two-dimensional case, these are themajor and minor axes, which is why the transform isappropriate for the ellipse computations. The detailsare given in appendix E.WIENER 787 FIG. 2. Computation of projected-area ellipse parameters.3. Trackinga. Matching storm sets using combinatorial optimization Figure 3 depicts the centroids and projected areasof two sets of storms, one set at time t~ and the otherat time t2, the difference At being the time taken tocollect a single volume scan (~5-10 min), There arenot necessarily the same number of storms present ateach time--in this example, there are 4 storms at t~and 5 storms at t2. The figure also shows the possiblepaths the storms may have taken during the periodbetween t~ and t2. The problem is to match the t~ storms with their t2counterparts, or equivalently to decide which set oflogically possible paths most likely is the true one. Ifthis is done for successive time intervals, the stormsmay be tracked for their entire duration. Considering the figure, one may make the followingintuitive assumptions: 1 ) The correct set will include paths that are shorterrather than longer. This is true for thunderstorms thatare observed frequently (At ~ 5 min) because the ratioPossible storm pathsPaths which are itoo long' Storms at tI O Storms at t2Fie;. 3. Possible paths between storms at consecutive time intervals.788 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME10of the size of the storm (~ 3-10-km diameter) to thedistance moved in At (~ 1-10 km) is such that it isunlikely that a storm will move well away, to have itsformer position (or one close to it) occupied by a different storm. Therefore, given a set of possible alternatives as shown in Fig. 3, the shorter the path themore likely it is to be a true one.' 2) The correct set will join storms of similar characteristics (size, shape, etc.). 3) There is an upper bound to the distance a stormwill move in At, governed by the maximum expectedspeed of storm movement (advection plus lateral development). In the figure, the paths that exceed thisupper bound are drawn as faint lines. The problem of determining the true set of stormpaths may be posed and solved as one of optimization.We search for the optimal set of paths, where this setminimizes the cost function as defined below, and weassume that the optimal set and the true set are thesame. SupPose that a storm i at tt has state S~i = (gz~i,~zli, V~~), and storm j at t2 has state S2~ = (-z2~,V2~). Suppose too that there are nl storms at tl and n2storms at t2. We may define the "cost" CO. (in units of distance)of changing state S~i to state S2~ as Co.= Wldv + w2&, where (1)dp= [(gz~i - ~z2j)2 -- (.~zli - ~z2j)2] 1/2 (assumption 1), (2) and d~ = IV13- V~31 (assumption 2). (3)Here d, is a measure of the difference in position (i.e.,the distance moved), d~ is a measure of the differencein volume (also 'in units of distance, because of thecube root), and w~ and w2 are weights (both set to 1.0for this study). Let the maximum expected storm speed be Smax (60km h-~ was used). This is a constraint on the system,and may be incorporated by settingCo. equal to a large number if dp//xt > Smax (assumption 3). (4)This will ensure that the apparent cost of such a pathis so high that it will not be included in the optimalset. We wish to find the match that minimizes the objective function Q = ~ Co., where i refers to the startpoint of a path and j the corresponding end point, andthe summation is performed over all possible sets ofstorm paths. The number of paths in the match willbe less than or equal to the minimum of n~ and n2. A problem posed in this manner can be transformedinto a weighted matching or optimal assignment problem and can be solved using techniques frgm the fieldof combinatorial optimization. The transformed spacehas size n x n, where n is the maximum ofnt and n2.The method has order O(n3). (The order of a methodis a constant multiple of the maximum number of iterations required for solution.) The optimal assignment problem can be stated inthe following manner: Given an n x n matrix CO., find an n x n matrix Xo.such that the following hold: 1 ) In any given row or column, Xo. has exactly onenonzero element and that element has the value 1. 2) The sum of Co.Xo. over all i, j is a minimum.We decided to use the Hungarian method for the solution since this method is relatively easy to implementand has orderO(p2q), where p = min(n~, n2), q = max(hi, r/2), (5)since in our case ~o. may have different numbers ofrows and columns. The Hungarian method algorithm first locates thelargest set of zero elements in Go., no two of which liein the same row or column. If there are n such zeroelements, then the positions of the XO. can be set tomatch these zero elements in C~, and the solution iscomplete. If not, the algorithm transforms the matrixCo. by adding values to appropriate rows and columnsto produce a new matrix DO.. It turns out that a solutionto the optimal assignment problem for D~ also solvesthe optimal assignment problem for GO.. If the matrixDO. does not have the n zero elements in differing rowsand columns, the algorithm will then continue totransform DO.. The algorithm guarantees that a solutionwill be found after a finite number of transformations. The Hungarian method is both complicated andsubtle. The short explanation given here is intendedmerely to introduce the reader to the method. Roberts(1984) gives a simple introduction to the method, andLawler (1976) provides more detail, including information on the computer application of the method.b. Handling mergers and splits Quite frequently two or more convective storms willmerge to form a single storm, and somewhat less frequently a single storm will split into two or morestorms. This is particularly true of "storms" as theyare defined in this paper. If the region between twostorms exceeds the reflectivity threshold for a shorttime, the storms will appear to merge and then splitsoon thereafter. The result of applying the matching scheme just described to mergers and splits is as follows: - Merger--a maximum of one track will be extended, and the remainder will be terminated. - Split--a maximum of one track will be extended,DECEMBER 1993 DIXON AND WIENER 789FIG. 4. Stormmerger.Actual track vectorForecast track vectorand new tracks will be created for the unmatchedstorms. It is necessary to enhance the tracking scheme tohandle these situations correctly. For example, considerFig. 4, which shows the merger of three storms. Thefirst step is to apply the matching algorithm as detailedabove, Perhaps one track will be extcndcd, and it mayhappen that none is extended. The latter occurs whenthe apparent movement of the centroid from the unmerged to merged situation is so great that the maximum speed constraint is violated. Then, we searchthrough the storms at t~ for those storms that wereterminated at t2 by the matching algorithm. For eachof these tracks we are able to make a forecast of thecentroid position at t2 using the technique detailed insection 4. If this forecast position falls within the projected area of a storm at t2, we conclude that the tlstorm did not terminate but rather merged to form thet_~ storm. The splitting situation is treated similarly (Fig. 5 ).In this case, for all the storms at t~, we forecast theposition, shape, and size of the projected-area ellipseat [2. Then we consider all those storms at t2 that areapparently new tracks, that is, have no history. If sucha storm has a centroid located within the forecast projected-area ellipse of a storm, we conclude that a splithas taken place.4, Short-term forecastinga. Methodology In considering how to formulate the storm forecast'algorithm, we make the following assumptions: - A storm tends to move along a straight line. - Storm growth or decay follows a linear trend. - Random departures from the above behavior occur. Forecasts are made for a number of parameters(listed in appendix D). The forecasts of importance tothe tracking technique are reflectivity-weighted centroid, storm volume, and the parameters of the projected-area ellipse- When a storm is observed for the first time, it hasno history from which to make a forecast. In this case,all rates of change are assumed to be zero, and a persistence forecast is made. Forecasts at all later timesare based on a linear trend model with double exponential smoothing (Abraham and Ledolter 1983).Simply stated, this is a linear regression model, in whichthe past values are weighted with exponentially decreasing weights. Consider the time series Pi for a given storm parameter p, where (i = 0) is the present, (i = 1 ) is one timestep in the past, and so on, and i ranges from 0 to- 1, where nt is the maximum number of time pointsconsidered relevant to the forecast- Let ti be a measureof the time, for example, the number of seconds sincethe start of operations, and wi be a weight associatedwith time step i. For the exponentially smoothed model with parameter a, wi = a~, where 0 < a ~< 1. A linear regressionis performed between wiPi and ti. Figure 6 presents asimple example. The linear fit yields the equation of astraight line. The slope of the line is the forecast rateof change for the parameter p. It is assumed that thecurrent value is correct, and the forecast is based onthe current value and the forecast rate of change- So if p0 is the current value, and dp/dt is the estimated rate of change, then Pt=Po + \~-] - (6)For the forecast of the projected-area ellipse, it is assumed that the aspect ratio Fmajor/Fminor and orientation0 remain constant. The forecast of the area A is based Actual track vector O Forecast track vectorForecast ellipse positionFIG. 5. Storm split.790 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME l0-- Time history data i=2 i=i........ Weighted '1 fit~i.:.-- ..no/'-'--'' 'FIG. 6. Forecast based on weighted history.on the rate of change of volume rather than area, sincethe volume varies more smoothly with time than doesthe area, and therefore provides a less erratic forecast.So, A, = A0 + . (7)For this study, the parameters used were nt = 6 and a= 0.5, and At was typically 6 min. For the type ofstorms studied, the forecast accuracy proved to be insensitive to a (Table 5 ).b. Handling mergers and splits The forecast depends on the recent storm history.Therefore, when a merger or split occurs, the historymust be combined or split accordingly. Let us first deal with the positional history. Considerthe merger depicted in Fig. 4. The positional historyof the merged track is a combination of the historiesof the three parent tracks. First, the parent track histories are translated in (x, y) so that their forecast positions coincide with the centroid after the merger (Fig.7). These translated histories are then combined as aweighted average, where the weights are the ratio ofstorm volume for each parent to the sum of the volumesof all parents. Clearly the weights change at each timein the history, depending on the size history of each ofthe parents. In the case of a split, the history of eachchild is a copy of the history of the parent, translatedto coincide with the centroid of that child (Fig. 8). Next consider the storm size parameters, such asarea, volume, and mass. In the merger case, the historyof a parameter is computed as the sum of the historiesof the parents. In the split case, the history for a childis computed as the history for the parent scaled by theratio of the volume of that child storm to the sum ofthe volumes of all of the children.c. Evaluation To evaluate the forecasts, both the forecast stormposition (ellipse) and the "truth" (the actual radar r ~ ~) Histories before merger, with forecast. ............... . ................. , ................. ,,- Histories after translation in (x, y). ~ ~ ~ Merged weighted-average history.FIG. 7. Positional history translation for storm merger.echoes at the forecast time) are mapped onto a 5-kmx 5-km grid. A grid point is considered "active" if anyradar point in the area around that grid point exceedsthe storm reflectivity threshold Tz. The contingency table approach (Donaldson et al.1975; Stanski et al. 1989) is used. The following definitions apply: - Success--both truth and forecast grid points active. - Failure--truth grid point active and forecast gridpoint inactive. - False alarm--truth grid point inactive and forecastgrid point active. The probability of detection (POD), false-alarm ratio (FAR), and critical success index (CSI) are computed as follows: nsuccess POD = , (8) //success nc nfailureHistory before split, with forecast--~ ....... ~ "'~ History translated in (x, y)FIG. 8. Positional history translation for storm split.DECEMBER 1993 DIXON AND WIENER 791 F/false alarmFAR = , (9) F/success -- F/false alarmCSI = nsucccss (10) F/success "~ F/failure -- F/false alarmThe forecast results are presented in section 5.5. Discussion and results The data presented in this section are intended toshow that the method works and to give the reader afeel for the type of results produced by the forecastingsystcm. The analyses include data from all radar ranges(0-150 km), and no attempt was made to discriminatebetween storms in the mountains and those on theplains. A more detailed treatment of the results will bethe subject of a later paper. The system was run using real-time data from theMile~High Radar near Denver for the summer of 1991,from 29 May to 29 August. Operations were generallylimited to the hours from 1100 to 1900 MDT. Theradar is a prototype similar to the WSR-88D (Pratteet al. 1991 ). Figure 9 presents a typical plot of a storm track,showing the recent storm history, the present position,and the forecast. A total of almost 4100 tracks were identified. Ofthese, 2000 were discarded for one or more of the following reasons: - The storm existed at the beginning or end of radaroperations, and therefore either the beginning or endof the track data were missing. - The storm came too close to the radar for completeobservation of the tops, or moved out of radar range(>150 km). - The storm was observed during only one volumescan, resulting in a trivial track. This was the mostcommon reason for rejection. Of the 2100 or so "good" tracks remaining, 12%contained mergers or splits. Figure 10 presents the relationship between the duration and mean volume foreach track, and indicates a positive correlation betweenduration and volume. A few tracks in Fig. 10 have short durations andlarge mean volumes--these result from those cases inwhich the tracking algorithm fails to detect a mergeror split and therefore discontinues the track for a mature storm and starts a new track for the same maturestorm. If such a split follows a merger (or vice versa)the result may be a short track with a large mean volume. These failures occur for storms of very irregularshape because an ellipse does not fit the boundary welland the search for mergers and splits is confined to theinterior of the ellipse. An improved algorithm that will' 1991/08/24 21:45:363(kin) Tracks to 21:45:35 Tz 35 Forecast 0.5(hr)[')G. 9. Example of track plot.~i!1 (Dbz) 80 70 65 6o 57 54 51 48 45 42 39 36 33 3O 20 ::::::::::::::::::::::::::: ~0 - - past z~ forecast O past ~ current c_-~jD forecast792 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 10Mean volume(103 km3) 1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 0.00 2.00 4.00 6.00 Duration (h)FIG. 10. Scatterplot of mean storm volume versus track duration.describe the storm shape as a polygon rather than anellipse is being tested, and indications are that this willdo better for the identification of mergers of irregularlyshaped storms. Figures 11 and 12 show truncated histograms of theduration and mean volume of the storm tracks. Theupper tails of the distributions, which have small values,have been truncated. From the figures it is clear thata large percentage of the storms are small, with a meanvolume of less than 400 km3 and a duration of lessthan 2 h. The zero entry for the first histogram intervalin Fig. 12 is caused by the volume threshold Tv, whichwas set to 50 km3. %35.0030.0025.0020.0015.0010.00 5.00 0.00 0.00 1.00 2.00 3.00 Duration (h)FIG. 11. Truncated histogram of storm-track duration. %35.0030.0025.0020.0015.0010.00 5.00 0.00 0.00 200.00 400.00 600.00 Mean volume (103 km3)FIG. 12. Truncated histogram of mean storm volume per track. The evaluation of the forecast accuracy was performed in a number of different ways, with minor variations between each. The simplest and most stringentapproach is to consider the data on a volume-by-volume basis and compare the forecast positions of all ofthe storms with their actual positions as detected later.Table 1 presents the results of this analysis. Note thePOD value of 0.91 at the forecast time of 0 min (i.e.,storm detection). This indicates that on average 91%of the radar projected area falls inside the ellipse, and9% outside, since the ellipse area is set equal to theradar-measured projected area. It was found that thedaily mean for this POD was reasonably constant atabout 90%. By way of comparison, Table 2 shows the 30-minforecast evaluation results from a nowcasting experiment performed in the Denver region of Colorado forthe 1989 and 1990 summer seasons (Wilson andMueller 1993). The evaluation is also based on a 5km x 5-km grid. The "human" forecasts incorporateboth the extrapolation of existing storms and the forecast initiation of new storms. The data in Tables 1 andTABLE 1. Forecast evaluation--volume-by-volume analysis.Forecast lead time (min) POD FAR CSI0 0.91 0.13 0.806 0.76 0.28 0.5912 0.64 0.40 0.4518 0.55 0.48 0.3624 0.48 0.56 0.3030 0.42 0.62 0.251-)FCEMBER 1993 DIXON AND WIENER 793TABLE 2. Forecast evaluation--Denver nowcasting experiment; forecast lead time 30 rain. TABLE 4. Forecast evaluation--track-by-track analysis, with minimum required history.Forecast Minimumlead time history(min) (min) POD FAR CSI 6 3 0.83 0.27 0.64 12 6 0.78 0.39 0.52 18 9 0.73 0.50 0.42 24 12 0.68 0.56 0.36 30 15 0.63 0.62 0.31 Forecast type and date POD FAR CSIHuman, 1989 0.62 0.68 0.27Pcrsistcncc, 19890.27 0.63 0.19Human, 1990 0.55 0.85 0.14Extrapolation only, 1990 0.15 0.75 0.10Persistence, t9900.11 0T84 0.072 show that the accuracy of the automated forecasts iscomparable with that of human forecasts. The follow~ing differences between the analyses should be noted: - This method excludes all storms with a volumeoflcss than 50 km3, while the nowcasting experimentincluded all detectable storms. - This method system used a 35-dBZ threshold forstorm definition. The nowcasting experiment thresholdwas 30 dBZ for t989 and 40 dBZ for 1990. - This method only extrapolates existing storms,whereas the human forecasters attempted to handleinitiation as well. A problem with the preceding evaluation is that theanalysis on a volumeNby-volume basis includes allstorms, even if they are too young for a forecast to beapplicable to them. For example, if a storm is 15 minold, it is not possible for a 30-min forecast to havepredicted its existence, since initiation is not dealt with.This complicates the evaluation by including cases forwhich the technique is not designed. If the aim is toanalyze the method with a view to improvement, amore relevant approach is to perform the analysis ona track-by-track basis and include only those cases inwhich thc storms are old enough to have been forecast,that is, for which the history exceeds the forecast leadtime. Table 3 presents the results of the track-by-trackanalysis. As a further condition on the analysis, one mayevaluate only those cases in which the storms have sufficient history for one to reasonably expect the forecastto be accurate. The criterion used here is that the stormhistory must be at least half as long as the forecast leadtime. Table 4 presents the results for this analysis. Table 5 presents an analysis on the sensitivity of theforecast accuracy to the parameter a. Clearly, the accuracy is not very sensitive to the value of a, and the TABLE 3. Forecast evaluation--track-by-track analysis.Forecast lead time (min) POD FAR CSI 6 0.83 0.30 0.61 12 0.76 0.43 0.48 18 0.70 0.53 0.39 24 0.64 0.62 0.32 30 0.59 0.68 0.26optimum lies between 0.25 and 0.75. An investigationwas also carried out to determine the optimum valuefor nt, the number of history volumes used in theforecast. An r/t value of 6 seems suitable for allvalues of a. Tables 4 and 5 summarize the statistics for all of thetracks together. It is also useful to consider the forecastaccuracy for individual tracks. We therefore analyzedeach track, comparing the forecast storm locations withthose observed for that track, and computed the POD,FAR, and CSI values averaged over the lifetime of thestorm. Figures 13, 14, and 15 present scatterplots ofPOD, FAR, and CSI versus storm-track duration fora forecast lead time of 30 min and a minimum historyof 15 min. Because of these constraints, the minimumduration for any track in the plots is 45 min. Theseplots show the range of forecast results that occur forstorms of different durations. The scatter is large, indicating that the behavior of storms varies significantlyand that the statistical forecasting model presented hereperforms much better in some cases than others.6. Future enhancements The method would benefit from the improvementsdiscussed in this section.a. Storm identification and tracking at multiple threshold levels Some of the features that could be tracked usingmultiple threshold levels include - individual cells within a convective storm, - individual convective storms within a squall line, - cellular features within a snow band.TABLE 5. Sensitivity analysis for a--track-by-track analysis, with minimum required history, rtt = 6.Forecast Minimumlead time history(min) (min) a POD FAR CSI 30 15 0.26 0.617 0.620 0.307 30 15 0.50 0.630 0.620 0.310 30 15 0.75 0.628 0.630 0.303 30 15 1.00 0.625 0.641 0.298794 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 10POD1.000.900.800.700.600.500.400.300.200.100.00 0.00 2.00 4.00 6.00 Storm track duration (h)FIG. 13. Scatterplot of mean track POD versus storm-trackduration; forecast lead time 30 min, minimum history 15 min.The identification of features using a higher reflectivitythreshold, Tz+, is relatively simple once the identification at Tz is complete. The higher-intensity echo willbe contained within the runs identified for the largerless-intense feature, and the search may be confined tothose runs. Therefore the T:+ search does not add significantly to the computations.FAR 0.00 2.00 4.00 6.00 Storm track duration (h)FIG. 14. Scatterplot of mean track FAR versus storm-trackduration; forecast lead time 30 min, minimum history 15 min.can be adequately performed on two workstations inthe 10-15-MIPS (million instructions per second)class, and from initial tests it appears that a singleworkstation in the 20-MIPS class would be able to perform all of the functions.b. Incorporation of a more detailed shaperepresentation for the projected areaThe possibility of using shapes other than ellipses todepict the projected area is currently under investiga- CS!tion. The ellipse was chosen as the initial candidate 1.00because of its simplicity. Other potential shape categories include O.90- arbitrary curvilinear shapes, 0.80- convex polygons,- arbitrary polygons. 0.70 0.60c. Using additional storm properties to sharpen the forecast 0.50The storms and their tracks have well-defined prop- 0.40erties. It seems probable that some of these containinformation that could be used to amend the forecast. 0.3OFor example, a multivariate correlation analysis be- 0.20tween storm volume and other properties, the volumebeing lagged in time, could yield a lagged linear model o. 10for volume forecasts. 0.007. Hardware requirements This system is data driven in that it needs to keeppace with a real-time radar data stream. It was foundthat all of the processing required, including display,8. Conclusions This methodology provides the framework necessaryto identify storms within three-dimensional radar data 0.00 2.00 4.00 6.00 Storm track duration (h)FlG. 15. Scatterplot of mean track CSI vs storm-track duration; forecast lead time 30 min, minimum history 15 rain.DECEMBER 1993 DIXON AND WIENER 795and to track them as physical entities. The storm andtrack data are suitable for scientific analysis, for thepurposes of both understanding and forecasting thephysics of storm development and movement. The method was successfully applied during realtime operations for a summer season in Colorado, andthe human observers felt that typically it performedwell. The accuracy of the forecasts is encouraging andis comparable with that of human-based forecasts thattake additional factors such as low-level convergenceand storm initiation into account. The intention is touse the system to assist forecasters in future field projects, who will in turn provide feedback on how wellthe system performs, and what enhancements shouldbe made. APPENDIX A Cartesian Transformation and Noise Filtering Thc Cartesian transformation is performed using the"nearest neighbor" principle with no interpolation. Itis assumed that the radar will be operated using a fixedscan strategy. For each point in the target Cartesiangrid, the coordinates of the closest radar point arecomputed and stored in a table. This table is then inverted, so that the target Cartesian locations for eachradar point are known. When a radar beam is processed, the gate data are placed directly into the appropriate place in the Cartesian grid. Typically, thereare radar points to which multiple Cartesian pointscorrespond (long ranges, radar undersampling) andradar points with no corresponding Cartesian point(short ranges, radar oversampling). During the transformation, noise suppression is carded out. A signal-to-noise threshold T,~, (dBm), is set.Any radar point with a signalMto-noise value below Ts,is considered to contain missing data. This removesmuch of the noise, but leaves some "spotty" areaswhere the noise spikes exceed Ts,~. Such spikes couldbc caused by receiver noise, or point targets such asbirds, aircraft, and ground targets. Removal of thesespots is accomplished in a second step. Let Lmin be theminimum dimension for any feature considered valid.The data from each beam are searched for runs of dataabove Ts, but with a length less than Lmln. Such runsare flagged as missing data. For this study, Ts, was setto t0.0 dB, and Lmin to 1.2 km.APPENDIX BClutter Removal Ground clutter presents a problem if the size of aregion of clutter exceeds Vm~n. In the case of the MileHigh Radar near Denver, which provided the data forthis study, the Rocky Mountains cause significant clutter regions. The clutter map is computed from a number ofCartesian volumes (at least 20) sampled during a periodof no significant weather. The signal-to-noise ratioTs~_c~uuer applied to these volumes is set somewhat lowerthan Ts,. The reason for this is to include clutter pointsthat are borderline, and that may exceed Tsn only someof the time. The clutter value for a Cartesian grid pointis computed as the median of the reflectivity values atthat grid point for all of the clear-air scans. This clutter map is then used to filter out clutterduring the Cartesian transformation stage. A grid pointis considered to contain clutter if the reflectivity doesnot exceed the clutter map value plus some cluttermargin. The relevant values used for this study were rsn_clutter= 4.0 dB with a clutter margin of 6.0 dB. This clutter removal methodology is based on Hynek(1990), and this reference should be consulted for further details. APPENDIX C Computed Storm Parameters The following storm parameters were computed: * centroid for whole storm and for each plane, - reflectivity-weighted centroid for whole storm andfor each plane, - top, - base, - volume, area for each plane, and mean area, - mass of precipitation for whole storm and for eachplane (based on Z-M relationship), - rain flux (based on Z-R relationship), - angle and direction of tilt, - max and mean reflectivity for whole storm andfor each plane, - height of max reflectivity, - estimate of vorticity about a vertical axis throughthe storm centroid (based on circular storm model)for whole storm and for each plane, - mean and standard deviation of velocity for wholestorm and for each plane, - mean and standard deviation of spectral width forwhole storm and for each plane, - position, size, and shape of rain region (lowestplane), - position, size, and shape of projected area, - histogram of reflectivity as function of volume, - histogram of reflectivity as function of area. APPENDIX D Forecast ParametersThe following forecast parameters were computed:- centroid,- reflectivity-weighted centroid,- top,- base,- volume,796 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 10 - mean area, - mass of precipitation (based on Z-M relationship), - rain flux (based on Z-R relationship), - rain area, - projected area. APPENDIX E Computation of Rotated Ellipse Parameters The parameters of the rotated ellipse fitted to theprojected area are derived from the parameters of aprincipal component transformation applied to the (x,y) data pairs that make up the projected area. Refer again to Fig. 2. Suppose there are n (x, y)pairs, each of which represent a grid point in the projected area of the storm. Then in 1n .,~ = - ~ xi, j7 = - ~ Yi. (El) /'/ i=1 F/ i=1An estimate of the covariance matrix of the (x, y) datais given by cov,~y=[ed :], where (E2) 1 ~ d = -- Y. (xi --)2, (E3) n--li=~ 1 ~ e = -- ~ (x, - g)(y, - y), (E4) n-- li=~ 1 ' f= n - 1 ~ (Yi - ~)2. (E5) i=1The principal component transformation is basedon an eigenvalue-eigenvector analysis of the covariancematrix. For computer implementations, the best approach is to use a general purpose numerical eigenvector solver because these take care of all of the specialcases that may arise, and that are data dependent.However, for completeness, we will include the equations that describe the two-dimensional analysis as itis applied to the ellipse problem.The eigenvalues of the covariance matrix are givenby (d +f) + [(d +f)2 _ 4(df- e2)] 1/2 X~, X2 -- 2 '(E6) where X~ is the larger of the two eigenvalues. Here X~ represents the variance of the data in the u direction, and X2 the variance in the v direction. Therefore, O'major -~ ~xl/2, rrminor = X~1/2, (E7) where amajor and aminor are the standard deviation of the data in the u and v directions, respectively. The normalized (t~, v) eigenvector in (u, v) coordinates associated with Xl is given by u = (1 + g2) # -gv, where (ES) f+ e- X~ g = d + e- Xl (E9)The ellipse properties are computed as follows. Thecentroid position is given by the mean of the (x, y)data (g',,, jTe) = (g, ~). (EIO)The rotation 0 of the ellipse major axis relative to thex axis is given byThe area of the storm, A, is ~ven by A = ndxdy, (El2)where dx and dy are the Ca~esian ghd spacing in xand y, respectively. We set the ellipse area to be equal to the storm area.The major and minor radii of the ellipse are thereforegiven by Fminor = Gminor~ ~ J , ~ ~ Gmajorgminor ] rmajor = ffmajor[ ~ / - (E 13) [ ~majorGminor ] REFERENCESAbraham, B., J. Ledolter, 1983: Statistical Methods for Forecasting.Wiley, 445 pp.Austin, G. L., 1985: Application of pattern-recognition and extrap olation techniques to forecasting. Eur. Space Agency d., 9, 147 155. , and A. Bellon, 1982: Very-short-range forecasting of precipi tation by the objective extrapolation of radar and satellite data. Nowcasting, A. K. Browning, Ed., Academic Press, 177-190.Crane, R. K., 1979: Automatic cell detection and tracking. IEEE Trans. Geosci. Electron., GE-17, 250-262.Dixon, M. J., and G. K. Mather, 1986: Radar evaluation of a ran domized rain-augmentation experiment--Some preliminary results. Preprints, lOlh Conf on Planned and Inadvertent Weather Modification, Arlington, Virginia, Amer. Meteor. Soc., 139-141.Donaldson, R. J., R. M. Dyer, and M. J. Kraus, 1975: An objective evaluation of techniques for predicting severe weather events. Preprints, 9th Conf on Severe Local Storms, Norman, Okla homa, Amer. Meteor. Soc., 321-326.Hynek, D. P., 1990: Use of clutter residue editing maps during the Denver 1988 Terminal Doppler Weather Radar (TDWR) tests. Project Rep. ATC-169, MIT Lincoln Laboratory, Lexington, MA, 65 pp.Lawler, E. L., 1976: Combinatorial Optimization: Networks and Ma troids. Holt, Rinehart and Winston, 201-207.Pratte, J. F., J. H. Van Andel, D. G. Ferraro, R. W. Gagnon, S. M. Maher, G. L. Blair, 1991: NCAR's mile high meteorologicalDECEMBER 1993 DIXON AND WIENER 797 radar. Preprints, 25th Int. Conf on Radar Meteorology, Paris, France, Amer. Meteor. Soc., 863-866.Richards, J. A., 1986: Remote Sensing Digital Image Analysis--An Introduction. Springer-Verlag, 127- t 42.Rinehart, R. E., and E. T. Garvey, 1978: Three-dimensional storm motion detection by conventional weather radar. Nature, 273, 287-289.Robe,s, F. S., 1984: Applied Combinator[cs. Prentice-Hall Inc., 565 568.Rosenfeld, D., 1987: Objective method for analysis and tracking of convective cells as seen by radar. J. Atmos. Oceanic Technol., 4, 422-434.Stanski, H. R., L. J. Wilson, and W. R. Burrows, 1989: Survey of common verification methods in meteorology, World Weather Watch Tech. Rep. No. 8, World Meteorological Organization, Geneva, Switzerland, 114 pp.Tuttle, J. D., and G. B. Foote, 1990: Determination of the boundary layer airflow from a single Doppler radar. J. Atmos. Oceanic Technol., 7, 218-232.Wilson, J. W., and W. E. Schreiber, 1986: Initiation of convective storms at radar-observed boundary-layer convergence lines. Mon. Wea. Rev., 114, 2516-2536.--., and C. K. Mueller, 1993: Nowcasts of thunderstorm initiation and evolution. Wea. Forecast., 8, 113-131.Witt, A., and J. T. Johnson, 1993: An enhanced storm cell identifi cation and tracking algorithm. Preprints, 26th Int. Conf. on Ra dar Meteorology, Norman, Oklahoma, Amer. Meteor. Soc., in press.Zittel, W. D., 1976: Computer applications and techniques for storm tracking and warning. Preprints, 17th Int. Conf on Radar Me teorology, Seattle, Amer. Meteor. Soc., 514-521.
Abstract
A methodology is presented for the real-time automated identification, tracking, and short-term forecasting of thunderstorms based on volume-scan weather radar data. The emphasis is on the concepts upon which the methodology is based. A “storm” is defined as a contiguous region exceeding thresholds for reflectivity and size. Storms defined in this way are identified at discrete time intervals. An optimization scheme is employed to match the storms at one time with those at the following time, with some geometric logic to deal with mergers and splits. The short-term forecast of both position and size is based on a weighted linear fit to the storm track history data. The performance of the detection and forecast were evaluated for the summer 1991 season, and the results are presented.