## Introduction

The term “genetic algorithm” (GA) is applied to any search or optimization algorithm that is based on Darwinian principles of natural selection. A key concept in genetic algorithms is that of a chromosome. A chromosome contains a group of numbers that completely specifies a candidate during the optimization process. For example, when finding the root of a polynomial, the candidates are complex numbers. One choice of chromosome could consist of two numbers—the real part and the imaginary part—to specify completely a candidate.^{1} One can “cross over” two chromosomes (the “parents”) to get one or more “children.” The numbers that specify a child are then taken from one or the other parent. For example, a child of two parents 3 + 2*i* and 5 + 7*i* could be 3 + 7*i.* A single chromosome can be cloned, a process known as reproduction. In this case, the numbers that specify a child are identical to those of the parent; 3 + 2*i* after reproduction yields another chromosome that represents the candidate root 3 + 2*i.* A chromosome can also be “mutated.” In this process, the numbers corresponding to the chromosome are perturbed slightly, to yield somewhat different numbers. A chromosome representing 2 + 3*i,* after mutation, could represent 2 + 3.07*i.* Typically, genetic algorithms use crossover, mutation, and reproduction to provide structure to a random search.

Genetic algorithms use randomization heavily in choosing individuals (or “genetic” parts of individuals) who will be propagated to future generations. In the work described in this paper, we used a modified version of the simple genetic algorithm described in Goldberg (1989).

### The bounded weak echo region algorithm

Bounded weak echo regions (BWERs) are radar features that are often associated with supercell thunderstorms. A BWER is a region of relatively low radar reflectivity that extends upward into, and is surrounded by, higher reflectivities aloft. This radar signature is usually indicative of a high-speed updraft. For a detailed description of the BWER algorithm including the features considered and how they are computed, the reader is directed to Lakshmanan and Witt (1997).

The BWER algorithm first computes both two-dimensional and three-dimensional features of candidate regions by filtering the radar elevation scans. The features associated with a candidate are used as inputs to a fuzzy rule base. The rule base then provides a confidence estimate that the candidate region is a BWER. In the work described in this paper, the second part of the BWER algorithm—obtaining a confidence estimate of a candidate BWER from a set of features using a fuzzy rule base—is optimized.

The first stage of the BWER algorithm (see Fig. 1) is an image processing stage in which the radar elevation scans are preprocessed and filtered to yield regions of low reflectivity adjoining regions of high reflectivity. The regions in each radar elevation scan are labeled at this stage. Labeled regions from successive elevation scans of the radar are stacked vertically together to form a three-dimensional (3D) set of regions. This 3D set along with the 3D set of original Cartesian grids similarly stacked is used for further processing.

Various measures are computed from the properties of these 3D sets. For example, the degree to which a region is capped is obtained from four fuzzy sets that are evaluated for the region:

the degree to which there are many pixels above this region with reflectivities greater than 45 dB

*Z,*the degree to which there are fewer pixels with reflectivities greater than 45 dB

*Z*below this region than there are above it,the degree to which the average reflectivity above the region is higher than the average reflectivity within the region, and

the degree to which the average reflectivity below the region is lower than the average reflectivity above this region.

*Z*capping extent correspond either to the region itself or to one of the regions above/below it and from which it can inherit attributes (see Fig. 2).

All computed attributes of a region are carried over to the optimization stage. It is over this second stage (the second row of diagrams in Fig. 1) that we will optimize the membership functions of the fuzzy sets corresponding to each of the attributes.

## The genetic algorithm

A typical GA does its tuning in stages called generations. Usually, the average fitness of individuals will increase with each generation, through the process of natural selection. We start with a random collection of individuals. In each successive generation, individuals with bad genes are weeded out while those with good genes propagate their genetic code. The genetic code that determines the fitness of an individual is termed, logically enough, the chromosome of that individual. Given a chromosome, the GA should be able to ascertain its fitness. In our case, this is done by performing the BWER detection analysis on all the truthed cases in the verification database using the chromosome and finding the skill score of the resulting detections.

In our GA, each chromosome consists of a fixed number of genes. Although there are applications that use chromosomes of varying length (see, e.g., Koza 1992), it is overkill in weather detection algorithms where we already know the form of the solution or, at least, the shape of the membership functions of all the fuzzy sets. Since we can identify the closed list of features that we wish to utilize in our solution, a chromosome that assigns a gene to each feature will suffice.

### Fuzzy membership functions

The “features” in the weather detection algorithm or its rule base translate into the GA’s genes. Thus, each gene represents a fuzzy set. For example, the first gene in the BWER algorithm represents the attribute of a candidate region that the reflectivity value within a BWER is low. The algorithm designer does not need to specify hard thresholds. It is the job of the GA to sort out how low “low” really is. Naturally, the fuzzy membership functions constrain the GA—the GA cannot change the *form* of the solution, only its degree. Thus, this method of combining a fuzzy rule base with a genetic algorithm is useful when there is a lot of empirical knowledge that can go into the algorithm, but no extensive statistical information.

Associated with each gene is a membership function based on the value of a single feature. There are three kinds of membership functions in the BWER algorithm:an attribute of a candidate should be low, should be high, or should be about some value. All the empirical knowledge about BWERs can be described using this family of membership functions. For each of the function types, two parameters need to be specified. These parameters, *x*_{1} and *x*_{2} in Fig. 3, along with the shape of the membership function, describe the fuzzy set completely. For any value of the attribute, the degree to which the property holds can be obtained by looking at the graph. If *x*_{1} and *x*_{2} change, the extent to which the property holds for various candidates will change.

For example, if the minimum radar reflectivity value within a candidate region is 30 dB*Z,* then if *x*_{1} and *x*_{2} are 10 and 40 dB*Z,* respectively, the minimum reflectivity value within the region is low to the extent 0.33, as can be ascertained from Fig. 3a. Had *x*_{1} and *x*_{2} changed, to 20 and 30 dB*Z,* then the same property for the same region would be satisfied to a 1.0-degree extent. Had the membership function corresponded to the idea that the minimum reflectivity value is high (with *x*_{1} and *x*_{2} being 10 and 40 dB*Z*, respectively), then the rule would have been satisfied to a 0.67-degree extent (from Fig. 3b). If the property were that the minimum reflectivity value is neither too high nor too low (an “is about” condition; see Fig. 3c), it would have been satisfied to a 0.5-degree extent.

The genetic algorithm will change the values of *x*_{1} and *x*_{2} associated with each gene while keeping the forms of the membership functions for each feature constant to come up with different chromosomes. The better fit chromosomes will be those that have values of *x*_{1} and *x*_{2} that correspond to the BWERs in the verification database. Of course, the GA will attempt to fit the *x*_{1}s and *x*_{2}s for all the fuzzy sets together in an attempt to raise the skill of the algorithm.

For each candidate region that the BWER detection algorithm should classify, the value of each feature is computed. Using that value, the extent to which various fuzzy sets hold are computed. First, fuzzy sets that depend on a single feature are computed. Then, fuzzy rules that use several features are evaluated. Fuzzy values are combined using the standard methods and operators introduced by Zadeh (1965) and studied by Voxman and Goetschel (1983). The inverse, *A**A,* has the membership function *f*_{A}*a*) = 1 − *f*_{A}(*a*), where *f*_{A}(*a*) is the membership of element *a* in the set A. The membership of an element in the union of two sets (*A* ∪ *B*) is given by the maximum of memberships in the two sets (*f*_{A} ∨ *f*_{B}). This represents a logical OR. The membership of an element in the intersection of two sets (*A* ∩ *B*) is given by the minimum of the two sets’ memberships (*f*_{A} ∧ *f*_{B}) and represents a logical AND.

For example, one of the rules in the BWER rule base is that the candidate region is said to have a good two-dimensional profile if the minimum reflectivity value in the region is low or the bounding pixels have a high reflectivity value. If the candidate region has a minimum reflectivity value inside of 25 dB*Z* and is bounded by pixels having an average bounding value of 45 dB*Z,* then if the *x*_{1}s and *x*_{2}s for the two rules are 20, 40, 40, and 60 dB*Z,* respectively, the first rule (that the reflectivity value inside the region is low) is satisfied to the extent 0.75 and the second rule is satisfied to the extent 0.25. Since a logical OR is given by the maximum of the individual fuzzy values, the combination of the two rules holds to the extent 0.75. Both the rules (a logical AND) would hold to only a 0.25-degree extent.

The fuzzy rules are aggregated and a confidence estimate that the candidate region is a BWER is arrived at. This is the output of the BWER algorithm for that candidate region. This is done for each candidate region in the verification database. The set of candidates that have been deemed BWERs is then compared with the truthed BWER set.

### Chromosome generation

The first generation of the GA consists entirely of randomly generated chromosomes. Every randomly generated gene in each of the randomly generated chromosomes consisted of an *x*_{1} generated with uniform probability in the range [0, 1) and *x*_{2} generated similarly but set to *x*_{1} if *x*_{2} < *x*_{1}.

We considered two choices for the generation of *x*_{2} (subject to the condition that *x*_{1} ⩽ *x*_{2}).

- Generating
*x*_{2}uniformly in the range [*x*_{1}, 1). If*x*_{2}were generated this way, the median value of*x*_{2}would be 0.8 and its probability distribution function would be given by^{2}*P*_{x2}*x**x**x* - Generating
*x*_{2}uniformly in the range [0, 1) but setting it to be equal to*x*_{1}if the generated*x*_{2}is lesser than*x*_{1}. The probability distribution function of*x*_{2}, whose median value is 0.7, is then given by*P*_{x2}*x**x**x*

*x*

_{1}=

*x*

_{2}) is 0.5. On average, half the transitions in a randomly generated individual are crisp ones. In the BWER detection algorithm, however, noncrisp transitions create individuals that are more fit. After several generations, few crisp transitions survive in the population—a rather roundabout affirmation of our fuzzy logic approach.

*x*

_{1}and

*x*

_{2}. The actual values for

*x*

_{1}and

*x*

_{2}are determined by the real-world values that the fuzzy set depends on. In addition, we discretize the search space into bins that are not so small that the problem becomes overfit and not so large that optimization does not happen. In the BWER algorithm, we used a quantization level

*x*

_{quant}of 100. For example, if our fuzzy set relies on a radar reflectivity value, we will set the minimum value of

*x*

_{1}and

*x*

_{2}to be −5 dB

*Z*and the maximum to 60 dB

*Z*and discretize the values that

*x*

_{1}and

*x*

_{2}can take in intervals of 0.65 dB

*Z.*This does not mean that the underlying variable cannot take a value of, say, 65 dB

*Z,*it simply means that the fuzzy set will treat all values of radar reflectivity above 60 dB

*Z*the same way it treats a radar reflectivity value of 60 dB

*Z.*

^{3}Therefore, the actual values of

*x*

_{1}and

*x*

_{2}that are used to decide the extent to which a fuzzy set holds is determined by scaling the random number,

*x*

_{rnd}:

*x*) is the largest integer that is not greater than

*x.*

### Fitness measure

The number output by the BWER algorithm is a real number in the range [0, 1]; that is, it is a confidence estimate with 1.0 signaling that the algorithm is extremely confident that the candidate region is a BWER and 0.0 that it is confident that it is not a BWER. Pragmatically, we drop all candidates that receive confidence estimates of less than 0.3 as having weak endorsements.^{4} The truthed cases do not have such a fine grain of detail—the BWERs identified are either “strong” or “marginal.” Our philosophy in devising a fitness measure for the chromosome is that it should be rewarded more for correctly identifying a strong BWER with a 1.0 estimate than it should for identifying it with a 0.4 estimate. Similarly, it should be punished less for creating a false alarm with a 0.4 estimate than it should for creating one with a 1.0 estimate.

*d*be the distance

^{5}between the detection of interest and the truth-BWER. Let

*D*be the maximum distance within which we can accept that the truth and the BWER correspond to the same feature. If

*c*is the confidence estimate of the detection and

*a*is the confidence estimate of the truth-BWER (assigned as 0.5 for marginal and 1.0 for strong), then validity

*υ*of the match between a detection and the truth-BWER is given by

*υ*

_{max}. The BWER has been missed to the extent “miss” given by

*x*

_{1}and

*x*

_{2}values for each of the single-feature fuzzy rules.

*υ*of each truth-BWER given by

*υ*

_{max}. Each detection is a false alarm “fa” to the extent

The numbers we have identified as hit, miss, and fa are really just constructs—they do not represent the real-world skill measure. In weather detection algorithms, a detection can either be a hit or a false alarm. It cannot be both. So, we redo the computation, this time thresholding the BWER detections. Detections with strong endorsements, defined as those with confidences greater than 0.75, are retained. These detections are either hits or false alarms depending on whether there is a truth-BWER within a distance *D* of the detection. Similarly, truth-BWERs that have not been matched within a distance *D* by a detection are counted as misses. Let the hits, misses, and false alarms obtained by this either–or logic be given by hit_{r}, miss_{r}, and fa_{r}, with the subscript denoting that these numbers denote the real measure of skill.

Because these fitness values, in spite of the graduated measure of skill we provide, tend to lie very closely together for randomly chosen chromosomes, we scale the fitness of a chromosome based on the raw fitness values of the other chromosomes in the generation using sigma truncation followed by linear scaling (Goldberg 1989). This scaled fitness is used for probabilistic selection.

The fitness measure that we have used could be used directly to score the resulting algorithm. Therein lies an important advantage of genetic algorithms. The entire analysis is carried out in the space of the original problem. We have not had to deal with gradients or any attribute that the run-time algorithm does not deal with. In most other search and optimization methods, it is necessary to compute (or approximate) such attributes that are not part of the run-time algorithm. This is particularly useful because it is not easy to describe the BWER algorithm as a closed-form function, so as to be able to take partial derivatives.

### Optimization parameters

A modified version of the simple genetic algorithm described by Goldberg (1989) was used in this work. Each generation had 200 chromosomes. The first generation consisted of 200 chromosomes chosen randomly. Successive generations were composed of chromosomes that were chosen through crossover, mutation, and reproduction. The crossover probability was 0.7, which means that, on the average, we choose 70% of the chromosomes in a generation through crossover. The remainder are simply clones of chromosomes that existed in the previous generation.

To implement a crossover, we need to choose two chromosomes from the previous generation. For reproduction (cloning), we need to choose one chromosome from the previous generation. We choose these chromosomes through probabilistic selection; that is, a better fit chromosome has a better likelihood of being selected. As explained in section 2c, the raw fitness values are scaled using sigma truncation and linear scaling to obtain fitness values that can be used to provide different survival characteristics for different chromosomes. A chromosome with a scaled fitness value of 0.3 is twice as likely as a chromosome with a fitness value of 0.15 to be chosen as one of the pair for a crossover. It can be shown (see, e.g., Goldberg 1989; Holland 1975) that using this kind of probabilistic selection ensures that the “schemata” or parts of chromosomes that are good solutions to the problem are chosen with exponentially increasing frequency, ensuring that after a few generations, the regions of the search space with the best chromosomes are identified by the GA.

*p*

_{m}to calculate the standard deviation,

*σ,*of the normal function of

*x*that satisfies

*P*

*x*

*σ*

*p*

_{m}

*x*) will then give us the number of discrete mutation steps such that

*x*≠ 0 with probability

*p*

_{m}. We solve Eq. (12) through numerical integration

^{6}and use the

*σ*so obtained to get a random Gaussian process with zero mean and standard deviation

*σ*[see Sedgewick (1990) for a way to do this]. The step size of the mutation is determined by this normally distributed random variable and

*x*

_{quant}. We used a

*p*

_{m}of 0.005 in the BWER algorithm. We generate one such step size for each

*x*

_{i}of each gene of the chromosome and increment the

*x*

_{i}by that amount.

### Hybrid genetic algorithm

Unlike other search and optimization techniques, a genetic algorithm promises convergence but not optimality, not even that it will find local maxima. This implies that the choice of when to stop the genetic algorithm is not well defined. We stop the genetic algorithm process when 50 generations have gone by with no better chromosome identified. Since there is no guarantee of optimality, successive runs of the GA will provide different chromosomes with varying fitness measures. This is one of the drawbacks of using a genetic algorithm for optimization—since there is no guarantee of optimality, there is always the chance that there is a better chromosome lurking somewhere in the search space.

Although there is no guarantee of optimality, we are assured of exponential convergence. If we run the GA several times, it will converge each time, possibly at different optimal chromosomes. The schemata that promise convergence are actually indicative of the regions in the search space where good chromosomes may be found. Typically, the GA is coupled with a local search mechanism to find the optimal chromosome in a region. So, if we use a hybrid algorithm, the problem reduces to ensuring that we run the GA as many times as is needed to pick out all the good regions. If we know before hand the shape of the search space, we can estimate the number of regions we expect to find. We can then repeatedly run the GA until these regions have been found. In most practical problems, however, the shape of the search space is not known beforehand. The systematic approach is then to repeat GA runs until the best chromosomes that are found start to repeat with some regularity.

Genetic algorithms are not good at identifying the optimal value of a chromosome for a problem but do very well in identifying the regions where those optima lie. Therefore, we use a hybrid GA—every 10 generations, we anneal the best 10% of the population. This has the effect of moving the top chromosomes in that generation (which are the result of exponential convergence toward the best regions) to the local maximum in their region. A discussion of simulated annealing is beyond the scope of this paper; the interested reader is directed to Metropolis et al. (1953) and Press et al. (1988).

## Results and discussion

The BWER algorithm has a rule base consisting of 20 fuzzy sets. Each of these sets, as described in section 2a, is parameterized by two numbers, *x*_{1} and *x*_{2}, which are subject to tuning by the genetic algorithm. These numbers are determined by *x*_{min} and *x*_{max}, the extremes that the underlying value can take in the real world; the quantization, *x*_{quant}, we choose; and on the values that the GA assigns to *x*_{1} and *x*_{2}. The form of the membership function will never get changed as a result of optimization. The GA can choose whether low should mean 35 or 25 dB*Z* in the context in which it is used but cannot change the word to high.

_{r}

_{r}. The skill of the algorithm in producing high-confidence detections is given more weight when the end user is a forecaster than when the end user is a neural network.

^{7}

The BWER algorithm was tuned over five cases containing 200 volumes of radar reflectivity data. There were a total of 131 BWERs in the five cases. The rapid improvement in the average fitness of a chromosome in the first few generations can be clearly seen in Fig. 4. The fitness measure reported is as calculated in Eq. (11).

Are the chromosomes generated by the GA general? To test the generality of the chromosomes output by the tuning process, we split the set of 200 volumes into 170 volumes for tuning and 30 for testing. We then ran the GA on the tuning cases and tested the best chromosome reported by the GA on the test cases. The results obtained when the experiment was performed five times (with the same 170 volumes for tuning and the same 30 volumes for testing) are shown in Table 1. One test of whether the chromosomes produced by the GA are general is whether the performance of a chromosome on the training cases is positively correlated with the performance of the chromosome on the test cases. The correlation coefficient can be computed from Table 1 to be 0.82 for the HSS and 0.85 for the critical success indicator (CSI). Since the correlation coefficients are close to 1.0, we can deduce that the chromosomes generated by the GA are general.

Are the performances of the best chromosomes in each run of the GA consistent? No. Again from Table 1, we can calculate that the standard deviation in the training HSS is about 7% of its mean while that in the CSI is about 11% of its mean. For the test cases, these go up to 16% and 36%, respectively. The GA does not promise optimality and that lack of a guarantee is seen in the high variability of the results that are obtained with successive runs of the GA. Each run of the GA will give you a *good* chromosome but you can never be sure that the next run will not give you an even better one.

A more-detailed table describing the performance of the BWER detection algorithm, including the number of hits, misses, and false alarms on individual storm days, can be found in Lakshmanan and Witt (1997).

## Conclusions

The paradigm of using a genetic algorithm to tune a weather detection algorithm is useful because genetic algorithms work in the same space as the original problem, removing the need to compute attributes such as gradients that are needed for optimizing by other means. It also provides a convenient way to transfer the results of optimization to a run-time algorithm since all the optimizable information is in a chromosome. Chromosomes tuned to a particular location or types of weather events can be swapped for one another, leading to more customizable algorithms. For example, the BWER detection algorithm could be customized very easily for the differing needs of forecasters and a neural network by simply changing the fitness function and retuning the algorithm. The run-time algorithm is the same; only the chromosome that it uses for the two end users is different.

## Acknowledgments

Thanks to Arthur Witt of NSSL for performing the truthing on the test and training cases. Arthur was the one who suggested many of the rules for the rule base. Thanks also to Gregory Stumpf and Caren Marzban, both of NSSL, who incorporated the BWER output into the neural network for vortex detection.

## REFERENCES

Donaldson, R., R. Dyer, and M. Krauss, 1975: An objective evaluator of techniques for predicting severe weather events. Preprints,

*Ninth Conf. on Severe Local Storms,*Norman, OK, Amer. Meteor. Soc., 321–326.Goldberg, D., 1989:

*Genetic Algorithms in Search, Optimization, and Machine Learning.*Addison-Wesley, 432 pp.Heidke, P., 1926: Berechnung des erfolges und der gute der windstarkvorhersagen im sturmwarnungsdienst.

*Geogr. Ann.,***8,**301–349.Holland, J., 1975:

*Adaptation in Natural and Artificial Systems.*The University of Michigan Press, 228 pp.Koza, J., 1992:

*Genetic Programming: On the Programming of Computers By Means of Natural Selection.*The MIT Press, 800 pp.Lakshmanan, V., and A. Witt, 1997: A fuzzy logic approach to detecting severe updrafts.

*AI Appl.,***11,**1–12.Marzban, C., 1998: Scalar measures of performance in rare-event situations.

*Wea. Forecasting,***13,**753–763.Metropolis, N., A. Rosenbluth, M. Rosenbluth, A. Teller, and E. Teller, 1953: Combinatorial minimization.

*J. Chem. Phys.,***21,**1087–1092.Press, W., B. Flannery, S. Teukolsky, and W. Vetterling, 1988:

*Numerical Recipes in C: The Art of Scientific Computing.*Cambridge University Press, 1020 pp.Sedgewick, R., 1990:

*Algorithms in C++.*Addison-Wesley, 656 pp.Voxman, W., and R. Goetschel, 1983: A note on the characterization of the max and min operators.

*Inf. Sci.,***30,**5–10.Zadeh, L., 1965: Fuzzy sets.

*Inf. Control,***8,**338–353.

Performance of the chromosome obtained by tuning the BWER algorithm using the GA. The numbers are traditional CSIs and HSSs obtained by thresholding the detections at a 0.75 confidence level.

^{1}

A better representation, from the point of view of optimization using GAs, is each digit of the real and imaginary parts. Instead of using decimal numbers, one could also use the digits of the numbers to a base 2, i.e., the bits with which they are stored in computer memory.

^{2}

*x*

_{2}given a value of

*x*

_{1}is 1/(1 −

*x*

_{1}) in the range [

*x*

_{1},1) and zero everywhere else. So, the probability that both

*x*

_{1}and

*x*

_{2}are less than a number

*t*is given by

*t*+ (1 −

*t*) log(1 −

*t*). Using Bayes’s theorem and

*P*(

*x*

_{1}<

*t*|

*x*

_{2}<

*t*) = 1, we can evaluate

*P*(

*x*

_{2}<

*t*). The median value of

*x*

_{2}is the value of

*x*for which

*P*(

*x*

_{2}<

*x*) = 0.5, and the probability distribution can be obtained from

*P*(

*x*

_{2}<

*x*) using

^{3}

Since we generate random numbers in the range [0, 1), *x*_{max} should really be one *x*_{quant} past the largest number that the underlying value can take.

^{4}

We also peg BWERs with confidences above 0.75 as having strong endorsements. These numbers are ad hoc and could as well be 0.33 and 0.67. The GA will do its tuning according to whichever numbers are used. The important thing is that some detections are discarded using a crisp threshold and that the crisp threshold is consistent between the run-time and tuning cases.

^{5}

We define the distance *d* between two points (*x*_{1}, *y*_{1}) and (*x*_{2}, *y*_{2}) as *d* = |*x*_{1} − *x*_{2}| + |*y*_{1} − *y*_{2}| rather than using the computationally more expensive Euclidean distance.

^{6}

The value of *σ* can be obtained from the appendix of any engineering or statistics text if the value of *p*_{m} is known beforehand.

^{7}

There was no particular rationale in deciding to use the Heidke skill score rather than the critical success index—just that either one could be used, so we used both. In a rare-event situation, then Marzban (1998) derives the result that the CSI and HSS are optimized simultaneously.