## Abstract

An important focus of research in the forecast verification community is the development of alternative verification approaches for quantitative precipitation forecasts, as well as for other spatial forecasts. The need for information that is meaningful in an operational context and the importance of capturing the specific sources of forecast error at varying spatial scales are two primary motivating factors. In this paper, features of precipitation as identified by a convolution threshold technique are merged within fields and matched across fields in an automatic and computationally efficient manner using Baddeley’s metric for binary images.

The method is carried out on 100 test cases, and 4 representative cases are shown in detail. Results of merging and matching objects are generally positive in that they are consistent with how a subjective observer might merge and match features. The results further suggest that the Baddeley metric may be useful as a computationally efficient summary metric giving information about location, shape, and size differences of individual features, which could be employed for other spatial forecast verification methods.

## 1. Introduction

A growing interest in quantitative precipitation forecasts (QPF) from industry, agriculture, government, and other sectors has created a demand for more detailed rainfall predictions. Rainfall is one of the most difficult weather elements to predict correctly (Ebert et al. 2003). Traditional verification scores can give misleading or noninformative results because of their inability to distinguish sources of error and their high sensitivity to errors caused by even minor displacements of precipitation areas spatially (or temporally), magnitude differences, and other distortions (Brown et al. 2007). Numerous methods have subsequently been proposed in order to better characterize forecast performance of QPF and other high-resolution spatial forecasts (e.g., convection, reflectivity, strong winds, etc.). Browning et al. (1982) established the importance of identifying sources of error through a subjective verification approach. The technique of Hoffman et al. (1995) decomposes the forecast error into displacement, amplitude, and residual, and Ebert and McBride (2000) followed up with this with an entity-based verification approach that identifies displacement vectors and decomposes the error statistics into their sources, such as displacement and pattern errors. Micheas et al. (2007) extend the concept of Hoffman et al. (1995) and Ebert and McBride (2000) to identify errors resulting from propagation, dilation, intensity, rotation, and shape through Procrustes shape analysis methods, which is also similar to the fuzzy logic approach of Davis et al. (2006a), but is set in the framework of traditional analysis of variance methods. Marzban and Sandgathe (2006) use statistical cluster analysis in order to identify *objects* of intense precipitation at different scales.

Other types of methods for verifying spatial forecasts such as QPF have been proposed (e.g., Briggs and Levine 1997; Casati et al. 2004; Harris et al. 2001; Ebert 2007). Of interest for the present paper are approaches based on the creation of objects (e.g., Du and Mullen 2000; Brown et al. 2007; Marzban and Sandgathe 2006, 2008; Davis et al. 2006a). In each approach, it is necessary to determine the *distance* between two objects of precipitation, which is not always straightforward because of the possibility for widely varying shapes and sizes of objects. Furthermore, once objects have been identified it is often necessary to subsequently merge some objects together that are part of the same weather system. Similarly, for verification it is usually necessary to then match objects in one field to objects in the other, which again relies on the use of some type of distance measurement.

The strategies of Marzban and Sandgathe (2006, 2008) encompass the merging and matching steps. In particular, one begins with each pixel as an object, and the pixels are joined iteratively using statistical cluster analysis until there remains only one object. At each step, a distance measure (possibly involving a meteorological covariate) is used to determine which clusters of pixels to merge. Marzban and Sandgathe (2008) employ the procedure to a composite of the two fields together, keeping track of which pixels belong to each field to subsequently calculate a traditional verification score at each step. There are similarities with the method proposed here, but it is important to point out the differences. First, the method proposed here relies on predefined fields of binary objects, whereas the methods of Marzban and Sandgathe (2006, 2008) iteratively define new objects at each step. Second, in order to alleviate the computational burden of the iterative procedure, Marzban and Sandgathe (2008) consider subsets of the clusters chosen randomly. The approach proposed here handles the computational challenge by choosing a subset of possible object merges and matches based on optimized values of the chosen distance metric. Finally, to the best of our knowledge, the metric employed here has not previously been used for forecast verification, though it could easily be employed as the distance metric in the approaches of Marzban and Sandgathe (2006, 2008).

We investigate the use of this metric proposed by Baddeley (1992a) in measuring the distance between two binary images. Additionally, we propose a technique for merging and matching objects that makes repeated use of this metric.

We begin with background on the data used here in the next section, followed by a brief introduction to the Baddeley delta metric in section 3. Section 4 discusses an algorithm based on this metric for matching objects between images and merging objects within images. Section 5 presents the results for four test cases, and section 6 provides a summary and discussion.

## 2. Binary image data

The main objective of this paper is to make comparisons between two binary images, so the focus is not on what these images actually represent. However, because this work is motivated by the need to better verify QPFs, we give some background on the data here. We refer the reader to Brown et al. (2007) for a more detailed description of the datasets.

The forecast image is created from a convolution threshold technique (Davis et al. 2006a) applied to the Weather Research and Forecasting (WRF) model for precipitation (e.g., Fig. 1, left column). The technique identifies areas of intense precipitation by defining objects that cover areas where precipitation intensity is above a specified threshold after having smoothed the field by a convolution technique. The result is a binary image defining contiguous objects of intense precipitation.

The WRF model is a mesoscale numerical weather prediction system designed for use by both operational forecasters and atmospheric researchers. [See the WRF Web site online at http://wrf-model.org/ for more information on the WRF model and related publications (e.g., Skamarock et al. 2005; Wicker and Skamarock 2002).] Henceforth, these binary images will be referred to as the *forecast* images.

The same convolution threshold technique is then applied to a stage-IV analysis dataset for verifying the WRF QPF model (e.g., Fig. 1, right column). The NCEP stage IV is a mosaic of regional multisensor analysis produced with a manual quality control step by the National Weather Service (NWS) River Forecast Centers (RFCs; Lin and Mitchell 2005). These images will be referred to here as the *analysis* images.

Attention is given to making comparisons between these resulting pairs of binary images. Generally, the images consist of several large objects that may be similar to each other in shape, number, and placement. It is possible for two separate objects in one image to closely match a single object in the other image. Therefore, it is desirable to find the best object mergings within an image, as well as the best object matches between images. Once the best mergings and matchings have been found, the next step is to compare the two images.

## 3. Baddeley metric for comparing binary images

Our proposed method for merging objects within each image and matching the objects across images makes repeated use of the Baddeley delta metric (Baddeley 1992a, b). Therefore, we summarize this metric here, beginning with a brief discussion of metrics and distances [see Baddeley (1992b) for more on metrics and, in particular, the Hausdorff metric discussed below].

A metric, Δ, between two sets of pixels *A* and *B* contained in a pixel raster *X* satisfies the following axioms:

Similarly, a metric between two pixels **x** and **y**, say *ρ*(**x**, **y**), in a raster of pixels can be defined by replacing Δ with *ρ* and *A*, *B* with **x**, **y** in Eq. (1). In the present context, the sets of pixels *A* and *B* represent objects as defined in section 2.

Next, let *d*(**x**, *A*) denote the shortest distance from pixel **x** ∈ *X* to the set of pixels, A ⊆ *X*. That is,

with *d*(**x**, ∅︀) ≡ ∞ and *ρ*(·, ·) a metric.^{1} Because images can be relatively large, it is important to consider methods that can be rapidly computed; the distance transform algorithm (Borgefors 1986; Rosenfeld and Pfalz 1966, 1968) is useful for computing pixel distances rapidly, and is used in the analyses here.

One method for comparing binary images is the Hausdorff metric, which motivates the Baddeley metric; among others (e.g., Venugopal et al. 2005).

Let *A*, *B* ⊆ *X*, with *X* as a raster of pixels. The Hausdorff distance is given by

with *d*(**x**, ·) as defined in Eq. (2). That is, *H _{ρ}*(

*A*,

*B*) is the maximum distance from a point in one set to the nearest point in the other set. Because the sets

*A*and

*B*considered here are finite binary sets, Eq. (3) can be written as

Note that this second definition for *H* involves pixels from the entire raster, *X*, instead of simply those in *A* and *B* as in the first representation.

There are some important problems with Eq. (4). In particular, *H* has a high sensitivity to noise because a single error pixel can cause elevation of *H* to its maximum possible value because of the supremum in its definition. See Baddeley (1992a, b) for more on the drawbacks of the Hausdorff metric as an error measure for images.

The Baddeley delta metric replaces the supremum in (4) with an *L _{p}* norm to stabilize the measure, and a further transformation on

*d*(

*x*, ·) to ensure that the result is a metric. Specifically,

where *N* is the total number of pixels in the raster, *X*; *p* is chosen a priori; and *w* is a concave continuous function that is strictly increasing at zero. For applications, Baddeley (1992a) suggests using the cutoff transformation:

for a fixed *c* > 0.

For *p* → ∞ in (5), Δ tends toward the maximum difference in distances between two sets, and subsequently would be equivalent to (4). For *p* → 0, Δ tends toward the minimum; *p* = 1 is the usual arithmetic average for the differences in distance, and *p* = 2 gives the average of the common Euclidean norm for each difference [see, e.g., Nychka and Saltzman (1998) or Johnson et al. (1990) for more on *L _{p}* norms]. A wide range of choices for

*p*and

*c*will work, and for our purposes, we shall use

*p*= 2 for computational convenience and efficiency.

Note that Δ^{p}_{w}(*A*, *A*) = Δ^{p}_{w}(∅︀, ∅︀) = 0, and unlike the Hausdorff metric Δ^{p}_{w}(*A*, ∅︀) = Δ^{p}_{w}(∅︀, *A*) < ∞.

Qualitatively, Eq. (5) gives an average of the difference in position of two sets *A* and *B* to each point **x** ∈ *X*. We show next that with *ω*(·) as defined in Eq. (6) and *p* = 2, that the metric in (5) is a type of average cluster distance.

Three principal cases can be identified within the metric in (5) for two binary image objects *A* and *B*, and a given pixel **x** ∈ *X* pertaining to the difference *w*[*d*(**x**, *A*)] − *w*[*d*(**x**, B)] when *ω*(·) is defined by (6). Namely, (i) *d*(**x**, *A*) ≤ *c* and *d*(**x**, B) ≤ *c*, (ii) only one of *d*(**x**, *A*) and *d*(**x**, B) is less than or equal to *c*, or (iii) *d*(**x**, *A*) > *c* and *d*(**x**, B) > *c*. Using the law of cosines on the first case, it is easy to see that for *p* = 2 the squared difference {w[*d*(**x**, *A*)] − *w*[*d*(**x**, B)]}^{2} can be written as

where *ρ*(**x*** _{A}*,

**x**

*) is the distance between the point {*

_{B}**x**

*:*

_{A}**x**

*∈*

_{A}*A*and

**x**∈

*X*,

*ρ*(

**x**

*,*

_{A}**x**) =

*d*(

**x**,

*A*)} and the point {

**x**

*:*

_{B}**x**

*∈*

_{B}*B*and

**x**∈

*X*,

*ρ*(

**x**

*,*

_{B}**x**) =

*d*(

**x**, B)}, and

*θ*is the angle between the line segments adjoining the point

**x**∈

*X*and the two points

**x**

*and*

_{A}**x**

*, respectively (Fig. 2).*

_{B}For the second case, the contributions and reductions to the overall metric are relative to a constant term, *c*, and a (smaller) value relating to the distance between the object nearer to **x** ∈ *X*. For the third case, there is no contribution or reduction to the metric because all of the values are zero. Therefore, it can be seen from the form in (7) that Δ of Eq. (5) with *p* = 2 yields a type of average pixel distance between sets (objects) *A* and *B*.

## 4. Merging and matching binary image objects

After initially identifying objects within the analysis and forecast images as in Fig. 1 (bottom row), it is necessary to find the optimal mergings within each image, as well as which objects to match from one image to the other. Ideally, the Baddeley metric (henceforth, Δ) would be computed for all possible mergings, and each merging compared to each object of the other image. However, if there are *m* forecast objects and *n* analysis objects, then there are 2* ^{m}* × 2

*comparisons to make; which would generally be too computationally intensive to be compared operationally. In this section, we propose a method for finding a reasonable subset of the possible mergings.*

^{n}The proposed technique computes Δ between each of the original objects from one image to the other, and then merges objects in each image based on a ranking of the initial Δs. Specifically, let *i* = 1, . . . , *m* denote the *i*th forecast object, and *j* = 1, . . . , *n* the *j*th analysis object.

Compute Δ for each object from the forecast image with each object from the analysis image. For convenience, these Δ values are stored in an

*m*×*n*matrix, ϒ.Rank the values from step 1. For the

*i*th forecast object, let*j*,_{k}*k*= 1, . . . ,*n*denote the objects with the lowest to highest Δ between forecast object*i*and each analysis object*j*. Similarly, for the_{k}*j*th analysis object denote*i*_{ℓ}, ℓ = 1, . . . ,*m*as the objects with the lowest to highest Δ when comparing analysis object*j*to each forecast object*i*_{ℓ}.Compute Δ between the

*i*th forecast object and analysis object*j*_{1}, then between*i*and*j*_{1}merged together with*j*_{2}(i.e.,*j*_{1}∪*j*_{2}), then*i*and*j*_{1}∪*j*_{2}∪*j*_{3}, and so on until object*i*is compared to the merging of all*n*objects from the analysis image. For convenience, these values are stored in an*m*×*n*matrix,**Ψ**.Perform step 3 in the other direction. That is, compute Δ between object

*j*and*i*_{1},*j*and*i*_{1}merged with*i*_{2}, etc. Again, for convenience, we store these values in an*m*×*n*matrix,**Ξ**.Merge and match objects by comparing the resulting Δ values in the matrices:

**ϒ**,**Ψ**, and**Ξ**. Optionally, a threshold,*u*, may be employed to not allow objects to be matched when Δ >*u*. This final step is described more explicitly in the text that follows.

For step 5 of the above algorithm, let *Q* = {**ϒ**, **Ψ**, **Ξ**}. Merge and match objects by comparing the resulting Δ values in *Q* in the following manner. First, those objects that lead to the smallest Δ value in *Q* are merged and matched together to form the first group of matching objects. To prevent the same object from being matched more than once, the Δ values that were computed with any of this first group of objects are removed from *Q*. Then the smallest Δ value is located from those remaining Δ values in *Q*. Objects that give rise to this second smallest Δ value are merged and matched to form the second group. This process continues in a similar manner until all objects are exhausted.

Note that the above algorithm will allow merged objects from one field to match with single objects from the other field, but will not allow for merged objects in one field to match merged objects in the other field. A further iterative step could be added to allow for such matchings, but at the cost of higher computational burden, and the present study does not explore this possibility.^{2}

The next section will begin with an example of the above algorithm for clarification of the procedure.

## 5. Results from four test cases

Here, the strategy outlined in the previous section is demonstrated with four test cases. These represent typical scenarios from the 100 test cases inspected in this study to give an impression of the strengths and weaknesses of the approach. Note that there are not many objects in these cases so that one could compare all 2* ^{m}* × 2

*combinations, but cases do exist where there are enough objects to make all comparisons expensive. Furthermore, even when 2*

^{n}*× 2*

^{m}*is not prohibitively large for any given field, the computational burden can still be large when comparing a large number of fields.*

^{n}The first of the four test cases is graphed in Fig. 1, which shows both the raw data (top row) and the binary images with objects labeled with numbers (bottom row) for each of the WRF forecast (left column) and stage-IV analysis (right column) as defined by the convolution threshold technique of Brown et al. (2007) and Davis et al. (2006a); this is the same test case as shown in the example of Fig. 2 in Davis et al. (2006a). The binary images for the other three cases are shown in Fig. 3.

As mentioned in section 3, a wide range of choices for *p* and *c* will work. We use *p* = 2 in Eq. (5) for each of these test cases for computational convenience and efficiency. From subjective exploration, we choose *c* = 100 pixels, corresponding to about 400 km, as this value consistently yielded reasonable results. Furthermore, we standardize Δ to be in the interval [0, 1] by dividing by *c*.

For illustration of the merging and matching procedure detailed in the previous section, we step through the algorithm for the first test case. Table 1 gives the matrix **ϒ** from step 1. Note that although the number of objects in each of these fields are the same, this is not necessary for the procedure (the number of objects in either field is arbitrary). Step 2 requires that these values be ranked, and subsequent analyses are based on mergings that progress through these rankings in order to lessen the computational burden in a reasonable manner. The best matches of each forecast object compared with each analysis object are shown in boldface, and the best match of each analysis object compared to each forecast object are indicated by a superscripted asterisk. Table 2 shows the matrix, **Ψ**, from step 3 of the algorithm, and Table 3 shows the matrix, Ξ, from step 4.

Note that the best overall match from Table 1 matches object 3 from the forecast image to object 3 from the analysis image (Δ ≈ 0.02). Because Δ does not decrease with any of the mergings (Tables 2 and 3), neither of these objects are merged to any other object. Subjective inspection of Fig. 1 (bottom row) suggests that this result is reasonable as both objects are of roughly the same size, shape, and location.

The next best match of single objects matches forecast object 5 with analysis object 4. Again, no better values are obtained by merging objects in the analysis image (see Table 2, row 5), nor are any better values found from merging objects in the forecast image (see Table 3, column 4). Therefore, forecast object 5 is matched to analysis object 4 without any mergings. Examination of Fig. 1 (bottom row) shows that both objects are small, with forecast object 5 the smaller. There also appears to be a slight spatial (or possibly temporal) displacement where the forecast object lies a little too far to the east as compared with the analysis object 4. Nevertheless, both objects are relatively small in size and similar in location, where forecast object 5 is smaller than analysis object 4 to make comparison of their shapes implausible.

Following to the next best single object match, we would compare forecast object 5 to analysis object 3, but as forecast object 5 has already been matched to analysis object 4 with a better Δ value (cf. ≈0.08 and ≈0.06), and no further mergings improve this value, this matching is not accepted by the procedure. Although the two objects are in the same general area, it is clear that the former match is superior to the latter, indicating that the procedure made a correct decision here. The next best match is between forecast object 8 and analysis object 5, and no further mergings improve the Δ value, so this matching is accepted without any mergings. The forecast object here is displaced spatially to the west from this analysis object, and is also a bit smaller. Subjective judgment would likely make this match over the other possibility of matching forecast object 8 with analysis object 6, indicating that the choice based on the procedure is reasonable. One might subjectively match forecast object 6 with analysis object 5, or possibly merge forecast objects 6 and 8 and subsequently match them to the analysis object 5. Nevertheless, the selection by the procedure to match the single forecast object 8 with the single analysis object 5 is consistent with the choice a subjective observer might make.

Object 1 from each image shows the next best match. However, careful inspection of matrices **Ψ** and **Ξ** (Tables 2 and 3, respectively) shows that analysis object 2, when matched with the merging of forecast objects 1, 2, and 4 yields a lower Δ than is obtained when simply matching forecast object 1 with analysis object 1. Again, from a subjective point of view, this is a reasonable choice, though the inclusion of analysis object 1 might be desirable (i.e., the merging of analysis objects 1 and 2). This demonstrates a limitation of the approach that merged objects from one field cannot be matched to merged objects from the other field. Nevertheless, the result is reasonable in that a subjective observer could argue against such a merging; analysts might have quite varied interpretations.

Objects 7 from the forecast and 6 from the analysis yield the next best single object match from the matrix, **ϒ** (Table 1). Note that the second best match for forecast object 7 is with analysis object 7. In fact, when these two analysis objects are merged, the match with forecast object 7 provides a much lower Δ than the unmerged case. As there are no better mergings or matchings for these objects, the analysis objects 6 and 7 are merged and together are matched with forecast object 7. Again, this appears reasonable from a subjective standpoint as the forecast object covers much of central Texas beginning on the eastern border with Mexico and ending shortly before the border with Louisiana. Although the analysis objects do not cover much of central Texas, there are no other analysis objects in this area, and there appear to be (spatial) overlaps with these objects and forecast object 7. This may indicate an error in forecast intensity, and one could investigate this by obtaining binary images from the convolution threshold algorithm using different thresholds. Finally, analysis object 8 is in the vicinity of forecast object 7, but is clearly displaced spatially. Inspection of the original raw field (Fig. 1, top row) indicates that the forecast may have indeed overforecast the precipitation intensity over central Texas, and that analysis object 8 appears to be part of the same weather system as objects 6 and 7. This result highlights the important point that this procedure does not account for information in the raw field. Ideally, meteorological information should be used to inform the merging process as important information about the forecast can still be lost in this type of verification scheme, such as spatial displacement, intensity, etc.

Table 4 summarizes the results discussed above for the first test case. The Δ values for each of these matches are displayed in the last column. Note that analysis objects 1 and 8 are not merged with any other objects, nor are they matched to any objects in the forecast image. As discussed, a subjective observer looking only at the binary object images may or may not merge these objects with other nearby objects. Inspection of the raw fields, however, suggests that both of these objects are likely part of weather systems that contain these objects so that perhaps the objects should be merged in these cases.

Table 5 shows the final results for the second test case (Fig. 3, top row). The procedure merges objects 1, 2, and 4 of the analysis image and matches this to object 2 of the forecast image. This is found to be the best match for this test case. Note that the Δ value for this match is relatively higher than those for test case 1 (Δ ≈ 0.12 relative to Δ ∈ [0.02, 0.08] for test case 1). Indeed, from subjective inspection of 100 test cases from 2 July 2001 to 10 July 2001, a reasonable threshold for Δ appears to be about 0.10. Clearly, analysis objects 1 and 2 are heavily displaced to the north of forecast object 2, though there is some overlap. It is not clear from the binary analysis image, or from its corresponding raw field (not shown), that analysis object 4 is part of the same weather system as objects 1 and 2. Nevertheless, the forecast object covers a large region with the southern and northern edges touching all of these analysis objects. Therefore, from a purely binary object viewpoint, the merging and matching is reasonable. Finally, objects 3 from both images are matched with Δ ≈ 0.29. Although the two objects are of about the same size and shape, and at about the same latitude, they are relatively far apart in the east–west direction. Note that Δ alone does not indicate whether two objects are different in size or location, but incorporates both criteria into the total. Furthermore, a threshold of 0.30 was used initially on the 100 cases, but inspection of these results suggests that a better threshold would be 0.10 for these data. If this threshold is enforced, then there would be no matches for this case.

Final results for test case 3 (Fig. 3, middle row) are given in Table 6. There is only one analysis object for this test case, and it is matched to the unmerged forecast object 1. The two objects overlap spatially, and although both are small, the analysis object is considerably smaller.

Finally, Table 7 gives results for test case 4 (Fig. 3, bottom row). This test case is similar to the previous case in that there are not many objects, and the objects are all relatively small in size. Subjectively, one would likely match forecast object 1 with analysis object 2 as these are relatively close in size, shape, and location, although the forecast object is slightly displaced to the east, and covers slightly more area than the corresponding analysis object. One would also likely match forecast object 2 with analysis object 3. Again, the two objects are similar in size, shape, and location, with the forecast object displaced slightly to the south and west from the corresponding analysis object. The small analysis object 1 would likely not be matched to either of the forecast objects as it is spatially very far away from both. The results from applying the merging and matching scheme of section 4 agree with the above subjective arguments. Note also that both Δ values are near to 0.10, which, as mentioned above, was generally found to be indicative of poorer agreement between the two objects for all of the 100 test cases. Values larger than this number tend to represent greater discrepancies between images, and values that approach this number tend to be associated with images that are very similar, but show relatively serious departures from each other.

## 6. Summary and discussion

Many methods have been proposed for providing more useful verification of high-resolution spatial forecasts. One class of methods involves the identification of features or objects in each of the forecast and observed fields. For many of these methods it is necessary to somehow identify which, if any, objects within a field should be considered to belong to a larger group of objects within a field (merging). Furthermore, it is often necessary to identify which objects (or groups of objects) should be compared across fields (matching). An automatic and computationally efficient strategy for merging and matching is presented here that is shown to make reasonable merges and matches. The strategy makes heavy use of the Baddeley image metric, Δ, which is found to be a particularly useful summary metric that accounts for spatial location, coverage, and shape differences among identified features.

The Baddeley metric (Δ) provides a useful summary measure for comparing two sets of binary images because it is robust to small changes in location (displacement), orientation, amplitude, and distortion. This is an important feature for verifying forecasts where areas of precipitation may be slightly displaced in time (and subsequently space), but are otherwise relatively accurate. These characteristics are verified by the test cases investigated here. The measure does not, however, distinguish the type of errors that may exist. For example, the metric cannot discern if the differences in the images result from amplitude or distortion errors. For this, one would need to apply other techniques (e.g., Davis et al. 2006a, b; Brown et al. 2007; Ebert and McBride 2000; Hoffman et al. 1995; Micheas et al. 2007; Marzban and Sandgathe 2008).

For determining which objects to merge in one image, and which to match between images, the four test cases suggest our approach has great promise, as confirmed by subjective evaluation of 100 test cases. However, the method detailed here relies on using both the forecast and analysis images together in order to determine the best possible mergings. This is acceptable if it is desired to compare only one forecast to the analysis field, but frequently it is desired to compare multiple forecasts using a single reference forecast, such as the analysis field. In such a case, it is generally best to make comparisons using the same analysis image (i.e., with the same object mergings) for all forecast images. Nevertheless, even with this objective in mind, this approach uses the same basic objects in the analysis image to compare against each forecast image. The images only differ in how the individual objects are merged, and in each case they are merged to minimize Δ so that one merging does not bias results toward a particular forecast. This is appropriate here because while the merged objects are found to be consistent with how a subjective evaluator might merge objects, they are not merged based on meteorological criterion.

It would be useful to allow meteorological covariates (e.g., rainfall regime, storm organization, etc.) to inform the mergings and matchings. Although it may be possible to incorporate such information into the distance metric *d*(**x**, *A*) (e.g., Marzban and Sandgathe 2006, 2008), such a scheme may nullify the computationally efficient distance transform method, making our procedure highly inefficient. Nevertheless, the procedure proposed here allows for a computationally efficient automated method for distinguishing between objects that are close or far apart from one field to another. As a result, the procedure may contribute to other approaches where an algorithm is needed to merge and match binary image objects.

## Acknowledgments

The authors thank the reviewers for their helpful and constructive comments that made this a better paper. We also thank Cindy Halley Gotway for her assistance with some of the figures.

## REFERENCES

^{p}version of the Hausdorff metric.

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**.**

**,**

## Footnotes

*Corresponding author address:* Eric Gilleland, Research Applications Laboratory, National Center for Atmospheric Research, 3450 Mitchell Lane, Boulder, CO 80301. Email: ericg@ucar.edu

^{1}

The inf in Eq. (2) stands for infimum, which is defined as the greatest lower bound of the set. Similarly, for the supremum (sup) of Eq. (3), which is the least upper bound of the set. For sets that contain the greatest lower bound (least upper bound), the infimum (supremum) is equivalent to the minimum (maximum) element of the set.

^{2}

Note that the examples shown in this study do not contain more than eight objects per field so that all 2* ^{m}* × 2

*combinations could be compared, and a more complicated procedure could be employed to allow merged objects from one field to match to merged objects in the other field. This is beyond the scope of this paper as we are interested in an automatic and computationally efficient method that could be used for larger*

^{n}*m*and

*n*.