Measuring Displacement Errors with Complex Wavelets

Sebastian Buschow aInstitute of Geosciences, University of Bonn, Bonn, North Rhine-Westphalia, Germany

Search for other papers by Sebastian Buschow in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0003-4750-361X
Open access

Abstract

When highly resolved precipitation forecasts are verified against observations, displacement errors tend to overshadow all other aspects of forecast quality. The appropriate treatment and explicit measurement of such errors remains a challenging task. This study explores a new verification technique that uses the phase of complex wavelet coefficients to quantify spatially varying displacements. Idealized and realistic test cases from the MesoVICT project demonstrate that our approach yields helpful results in a variety of situations where popular alternatives may struggle. Potential benefits of very high spatial resolutions can be identified even when the observational dataset is coarsely resolved itself. The new score can furthermore be applied not only to precipitation but also variables such as wind speed and potential temperature, thereby overcoming a limitation of many established location scores.

Significance Statement

One important requirement for a useful weather forecast is its ability to predict the placement of weather events such as cold fronts, low pressure systems, or groups of thunderstorms. Errors in the predicted location are not easy to quantify: some established quality measures combine location and other error sources in one score, others are only applicable if the data contain well-defined and easily identifiable objects. Here we introduce an alternative location score that avoids such assumptions and is thus widely applicable. As an additional benefit, we can separate displacement errors into different spatial scales and localize them on a weather map.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Sebastian Buschow, s6sebusc@uni-bonn.de

Abstract

When highly resolved precipitation forecasts are verified against observations, displacement errors tend to overshadow all other aspects of forecast quality. The appropriate treatment and explicit measurement of such errors remains a challenging task. This study explores a new verification technique that uses the phase of complex wavelet coefficients to quantify spatially varying displacements. Idealized and realistic test cases from the MesoVICT project demonstrate that our approach yields helpful results in a variety of situations where popular alternatives may struggle. Potential benefits of very high spatial resolutions can be identified even when the observational dataset is coarsely resolved itself. The new score can furthermore be applied not only to precipitation but also variables such as wind speed and potential temperature, thereby overcoming a limitation of many established location scores.

Significance Statement

One important requirement for a useful weather forecast is its ability to predict the placement of weather events such as cold fronts, low pressure systems, or groups of thunderstorms. Errors in the predicted location are not easy to quantify: some established quality measures combine location and other error sources in one score, others are only applicable if the data contain well-defined and easily identifiable objects. Here we introduce an alternative location score that avoids such assumptions and is thus widely applicable. As an additional benefit, we can separate displacement errors into different spatial scales and localize them on a weather map.

© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Sebastian Buschow, s6sebusc@uni-bonn.de

1. Introduction

Location errors are the main reason why simulated meteorological fields like precipitation can often not be directly compared to observations in a gridpoint-wise manner. If a forecast misplaces a rain field, large differences are seen both at the predicted and the true, observed location of the feature; possible similarities between the two images are not rewarded and the magnitude of the displacement is not quantified. Due to this double-penalty effect, quality measures like the pointwise RMSE prefer forecasts with large, smooth structures to finer-scaled, arguably more realistic models. The need for informative, objective evaluation of high-resolution forecasting systems lead to the development of numerous new “spatial” verification techniques, which were surveyed in the Spatial Forecast Verification Methods Intercomparison Project (ICP; Gilleland et al. 2009) and its successor project Mesoscale Verification Intercomparison over Complex Terrain (MesoVICT; Dorninger et al. 2018). Besides facilitating the development of new methods and providing standardized test cases for their comparison, these projects classified nearly all existing approaches into five classes. The common thread is that each class uses a specific abstract representation of the underlying fields: neighborhood methods apply smoothing filters to essentially compare the average rainfall characteristics around each location. Similarly, scale separation methods split forecast and observation into individual frequency components via bandpass filters. Binary distance measures instead rely on the computation of distance maps, which measure the distance from each pixel to the next rainy pixel in one of the images. Field deformation methods consider the optimization problem of transforming one image into the other, for example, via optical flow algorithms. The fifth and perhaps most popular group of methods decomposes the fields into discrete objects and compares their properties.

Many of these methods aim to isolate individual aspects of forecast quality in order to specify the nature of a forecast’s error and, ideally, hint at the reasons for the shortcoming. Such scores would be useful even in the absence of double penalties, since the realism of a complex simulated field can hardly be described by a single number. A prime example is the object-based structure, amplitude, and location (SAL) method of Wernli et al. (2008), which identifies errors in structure, amplitude, and location. Weniger and Friederichs (2016) pointed out that SAL can be highly sensitive to the specifics of the object identification procedure and may not be appropriate for variables other than precipitation. Motivated by this, a new structure verification method based on scale separation instead of object decomposition was developed (Kapp et al. 2018; Buschow et al. 2019). In its final form, presented in Buschow and Friederichs (2021), this approach compares forecasts to observations in terms of their spatial scale, anisotropy, and direction. The scale component of SAD is similar to SAL’s S, albeit more specific in its interpretation (predicted structure too small or too large) and more effective in detecting the correct correlation structure (Buschow et al. 2019). The other two components measure how strongly a field is directed and what its preferred orientation is, two aspects that are not considered by SAL.

Taking further inspiration from SAL, this study aims to define a location score based on the same scale decomposition used in SAD. On the one hand, a verification that considers only the correlation structure (and perhaps the marginal distribution) is incomplete since forecasts with no location information are useless for many applications. On the other hand, the complex wavelet transform, which forms the technical core of SAD, affords us the unique opportunity to extract location information from the phases of the complex coefficients. As a further motivation, our literature survey in section 2 below indicates that SAL’s location score is often used but rarely useful. Last, wavelet transforms require no strong assumptions on the structure of the underlying fields, such as the existence of discrete objects and meaningful thresholds. Our new methodology is therefore not limited to intermittent, precipitation-like data but can be applied to any meteorological field of interest.

The remainder of this paper is structured as follows: in section 2, we review some of the most popular location scores from the literature, including SAL. Section 3 briefly introduces the SAD structure verification and the wavelet transform on which it is based. We define the new, phase-based location score in section 4 and demonstrate its behavior in a series of idealized tests. Realistic test cases from the MesoVICT project are introduced in section 5 and verified using the new and old location scores in section 6. Here we focus mainly on precipitation; potential temperatures and wind speed are included as a proof of concept as well. Section 7 summarizes the outcomes of our study and discusses the merits, as well as limitations of all tested scores.

2. Established displacement measures

Perhaps the most widely used pure location score is the L component of the object-based SAL score (Wernli et al. 2008). Denoting the center of mass in the observed and forecast field by r(obs) and r(forc), respectively, one-half of L is defined as
L1=|r(obs)r(forc)|Lmax,
where Lmax is the longest distance between two grid points within the domain. For the other half of L, rain fields are decomposed into discrete objects. In this study, we use the standard object identification procedure advocated in Wernli et al. (2009) and implemented in the SpatialVx R library: 1) convert precipitation into binary fields by thresholding at 1/15 times the 95th percentile of nonzero values in the respective field, 2) smooth the binary mask with a disk kernel, and 3) group continuous nonzero regions into individual objects. We compute the centers r1,…,N and precipitation totals R1,…,N of all N objects in one of the fields and define the scatter around the overall center r as Δr=i=1NRi|rir|/i=1NRi. The second half of L is then given by
L2=2×|Δr(obs)Δr(forc)|Lmax.

The overall location score L = L1 + L2 is in the interval [0, 2] and consists of equal contributions from the overall center of mass and the scattering around that center. Considering the continued popularity and widespread use of SAL, it is worth pointing out that L has repeatedly failed to produce useful information on forecast performance. Table A1 in appendix A summarizes the results of L in 20 verification studies. Only three of these authors obtain any interpretable information from the location component (Hanley et al. 2013; Navascués 2013; Davolio et al. 2017); the others either fail to mention it entirely (Früh et al. 2007; Zimmer et al. 2008; Zacharov 2013), or explicitly state that L remained uninformative (e.g., Wittmann et al. 2010; Lindstedt et al. 2015; Kann et al. 2015; Maurer et al. 2017). While this list is not exhaustive, it nonetheless demonstrates that 1) there is considerable interest in a pure location score and 2) SAL’s location component is frequently uninformative. An obvious explanation for this shortcoming is that the cooccurrence of multiple different location errors in one forecast may be handled incorrectly. The L1 is invariant under any rearrangement of the fields that leaves the center of mass unchanged. The behavior of L2, which is supposed to compensate for such effects, is not obvious when the number, placement and intensity of objects can differ between the two fields in a variety of ways.

Field deformation is another widely cited approach to the explicit measurement of location errors. By computing an optimal vector field that transforms one image into the other, these techniques account for varying displacements in different parts of an image. Since the optimal flow is generally not divergence free, such scores register not only displacements but also errors in the spatial structure and, in the case of precipitation, the rate of occurrence. While field deformation methods are frequently mentioned in lists of popular spatial verification techniques, they are comparatively rarely used. Of the three deformation approaches included in the original ICP (Gilleland et al. 2010; Marzban and Sandgathe 2010; Keil and Craig 2009), only the displacement and amplitude score (DAS) of Keil and Craig (2007, 2009) appears to have seen use in multiple later studies. Their “pyramid matching” algorithm performs an exhaustive search for the best shift of each individual pixel for a series of coarse-grained versions of the two fields and combines the resulting displacement vectors into an optical flow field (for a complete description see the papers cited above). While this pragmatic approach is not guaranteed to find the true optimum, it will always give a result after a predetermined number of steps (unlike other methods that may even fail to converge) and can be computed at reasonably low cost. Compared to SAL, DAS is far less widely used, likely because it is more difficult to understand and implement. In fact, most subsequent applications are either coauthored by one of the original authors (Tafferner et al. 2008; Craig et al. 2012; Lange and Craig 2014) or acknowledge them for providing the code and/or assisting with the implementation (Nan et al. 2010; Skinner et al. 2016; Han and Szunyogh 2016). For this study, we have developed an implementation of DAS based on the imager R library (Barthelme 2018). We will refer to the vector magnitude of the flow that transforms the forecast into the observation, averaged over all locations with observed precipitation, as DKC.

A third way of quantifying location errors is given by binary distance measures. The basis for these scores is the so-called distance map d(r, X), which measures the distance from an arbitrary location r to the nearest element of the set X of grid points where the binary field under consideration has the value 1. For a recent review and comparisons of distance measures used in forecast verification, we refer to Gilleland et al. (2020) and Gilleland (2021a). In this study, we use Baddeley’s delta metric (Gilleland 2011) as an example from this class. Denoting by A and B the sets of grid points where predicted and observed precipitation exceed 0.1 mm, it is defined as
BD={1Ni=1N|w[d(ri,A)]w[d(ri,B)]|p}1/p.

Here, we use the default SpatialVx implementation where p = 2 and the weight function is just the identity w(d) = d. Thus, for each pixel in the domain, we compare the distance to the next rainy pixel in forecast and observation. BD rewards overlap and can measure displacement errors but also reacts to difference in the general shape and spatial distribution of the rain areas.

3. Wavelet-based structure verification (SAD)

Wavelets were among the first proposed solutions to the double-penalty problem in forecast verification. The basic concept is to represent an image (in our case a meteorological field) as a superposition of functions ψj,d,l, which are limited to a specific scale j, direction d, and location l.1 Classically, a suitable set of these so-called daughter wavelets ψj,d,l is obtained by applying a rescaling, rotation, and shift to a single mother wavelet ψ. To qualify as a wavelet, ψ must integrate to zero (localization in space) and its Fourier transform must decay sufficiently quickly (localization in frequency). The expansion coefficient for a specific daughter is defined as the scalar product with the image I(x, y), i.e.,
cj,d,l=I(x,y)ψj,d,l(x,y)dxdy.

Most forecast verification approaches based on wavelets rely on the multiresolution analysis (MRA) algorithm of Mallat (1989). The MRA is a wavelet transform that allows only scales that are whole powers of two, i.e., ψj(x, y) = 2j/2ψ(2jx, 2jy), and shifts the daughter wavelets at scale j in increments of 2j. In two dimensions, the transform is not implemented as an explicit convolution [as written in Eq. (4)] but by a series of discrete high- and low-pass filters applied recursively to the rows and columns of the image. This separable construction leads to an orthogonal decomposition with three directions, namely, horizontal (high pass on the rows, low pass on the columns), vertical (vice versa), and diagonal (high pass on rows and columns). The popular intensity-scale verification method (Casati et al. 2004) uses an MRA to split the overall MSE between two binary images into its components on the various spatial scales. The double-penalty effect is thereby limited to the small-scale side of the decomposition while skill on larger scales can still be rewarded.

In Buschow and Friederichs (2021), we pursued a different approach and used wavelets to isolate information on the spatial correlation structure of the images while ignoring location errors entirely. In principle, this could be achieved by summing up the squared MRA coefficients over all locations l to obtain a wavelet spectrum. In analogy to the Fourier spectrum, information on the correlation structure could consequently be inferred following Eckley et al. (2010). However, Mallat’s original MRA is ill-suited to this task for two main reasons: 1) the distribution of energy across scales and directions changes abruptly when the input image undergoes a small shift and 2) the diagonally oriented daughter wavelets are ambivalent in their orientation (±45°) and smaller in scale than their sisters. Both issues are resolved by switching to the so-called dual-tree complex wavelet transform (dtcwt; Kingsbury 1999): the real-valued mother ψ is replaced by a complex-valued function ψr + i where the real and imaginary part are out of phase by 90°. In two dimensions, this is realized by a suitable set of four separate MRAs, the coefficients of which are recombined into six uniquely oriented, complex daughter wavelets. The resulting coefficients can then be represented by their modulus and phase
|cj,d,l|=(cj,d,l)2+(cj,d,l)2,Φj,d,l=arctan2[(cj,d,l),(cj,d,l)],
where and indicate the real and imaginary part, respectively. For the details of the dtcwt algorithm, we refer to the helpful tutorial paper by Selesnick et al. (2005). Figure 1 shows the six different orientations at scale j = 6. As in the original MRA, wavelets at this scale are shifted in increments of 26, i.e., the region inside the white box is represented by one complex coefficient per direction. We note that the support of the wavelet (image region where |ψj,d,l| > 0) is larger than the box it represents, which raises the question of boundary conditions. In this paper we will avoid this issue by 1) reflecting the fields at the edges2 and 2) discarding all daughter wavelets that are either larger than the input image, centered outside of the input image, or touching the outer boundary. In accordance with Buschow and Friederichs (2021), the largest three scales are thereby removed entirely from the analysis. (The resulting grids at the largest scales can also be seen in Fig. 12 at the end of section 6.)
Fig. 1.
Fig. 1.

Complex daughter wavelets at scale j = 6 on a 200 × 200 domain in HCL color space: phase mapped to the hue, chroma, and luminance corresponds to the modulus. The white boxes encompass an area of 26 × 26 pixels, which is represented by one coefficient of the dtcwt.

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

The dtcwt’s quadruple redundancy (two complex numbers for each pixel in the input image) gives it near-perfect invariance under shifts: we can sum up the squared coefficients cj,d,l2 over all locations to obtain a J × 6 wavelet spectrum that changes only mildly when the underlying 2J × 2J image is shifted. From this spectrum, we obtain the central scale z ∈ [1, J], degree of anisotropy ρ ∈ [0, 1], and preferred direction φ ∈ [0, π] by treating the cj,d,l2 as point masses located along the edges of a hexagonal prism and computing the center of mass. These structural characteristics can then be compared between forecast and observation. For a detailed explanation of the SAD verification method, we refer to Buschow and Friederichs (2021). In this paper, we will use it only briefly to summarize the observed spatial structure of our test cases. Our goal in the next section is to derive a location score, which is exactly complementary to the SAD structure scores.

4. A wavelet-based location score

The dtcwt achieves near-perfect shift invariance thanks to its complex basis functions: shifts of the input image are encoded in the phase of the complex coefficients while the amplitudes, averaged over all locations, are almost invariant. To get an intuition for this behavior, consider again Fig. 1 and suppose that the image I(x, y) to be transformed consists of a single nonzero pixel located somewhere inside the white box. At scale j = 6, the entire box is represented by a single complex coefficient for each direction, namely, the scalar product between I and the daughter wavelet ψ centered on the box. Since I is zero everywhere except for one pixel, the resulting coefficient is simply the value of the daughter wavelet at that pixel. In other words, the complex number shown at any point inside the box in Fig. 1 is exactly the value of the cj,d,l we would obtain if I was 1 at that point and zero elsewhere. When we move the nonzero pixel around, the absolute value of the coefficient (the luminance and chroma in Fig. 1) remains nearly constant, but the phase Φ (the hue in our plot) changes. The basic idea of our location score is to use this change in phase to estimate the displacement between two images. This approach is particularly promising because the relationship between phase and displacement is approximately linear. For the Fourier transform of a shifted signal x(⋅ −τ), it is easy to show that
F{x(τ)}(ω)=e2πiωτF{x}(ω),
meaning that a time shift by τ results in a frequency-dependent phase shift by −2πωτ. Since the real and imaginary part of the dtcwt wavelets have the same 90° offset as the Fourier basis, their local phase behavior is similar and should thus be close to linear. We derive a location score as follows:
  1. Perform the dtcwt of forecast and observation.

  2. At every location, scale, and direction, compute the phase difference
    ΔΦj,d,l=min[|Φj,d,l(obs)Φj,d,l(for)|,2π|Φj,d,l(obs)Φj,d,l(for)|]/π.
  3. Take a weighted average of ΔΦ over all locations and directions to obtain a scale-dependent phase error

ΔΦ(j)=allld=16wj,d,ld,lwj,d,lΔΦj,d,l.
The division by π gives us a score between 0 and 1, where ΔΦ(j) = 0 indicates that the coefficients for scale j are perfectly in phase and ΔΦ(j) = 1 is the largest possible phase shift of 180°. Intuitively, the worst possible location score should be assigned to a forecast that contains no information on the location of the observed features at all. In this case, the predicted phase Φ(for) can be modeled by a uniform random variable on the unit circle. Due to the rotational symmetry of the problem, we can set Φ(obs) = 0° without loss of generality and find for the expected phase error
E{min[|Φ(obs)Φ(for)|,2π|Φ(obs)Φ(for)|]}=E[|Φ(for)|]=0.5π.
To ensure that this remains the worst case for our verification score, we will therefore consider ΔΦ > 0.5 equally bad as ΔΦ = 0.5. This is also the value that materializes when the intensity in at least one of the images is zero, since the phase Φ is computed with finite precision as the arc tangent of the ratio between two small numbers [Eq. (5)]. It is therefore clear that the spatial average of phase differences should somehow be weighted by the modulus of the coefficients: without weighting, regions of correct negative forecasts would contribute ΔΦ = 0.5, i.e., the worst possible score! To prevent this, we weight the phase differences by the total observed and forecasted energy, i.e.,
wj,d,l=|c|j,d,l2(obs)+|c|j,d,l2(for).

The resulting score used throughout this study is thus symmetrical with respect to exchanging forecast and observation, ignores featureless regions and focuses on the most important part of the two fields. If our main interest were in presence or absence of features, we could alternatively set wj,d,l = 1 wherever either of the coefficients in nonzero and 0 elsewhere. For fields like temperature or wind speed with variability in all locations, one could also remove the weighting entirely, thereby treating all local gradients as equally important regardless of their strength.

For a first impression of the phase-based location score, we perform a simple experiment and compare a logarithmized rain field from the MesoVICT dataset (see section 6a) to shifted versions of itself (Fig. 2). Figure 3a shows the resulting weighted average phase differences ΔΦ(j). As we expected due to the analogy to Fourier, the phase shift is indeed initially linear, with a slope depending on the scale j. We observe that ΔΦ reaches 0.5, i.e., 90°, at shifts around 2j−1 (see dashed, colored vertical lines) and then oscillates around that limit value. The oscillation, caused by random realignment of image features at large displacements,3 is a further reason to treat ΔΦ > 0.5 and ΔΦ = 0.5 equally.

Fig. 2.
Fig. 2.

Difference between shifted and original test image.

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

We can use the observation that ΔΦ(j) = 0.5 is attained after 0.5 × 2j pixels to obtain a simple quantitative estimate of the displacement:
Δx(j)(2j×ΔΦjforΔΦj<0.52j1otherwise.

Figure 3b shows that this estimator works quite well for our example. All scales agree approximately on the correct result until the shift exceeds 2j at which point the corresponding estimate saturates. At very large displacements, we notice that the slope is not perfect, especially for j = 6, 7, but the deviations remain small. We confirmed that Eq. (10) is typically a good approximation across a large number of similar experiments (not shown).

Fig. 3.
Fig. 3.

(a) Scale-dependent phase-shift ΔΦ(j) and (b) estimated shift in image space 2jΔΦ(j) as a function of the true shift applied to the input image. For the solid lines in (b), ΔΦ is cut off at 0.5.

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

Whenever a single summary measure of the overall displacement error is needed, we will use the maximum estimated displacement, henceforth called
dΦ=maxj[Δx(j)].

For a single rigid displacement, dΦ is our best estimate of the true error. In this simple case, the displacement error is generally easy to estimate, for example, as the distance in centroids [Eq. (1)]. Due to their localized nature, the wavelets should, however, also be able to correctly handle more difficult situations with multiple different displacements. The maximum estimated displacement dΦ then represents the most severe location error in the forecast. To demonstrate this capability, we consider some of the geometric test images from the MesoVICT project (Gilleland et al. 2020), shown in Fig. 4.

  • The true displacement of Δx = 40 is almost correctly identified in Figs. 4a–c. All scales smaller than j = 7 are saturated; the result depends only weakly on the position and orientation of the features.

  • Figures 4d and 4e are correctly identified as worse than Figs. 4a–c. Figures 4g and 4h are recognized as better; the two largest scales approximately agree on the result; the estimated values are reasonably close to the correct answer (57 in Figs. 4d and 4e, 20 in Figs. 4g and 4h).

  • The addition of further features around the observation in Figs. 4h and 4i is considered an improvement over Figs. 4a–c. These are two examples where the biggest error does not reside on the largest scale: with respect to j = 7, the placement of the features is decent; on smaller scales it is just as bad as in Figs. 4a–c.

  • Similarly, the additional hit in Fig. 4j leads to an overall improved score over the otherwise identical case (Fig. 4a).

  • Figure 4k is deemed worse than Fig. 4a on the largest scale but better on the small scales.

  • Figure 4l looks like a decent forecast on the largest scale while Fig. 4m is bad across almost all scales.

  • The displacement in Fig. 4n is recognized but the two largest scales do not agree on a value. The shift indicated by j = 6 is close to the correct answer, at j = 7 each of the four daughters likely sees part of the unrelated feature and interprets it as a miss.

  • Figure 4o is among the overall worst forecasts.

  • The correctly placed region of scattered pixels in Fig. 4p receives nearly perfect scores, the shifted region in Fig. 4q is maximally bad.

  • The scores are invariant under inversions of the image (cf. Fig. 4r to Fig. 4b).

  • Figures 4s and 4t demonstrate that the measured shift can be direction dependent if the patterns have a preferred orientation. In Fig. 4s, the true shift of 25 pixels is slightly overestimated. For Fig. 4t, which is not part of the “official” tests, the same shift is applied parallel to the object orientation, resulting in a much better score due to large portions of the feature edge remaining in phase. Here, the “true” shift cannot be recovered, but most users would likely consider Fig. 4t a better forecast than Fig. 4s.

Fig. 4.
Fig. 4.

Estimated location errors for the circular test images. Gray areas mark the observation and dashed contours the forecasts. From top to bottom, the numbers indicate Δx(j = 1, …, 7). Green color marks scales with Δx(j) < 2j−1; the overall estimate [Eq. (11)] is marked in bold. The distance between two grid lines is 32; labels “CXCY” correspond to those in Gilleland et al. (2020).

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

Table 1 shows that, for the circle comparisons (Figs. 4a–o), dΦ behaves overall similarly to most of the distance measures tested in Gilleland et al. (2020). Conversely, the newly introduced scores G and Gβ from Gilleland (2021a) (as well as ZHU, which is highly correlated with them) are almost entirely uncorrelated to dΦ. Here, the main disagreement concerns first cases C2C11, C1C9, C1C10, which feature large, nonoverlapping regions, centered around the observed object, leading to small or medium dΦ but very poor G and Gβ. Second, the large displacement in C13C14 is strongly punished by dΦ whereas other scores effectively consider the small nonzero areas as insignificant noise and reward the forecast as correctly negative.

Table 1

Rank correlation between dΦ and several scores from Gilleland et al. (2020) and Gilleland (2021a) for the 15 circle comparisons (Figs. 4a–o). For the definitions of each score, see the respective references, particularly Table 2 in Gilleland et al. (2020).

Table 1

Overall, these results confirm that sensible estimates of displacement errors can be extracted from the wavelet phase in scenarios that are more complex than a single, image-wide shift.

Last, the localization of the daughter wavelets also allows us to display spatially varying displacement errors on a map. In our example case below (Fig. 7), we will use this to draw contours around the regions with the largest contributions to the overall ΔΦ. At the end of section 6, we also discuss the option of averaging ΔΦ over time instead of space to obtain a map of scale-dependent mean displacements.

The computational effort due to the wavelet transform scales linearly with the number of pixels and remains moderate for typical field sizes. Our implementation in the R programming language (Buschow 2021) takes roughly 0.3 s to verify a single forecast on a 256 × 256 grid and 5 s for 1024 × 1024 on a single modern CPU.

5. Data

The MesoVICT project relies on the Vienna Enhanced Resolution Analysis (VERA; Bica et al. 2007) as observational data against which all forecast models are verified. This model independent dataset enhances interpolated station observations with thermal and dynamical fingerprints to produce maps of meteorological variables. In this study, we focus mainly on hourly precipitation sums, which are inferred from station observations alone. In addition, we explore the use of the novel score for absolute wind speed, potential and equivalent potential temperatures. For the latter two variables, the fingerprint method was applied, thereby introducing information from a finer-scaled orography beyond the resolution of the station network (Dorninger et al. 2018). All data are interpolated to a regular 8-km grid covering central Europe (see maps in Fig. 7).

When the analysis domain is small compared to the typical features to be verified, displacement errors are hard to diagnose accurately because patterns are quickly displaced into or out of the domain. In the interest of avoiding such effects, as well as streamlining the experiment as a whole, we focus on two forecast models that cover the entire VERA domain: the hydrostatic Bologna Limited Area Model (BOLAM), run at 0.07° resolution and the nonhydrostatic, convection permitting Modello Locale (MOLOCH) with 0.0225° grid spacing that receives its boundary conditions from BOLAM. Reforecasts with the 2015 operational version of this model chain were performed at ISPRA for the MesoVICT project (Mariani and Casaioli 2018). Both models were initialized at 1200 UTC each day and run for 84 h (MOLOCH) and 108 h (BOLAM). The first 12 h of each run were discarded as model spinup time.

Table 2 lists the dates of the six MesoVICT test cases. With the exception of number five, all cases cover multiple days, thereby giving us the opportunity to compare forecasts from the same model with different lead times. For the purpose of testing a new verification measure, this is convenient as it allows us to probe a wide range of error magnitudes and gives us a clear a priori expectation for which forecasts should, on average, be better than others. To take full advantage of this idea, we select those time steps for which three different forecasts from each of the two models are available. This leaves us with the last day of cases 1, 4, and 6 and the last two days of cases 2 and 3 (168 time steps in total). In the plots below, we will refer to the different forecasts as

  • BOLAM007_1, MOL00225_1 with lead times +13 h, …, +36 h,

  • BOLAM007_2, MOL00225_2 with lead times +37 h, …, +60 h, and

  • BOLAM007_3, MOL00225_3 with lead times +61 h, …, +84 h.

Table 2

Dates of the MesoVICT test cases with a list of dominant weather events and the number of time steps used in this study.

Table 2

An overview of the synoptic situations in the different case studies is given in Fig. 5. As an objective measure of the spatial structure of the resulting rain fields analyzed in VERA, we also consider the degree of anisotropy ρ, dominant spatial scale z (Buschow and Friederichs 2021), and total rain area in Fig. 6. Without discussing each case in detail, we observe that cases 1 and 6 have the weakest synoptic-scale forcing, leading to mostly isotropic structures across a relatively wide range of small and intermediate scales, covering roughly a quarter of the analysis domain—the precipitation fields are mostly convective. In contrast, cases 2 and 4 have anisotropic patterns with smaller areas. Here, the main precipitation regions are aligned along airmass boundaries and related convergence lines. Case 4 is smaller in scale, indicating a more prominent role of convection. Last, case 3 sees the strongest synoptic forcing from a cutoff low centered over Germany and a related Genoa cyclone, resulting in rain areas covering up to 40% of the domain with a large, anisotropic pattern.

Fig. 5.
Fig. 5.

Representative synoptic analyses adapted from the KNMI archive (https://www.knmi.nl/nederland-nu/klimatologie/daggegevens/weerkaarten) for (a)–(f) MesoVICT test cases 1–6. Blue ellipses schematically indicate the main precipitation regions within the MesoVICT domain. Case 5 is included for completeness but not used in this study.

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

Fig. 6.
Fig. 6.

(left) Degree of anisotropy ρ, (center) central scale z, and (right) fraction of the domain with nonzero rain, all calculated from VERA data for the five cases considered in this study. Boxplots show the distribution over all time steps in each case.

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

6. Verification of the MesoVICT dataset

Based on the discussion above, we have a number of expectations for the outcome of the verification experiment:

  • Short range forecasts are likely better than those with longer lead times.

  • The synoptically driven case 3 should be the more predictable over longer time ranges than the others.

  • The convective cases 1 and 6 are likely the most difficult to predict.

Whether the highly resolved MOLOCH will perform better or worse than BOLAM is unclear a priori, especially because the VERA analysis has a relatively coarse internal resolution and therefore produces smooth fields that look more similar to BOLAM. Scores that are sensitive to the spatial structure of the fields might therefore prefer the coarser forecasts. The high-resolution model could nonetheless be superior in terms of precipitation locations, especially in convectively driven situations.

In the next section, we focus on precipitation and compare the novel score to the established alternatives from section 2. The subsequent section briefly summarizes some of the results obtained for the other variables.

a. Precipitation

Before computing the various scores, we set all observed and predicted rainfall values below 0.1 mm to zero. For the wavelet-based score dΦ, rain intensities are replaced by their binary logarithm [setting log2(0) → log2(0.1)] to reduce the impact of localized extremes and focus on the spatial distribution of rainfall as a whole (see also Buschow and Friederichs 2020).

To give an impression of the new score’s behavior under realistic conditions, we present an example from the second MesoVICT case (Fig. 7). On 20 July 2007, the bulk of the observed precipitation field is linearly organized along an air mass boundary near the center of the domain. The 23-h BOLAM forecast predicts a visually similar linear feature with nearly correct placement and orientation. The dΦ registers a displacement of 16 pixels, which corresponds to a single cell of the background grid in the figure. To visualize the detected errors, we have added contours around the pixels with the largest contributions to ΔΦ(j) for j = 3, 4, 5. Focusing on scale 5 (shown in blue), we see that the leading edge of the simulated front lies roughly in the middle of the two squares whereas the observation aligns with the outer edges—a phase error of roughly half the support size at j = 5, i.e., 16 pixels or 128 km.

Fig. 7.
Fig. 7.

Phase verification for 1100 UTC 20 Jul 2007: rectangles enclose the regions with 5% largest contributions to ΔΦ at scales 3 (red), 4 (green), and 5 (blue) for the (top) 23- and (bottom) +47-h BOLAM07 forecast. Red markers show the fields’ centroids.

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

The previous day’s BOLAM forecast (bottom part of Fig. 7) also simulates the linear pattern but clearly rotated and strongly displaced to the southwest. This is again easily seen by comparing the patterns within the blue box. In addition, the forecast contains a relatively intense false alarm near the northeast corner of the domain that is picked up by ΔΦ(3) and ΔΦ(4) but not at larger scales because small-scale patterns have less impact there. Overall, we find dΦ = 32 (256 km), which is the largest possible value in our case study.

In Fig. 7, we have marked the locations of the observed and predicted centers of mass that are the basis of SAL’s location component L. We observe that the centroids are nearly identical in all three fields. For the 47 h forecast, the westward displacement of the front is compensated by the additional feature in eastern part of the domain, leading to a centroid displacement close to zero. This is one of two common scenarios that can lead to substantial disagreement between dΦ and L: in a complex precipitation field with multiple objects, individual displacements (or misses and false alarms) can cancel out to create a centroid location near the center of the domain, potentially leading to dΦL1. The opposite result can occur when forecast and observations contain precipitation regions at the same locations but with different relative intensities. In this scenario, L1 may be large since the centroid shifts toward the most intense feature while dΦ would likely see only small phase errors at each precipitation location.

We now begin our systematic verification of the entire dataset with a look at the individual phase differences ΔΦ, separated by scale, forecast, and case number in Fig. 8. Recalling that the worst case is ΔΦ ≥ 0.5, we observe that none of the forecasts have any appreciable skill at scales smaller than j = 4; almost all are skillful at the largest scale j = 6. As expected, the forecasts started on the previous day are almost universally superior to those with longer lead times, the advantage being most evident on scale five. The overall quality of the predictions, as well as the range from best to worst forecast, differ substantially from case to case: a clear difference between the 2- and 3-day forecasts is evident at all scales j > 3 in cases 3 and 4. Only the most recent forecasts stand out in the first two cases.

Fig. 8.
Fig. 8.

Interquartile range (colored bars) and median (white gaps) of ΔΦ(j) for the different MesoVICT cases and forecasts.

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

In case 6, the remarkable lack of lead-time dependence, as well as the overall mediocre performance, is likely due to the dominant role of convective activity (smallest observed scales in Fig. 6), the precise timing and location of which is hardly predictable at lead times longer than a few hours. This is also the only case where the convection permitting MOLOCH consistently outperforms the coarser BOLAM at all scales and lead times. At the shortest lead times, MOLOCH furthermore has slight advantages in cases 1 and 4 that exhibit relatively small-scaled structures as well.

Conversely, case 3 sees the strongest synoptic forcing and was overall forecast best. The pronounced lead-time dependence indicates some remaining difficulty in predicting the precise path of the cutoff low. The difference between lead times is even stronger in case 4 where the formation and movement of the linearly organized precipitation patterns proved difficult to forecast more than one day in advance.

While the difference in quality between the +12-, +36-, and +50-h forecasts is thus often obvious, no clear winner emerges from the comparison between BOLAM and its finer-scaled sister model MOLOCH. This overall ranking is also reflected by the resulting displacement errors dΦ shown in Fig. 9f. Here we see that the median displacement across all cases is around 12 pixels (roughly 100 km) for the shortest lead times, 16 (128 km) for lead times greater than 36 h and slightly below 20 (160 km) for the longest-range forecasts. The three ranges of lead times are clearly separated: values that fall in the upper quartile for day one are near the median of day two and the lower quartile of day three. The worst-case value of dΦ = 32 (256 km) is rare even on the third forecast day.

Fig. 9.
Fig. 9.

Distribution of (a) RMSE, (b) one minus linear correlation, (c) SAL’s location component, (d) BD, (e) DKC, and (f) dΦ for the six competing forecasts. Boxplots show distributions across all time steps in all cases.

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

Interestingly, each of the other five scores shown in Figs. 9a–e paints a different picture. As expected, RMSE (Fig. 9a) is a textbook example of the double-penalty issue with hardly any difference between lead times but a strong preference for the coarser resolved BOLAM model. The linear correlation coefficient, shown in Fig. 9b, mostly rewards overlap between forecast and observation and therefore naturally prefers BOLAM as well. In addition, most of the longer-ranged predictions hardly overlap the observed field at all, leading to near zero correlations in most instances on days two and three. In stark contrast to the overall bad performance with respect to correlations, SAL’s location component (Fig. 9c) indicates low values for all forecasts with hardly any preference for either model and a very weak dependence on lead time. The reason for this behavior is explained by Fig. 10: due to the frequently complex and widespread nature of these precipitation fields, their centroids are usually concentrated near the center of the domain,4 leading to small values of L1. The other half of L measures the scattering around the centroid and is more a structural characteristic than a measure of displacement.

Fig. 10.
Fig. 10.

Average distance (in pixels) to the next rainy gridpoint for VERA and the most recent BOLAM and MOLOCH forecasts. White dots mark the position of the field’s center of mass for all cases.

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

The distance measure BD (Fig. 9d) shows perhaps the most surprising behavior: like RMSE, it registers almost no lead time dependence but instead of BOLAM, it clearly prefers the fine-scaled MOLOCH model. To understand this unexpected result, we must recall that distance measures are based on the distance from each rainy pixel in one field to the nearest rainy pixel in the other and therefore react sensitively to the presence or absence of small features in otherwise empty regions of the domain. Figure 10 reveals that the average distance to the nearest rain event is substantially too large in BOLAM, especially near the western and northern domain edge. While the model can decently simulate the main features of the precipitation field, it tends to neglect smaller-scaled showers throughout the rest of the domain. For the other scores, this effect is largely overshadowed by the displacement of the dominant precipitation systems.

The last score included in our comparison is the field-deformation score DKC (Fig. 9e), which generally prefers MOLOCH as well, while noting a similar decrease in forecast quality over time as dΦ. It is possible that this score also rewards the finer model for producing additional smaller-scaled precipitation cells in the general vicinity of the observed rain areas: the addition of hits or near misses on small scales will tend to reduce the overall mean displacement vector. Conversely dΦ, as defined in Eq. (11), will focus on the most intense parts of the image due to the weighting and ignores small-scale displacements if a big displacement is present on larger scales.

b. Other variables

While the location scores from section 2, as well as most others in the literature, were designed specifically for precipitation or similarly intermittent fields, our approach makes no such assumptions. As a proof of concept, we now apply the exact same methodology used for precipitation to fields of absolute wind speed in 10-m height (V), 2-m potential temperature (θ), and 2-m equivalent potential temperature (θe).

Figure 11 summarizes the scale-dependent phase errors ΔΦ for all four variables. The most obvious difference between precipitation and the others is a substantial improvement of ΔΦ at small scales and long lead times. The increased small-scale skill is particularly obvious for θ where the phase errors at scale j = 3 are comparable to those seen at j = 5 for precipitation. Wind speed and equivalent potential temperature exhibit slightly larger phase errors than θ, all three show a weak but consistent increase with lead time. In comparing the two models, we observe that MOLOCH has some advantages for both θe (mostly on small scales) and θ (mostly on large scales). For wind speed, on the other hand, the coarser BOLAM model is slightly superior, especially on small scales.

Fig. 11.
Fig. 11.

Interquartile range (colored bars) and median (white gaps) of ΔΦ(j) for all variables and forecasts.

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

An obvious explanation for the smaller displacements compared to precipitation is the presence of stationary objects like coastlines and mountains. Depending on their representation in the model orography and land surface fields, such features allow the forecasts to predict the location of spatial gradients in near-surface fields with high accuracy, even on small scales and multiple days in advance. To understand the qualitative difference between θ and V, we must recall that VERA enhances temperature-related variables with thermal and dynamic “fingerprints.” The interpolated station data are thereby imbued with additional fine-scaled texture far beyond the spatial resolution of the station network. This method was not applied for wind and precipitation. As a result, the finer-scaled wind features of MOLOCH appear erroneous compared to the analysis, thereby increasing the average phase errors.

An inherent advantage of the wavelet approach is its natural capability to localize errors in space. To produce a map of average phase errors, we simply take the weighted mean over time instead of space. Figure 12 shows the results for all four variables but only one of the forecasts (images for all six forecasts look qualitatively similar). As expected, there is no coherent pattern for precipitation since the phases result from intermittent features materializing at various discrete locations across the domain. Only on the largest two scales, we see a slight tendency toward better forecast locations in the southwest and larger errors over Germany. In contrast, individual pixels with ΔΦ ≫ 0.5 can be seen even at j = 3 for the other three variables. The regions of improved localization are primarily aligned along the coastlines. For θ, the Alps appear as an additional source of consistent localization, which is reproduced by the model. On large scales (j = 5, 6), most of the pixels in the domain border on either a coast or mountain range and consequently exhibit small ΔΦ.

Fig. 12.
Fig. 12.

Weighted time average of ΔΦ(j) of the MOL00225_2 forecasts for all considered variables (from left to right) and selected scales (from top to bottom).

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

7. Conclusions

In this study, we have introduced a novel location score that exploits the phase information of the dual-tree complex wavelet transform. Idealized tests with simple geometric shapes demonstrate that this score generally works as intended. The easiest of these tests consider a single, rigidly displaced feature and ask the score to reconstruct the magnitude of the shift. Like most other scores, the wavelet approach can typically solve such problems. The localization property of the wavelets furthermore allows us to correctly assess more complex scenarios where multiple different displacements occur in different parts of the domain.

In a real-world verification setting, however, the problem is even more complicated because the existence of a well-defined location error is not guaranteed at all. This is particularly true when we consider fields like temperature and wind speed, which do not naturally separate into discrete objects.

As a realistic case study, we used a subset of the MesoVICT test cases and compared 1-, 2-, and 3-day forecasts of hourly precipitation from the BOLAM-model and its higher-resolved sister-model MOLOCH to the VERA analysis. Our first experiment focused on precipitation fields and compared several scores that are sensitive to displacement errors. As expected, the pointwise RMSE uniformly prefers BOLAM. The linear correlation suffers from the same double-penalty issue, but is at least capable of identifying the advantages of the 1-day forecast over the others. More surprisingly, the distance measure BD exhibits the exact opposite behavior of RMSE, preferring MOLOCH irrespective of lead time. A look at the underlying distance maps shows that this score punishes BOLAM for neglecting small-scaled, scattered precipitation cells in parts of the domain. This result, while not uninformative, is more a property of the model climatology than the day-to-day precipitation placement. In agreement with our literature survey, SAL’s L proved to have very little information on forecast performance, as indicated by miniscule differences between both models and lead times. It should be mentioned that this score was originally defined and optimized with a much smaller domain in mind (a single river catchment)—a setting where small numbers of features and single, well-defined displacements are generally more likely than on our map of western Europe.

With the novel wavelet-based score dΦ, we find a clear decrease of skill with lead time. In addition, advantages for MOLOCH are identified in the smaller-scaled, convectively driven cases 1, 4, and 6. These cases were also found to be more challenging to forecast than the synoptically dominated case 3. The most similar established score turned out to be the displacement component of DAS. The fact that this field deformation method showed a general preference for MOLOCH may be partially due to the same phenomenon as for BD. In contrast to dΦ, neither of these scores give special weight to intense regions. We can furthermore conclude that, despite allowing divergent flow fields, DAS is not strongly sensitive to scale errors. Like dΦ, it can thus be recommended as a complement to pure structure verification techniques and pure comparisons of the marginal distribution. When the structure is verified via SAD, dΦ is a natural complementary score because it relies on the same wavelet transform (no additional computation needed) and utilizes exactly the information that SAD neglects. To enable further comparisons with existing and potential future scores, verification results for the original ICP test cases are included in appendix B.

Unlike the other scores in our intercomparison, dΦ requires no thresholding or object identification and can therefore verify any meteorological field of interest. As a first demonstration, we have applied our method to equivalent and dry potential temperature, as well as absolute wind speed. All of these variables were considered near the surface since a relatively dense station network is needed to produce a spatial analysis. The local phases of the resulting wavelet transforms are therefore strongly influenced by mountains and coasts. Phase errors at these spatially fixed locations are likely caused by errors in the strength of local gradients, rather than a spatial displacement. Our scores represent averages over both these stationary features and more transient phenomena related to, for example, fronts, cyclones, and convection. The localized nature of the wavelets furthermore allows us to study the distribution of consistent phase errors in space. For precipitation, this yielded little additional information due to the relatively small sample size. The other variables, however, exhibit well-localized regions of improved location skill, primarily along the coastlines.

While the option to apply the same location score to a variety of atmospheric variables is doubtlessly convenient, a number of limitations must be kept in mind. First, nonintermittent fields have variance in all parts of the domain and our score represents an average. It is therefore not always easy to identify the meteorological sources of the measured errors. A human might, for example, focus on the displacement of a cold front while the strongest spatial gradients, which dominate the score, are actually located on the coast. An obvious solution is to move up into the free atmosphere where surface features, as well as diurnal cycles, have less impact. This, however, exasperates the second main limitation, namely, the lack of spatial observations. Interpolated datasets like VERA can already be problematic near the surface since the density of station networks is far coarser than the resolution of modern weather models like MOLOCH; at higher levels, spatial verification must either rely on reanalysis (which is not model independent) or novel remote sensing data from satellites or (clear-air) radar and lidar scans.

1

For the sake of simpler notation, we have implicitly assumed that the locations l can be counted by a scalar index; general wavelet transforms can allow arbitrary locations in 2.

2

Here, reflection is preferred over padding because the former is appropriate for both precipitation and other variables.

3

Large compared to the daughter wavelet.

4

Precipitation in and around the Alps was also a criterion for the selection of cases in the MesoVICT project (Dorninger et al. 2018).

Acknowledgments.

Eric Gilleland, Stefano Mariani, and Marco Casaioli are gratefully acknowledged for providing the example datasets used throughout this paper. Further thanks are due to Petra Friederichs, as well as three anonymous reviewers, for their helpful suggestions and encouragement. This work was funded by the DFG under Grant FR 2976/2-1.

Data availability statement.

VERA analysis data, as well as the geometric test images, are available from the ICP/MesoVICT project homepage at http://projects.ral.ucar.edu/icp/. The BOLAM and MOLOCH data used in this study are available from the authors of Mariani and Casaioli (2018) upon request. Software for computation of SAL and BD (as well as many other scores) is included in the SpatialVx R library (Gilleland 2021b). Code for our displacement score based on complex wavelets, as well as an implementation of the optical flow from Keil and Craig (2009), has been archived at https://doi.org/10.5281/zenodo.5665719 (Buschow 2021).

APPENDIX A

Use of SAL’s Location Component in the Literature

A survey of 20 verification studies using L was conducted; Table A1 summarizes the results.

Table A1

Use of SAL’s L component in the literature. Cases where useful information was obtained from L are marked in bold.

Table A1

APPENDIX B

Verification of ICP Test Cases

To enable further comparisons with the existing (as well as potential future) spatial verification literature, we apply the newly developed displacement score dΦ to the realistic test cases from the original ICP project. This standard dataset consists of nine hourly precipitation forecasts of three competing WRF Model configurations from the 2005 Spring Program of the SPC/NSSL, verified against stage II reanalysis data. It was used by all studies within the original ICP, as well as several later additions (Han and Szunyogh 2016; Gilleland 2021a), and is currently available for download from http://projects.ral.ucar.edu/icp/ (last accessed 1 March 2022). For details on the models and selected cases, we refer to Ahijevych et al. (2009) and references therein.

The domain of the ICP dataset covers the central United States in a 4-km grid, resulting in 601 × 501 grid points. We extend the fields with reflective boundaries to 1024 × 1024 pixels, apply the same steps as before (cutoff at 0.1 mm, log transform) and compute dΦ. Here, we retain the smallest eight scales, resulting in a maximum possible displacement error of 128 grid points. The values for each case and model are listed in Table B1. In addition to the various objective scores, these case studies were also verified by a panel of 26 experts who were asked to rate each forecast on a scale from 1 (poor) to 5 (excellent) (Ahijevych et al. 2009). Several studies, including Ahijevych et al. (2009), Keil and Craig (2009), Gilleland et al. (2010), and Gilleland (2021a), have compared the objective ranking of various scores to the subjective ranking. These authors have already pointed out several flaws in this approach to testing a verification method (small sample, lacking reproducibility, unclear and inconsistent notion of “good” forecasts), thereby making it clear that the subjective rating is by no means the ground truth. Since the comparison nonetheless holds some interest, we plot the expert rankings against those resulting from dΦ in Fig. B1. In contrast to the studies cited above (see, e.g., Fig. 4 in Gilleland et al. 2010), we find that dΦ has a moderately good correspondence with the subjective judgement, especially for the wrf4ncar model (rank correlation 0.78). The greatest disagreement appears for wrf4ncep on 4 June. The very poor expert rating on this day is likely related to the strongly biased precipitation total (largest bias in the dataset), which is not measured by a displacement score like dΦ. In Table B2, we compare dΦ and the expert scores to some of alternative scores used throughout this paper. The fact dΦ has by far the closest relationship with the subjective results should not be taken as a measure of the scores “quality” for the reasons mentioned above; in particular, the experts were not asked to specifically identify displacement errors.

Table B1

Values of dΦ (rounded to the next integer) for the nine original real ICP test cases.

Table B1
Fig. B1.
Fig. B1.

Ranking of the nine original ICP test cases from best (1) to worst (9) according to dϕ (y axis) and the subjective expert judgment (x axis).

Citation: Weather and Forecasting 37, 6; 10.1175/WAF-D-21-0180.1

Table B2

Overall rank correlations between dΦ, subjective expert ratings and alternative scores for the original real ICP test cases. Expert values and DAS are taken from Table 3 in Keil and Craig (2009).

Table B2

REFERENCES

  • Ahijevych, D., E. Gilleland, B. G. Brown, and E. E. Ebert, 2009: Application of spatial verification methods to idealized and NWP-gridded precipitation forecasts. Wea. Forecasting, 24, 14851497, https://doi.org/10.1175/2009WAF2222298.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barthelme, S., 2018 : imager: Image Processing Library Based on “CImg,” version 0.41.1. R package, https://CRAN.R-project.org/package=imager

  • Bica, B., R. Steinacker, C. Lotteraner, and M. Suklitsch, 2007: A new concept for high resolution temperature analysis over complex terrain. Theor. Appl. Climatol., 90, 173183, https://doi.org/10.1007/s00704-006-0280-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Buschow, S., 2021: Code for a location score based on complex wavelets. Zenodo, https://doi.org/10.5281/zenodo.5665720.

  • Buschow, S., and P. Friederichs, 2020: Using wavelets to verify the scale structure of precipitation forecasts. Adv. Stat. Climatol. Meteor. Oceanogr., 6, 1330, https://doi.org/10.5194/ascmo-6-13-2020.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Buschow, S., and P. Friederichs, 2021: SAD: Verifying the scale, anisotropy and direction of precipitation forecasts. Quart. J. Roy. Meteor. Soc., 147, 11501169, https://doi.org/10.1002/qj.3964.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Buschow, S., J. Pidstrigach, and P. Friederichs, 2019: Assessment of wavelet-based spatial verification by means of a stochastic precipitation model (wv_verif v0.1.0). Geosci. Model Dev., 12, 34013418, https://doi.org/10.5194/gmd-12-3401-2019.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Casati, B., G. Ross, and D. Stephenson, 2004: A new intensity-scale approach for the verification of spatial precipitation forecasts. Meteor. Appl., 11, 141154, https://doi.org/10.1017/S1350482704001239.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Craig, G. C., C. Keil, and D. Leuenberger, 2012: Constraints on the impact of radar rainfall data assimilation on forecasts of cumulus convection. Quart. J. Roy. Meteor. Soc., 138, 340352, https://doi.org/10.1002/qj.929.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Davolio, S., F. Silvestro, and T. Gastaldo, 2017: Impact of rainfall assimilation on high-resolution hydrometeorological forecasts over Liguria, Italy. J. Hydrometeor., 18, 26592680, https://doi.org/10.1175/JHM-D-17-0073.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dorninger, M., E. Gilleland, B. Casati, M. P. Mittermaier, E. E. Ebert, B. G. Brown, and L. J. Wilson, 2018: The setup of the MesoVICT project. Bull. Amer. Meteor. Soc., 99, 18871906, https://doi.org/10.1175/BAMS-D-17-0164.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Eckley, I. A., G. P. Nason, and R. L. Treloar, 2010: Locally stationary wavelet fields with application to the modelling and analysis of image texture. J. Roy. Stat. Soc., 59, 595616, https://doi.org/10.1111/j.1467-9876.2009.00721.x.

    • Search Google Scholar
    • Export Citation
  • Früh, B., J. Bendix, T. Nauss, M. Paulat, A. Pfeiffer, J. W. Schipper, B. Thies, and H. Wernli, 2007: Verification of precipitation from regional climate simulations and remote-sensing observations with respect to ground-based observations in the upper Danube catchment. Meteor. Z., 16, 275293, https://doi.org/10.1127/0941-2948/2007/0210.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., 2011: Spatial forecast verification: Baddeley’s delta metric applied to the ICP test cases. Wea. Forecasting, 26, 409415, https://doi.org/10.1175/WAF-D-10-05061.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., 2021a: Novel measures for summarizing high-resolution forecast performance. Adv. Stat. Climatol. Meteor. Oceanogr., 7, 1334, https://doi.org/10.5194/ascmo-7-13-2021.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., 2021b: SpatialVx: Spatial Forecast Verification, version 0.8 . R package, https://CRAN.R-project.org/package=SpatialVx.

  • Gilleland, E., D. Ahijevych, B. G. Brown, B. Casati, and E. E. Ebert, 2009: Intercomparison of spatial forecast verification methods. Wea. Forecasting, 24, 14161430, https://doi.org/10.1175/2009WAF2222269.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., J. Lindström, and F. Lindgren, 2010: Analyzing the ImageWarp forecast verification method on precipitation fields from the ICP. Wea. Forecasting, 25, 12491262, https://doi.org/10.1175/2010WAF2222365.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., G. Skok, B. G. Brown, B. Casati, M. Dorninger, M. P. Mittermaier, N. Roberts, and L. J. Wilson, 2020: A novel set of geometric verification test fields with application to distance measures. Mon. Wea. Rev., 148, 16531673, https://doi.org/10.1175/MWR-D-19-0256.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gofa, F., D. Boucouvala, P. Louka, and H. Flocas, 2018: Spatial verification approaches as a tool to evaluate the performance of high resolution precipitation forecasts. Atmos. Res., 208, 7887, https://doi.org/10.1016/j.atmosres.2017.09.021.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Haiden, T., A. Kann, C. Wittmann, G. Pistotnik, B. Bica, and C. Gruber, 2011: The Integrated Nowcasting through Comprehensive Analysis (INCA) system and its validation over the eastern Alpine region. Wea. Forecasting, 26, 166183, https://doi.org/10.1175/2010WAF2222451.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Han, F., and I. Szunyogh, 2016: A morphing-based technique for the verification of precipitation forecasts. Mon. Wea. Rev., 144, 295313, https://doi.org/10.1175/MWR-D-15-0172.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hanley, K. E., D. J. Kirshbaum, N. M. Roberts, and G. Leoncini, 2013: Sensitivities of a squall line over central Europe in a convective-scale ensemble. Mon. Wea. Rev., 141, 112133, https://doi.org/10.1175/MWR-D-12-00013.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hardy, J., J. J. Gourley, P.-E. Kirstetter, Y. Hong, F. Kong, and Z. L. Flamig, 2016: A method for probabilistic flash flood forecasting. J. Hydrol., 541, 480494, https://doi.org/10.1016/j.jhydrol.2016.04.007.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kann, A., I. Meirold-Mautner, F. Schmid, G. Kirchengast, J. Fuchsberger, V. Meyer, L. Tüchler, and B. Bica, 2015: Evaluation of high-resolution precipitation analyses using a dense station network. Hydrol. Earth Syst. Sci., 19, 15471559, https://doi.org/10.5194/hess-19-1547-2015.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kapp, F., P. Friederichs, S. Brune, and M. Weniger, 2018: Spatial verification of high-resolution ensemble precipitation forecasts using local wavelet spectra. Meteor. Z., 27, 467480, https://doi.org/10.1127/metz/2018/0903.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Keil, C., and G. C. Craig, 2007: A displacement-based error measure applied in a regional ensemble forecasting system. Mon. Wea. Rev., 135, 32483259, https://doi.org/10.1175/MWR3457.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Keil, C., and G. C. Craig, 2009: A displacement and amplitude score employing an optical flow technique. Wea. Forecasting, 24, 12971308, https://doi.org/10.1175/2009WAF2222247.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kingsbury, N., 1999: Image processing with complex wavelets. Philos. Trans. Roy. Soc., A357, 25432560, https://doi.org/10.1098/rsta.1999.0447.

  • Lange, H., and G. C. Craig, 2014: The impact of data assimilation length scales on analysis and prediction of convective storms. Mon. Wea. Rev., 142, 37813808, https://doi.org/10.1175/MWR-D-13-00304.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lindstedt, D., P. Lind, E. Kjellström, and C. Jones, 2015: A new regional climate model operating at the meso-gamma scale: Performance over Europe. Tellus, 67A, 24138, https://doi.org/10.3402/tellusa.v67.24138.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mallat, S. G., 1989: A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell., 11, 674693, https://doi.org/10.1109/34.192463.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mariani, S., and M. Casaioli, 2018: Effects of model domain extent and horizontal grid size on contiguous rain area (CRA) analysis: A MesoVICT study. Meteor. Z., 27, 481502, https://doi.org/10.1127/metz/2018/0897.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Marzban, C., and S. Sandgathe, 2010: Optical flow for verification. Wea. Forecasting, 25, 14791494, https://doi.org/10.1175/2010WAF2222351.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Maurer, V., N. Kalthoff, and L. Gantner, 2017: Predictability of convective precipitation for West Africa: Verification of convection-permitting and global ensemble simulations. Meteor. Z., 26, 93110, https://doi.org/10.1127/metz/2016/0728.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Nan, Z., S. Wang, X. Liang, T. E. Adams, W. Teng, and Y. Liang, 2010: Analysis of spatial similarities between NEXRAD and NLDAS precipitation data products. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 3, 371385, https://doi.org/10.1109/JSTARS.2010.2048418.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Navascués, B., 2013: Long-term verification of HIRLAM and ECMWF forecasts over southern Europe: History and perspectives of numerical weather prediction at AEMET. Atmos. Res., 125–126, 2033, https://doi.org/10.1016/j.atmosres.2013.01.010.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Paulat, M., 2007: Verifikation der Niederschlagsvorhersage für Deutschland von 2001–2004. Ph.D. thesis, University of Mainz, 155 pp.

  • Prein, A., A. Gobiet, M. Suklitsch, H. Truhetz, N. Awan, K. Keuler, and G. Georgievski, 2013: Added value of convection permitting seasonal simulations. Climate Dyn., 41, 26552677, https://doi.org/10.1007/s00382-013-1744-6.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schellander-Gorgas, T., Y. Wang, F. Meier, F. Weidle, C. Wittmann, and A. Kann, 2017: On the forecast skill of a convection-permitting ensemble. Geosci. Model Dev., 10, 35–56, https://doi.org/10.5194/gmd-10-35-2017.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schneider, S., Y. Wang, W. Wagner, and J.-F. Mahfouf, 2014: Impact of ASCAT soil moisture assimilation on regional precipitation forecasts: A case study for Austria. Mon. Wea. Rev., 142, 15251541, https://doi.org/10.1175/MWR-D-12-00311.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Selesnick, I., R. Baraniuk, and N. Kingsbury, 2005: The dual-tree complex wavelet transform. IEEE Signal Process. Mag., 22, 123151, https://doi.org/10.1109/MSP.2005.1550194.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Skinner, P. S., L. J. Wicker, D. M. Wheatley, and K. H. Knopfmeier, 2016: Application of two spatial verification methods to ensemble forecasts of low-level rotation. Wea. Forecasting, 31, 713735, https://doi.org/10.1175/WAF-D-15-0129.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sokol, Z., and P. Zacharov, 2012: Nowcasting of precipitation by an NWP model using assimilation of extrapolated radar reflectivity. Quart. J. Roy. Meteor. Soc., 138, 10721082, https://doi.org/10.1002/qj.970.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tafferner, A., C. Forster, M. Hagen, C. Keil, T. Zinner, and H. Volkert, 2008: Development and propagation of severe thunderstorms in the upper Danube catchment area: Towards an integrated nowcasting and forecasting system using real-time data and high-resolution simulations. Meteor. Atmos. Phys., 101, 211227, https://doi.org/10.1007/s00703-008-0322-7.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Vincendon, B., V. Ducrocq, O. Nuissier, and B. Vie, 2011: Perturbation of convection-permitting NWP forecasts for flash-flood ensemble forecasting. Nat. Hazards Earth Syst. Sci., 11, 1529–1544, https://doi.org/10.5194/nhess-11-1529-2011.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Weniger, M., and P. Friederichs, 2016: Using the SAL technique for spatial verification of cloud processes: A sensitivity analysis. J. Appl. Meteor. Climatol., 55, 20912108, https://doi.org/10.1175/JAMC-D-15-0311.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wernli, H., M. Paulat, M. Hagen, and C. Frei, 2008: SAL—A novel quality measure for the verification of quantitative precipitation forecasts. Mon. Wea. Rev., 136, 44704487, https://doi.org/10.1175/2008MWR2415.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wernli, H., C. Hofmann, and M. Zimmer, 2009: Spatial Forecast Verification Methods Intercomparison Project: Application of the SAL technique. Wea. Forecasting, 24, 14721484, https://doi.org/10.1175/2009WAF2222271.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wittmann, C., T. Haiden, and A. Kann, 2010: Evaluating multi-scale precipitation forecasts using high resolution analysis. Adv. Sci. Res., 4, 8998, https://doi.org/10.5194/asr-4-89-2010.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zacharov, P., 2013: Evaluation of the QPF of convective flash flood rainfalls over the Czech territory in 2009. Atmos. Res., 131, 95–107, https://doi.org/10.1016/j.atmosres.2013.03.007.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zimmer, M., 2010: Merkmalsbezogene Verifikation hochaufgelöster Niederschlagsvorhersagen für Deutschland. Ph.D. thesis, Johannes Gutenberg-Universität, 168 pp.

  • Zimmer, M., H. Wernli, C. Frei, and M. Hagen, 2008: Feature-based verification of deterministic precipitation forecasts with SAL during COPS. Proc. MAP D-PHASE Scientific Meeting, Bologna, Italy, Institute of Atmospheric Sciences and Climate–ARPA-SIM, 116121.

    • Search Google Scholar
    • Export Citation
Save
  • Ahijevych, D., E. Gilleland, B. G. Brown, and E. E. Ebert, 2009: Application of spatial verification methods to idealized and NWP-gridded precipitation forecasts. Wea. Forecasting, 24, 14851497, https://doi.org/10.1175/2009WAF2222298.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Barthelme, S., 2018 : imager: Image Processing Library Based on “CImg,” version 0.41.1. R package, https://CRAN.R-project.org/package=imager

  • Bica, B., R. Steinacker, C. Lotteraner, and M. Suklitsch, 2007: A new concept for high resolution temperature analysis over complex terrain. Theor. Appl. Climatol., 90, 173183, https://doi.org/10.1007/s00704-006-0280-2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Buschow, S., 2021: Code for a location score based on complex wavelets. Zenodo, https://doi.org/10.5281/zenodo.5665720.

  • Buschow, S., and P. Friederichs, 2020: Using wavelets to verify the scale structure of precipitation forecasts. Adv. Stat. Climatol. Meteor. Oceanogr., 6, 1330, https://doi.org/10.5194/ascmo-6-13-2020.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Buschow, S., and P. Friederichs, 2021: SAD: Verifying the scale, anisotropy and direction of precipitation forecasts. Quart. J. Roy. Meteor. Soc., 147, 11501169, https://doi.org/10.1002/qj.3964.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Buschow, S., J. Pidstrigach, and P. Friederichs, 2019: Assessment of wavelet-based spatial verification by means of a stochastic precipitation model (wv_verif v0.1.0). Geosci. Model Dev., 12, 34013418, https://doi.org/10.5194/gmd-12-3401-2019.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Casati, B., G. Ross, and D. Stephenson, 2004: A new intensity-scale approach for the verification of spatial precipitation forecasts. Meteor. Appl., 11, 141154, https://doi.org/10.1017/S1350482704001239.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Craig, G. C., C. Keil, and D. Leuenberger, 2012: Constraints on the impact of radar rainfall data assimilation on forecasts of cumulus convection. Quart. J. Roy. Meteor. Soc., 138, 340352, https://doi.org/10.1002/qj.929.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Davolio, S., F. Silvestro, and T. Gastaldo, 2017: Impact of rainfall assimilation on high-resolution hydrometeorological forecasts over Liguria, Italy. J. Hydrometeor., 18, 26592680, https://doi.org/10.1175/JHM-D-17-0073.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Dorninger, M., E. Gilleland, B. Casati, M. P. Mittermaier, E. E. Ebert, B. G. Brown, and L. J. Wilson, 2018: The setup of the MesoVICT project. Bull. Amer. Meteor. Soc., 99, 18871906, https://doi.org/10.1175/BAMS-D-17-0164.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Eckley, I. A., G. P. Nason, and R. L. Treloar, 2010: Locally stationary wavelet fields with application to the modelling and analysis of image texture. J. Roy. Stat. Soc., 59, 595616, https://doi.org/10.1111/j.1467-9876.2009.00721.x.

    • Search Google Scholar
    • Export Citation
  • Früh, B., J. Bendix, T. Nauss, M. Paulat, A. Pfeiffer, J. W. Schipper, B. Thies, and H. Wernli, 2007: Verification of precipitation from regional climate simulations and remote-sensing observations with respect to ground-based observations in the upper Danube catchment. Meteor. Z., 16, 275293, https://doi.org/10.1127/0941-2948/2007/0210.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., 2011: Spatial forecast verification: Baddeley’s delta metric applied to the ICP test cases. Wea. Forecasting, 26, 409415, https://doi.org/10.1175/WAF-D-10-05061.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., 2021a: Novel measures for summarizing high-resolution forecast performance. Adv. Stat. Climatol. Meteor. Oceanogr., 7, 1334, https://doi.org/10.5194/ascmo-7-13-2021.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., 2021b: SpatialVx: Spatial Forecast Verification, version 0.8 . R package, https://CRAN.R-project.org/package=SpatialVx.

  • Gilleland, E., D. Ahijevych, B. G. Brown, B. Casati, and E. E. Ebert, 2009: Intercomparison of spatial forecast verification methods. Wea. Forecasting, 24, 14161430, https://doi.org/10.1175/2009WAF2222269.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., J. Lindström, and F. Lindgren, 2010: Analyzing the ImageWarp forecast verification method on precipitation fields from the ICP. Wea. Forecasting, 25, 12491262, https://doi.org/10.1175/2010WAF2222365.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gilleland, E., G. Skok, B. G. Brown, B. Casati, M. Dorninger, M. P. Mittermaier, N. Roberts, and L. J. Wilson, 2020: A novel set of geometric verification test fields with application to distance measures. Mon. Wea. Rev., 148, 16531673, https://doi.org/10.1175/MWR-D-19-0256.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gofa, F., D. Boucouvala, P. Louka, and H. Flocas, 2018: Spatial verification approaches as a tool to evaluate the performance of high resolution precipitation forecasts. Atmos. Res., 208, 7887, https://doi.org/10.1016/j.atmosres.2017.09.021.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Haiden, T., A. Kann, C. Wittmann, G. Pistotnik, B. Bica, and C. Gruber, 2011: The Integrated Nowcasting through Comprehensive Analysis (INCA) system and its validation over the eastern Alpine region. Wea. Forecasting, 26, 166183, https://doi.org/10.1175/2010WAF2222451.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Han, F., and I. Szunyogh, 2016: A morphing-based technique for the verification of precipitation forecasts. Mon. Wea. Rev., 144, 295313, https://doi.org/10.1175/MWR-D-15-0172.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hanley, K. E., D. J. Kirshbaum, N. M. Roberts, and G. Leoncini, 2013: Sensitivities of a squall line over central Europe in a convective-scale ensemble. Mon. Wea. Rev., 141, 112133, https://doi.org/10.1175/MWR-D-12-00013.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hardy, J., J. J. Gourley, P.-E. Kirstetter, Y. Hong, F. Kong, and Z. L. Flamig, 2016: A method for probabilistic flash flood forecasting. J. Hydrol., 541, 480494, https://doi.org/10.1016/j.jhydrol.2016.04.007.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kann, A., I. Meirold-Mautner, F. Schmid, G. Kirchengast, J. Fuchsberger, V. Meyer, L. Tüchler, and B. Bica, 2015: Evaluation of high-resolution precipitation analyses using a dense station network. Hydrol. Earth Syst. Sci., 19, 15471559, https://doi.org/10.5194/hess-19-1547-2015.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kapp, F., P. Friederichs, S. Brune, and M. Weniger, 2018: Spatial verification of high-resolution ensemble precipitation forecasts using local wavelet spectra. Meteor. Z., 27, 467480, https://doi.org/10.1127/metz/2018/0903.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Keil, C., and G. C. Craig, 2007: A displacement-based error measure applied in a regional ensemble forecasting system. Mon. Wea. Rev., 135, 32483259, https://doi.org/10.1175/MWR3457.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Keil, C., and G. C. Craig, 2009: A displacement and amplitude score employing an optical flow technique. Wea. Forecasting, 24, 12971308, https://doi.org/10.1175/2009WAF2222247.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kingsbury, N., 1999: Image processing with complex wavelets. Philos. Trans. Roy. Soc., A357, 25432560, https://doi.org/10.1098/rsta.1999.0447.

  • Lange, H., and G. C. Craig, 2014: The impact of data assimilation length scales on analysis and prediction of convective storms. Mon. Wea. Rev., 142, 37813808, https://doi.org/10.1175/MWR-D-13-00304.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lindstedt, D., P. Lind, E. Kjellström, and C. Jones, 2015: A new regional climate model operating at the meso-gamma scale: Performance over Europe. Tellus, 67A, 24138, https://doi.org/10.3402/tellusa.v67.24138.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mallat, S. G., 1989: A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell., 11, 674693, https://doi.org/10.1109/34.192463.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mariani, S., and M. Casaioli, 2018: Effects of model domain extent and horizontal grid size on contiguous rain area (CRA) analysis: A MesoVICT study. Meteor. Z., 27, 481502, https://doi.org/10.1127/metz/2018/0897.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Marzban, C., and S. Sandgathe, 2010: Optical flow for verification. Wea. Forecasting, 25, 14791494, https://doi.org/10.1175/2010WAF2222351.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Maurer, V., N. Kalthoff, and L. Gantner, 2017: Predictability of convective precipitation for West Africa: Verification of convection-permitting and global ensemble simulations. Meteor. Z., 26, 93110, https://doi.org/10.1127/metz/2016/0728.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Nan, Z., S. Wang, X. Liang, T. E. Adams, W. Teng, and Y. Liang, 2010: Analysis of spatial similarities between NEXRAD and NLDAS precipitation data products. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 3, 371385, https://doi.org/10.1109/JSTARS.2010.2048418.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Navascués, B., 2013: Long-term verification of HIRLAM and ECMWF forecasts over southern Europe: History and perspectives of numerical weather prediction at AEMET. Atmos. Res., 125–126, 2033, https://doi.org/10.1016/j.atmosres.2013.01.010.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Paulat, M., 2007: Verifikation der Niederschlagsvorhersage für Deutschland von 2001–2004. Ph.D. thesis, University of Mainz, 155 pp.

  • Prein, A., A. Gobiet, M. Suklitsch, H. Truhetz, N. Awan, K. Keuler, and G. Georgievski, 2013: Added value of convection permitting seasonal simulations. Climate Dyn., 41, 26552677, https://doi.org/10.1007/s00382-013-1744-6.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schellander-Gorgas, T., Y. Wang, F. Meier, F. Weidle, C. Wittmann, and A. Kann, 2017: On the forecast skill of a convection-permitting ensemble. Geosci. Model Dev., 10, 35–56, https://doi.org/10.5194/gmd-10-35-2017.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schneider, S., Y. Wang, W. Wagner, and J.-F. Mahfouf, 2014: Impact of ASCAT soil moisture assimilation on regional precipitation forecasts: A case study for Austria. Mon. Wea. Rev., 142, 15251541, https://doi.org/10.1175/MWR-D-12-00311.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Selesnick, I., R. Baraniuk, and N. Kingsbury, 2005: The dual-tree complex wavelet transform. IEEE Signal Process. Mag., 22, 123151, https://doi.org/10.1109/MSP.2005.1550194.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Skinner, P. S., L. J. Wicker, D. M. Wheatley, and K. H. Knopfmeier, 2016: Application of two spatial verification methods to ensemble forecasts of low-level rotation. Wea. Forecasting, 31, 713735, https://doi.org/10.1175/WAF-D-15-0129.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sokol, Z., and P. Zacharov, 2012: Nowcasting of precipitation by an NWP model using assimilation of extrapolated radar reflectivity. Quart. J. Roy. Meteor. Soc., 138, 10721082, https://doi.org/10.1002/qj.970.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tafferner, A., C. Forster, M. Hagen, C. Keil, T. Zinner, and H. Volkert, 2008: Development and propagation of severe thunderstorms in the upper Danube catchment area: Towards an integrated nowcasting and forecasting system using real-time data and high-resolution simulations. Meteor. Atmos. Phys., 101, 211227, https://doi.org/10.1007/s00703-008-0322-7.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Vincendon, B., V. Ducrocq, O. Nuissier, and B. Vie, 2011: Perturbation of convection-permitting NWP forecasts for flash-flood ensemble forecasting. Nat. Hazards Earth Syst. Sci., 11, 1529–1544, https://doi.org/10.5194/nhess-11-1529-2011.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Weniger, M., and P. Friederichs, 2016: Using the SAL technique for spatial verification of cloud processes: A sensitivity analysis. J. Appl. Meteor. Climatol., 55, 20912108, https://doi.org/10.1175/JAMC-D-15-0311.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wernli, H., M. Paulat, M. Hagen, and C. Frei, 2008: SAL—A novel quality measure for the verification of quantitative precipitation forecasts. Mon. Wea. Rev., 136, 44704487, https://doi.org/10.1175/2008MWR2415.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wernli, H., C. Hofmann, and M. Zimmer, 2009: Spatial Forecast Verification Methods Intercomparison Project: Application of the SAL technique. Wea. Forecasting, 24, 14721484, https://doi.org/10.1175/2009WAF2222271.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Wittmann, C., T. Haiden, and A. Kann, 2010: Evaluating multi-scale precipitation forecasts using high resolution analysis. Adv. Sci. Res., 4, 8998, https://doi.org/10.5194/asr-4-89-2010.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zacharov, P., 2013: Evaluation of the QPF of convective flash flood rainfalls over the Czech territory in 2009. Atmos. Res., 131, 95–107, https://doi.org/10.1016/j.atmosres.2013.03.007.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zimmer, M., 2010: Merkmalsbezogene Verifikation hochaufgelöster Niederschlagsvorhersagen für Deutschland. Ph.D. thesis, Johannes Gutenberg-Universität, 168 pp.

  • Zimmer, M., H. Wernli, C. Frei, and M. Hagen, 2008: Feature-based verification of deterministic precipitation forecasts with SAL during COPS. Proc. MAP D-PHASE Scientific Meeting, Bologna, Italy, Institute of Atmospheric Sciences and Climate–ARPA-SIM, 116121.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Complex daughter wavelets at scale j = 6 on a 200 × 200 domain in HCL color space: phase mapped to the hue, chroma, and luminance corresponds to the modulus. The white boxes encompass an area of 26 × 26 pixels, which is represented by one coefficient of the dtcwt.

  • Fig. 2.

    Difference between shifted and original test image.

  • Fig. 3.

    (a) Scale-dependent phase-shift ΔΦ(j) and (b) estimated shift in image space 2jΔΦ(j) as a function of the true shift applied to the input image. For the solid lines in (b), ΔΦ is cut off at 0.5.

  • Fig. 4.

    Estimated location errors for the circular test images. Gray areas mark the observation and dashed contours the forecasts. From top to bottom, the numbers indicate Δx(j = 1, …, 7). Green color marks scales with Δx(j) < 2j−1; the overall estimate [Eq. (11)] is marked in bold. The distance between two grid lines is 32; labels “CXCY” correspond to those in Gilleland et al. (2020).