## Abstract

Orthonormal wavelet analysis (OWA) is a special form of wavelet analysis, especially suitable for analyzing spatial structures, such as atmospheric fields. For this purpose, OWA is much more efficient and accurate than the nonorthogonal wavelet transform (WT), which was introduced to the meteorological community recently and which is more suitable for time series analysis. Whereas the continuous WT is strictly correct only for infinite domains, OWA is derived from periodizing and discretizing the infinite-domain case and so is correct for periodic boundary conditions.

Unlike Fourier spectra, OWA is not shift invariant. Nor is it equivariant like the WT; that is, the OWA output does not shift as its input shifts. Two remedies are to combine all possible shifts, known as the overcomplete, nonorthogonal shift equivariant WT, or else to use a “best shift,” known as best shift wavelet analysis. Although shift invariant and orthonormal w.r.t. arbitrary inputs, the latter’s optimization generally depends nonlinearly on a reference structure.

OWA is a generalization of Fourier series, and the associated multiresolution analysis (MRA) generalizes Reynolds averaging. Like these, OWA/MRA on discrete and continuous domains satisfy analogous identities arithmetically exactly, unlike the WT. OWA/MRA is much more efficient than Fourier series for analyzing several examples, including nearly singular synthetic structures, and a 90-day Hovmöller diagram of observed geopotential height, containing five atmospheric blocking events. OWA’s variance compression is comparable to that of EOF analysis, but is much faster and less data-dependent.

Implementing the basic OWA operations is documented with WaveLab, a freely available package requiring MATLAB.

## 1. Introduction

Understanding atmospheric structure and dynamics requires both functional and statistical analysis. This paper is about relatively new approaches to these analyses and the situations in which they are preferable. Historically there have mainly been two approaches. One option is to use the observable field values at discrete locations. For example, Onsager (1949) introduced a statistical hydrodynamics based on the approximation of a continuous vorticity field by a set of discrete “point vortices” of constant circulation and variable location. Of course, another “spatial” representation is by values on a fixed grid, as in finite-difference numerical modeling. The alternative technique has been Fourier series, or similar orthogonal function series. Lorenz (1960) showed that analogs of the fields’ physical conservation laws were obeyed by truncations of Fourier series. Charney (1963) observed this to be the case also for general orthogonal function series and for point vortices, but *not* for fixed-grid discretization. He called the point-vortex and Fourier representations “dual” in the sense of the Heisenberg uncertainty principle, or wave particle duality. These terms usually refer to the reciprocal uncertainty of location and wavenumber information in quantum mechanics. Actually, an uncertainty principle applies to the statistical variances of spatial and wavenumber representations of *any* distribution (Chui 1992;Wickerhauser 1994; Prestin and Quak 1999). The implied trade-off of information can be a basis for choosing between the analysis methods that we will discuss.

The point-vortex technique is limited to relatively simple kinds of dynamics. Thus to maintain conservation laws, some kind of orthogonal function expansion is needed, in order to treat realistic dynamics. Because actual atmospheric fields may be spatially localized and can have large gradients, Fourier series and related global analyses are not ideally suited; some kind of compromise representation is called for.

### a. Spatial versus wavenumber compromises

#### 1) Before wavelets

The uncertainty principle suggests that there can be a mutual exchange of information content, or resolution, between the “extreme” cases of points or waves. In recent decades there have been various imperfect attempts to formulate compromise representations, which somehow combine localization in space and scale. Siggia (1977) introduced the use of “wave packets” to partially decouple position and scale in turbulent dynamics. Zimin (1981) introduced a disjoint wavenumber band representation, later identified by Zimin and Hussain (1995) as a nonorthogonal wavelet basis. Both these approaches required very complicated approximations. Besides these contributions from turbulence investigators, there was also an application to meteorology of bandpass filtering (BPF) designed for limited spatial domains, which had nonnegligible domain boundary effects (Bettge and Baumhefner 1980). These investigations, and those in other sciences, set the stage for the introduction of orthonormal wavelet analysis (OWA), as labeled, for example, by Mak (1995). There are already several reviews available; for example, Kumar and Foufoula-Georgiou (1997) review wavelet analysis for geophysical applications. Examples throughout broader areas of physics are collected by van den Berg (1999).

#### 2) Relevant wavelet applications

OWA is especially relevant when dominant spatially “local” structures are present in a synoptic field, for example, during a blocking situation. [Following Hansen and Sutera (1984), we define blocking, in the 55°–80°N average, by slowly propagating ridges in the 50-kPa height, which exceed the zonal mean by at least 250 m for at least seven days.] The present paper is the first in a sequence that argues that in the case of compactly organized structures such as blocking, wavelet-based analysis techniques are more appropriate than Fourier analysis, to evaluate the relative importance of different-scale effects. [This outstanding issue has been investigated by numerous authors, e.g., Tsou and Smith (1990), using part of the present dataset.] Although wavelet analysis has been applied in the time domain to atmospheric boundary layer turbulence (e.g., Foufoula-Georgiou and Kumar 1994) and climatic time series (e.g., Bolton et al. 1995; Lau and Weng 1995; Mak 1995; Torrence and Compo 1998; Park and Mann 2000), and has been applied in the space domain to observed (Zubair et al. 1992; Zubair 1993) and numerically simulated turbulence (Meneveau 1991; Farge et al. 1992; Fournier and Stevens 1996), there has been little application to observed global synoptic meteorological data. The present author has endeavored to remedy this situation in Fournier (1996, 1999) and Fournier (1999, three manuscripts submitted to *J. Atmos. Sci.,* hereafter F1, F2, and F3).

This paper is also designed as a complement to earlier tutorials by Lau and Weng (1995) and Torrence and Compo (1998), whose methods are suitable for analysis of time series, not spatial structures. The latter authors presented several formulas that should be identified as approximations to appropriate integrals, to lowest order in the time increment *δt* and/or the scale increment *δj*;they correctly emphasized the importance of choosing sufficiently small increments. [Shensa (1992) discusses exact methods for such integrals, in the course of his thorough discussion of the relationship of OWA to the continuous wavelet transform (CWT) and other related transforms.] This is adequate for sufficiently smooth signals but may lead to difficulties in reconstructing, filtering, decomposing the variance of, or otherwise analyzing less-smooth signals. By contrast, the methods described in the present paper are arithmetically exact, that is, accurate to computer precision, for the analysis of a uniform sampling of any spatial field. The analyses of the field’s samples arithmetically exactly satisfy equations analogous to those of the continuous field’s analyses.

### b. A motivating example

The following further motivates the hypothesis that OWA should be advantageous for atmospheric problems. Figure 1a shows an image of the earth. Cloud patterns suggest the complicated dynamics and structure of the underlying fields, and exhibit a broad range of scales, anisotropy, and organized structures.

Often in image processing or geoscience one needs to use a relatively small number of values to represent such an earth image. Say we want to reduce the 2^{18} pixel values to only 2^{12} values of some kind. One method, corresponding to a purely spatial viewpoint, would be to simply average the pixel values in 2^{3} × 2^{3} squares, yielding the piecewise uniform image in Fig. 1b. Obviously, much important detail structure has been lost.

A method corresponding to a purely wavenumber viewpoint could be to truncate wavenumbers ≥2^{5}. The image reconstructed from the remaining (2 × 2^{5})^{2} = 2^{12} Fourier–sine–transform coefficients is shown in Fig. 1c. Although the image is much smoother and more realistic, there is still very little detail left.

The alternative method is to *combine* some of the spatial and wavenumber information, using coefficients of expansion in a wavelet basis, which is simultaneously localized in both space and wavenumber. The reconstruction from the 2^{12} largest-magnitude coefficients is shown in Fig. 1d. Much of the structure lost by the previous two methods has been regained, such as the ITCZ and the vortices west of South America. This observation has been the main motivation to the investigations of Fournier (1995, 1996, 1999), F1, F2, F3, and Fournier and Stevens (1996) into how to apply OWA to atmospheric spatial structure and dynamics.

### c. Outline

An introduction to the basic elements of OWA follows a review of simple statistical ideas familiar to meteorologists, in section 2. Unlike most popular introductions, which were geared toward time series analysis, this includes a review of periodization, appropriate for analysis on latitude circles. Also briefly reviewed are extensions initiated by Beylkin (1992) and Shensa (1992) to compensate for the lack of shift invariance of the basic analysis. This important objection can be satisfactorily dealt with by either using all possible shifts simultaneously (with minor additional computation) or choosing a “best shift.” There are at this time few references to this subject available, and mostly they are aimed at an audience specializing in statistical mathematics or signal processing. The representation of blocking structures by best shift is investigated in F3 and summarized in section 2h. Section 3 contains a detailed survey of examples, directly comparing the accuracy of equally truncated Fourier and wavelet series, for analyzing various test signals. This tutorial is to familiarize the reader with OWA enough to understand its suitability for the physical problem treated in section 4, namely, multiresolution analysis (MRA) of Hovmöller diagrams of blocking.

To offer the reader “hands-on” practice, Table 1 summarizes the actual function syntaxes to implement the key algorithms. These functions are part of the free online MATLAB library, WaveLab (Buckheit et al. 1995).

## 2. Average and detailed components of structures

### a. Basic MRA concepts

Since the work of Reynolds, it has been a common practice in turbulence (large eddy simulation) and meteorological modeling to decompose a field *f*(*x*) into its average and deviation parts: *f* = 𝒲^{⊥}*f* + 𝒲*f.* Here 𝒲^{⊥}*f* is the result of some kind of averaging projection operation, 𝒲^{⊥} is a smoothing operator, and the residual 𝒲*f* defines a detailing operator, 𝒲 ≡ 1 − 𝒲^{⊥}. The label ⊥ will appear in different ways here, to signify some kind of orthogonality. For functions and operators these orthogonalities are expressed by 〈𝒲*f*𝒲^{⊥}*f*〉_{x} = 0 [where 〈*f*〉_{x} ≡ ∫^{1}_{0} *f*(*x*) *dx*] and 𝒲^{⊥}𝒲 = 0, respectively. For example, low-pass filtering (LPF) (Blackmon 1976;Bettge and Baumhefner 1980; Saltzman 1990) gives a 𝒲^{⊥} in the “ideal” LPF case, (𝒲^{⊥})^{2} = 𝒲^{⊥}, but the commonly used running average is not a projection: each application of running average further smoothes the field. For concreteness, henceforth 𝒲^{⊥}*f* and 𝒲*f* are taken to be 〈*f*〉_{x} and *f*★ ≡ *f* − 〈*f*〉_{x}, respectively.

This is generalized to MRA as follows. Begin with a trivial-looking “telescoping sum” decomposition:

Note that this expansion is true regardless of the symbols’ definitions. If *f*_{J} stands for a “fine approximation” of *f* at some “finescale” *j* = *J,* then one can always derive it from a “gross approximation” *f*_{0} at some “coarse scale” *j* = 0 by adding a series of “details,”

between successive approximations. Now let us combine these simple ideas with OWA.

We are interested in *f*s that depend on periodic coordinates such as

so in the following sections we survey periodic OWA;see, for example, Perrier and Basdevant (1989) for details. (Henceforth “periodic” shall be subsumed into OWA, etc.) First we review Fourier analysis. For ease of discussion, all formulas and notation are summarized in Table 2, arranged in columns by analysis method and in rows by the ingredients of the analyses. Additional notations are 〈*f*〉_{λ} ≡ (2*π*)^{−1 }∫^{π}_{−π} *f*(*λ*) *dλ* and 〈*a*〉_{n} ≡ Σ^{∞}_{n=−∞} *a*_{n}.

### b. Fourier analysis

If *f* is 2*π* periodic and either absolutely integrable and has bounded ∂_{λ}*f* at *λ,* or else square integrable and continuous at *λ,* then *f*(*λ*) may be expanded in the Fourier basis. In practice, the limit ∞ in the Table 2, column 1 formulas is replaced by the number 2^{J} of uniform function samples available; thus we get discrete analogs that are arithmetically exact.

### c. Wavelet analysis

More generally than for Fourier series, any 2*π* periodic, continuous or absolutely integrable function *f*(*λ*) may be expanded in a periodic orthonormal wavelet basis. The wavelet coefficients *f̃*_{j,k} are defined using 〈 〉_{λ} or, in Fourier space, 〈 〉_{n}. Again, in practice ∞ is replaced by *J* − 1 so that the coefficients (including 〈*f*〉_{λ}) total in number and are computed as described in section 2f. All the wavelet equations in Table 2 follow from section 9.3 of Daubechies (1992).

#### 1) Different bases and the meaning of the labels

Another way to see how the OWA basis compromises between point and wave representations is given by Fig. 2a. The left panel depicts basis functions taking the value 1 at each of 2^{J} longitudes,

and zero elsewhere. This basis underlies the standard uniform discretization of *f*(*λ*). It is entirely localized in *λ.* The middle panel depicts the sines and cosines with wavenumbers *n* from 0 to 2^{J−1}, as used in the FFT. The Fourier basis is perfectly localized in wavenumber, sometimes equated with scale, as defined ∝*n*^{−1}. In the right panel, obviously the basis functions are somewhat localized in space. The basis functions come in octaves or levels *j* = 0, . . . , *J* − 1, with 2^{j} functions at each level. The functions at lower *j* are clearly less localized in *λ,* so *j* stands for resolution. It turns out that they are *more* localized in scale (properly defined as intervals of *n*), as we will see in Fig. 5b. In some sense, some scale information has been traded off to regain lost location information. Such trade-offs must always satisfy the uncertainty principle w.r.t. scale and location. We can visualize this with a unit box, Fig. 3a, symbolizing a 2D location–wavenumber plane to be covered by 2^{J} = 8 basis elements. The space representation (no scale or wavenumber information), the wavenumber representation (no location information), and the wavelet representation (some of both kinds of information) are symbolized by the tilings in Figs. 3b–d, respectively. The uncertainty principle is expressed by each tile in every tiling having the same area, 2^{−J} = ^{1}/_{8}. As one tile dimension is reduced, to increase the resolution in that aspect, the other dimension must be increased, which reduces its resolution aspect.

#### 2) Wavelets on the real line

The periodic OWA basis functions *W*_{j,k} are obtained from those on the real line, *ψ*_{j,k}, by periodization (see appendix A). The *ψ*_{j,k} are generated by 2^{j} dilation and (*k* − 1) shift:

There are many possible “mother wavelets” (Meyer 1993), *W,* which generate such a representation. Most *W*s cannot be written in terms of explicit functions, but many can be defined by

where the scaling function *W*^{⊥} (Daubechies 1992), also called “wavelet father” (Meyer 1993), is the solution of the dilation equation (e.g., Kaiser 1994; Strang and Nguyen 1995):

Sometimes (2.3b) is called the “scaling identity” (e.g., Strichartz 1993), or the “two-scale equation” (e.g., Wickerhauser 1994), or (2.3a) and (2.3b) together are called the “two-scale relations” (e.g., Chui 1992). Ultimately, everything is determined by the two filter sequences and

For any , (2.3c) (even with any odd integer replacing 3) implies for any integer ℓ. This in turn implies as long as . Equations (2.2), (2.3a)–(2.3c) are derived from (1.3.4), (5.1.32), (5.1.15), (5.1.35), respectively, of Daubechies (1992). The significance of the cascade interpretation of (2.3a), (2.3b) to turbulence is discussed in appendix B.

There is one case for which *W*^{⊥} and *W* may be expressed explicitly. For = 2^{−1/2}(−1, 1) the solutions to (2.3a), (2.3b) are

(Daubechies 1992). These relations are schematically depicted in Figs. 4a,b. The orthonormal basis generated by these solutions via (A.1) and (2.2) was invented by Haar (1910).

Note that Haar’s solutions are compactly supported (CS, i.e., they vanish outside a closed, bounded set of *t*), discontinuous, and zero mean: . Daubechies (1988, 1992) generalized the solutions, showing that, with certain requirements on (reviewed by F3), solutions *W*^{⊥} exist that are CS and have smoothness and a number of vanishing moments that all increase roughly linearly with the support length 2*N* of . Figures 4c,d show the simplest (*N* = 2) Daubechies solutions, for

In Fig. 4 the wavelet and scaling-function graphs are horizontally aligned to accurately show how features (such as peaks, dips, and cusps) combine to construct the larger structure. The size (inversion) of the ℓth smaller scaling function indicates the magnitude (sign) of the coefficient or *w*_{ℓ}_{−2N+2} that multiplies it.

The investigator should choose a particular wavelet type according to the problem to be addressed. For example, F1 uses the Daubechies-14 wavelet, with only 2*N* = 14 nonzero *w*_{ℓ}. For the purpose of analyzing a partial differential equation solution, this wavelet was chosen to provide the most CS basis available that had continuous second derivatives. In the present section 4, Fournier (1999), and F2 and F3, the Daubechies-20 wavelet is used, with only 2*N* = 20 nonzero *w*_{ℓ}. This was the smoothest CS wavelet available, selected on the assumption that atmospheric structures are at least twice differentiable on the scale observed.

#### 3) Wavelet location and support

Figure 5a shows the *W*_{j,k}s (for *λ*_{ε} = 0 in Table 2, and *N* = 10) in an arrangement that sets up Fig. 9. The *W*_{j,k}s are centered approximately at the pyramidal set of longitudes *λ*_{j,k} = *λ*_{J,}_{ℓj,k}. Each *λ*_{j,k} is related to the grid (2.1) by the pyramidal index set

and constructs a grid from −*π* to *π* with step 2^{1−j}*π.* Pyramidal simply describes the constraint *k* = 1, . . . , 2^{j} as exemplified in Table 3. The support of *W*_{j,k} has length equal to 2*π* min[1, 2^{−j}(2*N* − 1)].

### d. Generalization to arbitrary curtailing

The functions *W*^{⊥}_{j,k} defined by (A.1), replacing *W* by *W*^{⊥} in (2.2), satisfy the limited orthogonalities in Table 2. The *W*^{⊥}_{R,k} are averaging functions, associated with LPF, and *f*_{0} = *f*_{0,1} = 〈*f*〉_{λ} since *W*^{⊥}_{0,k}(*λ*) = 1. This is called OWA “curtailed at level *R*” (Nason and Silverman 1995). Its physical interpretation is that the scaling coefficients *f*_{R,k} now contain all information corresponding to a certain low-wavenumber interval, bounded ∝2^{R.}

Figure 2b shows *J* = 5 possible curtailings. The leftmost panel copies the lower half (*j* < *J* − 1) of the right panel of Fig. 2a. Level *j* = *J* − 1 is unaffected by curtailing at levels *R* < *J.* Curtailing at level *R* = *J* corresponds to the identity transformation. From left to right, as *R* increases from 0 to *J* − 1 the first 2^{R} basis functions shown in the leftmost panel are replaced by the corresponding scaling functions *W*^{⊥}_{R,k}. The 2^{J} − 2^{R} wavelets above are unaffected. Each of these panels (taken together with the *j* = *J* − 1 half not shown) illustrates another possible orthonormal basis. For *J* = 3 the tilings corresponding to *R* = 1 and 2 are in Figs. 3e,f, respectively.

### e. Fourier picture of OWA

OWA can also be visualized in the Fourier domain; the are shown in Fig. 5b. From the *n*th Fourier coefficient of (A.1) we see that the magnitude || is independent of location index *k* for a given resolution index *j,* so it is only plotted once, for *k* = 1 (black curve). Also, the phase arg is just a shift by 2*π*(1 − *k*)2^{−j}*n* from the continuous Fourier transform of *W* at frequency 2^{−j}*n* (gray curve for each *n*). Thus for *n* = 2^{j} the phase shift is zero, while for *n* = 2^{j−1} every change of *k* by 2 sweeps a whole cycle.

Since both *W* and are localized in their respective spaces, *f̃*_{j,k} selects the contribution to *f* from location ∝2^{−j}*k* and scale ∝2^{−j}. The set defines an MRA of *f.* It is essentially a sophisticated kind of BPF technique, to use a term more familiar to climatologists (Blackmon 1976; Saltzman 1990). The are for BPF, while the are for LPF. The BPF bandwidth and band center are seen to increase with *j* as the spatial resolution increases (Fig. 5a), in accordance with the uncertainty principle (Meyer 1993; Wickerhauser 1994). Fourier and wavelet series are further compared in appendix C.

### f. Practical OWA

In practice, 2^{J} discrete uniform samples *f*(*λ*_{J,k}) ≈ 2^{J/2}*f*_{J,k} of *f* are given. Then discrete analogs of the 2^{R} scaling coefficients *f*_{R,k+1}, *k* = 0, . . . , 2^{R} − 1, and the 2^{J} − 2^{R} wavelet coefficients *f̃*_{j,k+1}, *k* = 0, . . . , 2^{j} − 1, *j* = *R,* . . . , *J* − 1, are computed in *O*(2^{J}) convolution–decimation calculations:

(Beylkin 1992; Nason and Silverman 1995, flipping the sign of *ε*_{j} to conform with WaveLab). (The decimation of *a*_{k} is *a*_{2k−ε}, *ε* = 0, 1.) For 2*π* periodic *f* and *w*^{⊥}_{ℓ<1} = *w*^{⊥}_{ℓ>2N} = 0 (the periodic CS case) these become

Equation (2.5) involves just , so *W*_{j,k} never needs to be evaluated. This efficient computation is due to the“self-similarity” (2.3a), (2.3b). Note that the corresponding FFT requires *O*(*J*2^{J}) calculations and is therefore slower. The reconstruction is the inverse of this orthogonal transformation:

In the periodic CS case this becomes

A visual pattern corresponding to (2.5a)–(2.5c) corresponds to ascending a pyramid resembling Table 3; likewise (2.5d), (2.5e) correspond to descending. This discrete wavelet transform (DWT) satisfies the discrete analogs of all the previous equations, just by replacing the operation 〈*f*〉_{λ} everywhere by 2^{−J }Σ^{2J}_{ℓ=1} *f*_{ℓ}.

The binary sequence *ε*_{j} = 0, 1 appearing in (2.5) determines whether even- or odd-indexed samples are used at level *j* of convolution–decimation, as introduced independently by Shensa (1992) and (in more algorithmic detail) by Beylkin (1992). The reference longitude *λ*_{ε} is related to in appendix D. Including all possible , the “overcomplete” set of corresponds to using *all* 2^{J} possible periodized shifts of 2^{j}-dilated *W,* instead of the orthonormal set of 2^{j} shifts indexed by *k* = 1, . . . , 2^{j}, but still only requires *O*(*J*2^{J}) calculations (Beylkin 1992), the same cost as the FFT. Appropriately ordered, this larger set of analysis functions generates the nonorthogonal shift-equivariant WT (SEWT), sometimes called “undecimated WT” (Shensa 1992), “stationary WT” (Nason and Silverman 1995), “shift-invariant WT” (Lang et al. 1995), “translation-invariant WT” (Coifman and Donoho 1995; Del Marco and Weiss 1997) or “maximal-overlap DWT” (MODWT; Percival and Mofjeld 1997). The most physical description is equivariant (e.g., von Sachs et al. 1998), meaning a shift of the function produces the same shift of the SEWT [as opposed either to the relations (D.3), wherein the “detail bits” of the shift are truncated, or to (D.4)]. This term is distinguished from “shift invariant,” describing the Fourier spectrum |*f̂*_{n}|^{2} but not the Fourier transform itself, which merely diagonalizes the shift operator.

The unrestricted domain of location index *k* “fills in the gaps” left from the orthonormal basis. Figure 6 illustrates this overcomplete, or redundant, representation. The standard OWA (SOWA, defined by *λ*_{ε} ≡ 0) basis functions are shown in heavy outline. The other 2^{J} − 1 possible orthonormal sets are all contained in the redundant set.

Among the first SEWT applications were denoising (Coifman and Donoho 1995; Lang et al. 1995) and estimating noisy transients and time delays (Pesquet 1996). Further applications of SEWTs and their wavelet-packet generalizations (Pesquet 1996; Del Marco and Weiss 1997) include examining magnetic field shock waves (Walden and Contreras Cristan 1998), assessing formula errors with observed turbulent heat fluxes (Qi and Neumann 1997), synthesizing multifractal anisotropic 2D cloud profiles (Davis et al. 1999), and analyzing river water levels (B. Whitcher et al. 1999, manuscript submitted to *Water Resour. Res.*). In particular, Percival and Mofjeld (1997) demonstrated the power of their MODWT for detecting and meaningfully alligning MRA features with the original data, and efficiently decomposing data variance, and explained MODWT’s advantages over both short-time Fourier transform and complex demodulation.

With a minor abuse of notation, the coefficients may be denoted *f*_{j,2j−Jℓ}, and *f̃*_{j,2j−Jℓ} where the fractional *k* index accurately reflects the location defined by *λ*_{j,k→2j−Jℓ} and allows ℓ = 1, . . . , 2^{J}. Or, generalizing Shensa’s (1992) Eq. (3.7), use (D.1) to define

which are easily shown to be shift equivariant, and by (D.4) reduce to tautologies of the form *A* = *A* as ℓ → 2^{J−j}*k.* Although not orthonormal, the SEWT still satisfies a Parseval identity,

which is essential for any kind of analysis of variance, for example, the wavelet energetics of F1, F2, and F3. The factors 2^{j} compensate for an emphasis of the larger scales in this representation, in a sense, the “price” of spreading the large-scale representation over a finer location grid. In a scale analysis of covariance, omitting the factors 2^{j} would cause larger-scale contributions to be emphasized, relative to the orthonormal analysis.

### g. Best shift of the wavelet basis

A particular orthogonal representation, corresponding to a particular

may be extracted from the overcomplete set by maximizing some measure of the representation’s efficiency, w.r.t. . Such a best-shift wavelet analysis (BSWA) was developed roughly contemporaneously by several investigators; see Liang and Parks (1996), Pesquet (1996), and their references. The *λ*^{+}_{ε} computation is shift equivariant but generally depends nonlinearly on one reference input. For the appropriate *λ*^{+}_{ε}, the BSWA is shift invariant and orthonormal w.r.t. arbitrary inputs.

One may find *λ*^{+}_{ε} in *O*(*J*2^{J}) calculations by minimizing a cost functional such as ℰ() [the analysis entropy of Coifman and Wickerhauser (1992)] w.r.t. , where

As shown by Wickerhauser (1994), ℰ is useful for at least two reasons. It is convex w.r.t. possible analyses (), that is,

Hence any weighted averaging of analyses increases ℰ, reducing information, so conversely reducing ℰ increases information, pointing to the most efficient representation. In the extreme cases, 0 ⩽ ℰ() ⩽ *J,* where ℰ() = 0 if and only if has at most one nonzero term (maximum representation efficiency), and ℰ() = *J* if and only if *a*_{ℓ} is independent of ℓ (worst case: no information). Wickerhauser (1994) also shows that OWAs optimized in similar ways to this approach the optimal EOF analysis at substantially reduced computational cost. Liang and Parks (1996) classify several alternative cost functionals and evaluate their suitability for fast algorithms and effectiveness for compression. Del Marco and Weiss (1997) extended the method from base-2 to base-*M* in (2.3a), (2.3b), and to extra flexibility in minimizing ℰ by simultaneously combining multiple *λ*_{ε}s.

### h. Application of BSWA

Let us briefly review some recent applications of BSWA that are closely related to problems investigated by traditional techniques. BSWA is applied by F3 to observed 50-kPa geopotential height data. In four out of five cases of time- and latitude-averaged atmospheric blocking, only two optimally shifted wavelets contribute more variance (at least 89%) than four or more SOWA wavelets. For certain *j* = 1 type blocks, the observed height distribution remarkably resembles the wavelet *W*_{1,k}(*λ*) itself. The study of F3 discusses physical reasons why this might be the case and other topics, for example, how to interpret physically the signal-processing motivations underlying wavelet design. Also, F3 derives a minimal Fourier coefficient model of blocking from *W*_{1,k}, relates it to wavelet design considerations, and interprets the minimal model with regard to spectral dynamics models of Platzman (1962) and Wiin Christensen and Wiin-Nielsen (1996).

## 3. Examples of wavelet analysis

Now we shall compare examples of various *N* = 2 analyses of several test functions *f* accompanying the free WaveLab software (Buckheit et al. 1995). Each *f* from Table 4 was first centered and normalized by *f* − 〈*f*〉_{x} → *f,* ‖*f*‖^{−1}*f* → *f.*

### a. Fourier analysis versus SOWA and BSWA

Except where noted, both the BSWA and Fourier series were truncated at the number of SOWA coefficients exceeding the magnitude shown, to illustrate the relative merits of the methods. At the left of Figs. 7e,f,k,z, the wavelets are vertically sorted by the variance percentage (VP) each contributes to the reconstruction, cumulatively labeled for each (*j, k*). After summing some (usually small) number of terms, for example, 3(8) for Fig. 7e (f), the eventual inferiority of the Fourier components (not graphed) is indicated by the right-hand labels’ (*n*) lower scores in the VP measure. The VP measure also indicates analysis of variance efficiency.

Figure 7a shows the “Ramp” function, such as might model a breaking wave. [In fact, *t*^{−1}*f*(*x*) solves the nonlinear advection equation *U*_{t} = −*UU*_{x}; see F1.] The truncated Fourier series exhibits the Gibbs phenomenon (Fig. 7b), while the truncated BSWA yields a much better approximation (Fig. 7c). For most *x,* the BSWA errors are much smaller than the Fourier series errors (Fig. 7d). Note in Fig. 7e how the smaller-scale (higher *j*) wavelets appear only at the discontinuity, and how the cusps of the larger-scale (lower *j*) wavelets align and sum to a smooth line. Figures 7e,f show the improvement of BSWA over SOWA by the VP measure. Note that VP compression does not necessarily follow visually perceived accuracy; pointwise SOWA errors (not shown) are less than BSWA’s shown in Fig. 7d, as indicated in row a of Table 4.

How well do the methods reconstruct a point singularity? Table 4, row b, shows the truncation errors for the Kronecker (smoothed Dirac) function. Graphically this is similar to the less singular *f* analyzed in row c and Figs. 7g–k. With only 23 coefficients the Fourier series has barely started to build up this singularity (Fig. 7h) and performs terribly w.r.t. VP (Fig. 7k). The OWA errors are significantly smaller (Fig. 7j). The Fourier series is better for Table 4, row c, than row b, reflecting the weaker localization in *x,* but in neither case does it perform nearly as well as OWA.

Further delocalizing in *x,* consider Table 4, row d. Again the Fourier series improves but is still inferior to OWA, especially for describing the cusp (not shown). In this case the OWA actually yields two more coefficients larger than the threshold (0.01), relative to the Fourier analysis.

Functions a–d might represent various kinds of *vortices,* examples of localized phenomena occurring in geophysical fluid dynamics. This suggests that wavelets are useful for representing vortices, as has been demonstrated by Farge (1992), Farge et al. (1992, 1999), and Schneider et al. (1997). Wavelets are also advantageous for representing topography-like structures such as shown in Figs. 7l–p. In order to capture the very localized “bumps,” Fourier analysis must include high wavenumbers, which contaminate the level “plains” (Fig. 7m) and increase the overall truncation error (Fig. 7o). The small-scale wavelet components only occur where needed in Fig. 7p. Similar behavior appears for the piecewise uniform function shown in Figs. 7q–u.

We now consider functions in which Fourier analysis has some advantage. Clearly for the purely harmonic cases, Table 4, rows g and h, the Fourier analysis does a somewhat better job. In these two analyses there were many more significant OWA coefficients compared to Fourier coefficients, so for the sake of comparison, truncation was made on the magnitude of the latter (as not in the other examples).

Suppose we add a little discontinuity to a sine wave. Figures 7w,x show the reconstructions using 35 coefficients for Fourier or BSWA analysis, determined from the BSWA coefficient magnitudes. Figure 7z shows better VP compression by the Fourier coefficients. As seen in Table 4, row i, Fourier analysis also has slightly smaller mean absolute error, because the wavelets in this case are visibly not smooth enough and so introduce some roughness (Fig. 7x). However, OWA reproduces the discontinuities more sharply and so has smaller rms and max errors. Note: in this case BSWA does not improve over SOWA, since the two discontinuities are separated in *x* by 0.42, which is nondyadic and so not amenable to a single best shift. We speculate that the multiple-*λ*_{ε} approach of Del Marco and Weiss (1997) may help here.

We conclude this survey with two cases related to harmonic functions, yet in a sense more amenable to OWA. Table 4, row j, shows the truncation errors for spatially varying wavenumber, *n*_{D}(*x*) ≈ 21[*x*(20*x* + 1)]^{−1} (away from *x* = 0, 1). For the low-wavenumber part Fourier analysis does better, but at high wavenumbers the truncation greatly increases the Fourier series error, and OWA does better. OWA also does better with the linearly varying wavenumber “chirp” function, Table 4, row k. OWA outperforms Fourier analysis in this case at both the extreme large and small wavenumbers.

### b. MRAs and SEWTs versus ideal BPF

Now we revisit some of the wavelet coefficients just discussed, combined scale-by-scale for *λ*_{ε} = 0 (SOWA case) to form MRAs, or sorted by each *λ*_{ε} to form SEWTs. Figure 8 shows three analyses of the functions in Table 4, rows e and j. The sum of the *f*^{★}_{j} (or the *f*^{bp}_{j} below, but not the SEWT) regains *f,* with the indicated error ɛ. These cases illustrate how MRA and SEWT isolate local features at the appropriate scale. Figures 8d–f exhibit *n*_{D}(*x*): observe how the “activated” wavelets proceed to smaller scales as *x* decreases. Given that *n* ∝ 2^{j} for a given wavelet level (see Fig. 5b), it is satisfying to observe that the maximum amplitude follows very closely to the indicated (gray) curve *D*_{s} ≡ {(*x, j*)|*j* − log_{2}*n*_{D}(*x*) = *s* = (small constant)}. (The *n*_{D} = 2^{j} curve *D*_{0} becomes less least squares fit in going from MRA through SEWT to BPF.) Similarly for Table 4, row k, the wavenumber varies as *n*_{c}(*x*) ≡ 2^{J−2}*x,* and the maximum amplitude mostly follows the curve (not shown) *j* − log_{2}*n*_{c}(*x*) = (small constant).

For comparison with traditional techniques, attempt the same effect with the ideal BPF achieving roughly the same dyadic partitioning in *n*: Compare *f*^{★}_{j} with

( being the integer part of *t*), using nonorthogonal wavelet

whose *n*′th Fourier coefficient − 1), 0 (otherwise). This filter is wavenumber orthogonal (〈*ψ*_{n}*ψ*_{n′}〉_{λ} = *δ*_{n,n′}) and zero mean (〈*ψ*_{n}〉_{λ} = 0), and decomposes the zonal deviation *f*★ = Σ^{∞}_{j=0} *f*^{bp}_{j} and zonal variance 〈*f*★^{2}〉_{λ} = Σ^{∞}_{j=0} (*f*^{bp}_{j})^{2}. The results of this BPF of the cases from Table 4, rows e and j, are shown in Figs. 8c,f. In every case some of the features revealed by SOWA–MRA and SEWT still appear, but with much poorer spatial resolution, especially at large scales. There are contaminating oscillations where the graphs should be flat. The *f*^{bp}_{j} are smoother, but the *f*^{★}_{j} may be made smoother by using smoother wavelets. In terms of analysis of variance (label to right of *j*th curve indicating contribution from *j*th component), SOWA–MRA achieves the most compression for both functions (Figs. 8a,d). SEWT has some variance compression advantage over BPF for the variable wavenumber function (Figs. 8e,f) but, as noted before, not for the function containing discontinuities with nondyadic separation (Figs. 8b,c).

## 4. Time-dependent analyses of geopotential height

### a. MRA

In these sections five blocking periods are analyzed, of 7–13-day duration (Hansen and Sutera 1984; Fournier 1996, 1999; F2; F3). After F2, *ϕ* = 65°N was judged to be a representative latitude to discriminate blocking from nonblocking, since it cuts through the block structure. It is standard to depict blocking event sequences with a Hovmöller (1949) diagram. This is shown in Fig. 9a for *Z*(*λ, t*) at 70 kPa, with *t* = 0 ≐ 0000 UTC 1 December 1978. The light regions of this figure clearly depict four blocking events in the time intervals *T* = [0, 6.5], [19, 26.5], [44, 56.5], [77, 83] days near the Atlantic and one in *T* = [28.5, 40] days near the Pacific. The third Atlantic blocking event is very strong and clearly moves in a retrograde (westward) manner. Apart from these features it is difficult to extract other insight from the complicated patterns. The time and zonal average phase velocity 〈(∂*λ*/∂*t*)_{Z}〉_{λt} does not consistantly show the slowing commonly associated with blocking. There is range of scales displayed, since there are both regions with many close contour curves and broad, more uniform areas.

A standard way to proceed is to filter the Hovmöller diagram in Fourier space for each *t* (Wiin-Nielsen and Chen 1993). For comparison, take the *N* = 10 MRA

shown in Figs. 9b–g. The sum of these six panels reproduces Fig. 9a. The *j* = 1 by itself clearly shows the same blocking ridge pattern as Fig. 9a. This suggests that information encoded in just the two numbers *Z̃*_{1,1} and *Z̃*_{1,2} may instantaneously describe the presence of blocking, at least for these data. Although *j* = 2 evidently partly contributes to the block, generally the intermediate scales 2^{−j} = ^{1}/_{4}, ^{1}/_{8}, ^{1}/_{16} capture patterns associated with synoptic cyclone waves, from Atlantic size (*j* = 2) down to Hudson Bay size (*j* = 4). Generally speaking, there are more prograde (eastward tending, dark curves) structures evident during nonblocking. For example, *Z*^{★}_{1} for |*λ*| ≲ 90 during the first nonblocking interval, [7, 18.5] days, and *Z*^{★}_{2} for 0 ≲ *λ* ≲ 100° (over Europe and Russia) during the fourth nonblocking interval, [57, 76.5] days; *Z*^{★}_{3} at these times and also during the third nonblocking interval, [40.5, 43.5] days. The observable tendencies are usually reflected by 〈(∂*λ*/∂*t*)_{Z}^{★}_{j}〉_{λt}.

It should be noted that there are exceptions to these observations. That is, occasionally _{Z}^{★}_{j} may appear steady during nonblocking or prograde during blocking. Likewise, strong local anomalies prevent 〈(∂*λ*/∂*t*)_{Z}^{★}_{j}〉_{λt} from being a foolproof indicator of blocking action. This is partly why the very definition of blocking is still a subject of much debate (Liu 1994), and why part of the sequence of studies by the present author is aimed at characterizing blocking using MRA techniques.

### b. Comparison with Fourier BPF

The apparent low-amplitude (a few meters), small-scale (*j* > 4) steady structures over the eastern United States and Europe in Figs. 9f,g seem unphysical. These are probably artifacts but not due to this analysis; almost identical patterns are seen in high-*j* Fourier BPF (3.1) of these data, shown in Figs. 9h–m. This figure also shows how blocks, belonging to the large-scale (low *j*) band, are more poorly represented by simple BPF than by MRA. For example, although *Z*^{bp}_{1} exhibits greater amplitude during some blocking events, it cannot indicate in which hemisphere the block lies. Figure 9n shows *Z*(*λ,* 44.5 day) and the surface topography at 65°N. This time was chosen from the high-*j* panels of previous figures as exhibiting relatively high-amplitude small-scale structure. It is clear that the high-*j* steady structures must be associated with orography, especially mountain ranges. The perturbations are probably smaller than the original measurement accuracy and so probably indicate mountain-generated artifacts from the National Meteorological Center (now the National Centers for Environmental Prediction) analysis procedure (K. E. Trenberth 1997, personal communication).

### c. Wavelet coefficient phase space

To investigate whether the presence of blocking might be described by just the two coefficients *Z̃*_{1,1} and *Z̃*_{1,2}, their time series are presented in Fig. 10a. For the most part, the curves tend to stay at large magnitudes, apart from each other during blocking, and to approach each other at small magnitudes during nonblocking. As might be expected, the Eastern and Western Hemisphere wavelet modes exchange signs during the Pacific blocking relative to the Atlantic blocking.

The evolution of *Z̃*_{1,k}(*t*), *k* = 1, 2, can also be described in a dynamical phase space; see Fig. 10b. Ordering the quadrants in the standard way indicated (Shapiro 1977), it may be seen that Atlantic blocking is roughly characterized by clockwise orbits, *Z̃*_{1,1} > 0 leading *Z̃*_{1,2} < 0, mostly in quadrant II, while Pacific blocking is characterized by counterclockwise orbits, *Z̃*_{1,1} < 0 lagging *Z̃*_{1,2}, in quadrants III and IV. Nonblocking points tend to cluster nearer to the origin in all quadrants. The empty regions enclosed by the orbits suggest centers of instability, although either longer time series or projection onto higher-dimensional phase space, or both, would be required to further support this. If such diagnostic qualities were robust for more data, they could be useful in predicting blocking, an outstanding problem in extended-range weather prediction.

## 5. Conclusions

Orthonormal wavelet analysis (OWA) yields the scale separation, or bandpass filtering, of Fourier analysis, while retaining spatial information. The combined accuracy of retained information is limited by the uncertainty principle. The examples in this paper, whether from images, explicit functions, or observed geopotential height, illustrate the greater efficiency of OWA over Fourier analysis, in many cases.

The author offers this tutorial and examples, including the guide (Table 1) to the use of freely available software, with three purposes in mind:

to supplement the well-established material reviewing wavelet analysis of time series (e.g., Lau and Weng 1995; Torrence and Compo 1998) with techniques that are suitable for analysis of spatial structures;

to help a broader part of the meteorological community to go beyond the limitations of pure location or wavenumber representations; and

to illustrate, using atmospheric data, the advantages and complexities that await the user of these OWA techniques, in regard to physical interpretation.

## Acknowledgments

I thank A. B. Davis, A. R. Lupo, and an anonymous reviewer for much useful input; B. Saltzman, R. B. Smith, K. R. Sreenivasan, R. R. Coifman, A. R. Hansen, P. D. M. Parker, J. C. van den Berg, and J. D. Berner for comments on earlier versions; and the National Center for Atmospheric Research for providing the observational data. This material is based upon work supported by the National Science Foundation under Grant 9420011 and through NCAR Project 36211014.

## REFERENCES

### APPENDIX A

#### Periodization

The periodization (e.g., Wickerhauser 1994) of the orthonormal wavelet basis functions *ψ*_{j,k} on the real line is given in Table 2, where the periodization operator ( )^{per} acts on a function *A* defined on the real line as

and *λ*_{ε} is a reference origin longitude that may be chosen to optimally align the OWA to the analyzed function (cf. section 2g).

### APPENDIX B

#### Significance of OWA for Turbulence Modelers

All of the graphical depictions of (2.3a), (2.3b) in section 2c(2) illustrate another property of OWA that has attracted researchers in turbulence: a fractal-like self-similarity. Notice in Fig. 4c how the scaling function reproduces itself as a linear combination of smaller copies of itself (2.3b). A related linear combination generates the mother wavelet in Fig. 4d (2.3a). Turbulence cascade modelers have invoked less explicit (i.e., only statistical) self-similarity in order to obtain the observed scaling in turbulence, based in part on Richardson’s (1922) cascade hypothesis. For instance, the ratio of turbulent kinetic energy dissipation on two scales (wavenumbers 2^{j}*n,* 2^{j−1}*n*) is modeled by a probability distribution that, if independent of *j,* yields (statistically) self-similar cascades (Sreenivasan 1991). The continuous Fourier transform of (2.3b) reveals a ratio of values at wavenumbers 2^{j}*n*′, 2^{j−1}*n*′ equal to 2^{−1/2}〈*w*^{⊥}F∗(2^{j}*πn*′)〉_{n} (discussed further by F3). Although this ratio is deterministic for a given wavelet, the fact that wavelets are generated by such a process has encouraged their application to turbulence studies, such as the CWT studies of Arneodo et al. (1999). Wavelets are explicitly self-similar by (2.2) and generated by a scaling process (2.3a), akin to a cascade, and have been conclusively shown to yield theoretical or numerical advantage when applied to the Navier–Stokes equations that govern turbulence (e.g., Federbush 1993; Schneider et al. 1997). Wavelets’ efficient representation of intermittent, fractal, or multifractal structures such as those observed in turbulent signals is another indication of their relevance to high Reynolds number flows. See Nelkin (1994) and Frisch (1995) for a review of turbulence theory and empirical results. Farge (1992) and Farge et al. (1999) review the application of wavelet analysis and synthesis to turbulence.

### APPENDIX C

#### Formal Comparison of Fourier and Wavelet Expansions

It is interesting to compare the completely expanded Fourier and SOWA expressions for *f*★(*x*), that is,

The top form is an expansion in amplitudes |*f̂*_{n}| times scalings by a factor *n* of special shifts by −arg*f̂*_{n} of the mother wave cos(2*πx*) (Chui 1992); the bottom form is an expansion in amplitudes 2^{j/2}*f̃*_{j,k} times scalings by a factor 2^{j} of general shifts by −2^{j}ℓ + *k* − 1 of the mother wavelet *W*(*x*). Thus in both cases scale is introduced as a multiplier of the argument *x,* and clearly the linear index *n* affords more scale resolution than the exponential index *j.* However, because of the global structure of cos, the individual shifts (phases) arg*f̂*_{n} do not localize structures in *x,* but must be taken all together to do so. Because *W* is CS, each individual location index *k* does localize structures in *λ,* although it can be seen in Fig. 5a that for low *j* the periodization (A.1) slightly diffuses this localization.

### APPENDIX D

#### Relation of the *ε*_{j} to the Choice of *λ*_{ε}

The *ε*_{j} are related to the choice of *λ*_{ε} and the shift equivariance (limited to a given resolution *j*) of the basis functions *W*_{j,k} as follows. Let

be a shift by *λ*_{ε}, where is used to represent *λ*_{ε} on the grid (2.1) as

The limiting cases correspond to = (0, . . . , 0) and = (1, . . . , 1), respectively. It has been shown (e.g., Nason and Silverman 1995) that the coefficients *f̃*_{j,k} and *f*_{j,k} are shifted in *k* by ℓ_{ε} with the “detail bits” *ε*_{j′≥j} set to zero, that is, by

from the coefficients and computed with = (0, . . . , 0). (Examples are provided by F1.) A corollary generalizes Shensa’s (1992) Eq. (3.6): for any ,

precisely expressing the shift variance of OWA. Ultimately this implies that there are only (*J* − *R*)2^{J} different *f̃*_{j,k}, and 2^{JsgnR} different *f*_{R,k}, for all possible , that is, all possible *λ*_{ε}.

## Footnotes

* Current affiliation: Department of Meteorology, College of Computer, Mathematical, and Physical Sciences, University of Maryland, College Park, Maryland.

*Corresponding author address:* Aimé Fournier, National Center for Atmospheric Research, Boulder, CO 80307-3000.

Email: fournier@ucar.edu