## 1. Introduction

Wind scatterometry is a widely used technique for measuring global ocean surface winds from space. Current operational applications include assimilation into global models for numerical weather prediction like that of the European Centre for Medium-Range Weather Forecasts (ECMWF) (Hersbach 2007) and detection of tropical and extratropical hurricane force cyclones for marine nowcasting (Sienkiewicz et al. 2007). Table 1 gives an overview of present operational scatterometers.

Scatterometers measure the radar cross section of the ocean surface. A geophysical model function (GMF) gives the radar cross section as a function of the wind vector at 10-m anemometer height, incidence angle, azimuth angle, radar frequency, and polarization (Wentz and Smith 1999; Hersbach et al. 2007). Numerical inversion of the GMF yields the scatterometer wind measurement. Because of the nature of radar backscatter from the ocean surface, this procedure generally yields more than one solution. These multiple solutions are referred to as ambiguities. If the scatterometer observations are to be assimilated in a numerical weather prediction (NWP) model, the ambiguities and their a priori probabilities can be fed into the variational data assimilation scheme of that model to be combined with other observations (Stoffelen and Anderson 1997). If, on the other hand, the scatterometer observations are intended as a stand-alone information source for nowcasting, it is necessary to select the solution that is most likely the correct one. This is done in the ambiguity removal (AR) step.

A number of ambiguity removal methods have been proposed. These methods can be divided into three groups: the naïve methods, the spatial filters, and the variational methods. Naïve methods are the first-rank method that selects the solution with the highest a priori probability and the closest-to-background method that selects the solution closest to a model prediction (background wind field). More sophisticated AR schemes are based on spatial filtering (see, e.g., Cavanié and Offiler 1986; Graham et al. 1989; Cavanié and Lecomte 1987; Stoffelen and Anderson 1997; Stiles et al. 2002).

The ambiguity removal problem can also be solved in two steps following a variational approach. The first step requires availability of a model prediction of the wind field (background). An analysis wind field is constructed from the observations and the background by minimizing a cost function, which may contain constraints on smoothness, statistical consistency, physical consistency, etc. In the second step, the solution closest to the analysis is selected (so such methods may as well be referred to as closest to analysis).

The Variational Ambiguity Removal for the Scatterometer Online Processing (VARscat) algorithm was developed for processing scatterometer measurements (Roquet and Ratier 1988; Leru 1999) and to improve the operational scheme used at the Institute Français de Recherche pour l’Exploitation de la Mer (IFREMER) (Quilfen and Cavanié 1991). It is a variational method minimizing a heuristic cost function. Another variational method is the successive corrections ambiguity removal (SCAR) developed at the Norwegian Meteorological Institute (DNMI). Hoffman et al. (2003) present a two-dimensional variational method with a cost function consisting of seven terms for filtering and dynamical consistency. It is also possible to input measured radar cross sections, so inversion may be included in this method. It compares well to a median filter ambiguity removal technique when applied to data from the National Aeronautics and Space Administration (NASA) Scatterometer (NSCAT), as shown by Henderson et al. (2003).

In this paper, we present a two-dimensional variational ambiguity removal technique (2DVAR) developed at the Royal Netherlands Meteorological Institute (KNMI) from the mid-1990s onward. 2DVAR is already used in present operational wind products disseminated by the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) through the Ocean and Sea Ice Satellite Application Facility (OSI SAF; see http://www.knmi.nl/scatterometer/osisaf). It provides a simplified framework to test improvements to more complete three- and four-dimensional variational data assimilations (3D- or 4DVAR) of ambiguous scatterometer data. 2DVAR may further be used to process winds from the forthcoming Indian and Chinese scatterometers (e.g., to aid in marine and coastal warnings).

Portabella and Stoffelen (2004) have shown that the distance between a scatterometer observation and a corresponding ambiguous solution on the GMF can be related to an a priori probability for that particular ambiguity being the correct solution. 2DVAR takes these a priori probabilities as well as the known error characteristics of observations and background into account. Therefore it leads to wind fields that are not only spatially consistent and meteorologically balanced, but also statistically consistent: an ambiguity with high a priori probability is more likely to be selected than one with low probability. The main differences of 2DVAR with respect to other similar methods are that

the cost function contains two terms: an observational and a background term;

minimalization is performed in spectral space, thus optimizing all spatial scales simultaneously;

the problem is preconditioned, so inversion of the background error correlation matrix is trivial;

the a priori probabilities of the ambiguities are properly accounted for in the observation term of the cost function.

The observation geometry of SeaWinds changes along the swath. In the nadir part, this leads to broad minima when inverting the GMF (see, e.g., Fig. 1 from Hoffmann et al. 2003). The minima are no longer good representations of the ambiguities, resulting in considerable noise in the final wind solution. The multisolution scheme (MSS) retains the local wind vector probability density function after inversion, rather than only a limited number of ambiguous solutions at the local minima (Portabella 2002). This yields a better representation of the ambiguities, and 2DVAR in combination with the MSS effectively reduces the noise in the SeaWinds measurements. The noise level is estimated quantitatively for each wind vector cell (WVC) by extrapolating the autocorrelation function. Without MSS the background has little effect, but with MSS it gets more weight. It will be shown here that this can best be mitigated by switching off variational quality control. Good results are obtained for a hurricane force cyclone in the northern Pacific with wind speeds over 40 m s^{−1}. The scatterometer measurements at 25-km resolution compare better to buoy observations than those at 100 km, while this is reversed for comparison with the ECMWF prediction. This proves that 2DVAR retains small-scale information from SeaWinds measurements that is present in buoy observations but absent in the ECMWF model.

The aim of the paper is twofold: presentation of the 2DVAR method and investigation of its behavior and sensitivity to changes in the underlying error model. The 2DVAR method is described in section 2, but more details can be found in Vogelzang (2007). Section 3 describes two tests for the correctness of the current 2DVAR implementation: the single observation test and the so-called edge analysis. Section 4 contains the statistical analysis of a dataset consisting of one month of SeaWinds data. Section 5 contains two case studies about the effect of the parameters in the 2DVAR error model. The paper ends with the conclusions in section 6.

## 2. 2DVAR

### a. Formulation of the problem

The basic idea behind 2DVAR is first to combine the scatterometer observations and a model prediction (background) in a weighted field (analysis), and then to select that local ambiguous solution that lies closest to the analysis. Such a procedure, basically following the approach of Daley (1991), requires knowledge on the error characteristics of observations (in terms of error variances) and background (in terms of full error covariances). The observation error has been treated by Stoffelen (1998). Moreover, Portabella and Stoffelen (2004) show how local scatterometer wind vector ambiguities can be assigned an a priori probability based on their distance to the GMF (inversion residual). The error characteristics of the background are known and monitored on a routine basis at centers for numerical weather prediction (NWP; see http://www.nwpsaf.org).

2DVAR operates on a so-called batch grid that encompasses a set of scatterometer measurements. The batch grid has its *x* axis perpendicular to the satellite-moving direction and its *y* axis parallel to it. The wind vector components perpendicular and parallel to the satellite direction are denoted by *t* and *l*, respectively. The local rotation angle of the 2DVAR batch grid can be found with sufficient precision from the known positions of the WVCs (Vogelzang 2006b).

**v**

_{o}

^{k}with ambiguity index

*k*. Suppose also that the background information is contained in a state vector

**x**

*. The analysis state vector*

_{b}**x**minimizes the cost function

*J*as the observational term and

_{o}*J*as the background term. For each scatterometer observation the background field is assumed to be known at the same position and time, if necessary from interpolation. Note that the situation here is opposite to that of assimilating data into a numerical weather model: here the abundant observations have the largest weight and define the grid on which the analysis is made.

_{b}*δ*

**x**are used rather than the state vector

**x**itself (incremental formulation):

### b. Definition of the cost function

*i*,

*j*) as the indices of the batch grid cell,

*N*

_{1}and

*N*

_{2}as the number of batch grid cells perpendicular and parallel to the satellite-moving direction, respectively, and

*M*as the number of ambiguities in cell (

_{ij}*i*,

*j*). Further,

*t*and

_{ij}*l*stand for the analysis wind components at cell (

_{ij}*i*,

*j*) perpendicular and parallel to the satellite-moving direction, respectively. Similarly,

*J*remains unchanged when the wind components are replaced by their increments. In (3),

_{o}*σ*and

_{t}*σ*stand for the expected standard deviation of the error in the scatterometer wind components. Both for SeaWinds and the Advanced Scatterometer (ASCAT),

_{l}*σ*=

_{t}*σ*= 1.8 m s

_{l}^{−1}. The parameter

*λ*is an empirical parameter that weights the different ambiguities. It gives optimal separation between multiple solutions for

*λ*= 4. The reader is referred to Stoffelen and Anderson (1997) for the rationale behind the form of (3). Note that in the limit

*λ*→ ∞ only the smallest term contributes to the summation over

*k*, making the expression act as an analytical “if statement.” The inversion and quality control procedures give

*P*, the a priori probability of ambiguity number

_{k}*k*being the correct solution, as (Portabella and Stoffelen 2004)

*R*

_{MLE}is the distance from the ambiguity on the GMF to the scatterometer measurement in observation space, and

*R*

*N*guarantees that the sum over the a priori probabilities of all ambiguities in a WVC equals one.

_{t,l}as the matrix of background wind error covariances, the subscripts indicating that it is defined in terms of the wind components

*t*and

*l*. The superscript T indicates the transpose of a vector or matrix. Note that the transpose suffices here since

*δ*

**x**is a real vector. In the general case the Hermitian conjugate (complex conjugate of the transpose) should be taken. Evaluation of (5) requires inversion of 𝗕

_{t,l}, which may be time consuming since it is not diagonal. To circumvent this the background cost function is transformed to the spatial frequency domain with a Fourier transformation 𝗙 and is expressed in terms of streamfunction and velocity potential increments,

*δχ̂*and

*δψ̂*, using an inverse Helmholz transformation 𝗛

^{−1}. Introducing

*δ*= 𝗛

**ξ**^{−1}𝗙

*δ*

**x**as the transformed state vector, the cost function reads

_{χ̂,ψ̂}= ∑*

^{T}𝗣∑ with (Daley 1991)

_{χ̂}and Σ

_{ψ̂}are diagonal with the error standard deviations

*σ*and

_{χ̂}*σ*as components, while

_{ψ̂}*P*and

_{χ̂χ̂}*P*contain the autocorrelations

_{ψ̂ψ̂}*ρ*and

_{χ̂χ̂}*ρ*, respectively. These are treated in more detail in section 2d. Since ∑ and 𝗣 are real, it is possible to condition the problem by making the square root decomposition 𝗕

_{ψ̂ψ̂}_{χ̂,ψ̂}= ∑

^{T}𝗣

^{1/2}𝗣

^{1/2}∑ and defining the preconditioned state vector as

*δ*= ∑

**ς**^{−1}𝗣

^{−1/2}

*δ*. The relation between

**ξ***δ*and

**ς***δ*

**x**is now given by the conditioning transformation

*δ*to the identity matrix, and the background part of the cost function simply reads

**ς**### c. Minimalization and gradient

The minimalization is done with a limited-memory quasi-Newton routine named LBFGS written by J. Nocedal (Liu and Nocedal 1989). This routine proves to be fast and accurate. In 2DVAR, good results are obtained with an initial step size of 30*J*|**∇***J*|^{−1}, with *J* being the total initial cost function and **∇***J* being its gradient with respect to the control vector components. A typical batch requires less than 100 function evaluations to converge. See Vogelzang (2007) for detailed information. Note that 2DVAR uses the same minimization algorithm as Hoffman et al. (2003).

*δ*= 0, so the initial analysis equals the background, and uses the gradient of the cost function with respect to the control variables in

**ς***δς*. For the background part the gradient simply follows from (9) as

*t*and

*l*, packing the derivatives into a state vector, and transforming this state vector back to the spectral domain using the adjoint (complex conjugate of the transpose) of the inverse conditioning transformation,

At this point it should be remarked that the control vector used in the actual minimalization procedure is not necessarily equal to the state vector (Hoffman et al. 2003). In the spatial domain, the state vector *δ***x** is real with 2*N*_{1}*N*_{2} components and equals the control vector. In the spectral domain, the state vector *δ ς* is complex and has twice as many components. However, only components with nonnegative spatial frequency are independent (Press et al. 1988). The number of independent components remains 2

*N*

_{1}

*N*

_{2}, but an additional packing/unpacking transformation is needed to go from state vectors to control vectors and vice versa in the spectral domain.

### d. Error model

*ρ*and

_{χχ}*ρ*, are modeled as Gaussian functions following Daley (1991):

_{ψψ}*ν*

^{2}stands for the ratio of the rotational and the divergent contribution to the wind field, and

*R*and

_{ψ}*R*for the correlation lengths—the length scales that determine the extent of the error correlations. These parameters have a physical meaning and cannot be varied arbitrarily. Moreover, they are not independent, since the impact on the analysis is determined by the ratio of the error standard deviation and the correlation length (de Vries and Stoffelen 2000). The scaling parameters

_{χ}*L*and

_{ψ}*L*in (11) are defined as

_{χ}*L*

_{ψ}^{2}= ½

*R*

_{ψ}^{2}and

*L*

_{χ}^{2}= ½

*R*

_{χ}^{2}.

The background error correlation model is readily Fourier transformed, either numerically or analytically, to the spectral domain where it remains Gaussian. The default values for the parameters were found by de Vries and Stoffelen (2000) and are listed in Table 2. These values were obtained after the intercomparison of 2DVAR with other variational methods mentioned before. The correlation length in the tropics is higher than in the extratropics to account for the general large-scale convective circulation structures around the equator. In the extratropics the circulation is more rotational, as reflected in the smaller value for *ν*^{2}.

By choosing the incremental approach, the large-scale circulation patterns in the analysis are determined by the background while the small-scale patterns are fitted by the autocorrelation functions (11). Their form in terms of wind vectors is shown in section 3a.

### e. Ambiguity selection and variational quality control

*P*

_{GE}over a finite domain with width

*D*such that (Anderson and Järvinen 1999; Ingleby and Lorenc 1993)

*P*

_{GE}= 0.0075 and

*D*= 4. The gross error probability imposes a minimum value to the a priori probability, which implies that from a certain threshold the magnitude of

*R*

_{MLE}no longer matters. Equation (13) is often used as variational quality control to reduce the weight of observations inconsistent with the current estimate of the analysis. As will become clear in section 5, this is not a desirable property in 2DVAR.

In the present 2DVAR formulation the variational quality control (VQC) flag is set for each WVC where the contribution of *J _{o}* exceeds the threshold value of 12. This happens for WVCs where the a priori probability of each ambiguity equals the gross error probability.

## 3. Tests

In this section two tests are described that demonstrate the correctness of the current 2DVAR implementation. The first one is the single observation test. It involves only one observation and the minimization problem can be solved analytically. This test shows that all normalizations are correct. The second test is the edge analysis, showing that the analysis increments do not suffer from severe over- or underfitting.

### a. Single observation test

*i*,

*j*) there is one observation (

*t*,

_{o}*l*). Starting with zero background and analysis increments, the only contribution to the cost function and its gradient originates from this observation. From (6) this contribution reads

_{o}*σ*=

_{o}*σ*=

_{t}*σ*. Now the 2DVAR problem reduces to an optimal interpolation problem (Daley 1991) with solution

_{l}*σ*=

_{b}*σ*=

_{χ}*σ*is the standard deviation of the background error. At the solution, the gradient of the total cost function should be zero, since the total cost function is minimal there, so

_{ψ}*ν*,

*R*, and

_{χ}*R*do not enter (18) explicitly. However, they do appear in the full expression for the analysis wind field (Vogelzang 2007).

_{ψ}Figure 1 shows the resulting wind fields for (*t _{o}*,

*l*) equal to (0,1) m s

_{o}^{−1}and

*ν*equal to zero (purely rotational) or one (purely divergent). The observation is located in the center of the grid,

*x*=

*y*= 1600 km. The correlation lengths

*R*and

_{ψ}*R*are both equal to 300 km. The standard deviation of both the observation error and the background error was set equal to 1.8 m s

_{χ}^{−1}. The wind speed at the observation point should equal half of the observed speed. This is satisfied with accuracy better than 2 × 10

^{−5}. Figure 1 shows the impact of one observation on 2DVAR. The circulation patterns in Fig. 1 form the basic building blocks from which 2DVAR constructs its analysis increments.

### b. Edge analysis

A potential danger in making an analysis is over- or underfitting. Moreover, the analysis should go to zero at the edge of the batch grid in order to prevent numerical problems in the fast Fourier transform (FFT) used for the conditioning transformation and its inverse [Eqs. (8a) and (8b)]. Figure 2 shows the extreme (minimum and maximum) values of the analysis increments perpendicular and parallel to the satellite direction. The curves in Fig. 2 were obtained with the SeaWinds Data Processor (SDP) version 1.5 without applying the MSS at 25-km resolution with the ECMWF wind field as background, using all SeaWinds data from December 2004 (see section 4) and the default 2DVAR parameter settings listed in Table 2. Since the correlation length of the background error differs for the tropics (latitude between 20°N and 20°S) and the extratropics (latitude larger than 20°N or smaller than 20°S), the curves in Fig. 2 have been separated accordingly. The 2DVAR parameters have no seasonal dependence.

The batch grid has a width of 32 points. With a cell size of 100 km, the batch grid is 3200 km wide. The width of the free zone around the observations is defined as five cells or 500 km. Since the SeaWinds swath is 1800 km wide, there are four cells remaining. These are inserted at the right-hand side of the batch grid as an additional free zone. SDP version 1.5 does not process the outer swath, so there is an extra strip of 200 km without observations at each side. Therefore, the region with observations across the batch grid, marked with the vertical black dashed lines in Fig. 2, extends from cell 8 (*x* = 700 km) to cell 21 (*x* = 2000 km).

Figure 2 shows no signs of over- or underfitting: the curves are not too smooth nor too wildly varying. The analysis increments go to zero at the edges, faster in the extratropics than in the tropics. This is according to expectation, since the background error correlation length is 300 km in the extratropics and 600 km in the tropics. The free zone should be 2 or 3 times larger than the background error correlation length, so it is large enough in the extratropics but a bit tight in the tropics. Application of the MSS and/or using the National Centers for Environmental Prediction (NCEP) wind field as background yields similar results and the same conclusion. In the next section the edge analysis is revisited.

## 4. Statistical validation

### a. Introduction

In this section, 2DVAR is tested using some statistical methods using SeaWinds data from December 2004 or January 2008. The datasets contain all complete orbits that started in this period. The National Oceanic and Atmospheric Administration (NOAA) Binary Universal Form for the Representation of Meteorological Data (BUFR) files were processed with SDP version 1.5 using as background the NCEP model wind field (which is available in the NOAA BUFR product) or the ECMWF wind field. The NCEP model wind field is a 24-h forecast of the 1000-mb wind. The NOAA BUFR files do not contain a wind forecast at 10-m anemometer height. The ECMWF wind field is a 3–15-h forecast of the wind speed at 10-m anemometer height, and is therefore expected to compare better with the scatterometer winds, which are also at 10 m.

Processing was done with and without application of the MSS. To investigate how the analysis increments behave at the edges of the batch grid, the analysis increments were stored.

As SeaWinds is a rotating fanbeam scatterometer, its observation geometry varies across the swath. At 25-km resolution there are 76 WVCs, and the swath is divided in three parts: the outer swath (WVC 1–10 and 67–76), the midswath or “sweet” swath (WVC 11–30 and 47–66), and the nadir swath (WVC 31–46).

### b. Comparison with model winds

The datasets described above were intercompared by calculating the statistics of the differences in wind components for the zonal components, *u*, and the meridional components, *υ*. Starting with the NCEP wind field as background, SDP was applied with and without MSS. The differences with the ECMWF field were calculated, and the results are shown in Table 3 as the standard deviations of the differences in the zonal wind component, *σ _{u}*, and meridional wind component,

*σ*, respectively. Model wind vectors were only considered when the associated scatterometer wind vectors were valid in order to prevent contamination of the NCEP and ECMWF model wind vector comparison by land pixels.

_{υ}Table 3 shows that without MSS the SDP result compares worse with the ECMWF model than the NCEP background that was used as the starting point. Most of the difference is in the zonal wind component *u*, which contains most of the observation noise as will be shown below. With MSS the SDP result lies closer to the ECMWF model than the NCEP background. The standard deviation of the error with respect to the ECMWF model reduces with ^{2} − (1.52)^{2}^{−1} for *u* and with ^{2} − (1.50)^{2}^{−1} for *υ*. This shows that 2DVAR retrieves useful information from the scatterometer observations.

Table 4 shows the effect of the background. SDP was run with and without MSS using either the NCEP field or the ECMWF field as background. The resulting wind fields were compared with the ECMWF field. Without MSS, the choice of background has little effect, indicating that the background only determines the large-scale circulation. However, the background influence becomes larger when MSS is applied. Now the 2DVAR method has more freedom in selecting the optimal wind vector, and the background field becomes more important.

The results in Tables 3 and 4 were obtained for those wind vectors for which the VQC flag (see section 2e) was not set. The number of valid vectors may therefore differ slightly for each of the sets, between 18.8 and 19.2 million, a variation of 2% that may influence the statistics. However, inspection of all histograms of the wind differences revealed that their distributions are well behaved without significant outliers.

### c. Noise estimation

*u*at 25-km resolution obtained from the ECMWF field and from the SDP results with and without MSS. The left-hand panel shows the full curves, the right-hand panel an enlargement at short distances. The autocorrelation at zero distance equals one by definition. The SDP result without MSS (solid curve) shows a clear discontinuity at short distances, while the SDP result with MSS (dashed curve) and the ECMWF result (dotted–dashed curve) approach one continuously. This discontinuity is caused by an uncorrelated noise component adding only variance. The size of the discontinuity can be estimated by extrapolating the curve to zero distance (dotted curve). The extrapolated curve crosses the

*y*axis at 1 −

*a*rather than 1 for some value of

*a*between 0 and 1. Following the approach of Hollingsworth and Lönnberg (1989) it is easily shown that the standard deviation of the noise,

*σ*, satisfies

_{n}*σ*is the standard deviation of the total signal (Vogelzang 2006a).

_{s}Figure 4 shows the standard deviation of the noise for the SDP wind components at 25- and 50-km resolution processed without MSS. At coarser resolutions the noise level reduces and the extrapolation distance increases, leading to larger uncertainties in the noise estimate. The extrapolation may even overshoot the autocorrelation, leading to an extrapolated autocorrelation larger than 1 at *x* = 0 and, hence, a negative noise variance estimate. This happens at 50-km resolution in the midswath and at 100-km resolution all over the swath. Such points have been excluded from Fig. 4.

Figure 4 shows that the noise level decreases as the resolution becomes coarser. At 100-km resolution the noise estimates are invalid, indicating negligible noise contribution. At 25-km resolution the standard deviation of the noise may exceed 1 m s^{−1} for *υ* and 1.5 m s^{−1} for *u*. These figures agree well with the overall reduction in standard deviation of the difference between the scatterometer winds and the ECMWF background when switching on MSS as presented in the previous section. When MSS is applied the noise component disappears and no valid noise estimates are obtained.

### d. Analysis statistics

Some statistics of the analysis increments were already presented in section 3b to show that 2DVAR exhibits no signs of severe overfitting and that the free edge is sufficiently large in the extratropics but rather tight in the tropics. Figure 5 shows the results for the average absolute value of the analysis increment across the 2DVAR batch grid. This figure shows that in all cases the average absolute analysis increment is smaller when applying MSS (dashed curves). This is no surprise, since 2DVAR is expected to find a solution with reasonable probability not too far from the background in this case. Without MSS (dotted curve) the ambiguities are farther away from each other, leading to larger analysis increments. A second reason lies in the more noisy character of the scatterometer winds without MSS.

Figure 5 also shows that the average absolute analysis increments in the tropics (left panels) do not approach zero properly, again indicating that the free edge is rather tight. In the extratropics (right panels) the free edge is sufficiently large. Note also that the average absolute analysis increments are higher with the NCEP wind field as background (bottom panels) than those with the ECMWF wind field as background (top panels). When using the NCEP background in the extratropics, *δt* shows some mild signs of overfitting around *x* = 400 and *x* = 2400 km.

### e. Selection probabilities

As mentioned before, 2DVAR takes the a priori probabilities of the ambiguities into account [see Eq. (4)]. Figure 6 shows distributions of the conditional probability *p*(Sel|*p*_{MLE} = *P*) that an ambiguity with a priori probability *P* is selected by 2DVAR given the probability that the a priori probability has this value. The condition is important because there are many ambiguities with small probability, especially with MSS, and very few with large probability. Since all distributions in Fig. 6 are normalized to 1, they may be interpreted as probability density functions. The left panel of Fig. 6 shows the results without MSS. The selection probabilities of 2DVAR (solid curve) and closest to background (dotted–dashed line) are very similar and lie close to perfect statistical consistency, *p*(Sel|*p*_{MLE} = *P*) ∝ *P* (dashed curve). The first rank result (dotted curve) deviates most from statistical consistency, because an ambiguity with a priori probability larger than 0.5 is automatically selected. The right panel of Fig. 6 shows the results with MSS. Because there are almost no ambiguities with a priori probability larger than 0.5, the range of *P* is restricted to 0.45. The histograms are renormalized to 1 in this restricted interval, causing the first rank result to lie close to perfect statistical consistency. Again 2DVAR and closest to background are similar, but 2DVAR chooses more ambiguities with a priori probability larger than 0.15 than closest to background, showing that 2DVAR indeed uses the probability information. For *P* > 0.35 the results become unreliable due to lack of data.

### f. Comparison with buoy measurements

The results of SDP were compared with all reliable observations by moored buoys in January 2008. About 140 buoys are considered here, located mainly in the tropical oceans and along the coast of North America and Europe. A total number of 3057 collocated observations was found. A buoy observation is considered as collocated if the recording time differs less than 30 min and the distance between buoy and WVC center is less than the WVC size divided by

Table 5 shows the results for SDP at 25-km resolution (with MSS, to remove observation noise) and at 100-km resolution (without MSS). Also, the comparison with the ECMWF prediction is included. The 25-km SDP product compares better with the buoys than with the ECMWF prediction, while for the 100-km SDP product it is just the other way around. This shows that the 25-km product reveals details that are present in the buoy measurements but not in the ECMWF model. These details are averaged out in the SDP 100-km product, which therefore compares better with the model.

## 5. Case studies

In the previous sections it was shown that 2DVAR performs well in a statistical sense. In combination with MSS it suppresses the noise component in SeaWinds data at high resolution, but at the cost of increased background influence. In this section, two cases will be studied in more detail in order to support the conclusions drawn from the previous section and to gain more insight in the role of the parameters in the error model, since they control the relative balance between observations and background in 2DVAR.

The first case is an observation of a cyclone with a strong front in the southern Pacific on 6 August 2006. There is a mismatch in position of the cyclone and the front between the observations and the NCEP background. The second case concerns a severe cyclone of hurricane force intensity over the northern Pacific on 30 December 2004 with wind speeds over 40 m s^{−1}. The effect of changing 2DVAR settings will be illustrated.

### a. Case Pacific cyclone and front

Figure 7 shows the NOAA result for SeaWinds measurements recorded on 6 August 2006 in the Pacific Ocean off the coast of Chile. A deep low pressure area located approximately at 45°S and 80°W is accompanied with an extended frontal area on its northern and northeastern side. The front has an irregular shape around 75°W in Fig. 7 where a large number of cells have their rain flag set (orange arrows). To the north of the front a few erroneous wind vectors can be seen. This shape is not present in the NCEP model field shown in Fig. 8. Moreover, the NCEP model locates the front more to the south (the grid point 30°S, 80°W is a suitable reference) and the center of the cyclone more to the west. In view of the abundant observational evidence, the right position of the front and the center of the low are given by the NOAA results in Fig. 7. Note that the sharp front in Fig. 7 is accompanied by areas with homogeneous winds on both sides. This is an artifact of the AR method applied by NOAA.

The SDP wind field without MSS is shown in Fig. 9. The wind field is noisy and the eastern part of the front is not very clearly visible because many points there are flagged as rain points and therefore rejected for further processing by SDP. The VQC flag (see section 2e) is set in a number of WVCs along the front and near the center of the cyclone (purple arrows). The location of the cyclone agrees with the NOAA result (Fig. 6), while the location of the front agrees with the NCEP background (Fig. 8). Note the strong convergence in the region 30°–35°S, 75°–80°W. This structure seems unrealistic.

Figure 10 shows the result when MSS is applied. The wind field is now smooth, also north of the front, because the noise has been filtered out. The front appears more regular and extends farther eastward. No WVCs are flagged in the frontal zone. Southwest of the front line some wavy structures appear in the wind field. The convergence in the region 30°–35°S, 75°–80°W has disappeared, and the center of the cyclone has moved slightly to the west, indicating larger influence from the background.

To study the sensitivity of 2DVAR to its error model parameters, attention is focused on a 5° × 5° region around the cyclone. Figure 11 shows some results with various changes in the parameter settings relative to the standard values of Table 2. As a reference, the bottom right panel of Fig. 11 shows the result for the standard 2DVAR settings. The top panels show the effect of changing the background error standard deviation to 1 m s^{−1} (top left) or 3 m s^{−1} (top right). Decreasing the background error increases the background influence, and the center of the low moves slightly to the west. Increasing the background error has only a minor effect. The middle panels of Fig. 11 show the effect of the background error correlation length. Increasing the correlation length to 350 km (middle left) again increases the background influence, pushing the center of the low to the west. Decreasing the correlation length to 250 km (middle right) has relatively small impact. The bottom left panel of Fig. 11 shows the best result, obtained without gross error probability. This increases the impact of the observations and leaves the center of the low at the right position.

### b. Case Pacific cyclone

On 30 December 2004, a hurricane force cyclone raging over the northern Pacific was observed by SeaWinds. Figure 12 shows the SDP result with ECMWF background and MSS applied to reduce the noise. WVCs flagged as contaminated by rain have been left out of Fig. 12, while VQC flagged wind vectors are drawn in purple as before. Observations and background give the same position for the center of the cyclone, but the observed wind speeds around the center are higher than the modeled ones, especially to the south of the center where the VQC flags are set in Fig. 12. This is no surprise: it is well known that NWP models tend to underestimate the strongest winds in cyclones of hurricane force intensity.

Figure 13 shows the center of the cyclone. Now the VQC flagged wind vectors are in black, while the other arrow colors indicate the wind speed range. The left panel of Fig. 13 shows the same results as Fig. 12, but the right panel is obtained with the gross error probabilities set to zero. With the standard values for the GEPs (0.0075 for all WVCs; left panel of Fig. 13), some wind directions south of the center obviously do not fit into the overall circulation pattern. With GEPs equal to zero, 2DVAR is forced to fit the local wind vector solutions, and as a consequence the black arrows now fit well in the overall circulation pattern. The wind speeds south of the center of the cyclone exceed 40 m s^{−1}.

The GEPs effectively impose a maximum on the observation part of the cost function. In cases where ambiguities and background are far apart, the a priori probabilities becomes constant and 2DVAR can only distinguish the ambiguities by their distance to the background. As a result, 2DVAR starts behaving like closest to background and will select the ambiguity with its direction closest to the background (in MSS all ambiguities in a WVC have similar speeds). This is what happens in the left panel of Fig. 13. When the GEPs are set to zero, the a priori probabilities have more weight in 2DVAR’s selection process, resulting in selections that may deviate more from the background as in the right panel of Fig. 13. 2DVAR now selects ambiguities with high a priori probability that fit better into the general circulation pattern and have somewhat higher speed. The VQC flag is set more often because of the large difference with the background (note that the VQC flag is set when the observation cost in a WVC exceeds a threshold value of 12).

The cases shown here demonstrate that variational quality control does not necessarily identify wrong selections. It may as well indicate errors in the background. Anyhow, such cases should be handled with care.

### c. Resume

Figure 4 and Tables 3 and 4 show that SDP results without MSS at 25-km resolution contain noise, notably in the azimuth part of the swath. Additional information is needed in order to arrive at reasonable wind estimates. For 2DVAR this consists of the background and the structure functions depicted in Fig. 1. Application of 2DVAR in combination with MSS filters out the noise, as shown by the results in Tables 3 and 4 and by the fact that no valid error estimate can be produced from autocorrelation plots like Fig. 3. Comparison with buoy observations in Table 5 shows that wind information at small scales (smaller than that of the background) is not filtered out but successfully retrieved. This cannot be caused by the background and is therefore due to the structure functions.

A disadvantage of MSS is an increased dependency on information from the background and the structure functions compared to traditional GMF inversion schemes. In some cases this may lead to errors on larger scales, as shown by the case studies (Figs. 11, 13). Such errors may be suppressed by relying more on the scatterometer observations, in particular by relaxing the variational quality control.

## 6. Conclusions

In this paper a new method for ambiguity removal named 2DVAR is presented and applied to SeaWinds data. 2DVAR constructs an incremental analysis from the background and the observations, taking the a priori probabilities of the latter into account. The minimalization problem is fully conditioned and solved in the spectral representation of streamfunction and wind potential. The selected ambiguous solution is the one closest to the analysis. The present implementation satisfies the single observation test, a nontrivial case with analytical solution, and shows no clear signs of under- or oversampling.

Without MSS, the choice of background has little effect, indicating that the background determines only the large-scale structure of the wind field. With MSS, 2DVAR proves effective in removing the noise in SeaWinds data at 25-km resolution. Especially in the nadir part of the SeaWinds swath, MSS allows 2DVAR to choose from more solutions with comparable a priori probability. As a result, MSS here increases the influence of neighboring WVCs and the background. The latter is not a desirable property when the background is in error. The noise level can be estimated using a simple and robust method based on extrapolation of the autocorrelation to zero distance. For SeaWinds at 25-km resolution, the noise standard deviation exceeds 1.5 m s^{−1} in the zonal wind component *u* and 1.0 m s^{−1} in the meridional component *υ*. Comparison with buoy measurements shows that 2DVAR with MSS at 25 km reveals details that are not visible in the models.

The influence of the background can be controlled by the parameters of the error model in 2DVAR. It is decreased by increasing the background error variance, decreasing the observation error variance, increasing the background error correlation length, or decreasing the gross error probabilities. These parameters are not independent and have a physical meaning. Therefore they cannot be varied arbitrarily. The best results are obtained when switching off the gross error probabilities as shown by the two case studies. Good results are obtained even for a hurricane force cyclone with winds exceeding 40 m s^{−1}.

Further experimentation with 2DVAR to optimize these settings is recommended since it is likely that optimum settings depend on the meteorological situation. The current 2DVAR implementation is rather rigid with respect to the size and dimension of the grid on which the analysis increments are calculated. This can be improved by implementation of a mixed-radix Fourier transform. 2DVAR is used to retrieve the operational scatterometer wind products disseminated by EUMETSAT through the OSI SAF.

## Acknowledgments

The authors wish to thank their colleagues from KNMI, Jos de Kloe, Marcos Portabella, Jeroen Verspeek, and Gerrit Burgers, for their interest in this work, stimulating discussion, and helpful advice in software issues. This work has been funded by EUMETSAT in the context of the NWP SAF and OSI SAF parts of the Satellite Application Facility Network. The SDP software including 2DVAR can be obtained free of charge from the NWP SAF Web site (http://www.nwpsaf.org).

## REFERENCES

Anderson, E., and Järvinen H. , 1999: Variational quality control.

,*Quart. J. Roy. Meteor. Soc.***125****,**697–722.Cavanié, A., and Offiler D. , 1986: ERS wind scatterometer wind extraction and ambiguity removal.

*Proc. IGARSS ’86: Today’s Solutions for Tomorrow’s Information Needs,*Zurich, Switzerland, European Space Agency, 395–398.Cavanié, A., and Lecomte P. , 1987: Study of a method to dealias winds from ERS-1 data. Vol. 1, ESA Contract 6874/87/CP-I(sc)., European Space Agency, 53 pp.

Daley, R., 1991:

*Atmospheric Data Analysis*. Cambridge University Press, Cambridge, 472 pp.de Vries, J. C. W., and Stoffelen A. C. M. , 2000: 2D variational ambiguity removal. Tech. Rep. 226, Royal Netherlands Meteorological Institute (KNMI), 66 pp.

Errico, R. M., 1997: What is an adjoint model?

,*Bull. Amer. Meteor. Soc.***78****,**2577–2591.Giering, R., and Kaminski T. , 1998: Recipes for adjoint code construction.

,*ACM Trans. Math. Software***24****,**437–474.Graham, R., Anderson D. , Hollingsworth A. , and Böttger H. , 1989: Evaluation of ERS-1 wind extraction and ambiguity removal algorithms: Meteorological and statistical evaluation. ECMWF Rep., 147 pp.

Henderson, J. M., Hoffman R. N. , Leidner S. M. , Atlas R. , Brin E. , and Ardizzone J. V. , 2003: A comparison of a two-dimensional variational analysis method and a median filter for NSCAT ambiguity removal.

,*J. Geophys. Res.***108****,**3176. doi:10.1029/2002JC001307.Hersbach, H., 2007: The preparation of the assimilation of ASCAT scatterometer data at ECMWF.

*Joint 2007 EUMETSAT Meteorological Satellite Conf. and 15th Conf. on Satellite Meteorology and Oceanography,*Amsterdam, Netherlands, EUMETSAT/Amer. Meteor. Soc.Hersbach, H., Stoffelen A. , and de Haan S. , 2007: An improved C-band scatterometer ocean geophysical model function: CMOD5.

,*J. Geophys. Res.***112****,**C03006. doi:10.1029/2006JC003743.Hoffmann, R. N., Leidner S. M. , Henderson J. M. , Atlas R. , Ardizzone J. V. , and Bloom S. C. , 2003: A two-dimensional variational analysis method for NSCAT ambiguity removal: Methodology, sensitivity, and tuning.

,*J. Atmos. Oceanic Technol.***20****,**585–605.Hollingsworth, A., and Lönnberg P. , 1989: The verification of objective analyses: Diagnostics of analysis system performance.

,*Meteor. Atmos. Phys.***40****,**3–27.Ingleby, N. B., and Lorenc A. C. , 1993: Bayesian quality control using multivariate normal distributions.

,*Quart. J. Roy. Meteor. Soc.***119****,**1195–1225.Leru, M., 1999: Inversion des measures radars diffusiomètriques d’ERS-1 et ERS-2: Etude d’une nouvelle approche basée sur une méthode variationelle. IFREMER Tech. Rep. 99-05, 46 pp.

Liu, D. C., and Nocedal J. , 1989: On the limited memory BFGS method for large optimization methods.

,*Math. Prog.***45****,**503–528.Portabella, M., 2002: Wind field retrieval from satellite radar systems. Ph.D. thesis, University of Barcelona, 199 pp.

Portabella, M., and Stoffelen A. , 2004: A probabilistic approach for SeaWinds data assimilation.

,*Quart. J. Roy. Meteor. Soc.***130****,**1–26.Press, W. H., Flannery B. P. , Teukolsky S. A. , and Vetterling W. T. , 1988:

*Numerical Recipes in C: The Art of Scientific Computing*. Cambridge University Press, 1020 pp.Quilfen, Y., and Cavanié A. , 1991: A high precision wind algorithm for the ERS-1 scatterometer and its validation.

*Proc. 1991 IEEE Geoscience and Remote Sensing Symp.,*Espoo, Finland, International Electrical and Electronics Engineers, 873–876.Roquet, H., and Ratier A. , 1988: Towards direct variational assimilation of scatterometer backscatter measurements into numerical weather prediction models.

*Proc. 1988 IEEE Geoscience and Remote Sensing Symp.*, Edinburgh, Scotland, International Electrical and Electronics Engineers, 257–260.Sienkiewicz, J. M., Von Ahn J. M. , McFadden G. M. , and Stewart M. , 2007: Hurricane force extratropical cyclones as detected by QuikSCAT.

*Joint 2007 EUMETSAT Meteorological Satellite Conf. and 15th Conf. on Satellite Meteorology and Oceanography,*Amsterdam, Netherlands, EUMETSAT/Amer. Meteor. Soc.Stiles, B. W., Pollard B. D. , and Dunbar R. S. , 2002: Direction interval retrieval with thresholded nudging: A method for improving the accuracy of QuikSCAT winds.

,*IEEE Trans. Geosci. Remote Sens.***40****,**79–89.Stoffelen, A., 1998: Scatterometry. Ph.D. thesis, The University of Utrecht, The Netherlands, 199 pp.

Stoffelen, A., and Anderson D. , 1997: Ambiguity removal and assimilation of scatterometer data.

,*Quart. J. Roy. Meteor. Soc.***123****,**491–518.Vogelzang, J., 2006a: On the quality of high resolution wind fields. Tech. Rep. NWPSAF_KN_TR_002, EUMETSAT, 53 pp.

Vogelzang, J., 2006b: The orientation of SeaWinds wind vector cells. Tech. Rep. NWPSAF_KN_TR_003, EUMETSAT, 33 pp.

Vogelzang, J., 2007: Two dimensional variational ambiguity removal. Tech. Rep. NWPSAF_KN_TR_004, EUMETSAT, 62 pp.

Wentz, F. J., and Smith D. K. , 1999: A model function for the ocean-normalized radar cross section at 14 GHz derived from NSCAT observations.

,*J. Geophys. Res.***104****,**(C5). 11499–11514.

Operational scatterometers (May 2008). HH and VV stand for horizontally and vertically polarized emitted and received microwave radiation, respectively.

Default 2DVAR parameters for the background error correlation model.

Std devs of the differences in the zonal wind component, *σ _{u}*, and in the meridional component,

*σ*, with respect to the ECMWF model.

_{υ}Comparison of the SDP results at 25-km resolution with MSS and 100-km resolution without MSS, with buoy observations and ECMWF model predictions.