## 1. Introduction

Because of their high spatial and temporal resolution, radar observations have great potential for improving atmospheric analyses and the ensuing forecasts. Despite 30 years (Lilly 1990; Sun et al. 1991) of ongoing research, our skills in forecasting mesoscale convection have remained modest. Over continental scales, radar data assimilation was shown to improve forecasts for periods not exceeding 6–8 h (Berenguer et al. 2012; Stratman et al. 2013). Over regional scales, a forecasting system intercomparison by Wilson et al. (2010) found no individual system capable of accurately forecasting convection a few hours into the future. In a context where the resolution of operational models is regularly increased, making the best out of radar observations is as pertinent as it ever was.

The most common framework for assimilating radar data is to combine a first guess from a previously initiated model forecast (the background) to radar observations in order to obtain an analysis. An “optimal” analysis, the one with minimum error variance, can be found by minimizing a cost function in which the contribution of background and observations estimates are weighted by the inverse of their variance and covariance.

Because of the lack of information and limited computational resources, the covariance (or, equivalently, the correlation) of background and observation errors can only be represented in a simplified form. A nonexhaustive list of methods for doing so would include the recursive filter for representing Gaussian correlations (Purser et al. 2003) along with various expressions representing convolutions with different correlation matrices (Oliver 1995; Gaspari and Cohn 1999) or their inverse (Oliver 1998; Xu 2005; Yaremchuk and Sentchev 2012).

In convective situations, radar observations are generally available over significant portions of the analysis domain at a spatial resolution comparable to that of convection-resolving models. While the instrumental errors of radar observations are not correlated (Keeler and Ellis 2000), the representativeness errors associated with the integration of the precipitating medium within a radar volume might be. Because radar integration is reflectivity weighted, gradients in the intensity of precipitation will be the source of errors (Zawadzki 1973). Precipitation possessing a scaling structure (Fabry 1996), these errors may well be correlated.

Nevertheless, it is not uncommon to neglect the correlations of observation errors in radar data assimilation (see, e.g., Chung et al. 2009). To prevent the negative impact associated with this misrepresentation, it has been suggested to consider only observations sufficiently far apart so that they effectively become uncorrelated (Liu and Rabier 2002, 2003).

In this study, we investigate the impact of representing and misrepresenting the correlation of errors on the quality of analyses. Data thinning and the purposeful misrepresentation of variance to improve the quality of analyses are also examined.

Conceptually, the experiments presented here are similar to those of Liu and Rabier (2002, hereafter LR02). Several assimilation experiments are performed in an idealized context where the correlation of background and observation errors is prescribed. Both the background and observations are made available everywhere in the assimilation domain.

The problem under investigation, which considerably differs from the one examined by LR02, is specified in section 2, followed by theoretical considerations in section 3. Expression for analysis errors are given for the cases where correlations are either entirely neglected or perfectly represented. Special attention is also given to the precision at which the standard deviation may be estimated from a sample of correlated errors. Analyses obtained with different combination of background and observation errors are then examined in section 4.

In section 5 we compare the standard deviation of errors for analyses obtained with perfect representation of correlations to those obtained by neglecting correlations altogether. The computational costs of analyses are discussed in section 5a, followed by a short discussion, in section 5b, on cases where the correlation of errors may be neglected with little influence on the quality of analyses. We then consider analyses in which the correlation of only one term is neglected. This situation is examined first without (section 5c) then with (section 5d) data thinning.

In section 6, a few words are being said on the correlation of analysis errors. Results are discussed in section 7 followed by conclusions in section 8.

## 2. Problem setup

In this section, we describe the framework within which we studied the impact of representing and misrepresenting the correlation of errors in data assimilation.

**x**

_{t}represent the true value of a certain atmospheric variable over a discrete assimilation domain. To estimate

**x**

_{t}we are given a background

**x**

_{b}and a set of observations

**y**. For simplicity, we consider the case where

**x**

_{b}and

**y**are available in the same units and on the same grid as

**x**

_{t}. Background and observation errors are then given byIf

*ϵ*_{b}and

*ϵ*_{o}are described by multivariate

^{1}Gaussian distributions and are not correlated to each other, the optimal (minimum error variance with respect to

**x**

_{t}) analysis

**x**may be found by minimizing the cost function:with

The assimilation context we just described is much simpler than any realistic radar data assimilation. Only one variable is retrieved, we suppose that direct observations of **x**_{t} can be made, errors are unbiased, and temporal dependences (such as cycling and the influence of model equations) are not taken into account.

Even for this simple situation, it is very difficult to generalize the impact of representing correlations since assimilation domains usually contain thousands of data points leading to large and possibly complex matrices

To reduce the dimensionality of the problem, we investigate the case where the errors *ϵ*_{b} and *ϵ*_{o} are homogeneous, isotropic, and spatially correlated following an exponential^{2} decay.

The standard deviation of background and observation errors are then given by the scalars *σ*_{b} = *σ*(*ϵ*_{b}) and *σ*_{o} = *σ*(*ϵ*_{o}).

*ϵ*

_{1}and

*ϵ*

_{2}found at positions

**d**

_{1}and

**d**

_{2}, is only a function of the Euclidean distance,

*γ*= ||

**d**

_{1}−

**d**

_{2}||, separating them and is described byAgain, we use the

*b*and

*o*subscripts to indicate the respective rate of decay

*α*describing the correlation of background and observation errors. That is

*ζ*(

*ϵ*_{b}) =

*f*(

*α*

_{b}) and

*ζ*(

*ϵ*_{o}) =

*f*(

*α*

_{o}).

The correlation of background and observation errors are then given by the scalars *α*_{b} and *α*_{o}, respectively.

## 3. Theoretical background

Given the previous simplifications, only four parameters (*σ*_{b}, *α*_{b}, *σ*_{o}, and *α*_{o}) are required for a complete description of background, observations, and analysis errors on any assimilation domain. In this section, we give analytical expressions for the errors of analyses obtained with and without the representation of correlations in

Before these expressions are given, we briefly discuss the estimation of error statistics from a sample of correlated errors.

^{3}for the standard deviation of

**, a generic vector representing any of the background, observation, or analysis errors. In this equation,**

*ϵ**n*is the length of

**while**

*ϵ**n*

_{eff}is the “effective sample size,” the number of independent pieces of information contained in the

*n*data points at our disposal (description by Thiébaux and Zwiers 1984).

*n*×

*n*, representing the correlations between the different errors contained in

**, the effective sample size can be estimated usingwith**

*ϵ**C*

_{ij}representing individual entries in the matrix

^{4}Compared to Bayley’s expression, matrix notation provides a more intuitive expression for

*n*

_{eff}, which is also valid for nonhomogeneous correlations and easily extends to fields of more than one dimensions.

The estimate for the variance of errors *s*(** ϵ**) may only estimate the true variance

*σ*(

**) to a certain precision. For different realizations of**

*ϵ***,**

*ϵ**s*(

**) will vary about**

*ϵ**σ*(

**). Throughout this article, we refer to this variability as the “sampling noise” represented as**

*ϵ**σ*(

*s*(

**)) or**

*ϵ**σ*(

*s*) for short, where parentheses represent “function of.”

*υ*

_{eff}is another effective sample size, which has been referred to as the number of “effective degrees of freedom” (Zieba 2010).

^{5}

*υ*

_{eff}may be estimated exactly. An approximation to this formula may be obtained by keeping only the leading order terms (

*n*

^{3}in the numerator and

*n*

^{4}in the denominator) to obtainwhich was found to provide a good approximation

^{6}to the exact formula for large

*n*and decorrelation length scales smaller than the size of the fields under study.

*σ*(

*s*) rather than

*σ*

^{2}(

*s*

^{2}) given in Eq. (7). Using the “delta method” (see, e.g., appendix 1 of Hosmer et al. 2008) one obtains the following approximation for the variance of a function of random variables:By letting

*f*(

*s*) =

*s*

^{2}, and noting that

*E*[

*s*] =

*σ*in our case, we obtainSubstituting the left hand-side of this expression by the one provided in Eq. (7) we obtainedwhere

*σ*(

*s*) is the theoretical precision at which

*σ*(

**) can be estimated using Eq. (5). For Gaussian distributions, it is expected that**

*ϵ**s*(

**) =**

*ϵ**σ*(

**) ± 2**

*ϵ**σ*(

*s*) 95% of the time.

*N*realizations of

**with the same error statistics, the true sampling noise**

*ϵ**σ*(

*s*) can be estimated usingwith

While the errors in *ϵ*_{b} and *ϵ*_{o} are spatially correlated, the errors between different realizations of *ϵ*_{b} and *ϵ*_{o} are not. In Eq. (12), *s*(*s*) can then be estimated with the “traditional” formula for estimating the standard deviation.

To assess the impact of representing the correlation of errors we compare analyses obtained by neglecting the correlation of errors (using the diagonal matrices _{diag} and _{diag}) to analyses obtained by perfectly representing the correlation of errors (using the “true”

### a. Analyses obtained by neglecting the correlation of errors

*ϵ*_{b}and

*ϵ*_{o}, the analysis

**x**that minimizes the cost function [Eq. (3)] is a weighted average between

**x**

_{b}and

**y**:These analyses and their error statistics will be identified with the “avg” subscript. They can be obtained at very low computational costs but are not optimal following assimilation theory.

*ϵ*_{b}and

*ϵ*_{o}.

*ϵ*_{avg}is described by the weighted average of the correlation of background and observation errors:This result can be inferred by considering that the spectral density of a weighted average between two fields is itself a weighted average of their respective spectral densities. The correlation of analysis errors, which is the Fourier transform of the spectral density, is thus also a weighted average of the correlation of errors in

*ϵ*_{b}and

*ϵ*_{o}.

### b. Analyses obtained by perfectly representing the correlation of errors

**x**

_{t}. We identify these analyses using the “optim” subscript. Here

_{optim}, the variance–covariance matrix of analysis errors, can be obtained from[see Tarantola (2005), or any book discussing the variational method].

_{optim}can be determined usingDerivation for this solution is given in appendix A.

*ϵ*_{optim}are not exponentially correlated. In this case, one may compute

_{optim}numerically from Eq. (17) and estimate

*σ*

_{optim}usingThe functional form for the correlation of analysis errors,

*ζ*(

*ϵ*_{optim}), may be obtained from the correlation matrix of analysis errors:

## 4. Methodology

In the previous sections, we have shown that the standard deviation, correlation, and sampling noise of analysis errors could be predicted from the errors of **x**_{b} and **y** (determined by *α*_{b}, *σ*_{b}, *α*_{o}, and *σ*_{o}), and the domain size (from which we get *n*).

The principal objective of this article is to assess the impact of representing and misrepresenting correlations in idealized assimilation experiments. To do so, we need to show the dependence of analysis errors to the five parameters aforementioned; this is something that cannot easily be represented in one or even a few plots.

Being mostly interested in the impacts of representing the correlation of errors, we chose to let *σ*_{b} = *σ*_{o} = 2.5 m s^{−1}, a plausible value for the errors of horizontal winds in **x**_{b} a s well as Doppler velocity. By setting an equal value for the standard deviation of errors for the two sources of information, any difference in analysis errors will be attributable to correlation.

The analogy with radar data assimilation is useful as a reference point for determining the context in which experiments are performed and the general nature of errors to be tested. However, the experiments conducted here may be representative of any situation where two estimates with correlated errors are combined into an analysis.

The assimilation domain we chose for our experiments consisted of a 2D grid of 100 × 100 points representing a square domain with a side of 100 km at a resolution of 1 km. This fixes *n* = 10 000.

We wish to describe analysis errors for a wide variety of combinations of *α*_{b} and *α*_{o}. We thus fixed the decay rate for the correlation of observation errors to *α*_{o} = 5 km, a value that involves nonnegligible spatial correlations and yet is comparable to the typical length scales of wind field features in convection. The variable *α*_{b} was allowed to vary between 0 and 100 km. Note that because the errors of **x**_{b} and **y** were of the same nature, the same results would have been achieved by fixing *α*_{b} and allowing *α*_{o} to vary.

### Verification of assimilation system

Having derived theoretical expressions for analysis errors in the above, the impact of representing correlations could have been assessed without the actual computation of analyses. Analyses were nevertheless computed for verification purposes. By comparing the errors estimated from analyses to those expected from theory, we could verify that our assimilation system and verification procedure were free of errors. Having validated our assimilation system, we could then be confident in the results of experiments (shown in sections 5c and 5d) for which only numerical computations were available.

Here is the four-step procedure by which our assimilation system was tested:

- For given values of
*α*_{b}and*σ*_{b}, generate exponentially correlated noise and add to a predefined truth**x**_{t}to obtain**x**_{b}. Using the same procedure, obtain simulated observations**y**from*α*_{o}and*σ*_{o}*.*Over 1D domains, exponentially correlated noise can be obtained through an first-order autoregressive process (Ward 2002). In 2D and 3D, exponentially correlated noise can be obtained by convolving fields of Gaussian white noise with the kernels provided in Tables 1 and 2 of Oliver (1995).The choice of**x**_{t}is unimportant since this “truth” is to be removed from**x**to obtain analysis errors [see Eqs. (22)–(23)]. - Combine
**x**_{b}and**y**to obtain an analysis**x**. Two different analyses were computed:**x**_{avg}was obtained by use of Eq. (14), and**x**_{optim}was obtained by minimizing the cost function [Eq. (3)] using the conjugate-gradient algorithm.Knowledge of^{−1}and^{−1}is required for the computation of the cost function and its gradient. Several authors have discussed the sparse nature of inverse exponential matrices. Analytical formulation for the inverse of exponential correlation matrices are provided by Tarantola (2005) in 1D and 3D. Approximations for inverse exponential can also be obtained by use of Taylor expansion in spectral space as demonstrated by Oliver (1998) and Xu (2005). These expressions, however, require special care near boundaries. For the experiments presented here, on a relatively small 2D domain with background and observation estimates available everywhere, direct numerical inversion was the simplest way to obtain^{−1}and^{−1}. - Estimate the standard deviation and correlation of analysis errors from
- Repeat steps 1 to 3
*N*times and obtainand using Eq. (13). As *N*increases,should converge to *σ*_{avg}andto *σ*_{optim}. The magnitude of the sampling noise*s*(*s*(*ϵ*_{avg})) and*s*(*s*(*ϵ*_{optim})) can also be estimated using Eq. (12). As*N*increases,*s*(*s*(*ϵ*_{avg})) should converge to*σ*(*s*(*ϵ*_{avg})) and*s*(*s*(*ϵ*_{optim})) to*σ*(*s*(*ϵ*_{optim})).

## 5. Impact of representing correlations on the standard deviation of analysis errors

In Fig. 1, we show the standard deviation of background, observation, and analysis errors as a function of *α*_{b}. The errors expected from theory are displayed by use of solid lines while the errors obtained from numerical estimations are indicated by use of dots and color shadings.

Observation errors *ϵ*_{o} are represented in light blue. Here *σ*(*ϵ*_{o}), the standard deviation of observation errors was set to 2.5 m s^{−1} and is indicated by the horizontal dashed line [also indicating the standard deviation of background errors *σ*(*ϵ*_{b}) in gray]. Thin blue lines indicating the sampling noise expected from Eq. (11) are plotted 2*σ*(*s*(*ϵ*_{o})) ≈ 0.2 m s^{−1} above and below *σ*(*ϵ*_{o}). These lines indicate the range within which we expect the estimated standard deviation of errors *s*(*ϵ*_{o}) to lie 95% of the time. Because *σ*_{o} and *α*_{o} were fixed, the expected sampling noise for observations errors does not vary.

We can verify that the simulated observations **y** had the expected error statistics by considering the blue dots, each indicating *s*(*s*(*ϵ*_{o})) above and below **y** generated for each of the *α*_{b} being numerically tested. On average, *s*(*s*(*ϵ*_{o})) falls within the range *σ*(*s*(*ϵ*_{o})) expected from theory.

Background errors *ϵ*_{b} are displayed in light purple. Again, we can verify that ^{−1} independently from *α*_{b}.^{7} The magnitude of the sampling noise *σ*(*s*(*ϵ*_{b})), increases with *α*_{b} as predicted by Eq. (11).

The error statistics for analyses obtained by a weighted average of **x**_{b} and **y** are shown in orange. Invariably, *s*(*ϵ*_{avg}) ≈ *σ*(*ϵ*_{avg}) = 1.77 m s^{−1}. The magnitude of the sampling noise *s*(*s*(*ϵ*_{avg})) also increases with *α*_{b}.

Statistics for the errors of optimal analyses, *ϵ*_{optim}, are indicated in dark purple. The difference between *σ*(*ϵ*_{optim}) and *σ*(*ϵ*_{avg}) represents the magnitude of improvements to the standard deviation of analysis errors brought by the perfect representation of correlations in

When the correlation of observation and background errors are equal, indicated by the vertical dashed line in black (in Fig. 1), optimal analyses were also obtained using simple weighted averages. This was also expected since the two terms of the cost function [Eq. (3)] have the exactly the same structure. In the case where *α*_{b} = *α*_{o}, optimal analyses are obtained by a simple weighted average between **x**_{b} and **y**. This is also demonstrated analytically in appendix A.

Another interesting feature of the errors of **x**_{optim} is the tapering off of the improvements brought by the representation of correlations for *α*_{b} ≲ 0.5 km. This tapering is a consequence of *α*_{b} becoming significantly smaller than the grid spacing, which was set to 1 km.

We could attest that our assimilation system and verification procedure are behaving as expected since *s*(*s*) ≈ *σ*(*s*) for all errors being considered.

### a. Impact of representing correlations on the computational costs of analyses

Here are some considerations with respect to the computational costs of **x**_{avg} versus those of **x**_{optim}:

All analyses represented in this study were computed on the same desktop computer. The computational cost of representing the correlations of errors could then be inferred by comparing the time necessary for the generation of different analysis ensembles.

The analyses **x**_{avg} consisting in the weighted average of **x**_{b} and **y** could be obtained in a few thousandths of a second. Neglecting the correlation of errors leads to virtually costless analyses.

By contrast, **x**_{optim} obtained through the minimization of a cost function required a significant amount of computer time. In Fig. 2, we plot the time required for the generation of analyses as a function of *α*_{b}. Each dot represents the average time required for the minimization of 100 cost functions with the same *l*_{2} norm as discussed in Trefethen and Bau (1997)] of the matrix (^{−1} + ^{−1}), which is indicated in blue.

The exact amount of time required for convergence is not interesting in itself. Figure 2 shows that optimal analyses are much more expensive than weighted average. This is especially true for strongly correlated background and observation errors.

Research being our primary objective, the assimilation system that we used had not been optimized with computational time in mind. Such optimization (e.g., preconditioning, parallelization, etc.) is expected in operational system and will reduce the time required for the generation of analyses. It appears reasonable, however, to assume that for any assimilation system, representing the correlation of both background and observation errors may not be done at negligible costs.

### b. Sampling noise

If the observations are dense and that the correlation of *ϵ*_{o} and *ϵ*_{b} is similar, we know from Fig. 1 that nearly optimal analyses may be obtained by neglecting the correlation of errors in

The errors found in any individual analysis **x**_{avg} are partly due to misrepresentations of correlations and partly due to sampling noise. If the errors caused by misrepresenting correlations are much larger than the variability due to sampling noise, then the extra expense of obtaining **x**_{optim} are justified. On the other hand, if sampling noise is expected to dominate the error, then the beneficial impacts of representing the correlations may well go unnoticed.

A ratio *r* = 1, indicated by the horizontal dashed line means that the errors made by neglecting correlations are of the same magnitude as sampling noise. For *α*_{b} ≲ 3 km and *α*_{b} ≳ 10 km, *r* > 1. The errors of individual analyses are dominated by the misrepresentation of error correlations. In these circumstances, representing the correlation of errors will noticeably improve the quality of analyses. For 3 ≲ *α*_{b} ≲ 10 km, *r* < 1. The error is dominated by sampling noise. Neglecting correlation of errors will have little noticeable impacts on the standard deviation of analysis errors.

### c. Neglecting the correlations of errors for only one of or

In data assimilation, it is not uncommon to represent the correlation of background errors and not those of radar errors. We now assess the impact of neglecting the representation of error correlation for only one term in the cost function.

In Fig. 4, we show the standard deviation of errors for analyses obtained by neglecting correlation in only one of

Analyses obtained with diagonal *s*(*s*) the estimated sampling noise for each analysis ensemble. For comparison with previous results, we also plotted *σ*_{avg} (in orange) and *σ*_{optim} (dark purple), which appeared in Fig. 1.

When only the correlation of background errors were omitted (in blue) and the correlation of background errors were small, nearly optimal analyses could be obtained. For *α*_{b} ≳ 2 km, the errors of these analyses were greater than *σ*_{avg} (orange line). When only the correlation of observation errors were neglected (in red), the resulting analyses always had errors larger than *σ*_{avg}.

The impact of neglecting the correlation of errors in only one term of the cost function is important. Figure 4 indicates that it is generally better to neglect the correlation of errors altogether than to only partially represent them.

Misrepresented correlations may be compensated by purposefully misrepresenting the variance of errors as discussed in LR02; a technique that we will refer to as “variance compensation.” In Fig. 4 dashed lines indicate the average standard deviation of errors

From Fig. 4, we can observe that variance compensation may significantly improve the standard deviation of analysis errors. This is not, however, generally sufficient to yield analyses with errors smaller than *σ*_{avg}.

### d. Data thinning

LR02 concluded that the correlation of observation errors could be neglected with negligible impacts on analysis errors when the correlation between neighboring observations does not exceed 0.15. In the present context, where observations are spatially correlated following an exponential decay with *α*_{o} = 5 km, this criterion is met for observations which are approximately 10 km apart.

In Fig. 5a, we show the errors for analyses obtained by considering only observations that were separated by a distance of 10 km or more. Out of the initial 100 × 100 observations, a grid of 10 × 10 was retained. Only 1 out of every 100 observations was conserved. A simple matrix

Because of data thinning, analysis errors were no longer homogeneous. Only average error statistics are nevertheless presented here. This eases the representation of errors as a function of *α*_{b}, and makes the comparison with previous experiments possible.

For each pairs **x**_{b} and **y**, analyses were performed twice. Once neglecting the correlation of observation errors in

In Fig. 5a, we can observe an almost perfect overlap between the errors of the two set of analyses. As observed by LR02, neglecting the correlation of errors had no impact on the quality of analyses.

For comparison with earlier results, we also plotted (in red) the errors of analyses also obtained with _{diag} but without thinning. These errors were shown in figure Fig. 4.

When background errors were weakly correlated, *α*_{b} ≲ 1 km, the reduction in observational information caused by data thinning significantly increased the standard deviation of analysis errors. In these conditions, the average standard deviation of analysis errors was approximately the same as that of background errors.

As *α*_{b} was increased thinning became more and more beneficial. For *α*_{b} ≳ 5 km, the errors of analyses for which thinning was used (in green and pink) were smaller than those for which thinning was not used (in red). The improvements were small, however, when compared to the sampling noise observed for such correlations.

Keeping only 1 in 100 observations removed a lot of information. In Fig. 5b, we show the errors of analyses for which 1 in every 4 observations was conserved. This corresponds to a situation where the smallest distance between neighboring observations is 2 km.

This time, analyses obtained by neglecting the correlations of errors (in green) have significantly greater errors than those for which correlations are perfectly represented (in pink). This was expected since the errors of observations 2 km apart are still significantly correlated.

It is interesting to note that in case of strong background error correlations, *α*_{b} ≳ 8 km, when the correlation of errors was perfectly represented (in pink), analysis errors converged to those of optimal analyses obtained without thinning (in dark purple). This was also observed in LR02. When the correlations are strong and perfectly respected, increasing the density of observations does not significantly improve the quality of analyses. It is unfortunate that perfect representation of correlations is improbable in a more realistic context.

Used at its maximum potential, variance compensation (red dashed lines) yields analyses with smaller errors than those obtained with data thinning (green dots). Again, optimal application of variance compensation is unlikely in a realistic context.

When data thinning was used, nearly optimal analyses could only be obtained at the condition that 1) modest amount of thinning be applied, 2) the correlations of errors be perfectly represented in both

Irrespectively of variance compensation or data thinning, every analyses obtained by representing only the correlation of background errors (using _{diag}), had larger errors than those obtained without data thinning and with no representation of correlation at all (using _{diag} and _{diag}).

In operational data assimilation, correlations are not expected to be perfectly represented. Our results suggest that in such cases, keeping all available information but neglecting correlations altogether may well be less damageable than thinning observations prior to their assimilation.

Another commonly used method for dealing with dense radar data is the averaging of a group of observations to obtain a smaller number of “superobservations.” The averaging process by which superobservations are obtained affects the probability density or errors to be represented in **x**_{t}, averaging will affect the correlation of errors and may also cause the expected value of superobservations errors to differ from zero (Zawadzki 1982). Because of the dependence on **x**_{t}, the errors of superobservations will necessarily be heterogeneous. The use of superobservations, even in the present idealized context, is therefore not straightforward and is left as future work.

## 6. Impact of representing the correlation of *ϵ*_{b} and *ϵ*_{o} on the correlation of analysis errors

*ϵ*

*ϵ*

So far, only the standard deviation of analysis errors have been discussed. We saw how representing the correlation of errors was truly beneficial only in cases where the spatial correlation of *ϵ*_{b} and *ϵ*_{o} were relatively different. Also, the quality of analyses obtained by neglecting correlations altogether was generally better than that of analyses with partial representation of correlations. In this section, we demonstrate that similar conclusions can be reached through by examining the correlation of analysis errors.

While the standard deviation of analysis error could be represented by a scalar, the homogeneous and isotropic 2D correlation of analysis errors must be represented by a plot of correlation as a function of the distance, or lag, separating errors.

For the case where *α*_{b} = *α*_{o} = 5 km, **x**_{avg} = **x**_{optim} and the correlation of analysis errors follows an exponential decay with *α*_{avg} = *α*_{optim} = 5 km. This situation is not depicted here.

In Fig. 6, we show the correlation of analysis errors for 100 realizations of **x**_{avg} and **x**_{optim} obtained with *α*_{b} = 0.1, 1, and 10 km. Purple lines in Figs. 6a–c indicate *ζ*(*ϵ*_{optim}), the correlation of optimal analyses expected from Eq. (21). The average autocorrelation of errors estimated from analysis ensembles,

The correlation of errors for analyses consisting of a weighted average between **x**_{b} and **y** (obtained using _{diag} and _{diag}) are displayed in orange in Figs. 6d–f. In red (Figs. 6g–i) are the correlation of analysis errors obtained using the true *ζ*(*ϵ*_{o}) and *ζ*(*ϵ*_{b}), were also plotted in each panel (in blue and gray respectively).

For *α*_{b} = 0.1 km (Figs. 6a,d,g) perfect representation of correlations in

As *α*_{b} was increased, the difference between the correlation of errors in *ϵ*_{optim} and *ϵ*_{avg} (cf. Figs. 6b,e and 6c,f) became smaller. For *α*_{b} = 10 km, the correlation of errors for analyses obtained with and without the representation of correlations (Figs. 6c,f) are virtually indistinguishable. Neglecting the correlation of errors results in analyses with stronger error correlations but no real damage is done.

The opposite can be said about analyses for which only the correlation of observations errors was neglected. For increasing *α*_{b} (Figs. 6h,i) the correlation of analysis errors became more and more different from optimal analyses (Figs. 6b,c). For short lags, these correlations even exceeded the correlation of either the background and observations estimates.

## 7. Discussion

For increasing *α*_{b}, the variance of optimal analyses errors was shown to increase until *α*_{b} = *α*_{o} at which point, increasing *α*_{b} leads to analyses with smaller errors. While this behavior could be mathematically explained this result is not very intuitive. Why does the representation of error correlations only improve the quality of analyses at the condition that *α*_{b} ≠ *α*_{o}?

In attempt to answer this question, let us consider the extreme situation where *α*_{b} tends to infinity. In the limit of infinitely long correlations, background errors would be exactly of the same sign and magnitude everywhere in the assimilation domain. In other words, background estimates **x**_{b} would “measure” **x**_{t} with great precision but not with great accuracy. Note that the error (singular form intended) of such background estimates would differ between realizations of **x**_{b} and have an expected value of zero.

In this situation, the respective “merits” of background and observation estimates are different. Background estimates provide very good information on the relative magnitude between different random variables in final analyses while observation estimates provide very good information on the average value of all random variables considered together.

Representing the correlation of errors “tells” the assimilation system how to best combine **x**_{b} and **y** in order to take advantage of the respective merits of each source of information. For infinitely long background error correlations, observations would be used solely for the purpose of adjusting the mean value of individual background estimates in order to generate analyses that are both precise *and* accurate.

Representing the correlation of errors can only be beneficial in cases where the merits of information contained in **x**_{b} and **y** are different. For *α*_{b} = *α*_{o}, the two estimates do not bring complementary information in the system and analyses with the largest error variance are observed.

The results presented in this study may be considered as an extension to those of LR02. We now examine some of their conclusions in the context of our experiments. We also discuss the implications of our experiments to more realistic data assimilation situations.

LR02 concluded that when correlations are perfectly represented, increasing the observation density beyond a certain threshold yields little improvements to the quality of analyses. Our experiments demonstrate that the magnitude of improvements caused by increasing the observation density is strongly dependent on the correlation of background errors. In Figs. 5a,b, we can observe the errors of optimal analyses obtained with (in pink) and without (in dark purple) data thinning. The difference between these two sets of analyses, which indicates the magnitude of improvements associated with increasing the observation density, strongly depends on the correlation of background errors. In Fig. 5b, where moderate thinning was applied, optimal analyses obtained with and without thinning have virtually the same errors for *α*_{b} ≳ 10 km. For this specific experiment, we can therefore conclude that increasing the observation density beyond 2 km yields little improvements to the quality of analyses *on the condition that α*_{b} ≳ 10 km. To the conclusion of LR02, we add the requirement for background errors to be sufficiently correlated.

In a more realistic context, the structure of background and observation errors is likely to be much more complex than those examined here. Errors are not expected to be homogeneous and isotropic; their distributions may be poorly represented in terms of their mean and variance alone. There are also significant challenges associated with both the estimation of these errors and their representation in assimilation systems. For these reasons, misrepresenting the correlation of errors is unavoidable and obtaining truly optimal analyses is impossible. In this context, experiments where we neglected the correlation of observation errors are more relevant.

A second conclusion of LR02 is that when the correlations of observation errors are neglected, thinning data such that neighboring grid points have correlations no greater than 0.15 provides the best compromise between the error associated with correlation misrepresentations and the loss of information associated with thinning.

In Fig. 5, errors associated with correlation misrepresentations may be estimated by comparing analyses obtained with (in dark purple) and without (in red) the representation of observation error correlations. The magnitude of these errors shows only a weak dependence on the correlation of background errors.

Errors associated with information loss may be estimated by comparing optimal analyses obtained with (in pink) and without data thinning (in dark purple). The magnitude of these errors is strongly dependent on the correlation of background errors.

In green are the errors of analyses obtained by neglecting the correlation of observation errors and applying data thinning. Thus, they suffer from the two types of errors aforementioned. Data thinning may only alleviate the errors caused by correlation misrepresentations *on the condition that background errors are sufficiently correlated.* In Fig. 5a, this happens for *α*_{b} ≳ 5 km, in Fig. 5b, this happens for *α*_{b} ≳ 2 km. Again, to the conclusion of LR02, we add the requirement for background errors to be sufficiently correlated.

One notable difference between the experiments presented here and those of LR02 is the functional form of the correlations, which have been tested. We performed experiments with exponential correlations while they tested correlations functions closer to a Gaussian decay. With respect to the exponential function, the Gaussian decay imposes a more abrupt decrease of correlation with increasing distance between errors. There are two ways in which this is favorable to data thinning. First, less thinning will be necessary to obtain a correlation of 0.15 between neighboring data points. This will diminish the errors due to information loss. Second, background errors correlated following a Gaussian decay will better propagate observation information to neighboring and nonobserved grid points. We can, therefore, conclude that the 0.15 criterion found by LR02 depends on both the rate of decay and the functional form of background error correlations.

In any circumstances where dense observations are available, data thinning is probably not necessary. In our experiments, analyses obtained by neglecting correlations altogether proved systematically better than those with the representation of only background error correlation. This is convenient as weighted averages only require knowledge of the variance of errors that may be more easily estimated and represented in data assimilation than correlation. Savings in computational costs are not to be neglected either.

Our experiments have shown that partial representation of correlation generally leads to analyses that are of poorer quality compared to simple weighted averages. This suggests that great care should be taken when it comes to representing the correlation of errors in a realistic context. In case of doubt on the magnitude of the improvements brought by the representation of correlations, neglecting correlations altogether should be considered as a safe option.

Of course, neglecting the correlation of errors is only possible where dense observations are available. If radar observations are spatially scattered, or not available, then the correlation of background errors should be represented to spread observation information to nonobserved areas. In a realistic assimilation context, dense radar measurements are usually available on limited portions of the assimilation domain. The determination of where and when _{diag} may be used with little negative impact on the quality of analyses will first require the identifications of dense observation areas. Such systems with “mixed” representation of correlations remain to be studied.

In our experiments, we set *σ*_{b} = *σ*_{o} = 2.5 m s^{−1} for simplicity. In a more realistic assimilation context, it is likely that the variance of background and observation errors will differ. In experiments (not shown here) where we reduced *σ*_{o} to 1 m s^{−1} (a number often quoted for the errors of Doppler velocity), the standard deviation of analysis errors was reduced to ~0.9 m s^{−1} irrespectively from the correlations being represented in

In this respect, we have demonstrated the strong influence of error correlations on the precision at which the variance of errors may be estimated. In our experiments, precise estimates for the standard deviation of errors could be obtained by averaging several errors fields with the same error statistics. In an atmospheric context, this may only be done in a climatological sense. The applicability of such estimates to convective situations remains to be determined.

The temporal aspect of representing correlations is yet another aspect of data assimilation that should be investigated. Typically, data assimilation consists of several cycles of analyses followed by periods of model integration. In principle, this should bring the model state closer to the truth and affect both the magnitude and correlation of its errors. Potentially, this could make the correlation of background errors sufficiently different from observation errors to justify the representation of these correlations. If and how this happens at the convective scales remains to be documented.

As outlined above, there are a number of major differences between the idealized experiment presented here and the context of operational data assimilation. These experiments are nevertheless interesting as they help us understand the contribution of correlation to the quality of analyses in the ideal case where all the usual assimilation assumptions are fulfilled. It is probably safe to say that in an operational context, where assimilation is conducted in less than ideal conditions, improvements can only be harder to obtain.

The experiments presented here demonstrate that representing only the correlation of background errors may well end up degrading the quality of analyses compared with not representing correlations at all. This conclusion is important since most modern frameworks for performing data assimilation [e.g., the ensemble Kalman filter (EnKF) or hybrid EnKF-variational systems] are oriented toward better representation of background error correlations at the analysis step. We now know that, alone, improving the representation of background errors is not expected to improve the quality of observed variables in analyses.

## 8. Conclusions

The experiments presented in this study help understanding the process of optimal estimation in the presence of multivariate and correlated estimates. For simplicity, we only considered the case where background and observation estimates were available everywhere in the assimilation domain with known errors represented by unbiased, homogeneous and isotropic multivariate Gaussian distributions. Errors were correlated in space following an exponential decay, a “long tailed” correlation function similar to those often found in the atmosphere. With these simplifications, analysis errors could be expressed as a function of only four parameters: the standard deviation and correlation of background and observation errors.

In a first set of experiments, two situations were examined: one in which analyses were obtained by neglecting correlations altogether, a second in which correlations were perfectly represented. When the correlations of errors were neglected, analyses could be obtained at a very low computational cost by a weighted average between background and observation estimates. From a statistical perspective, these analyses are not optimal. They are not those with the smallest expected error variance with respect to the truth. Optimal analyses were obtained with perfect representation of correlations and at a considerably higher computational cost than simple weighted averages. We investigated whether the extra cost associated with the representation of correlations was justified.

By comparing these two sets of analyses we demonstrated that the more the correlation of background and observation errors differ, the more representing those correlations had a beneficial effect on the analyses. When the correlation of background and observation errors were equal, optimal analyses were obtained with or without the representation of correlations. For this special (and unlikely) situation, the costs associated with the representation of correlations are definitely not justified.

When the correlation of background and observation errors were different, perfectly representing the correlation of errors was always beneficial. The magnitude of the improvements, however, was shown to depend on the difference between the correlation of background and observation errors. When the correlation of background and observation errors did not differ significantly, perfect representation of correlations provided only small improvements compared to suboptimal analyses where correlations were neglected altogether.

Special attention was paid to the precision at which the average standard deviation of errors may be estimated. This second-order error statistics (which we referred to as the sampling noise) was shown to depend on the size of the assimilation domain, the standard deviation, and the correlation of errors being estimated. To determine when correlations may be neglected, we suggested that one considers the ratio between the improvements brought by the representation of correlation versus the expected magnitude of the sampling noise. If the improvements brought by the representation of correlations are small compared to the precision at which the standard deviation of errors may be estimated, then the beneficial effects of representing correlation are likely to go unnoticed. In this case, the computational resources associated with the representation of correlations may be allocated elsewhere.

The presence of correlated errors is not necessarily a bad thing. In the situation where the errors of one estimate are strongly correlated while the errors of a second estimate are not, representing these correlation in the cost function does yield analyses of significantly better quality than simple weighted averages. Determining whether if, and to which extent, this situation occurs in realistic situation demands further investigation.

Analyses for which the correlations of errors were represented in only one of

Under ideal circumstances, partial representation of error correlations generally does more harm than good. In a more realistic context where misrepresenting the correlation of errors is unavoidable, great care should be taken to insure that representing correlations will have a beneficial impact on analyses. In case of doubts, neglecting correlations altogether should be considered as a safe alternative. Of course, this is only possible in areas where dense observations are available.

The results presented in this article suggest that common practice in data assimilation, such as only representing the correlation of background errors and data thinning, may not have the expected beneficial impact on the quality of analyses. Operational data assimilation is, however, conducted in conditions which are far from the idealized context in which our experiments were performed. Therefore, additional work is required to fully document the impacts of representing correlations in a more realistic framework.

In the second part of this study, we perform similar experiments using model output as background estimates with observations made available only in precipitating areas. This will allow us to investigate the challenges that arise when we perform data assimilation from estimates, which error statistics are non-Gaussian, heterogeneous, biased, and misrepresented.

The authors are grateful to Peter Houtekamer and Marc Berenguer for providing many useful suggestions improving preliminary versions of this article.

# APPENDIX A

## Analytical Expressions for the Analysis Variance–Covariance Matrix

_{optim}is given by Eq. (17), which is reproduced here:

In this appendix, we show that for _{optim} also represents an exponential decay. Analytical expressions allowing the specification of _{optim} in terms of *σ*_{b}, *α*_{b}, *σ*_{o}, and *α*_{o} are derived. It is also shown that these expressions do not apply over 3D domains.

**x**by the inverse of a 1D covariance matrix

*α*, the rate of decay of the exponential correlation, and on

*σ*

^{2}, the variance of errors. Here

*δ*is Dirac’s delta function and

*δ*

^{2}represents its second derivative.

Such an operator may be constructed for ^{−1} (using *σ*_{b} and *α*_{b}) and ^{−1} (using *σ*_{o} and *α*_{o}). Should the errors of optimal analyses be also exponentially correlated, then *σ*_{optim} and *α*_{optim}.

^{−1}, and

^{−1}into Eq. (A1) to obtainRespecting this equality demands that bothbe respected. These two equations will be satisfied for

*α*

_{b}=

*α*

_{o}=

*α*

_{same}, these expressions reduce towhich are the variance and correlations expected for analyses consisting in the weighted average between the background and observation errors [given by Eqs. (15) and (16), respectively].

In 2D, the operator representing the inverse of an exponential variance-covariance matrix has yet to be derived analytically (Oliver 1998). The method used above is, therefore, not applicable. However, close examination of the average autocorrelation for analysis errors displayed in Fig. 6 showed that they do not follow an exponential decay.

*σ*

_{b},

*α*

_{b},

*σ*

_{o}, and

*α*

_{o}this system is overdetermined and inconsistent. In 3D, the correlation of errors for optimal analyses does not generally follow an exponential decay.

However, in the special case where *α*_{o} = *α*_{b} = *α*_{same} then analysis errors *are* exponentially correlated with Eqs. (A12)–(A14) being reduced to Eqs. (A10) and (A9).

# APPENDIX B

## Variance Compensation

In this appendix, we discuss the use of variance compensation to alleviate the negative impacts associated with the misrepresentation of error correlations.

Liu and Rabier (2003) mention the technique and discuss how variance may be adjusted by considering the value of the cost function *J*. For the experiments presented here, we were not so much interested in how to adjust the variance as in documenting the maximum benefits that could be obtained using this technique.

Given that we knew **x**_{t}, we adjusted the variance a posteriori, by computing analyses with different representation of variance and choosing the value that minimized the standard deviation of errors with respect to **x**_{t}. This is best explained using an example.

Let *σ*_{o} = *σ*_{b} = 2.5 m s^{−1}, *α*_{b} =1 km, and *α*_{o} = 5 km. We are interested in analyses obtained with perfect representation of background error correlations (the “true” _{diag}).

The correct representation of background error correlations in **x**_{b} in the cost function. On the other hand, the incorrect omission of observation errors correlations in **y** than it should have.

To a certain extent, the “correct” contribution of observation and background estimates can be reestablished by purposefully misrepresenting the variance of errors in

In Fig. B1, we plot the standard deviation of analysis errors for

In the limit where *J* = 0 is obtained for **x** = **x**_{b}. Conversely, in the limit where *J* = 0 when **x** = **y**. In between, analyses errors are smaller than either *σ*_{b} or *σ*_{o}. Analyses with the smallest standard deviation are found for ^{−1}, a value significantly smaller than *σ*_{b} = 2.5 m s^{−1}.

In the Figs. 4 and 5 the dashed line in red represents analyses ensembles for which variance compensation was applied. For each of these ensembles, the value of

## REFERENCES

Bayley, G. V., , and J. M. Hammersley, 1946: The “effective” number of independent observations in an autocorrelated time series.

,*Suppl. J. Roy. Stat. Soc.***8**, 184–197, doi:10.2307/2983560.Berenguer, M., , and I. Zawadzki, 2008: A study of the error covariance matrix of radar rainfall estimates in stratiform rain.

,*Wea. Forecasting***23**, 1085–1101, doi:10.1175/2008WAF2222134.1.Berenguer, M., , and I. Zawadzki, 2009: A study of the error covariance matrix of radar rainfall estimates in stratiform rain. Part II: Scale dependence.

,*Wea. Forecasting***24**, 800–811, doi:10.1175/2008WAF2222210.1.Berenguer, M., , M. Surcel, , I. Zawadzki, , M. Xue, , and F. Kong, 2012: The diurnal cycle of precipitation from continental radar mosaics and numerical weather prediction models. Part II: Intercomparison among numerical models and with nowcasting.

,*Mon. Wea. Rev.***140**, 2689–2705, doi:10.1175/MWR-D-11-00181.1.Bretherton, C. S., , M. Widmann, , V. P. Dymnikov, , J. M. Wallace, , and I. Bladé, 1999: The effective number of spatial degrees of freedom of a time-varying field.

,*J. Climate***12**, 1990–2009, doi:10.1175/1520-0442(1999)012<1990:TENOSD>2.0.CO;2.Chung, K.-S., , I. Zawadzki, , M. K. Yau, , and L. Fillion, 2009: Short-term forecasting of a midlatitude convective storm by the assimilation of single-Doppler radar observations.

*Mon. Wea. Rev.,***137,**4115–4135, doi:10.1175/2009MWR2731.1.Fabry, F., 1996: On the determination of scales ranges for precipitation fields.

,*J. Geophys. Res.***101**, 12 819–12 826, doi:10.1029/96JD00718.Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125**, 723–757, doi:10.1002/qj.49712555417.Hollingsworth, A., , and P. Lönnberg, 1986: The statistical structure of short-range forecast errors as determined from radiosonde data. Part I: The wind field.

,*Tellus***38A**, 111–136, doi:10.1111/j.1600-0870.1986.tb00460.x.Hosmer, D. W., Jr., , S. Lemeshow, , and S. May, 2008:

*Applied Survival Analysis: Regression Modeling of Time to Event Data.*2nd ed. Wiley, 416 pp.Keeler, R. J., , and S. M. Ellis, 2000: Observational error covariance matrices for radar data assimilation.

,*Phys. Chem. Earth, Part B: Hydrol. Oceans Atmos.***25**, 1277–1280, doi:10.1016/S1464-1909(00)00193-3.Kiefer, J. C., 1953: Sequential minimax search for a maximum.

,*Proc. Amer. Math. Soc.***4**, 502–506.Lilly, D. K., 1990: Numerical prediction of thunderstorms—Has its time come?

,*Quart. J. Roy. Meteor. Soc.***116**, 779–798, doi:10.1002/qj.49711649402.Liu, Z. Q., , and F. Rabier, 2002: The interaction between model resolution, observation resolution and observation density in data assimilation: A one-dimensional study.

,*Quart. J. Roy. Meteor. Soc.***128**, 1367–1386, doi:10.1256/003590002320373337.Liu, Z. Q., , and F. Rabier, 2003: The potential of high-density observations for numerical weather prediction: A study with simulated observations.

,*Quart. J. Roy. Meteor. Soc.***129**, 3013–3035, doi:10.1256/qj.02.170.Oliver, D., 1995: Moving averages for Gaussian simulation in two and three dimensions.

,*Math. Geol.***27**, 939–960, doi:10.1007/BF02091660.Oliver, D., 1998: Calculation of the inverse of the covariance.

,*Math. Geol.***30**, 911–933, doi:10.1023/A:1021734811230.Purser, R. J., , W.-S. Wu, , D. F. Parrish, , and N. M. Roberts, 2003: Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian covariances.

,*Mon. Wea. Rev.***131**, 1524–1535, doi:10.1175//1520-0493(2003)131<1524:NAOTAO>2.0.CO;2.Stratman, D. R., , M. C. Coniglio, , S. E. Koch, , and M. Xue, 2013: Use of multiple verification methods to evaluate forecasts of convection from hot- and cold-start convection-allowing models.

,*Wea. Forecasting***28**, 119–138, doi:10.1175/WAF-D-12-00022.1.Sun, J., , D. W. Flicker, , and D. K. Lilly, 1991: Recovery of three-dimensional wind and temperature fields from simulated single-Doppler radar data.

,*J. Atmos. Sci.***48**, 876–890, doi:10.1175/1520-0469(1991)048<0876:ROTDWA>2.0.CO;2.Tarantola, A., 2005:

*Inverse Problem Theory and Methods for Model Parameter Estimation.*SIAM, 358 pp.Thiébaux, H. J., , and F. W. Zwiers, 1984: The interpretation and estimation of effective sample size.

,*J. Climate Appl. Meteor.***23**, 800–811, doi:10.1175/1520-0450(1984)023<0800:TIAEOE>2.0.CO;2.Trefethen, L. N., , and D. Bau, 1997:

*Numerical Linear Algebra.*SIAM, 373 pp.Ward, L. M., 2002:

*Dynamical Cognitive Science.*The MIT Press, 371 pp.Wilson, J. W., , Y. Feng, , M. Chen, , and R. D. Roberts, 2010: Nowcasting challenges during the Beijing Olympics: Successes, failures, and implications for future nowcasting systems.

,*Wea. Forecasting***25**, 1691–1714, doi:10.1175/2010WAF2222417.1.Xu, Q., 2005: Representations of inverse covariances by differential operators.

,*Adv. Atmos. Sci.***22**, 181–198, doi:10.1007/BF02918508.Yaremchuk, M., , and A. Sentchev, 2012: Multi-scale correlation functions associated with polynomials of the diffusion operator.

,*Quart. J. Roy. Meteor. Soc.***138**, 1948–1953, doi:10.1002/qj.1896.Zawadzki, I., 1973: The loss of information due to finite sample volume in radar-measured reflectivity.

,*J. Appl. Meteor.***12**, 683–687, doi:10.1175/1520-0450(1973)012<0683:TLOIDT>2.0.CO;2.Zawadzki, I., 1982: The quantitative interpretation of weather radar measurements.

,*Atmos.–Ocean***20**, 158–180, doi:10.1080/07055900.1982.9649137.Zieba, A., 2010: Effective number of observations and unbiased estimators of variance for autocorrelated data—An overview.

*Metrol. Measure. Syst.,***17,**3–16, doi:10.2478/v10178-010-0001-0.

^{1}

Multivariate is used here in reference to the many random variables by which these error fields may be described. The influence of correlations may only be studied in the presence of two or more random variables. These statistical random variables are not to be confused with atmospheric state variables, such as temperature, or pressure.

^{2}

Many reasons justify the choice of exponential correlations. First, previous work on this function allowed us to derive expressions for analysis errors as stated in section 3b and appendix A. Second, we could use existing mathematical tools for testing our assimilation system. A procedure, described in section 4a, which requires the generation of exponentially correlated noise and the use of inverse exponential correlation matrices. Third, long tailed correlations, similar in nature to the exponential decay, have frequently been observed in atmospheric contexts. For examples, see Hollingsworth and Lönnberg (1986) for errors correlations at the global scales, and Berenguer and Zawadzki (2008, 2009) for the correlation of rain-rate errors inferred from radar measurements.

^{3}

As is often done in statistics, we adopt the convention by which the true standard deviation of errors is represented by the Greek letter *σ* while estimates of the same quantity are represented by the letter *s*.

^{4}

The equivalence between the two forms becomes evident when considering the inverse of Eq. (14) of Bayley and Hammersley (1946) multiplied by *n*^{2}.

^{5}

Note that the concept of effective degrees of freedom has also been used to described different statistical properties (e.g., in Bretherton et al. 1999). For the purpose of this article, it only refers to the quantity estimated using Eq. (8).

^{6}

The accuracy of this approximation has been verified in numerical experiments not shown here. Note also that the simplification given in Eq. (8) differs from that of Zieba (2010), which presented the result of a similar simplification. While we reduced the *n* − *j*)” inside the summation both increased the accuracy of the approximation and allowed us to conveniently express *ν*_{eff} in a matrix form similar to that of *n*_{eff} found in Eq. (6).

^{7}

The method we used for the generation of exponentially correlated noise relies on a convolution kernel provided by Oliver (1995). For *α*_{b} ≳ 10 km the tail of the kernel became sufficiently truncated as to prevent the generation of purely exponentially correlated noise. For this reason, numerical results are only provided up to *α*_{b} = 10 km. The range of most figures presented in this study nevertheless extends to *α*_{b} = 100 km to emphasize the symmetrical nature of *σ*_{optim}.