## 1. Introduction

Ensemble Kalman filters (EnKFs) have shown great promise for large-scale atmospheric data assimilation (Evensen 1994; Keppenne 2000; Houtekamer and Mitchell 2001; Houtekamer and Mitchell 2005; Houtekamer et al. 2005). Because the number of ensemble members typically available for atmospheric data assimilation is in the hundreds while the number of observations is several orders of magnitude greater, the ensemble sample covariance matrix is rank deficient, and spurious correlations are inevitable. Both rank deficiency and spurious correlations can lead to degraded analyses, and therefore to degraded forecasts. One practical solution is *localization*, which both increases the rank of the sample covariance matrix and mitigates spurious correlations, resulting in greatly improved analyses and forecasts (Houtekamer and Mitchell 1998).

Localization can be performed in the horizontal, in the vertical, and in time (and between variables). Localization schemes are typically distance based (an exception is the hierarchical filter; Anderson 2007), which makes sense in physical space and time given our knowledge of the spatial and temporal scales of physical and dynamical processes in the atmosphere. Experimental results (Hollingsworth and Lönnberg 1986) confirm the expectation that forecast error covariances generally diminish with horizontal and vertical distance for conventional observations. Distance-based forecast error covariance localization, typically implemented as a Schur (elementwise) product of the raw ensemble covariance matrix and some positive definite localization matrix, works well both in the horizontal and for conventional observations.

Vertical covariance localization for satellite radiances is becoming more important as the number and type of satellite observations increases much more rapidly than conventional observations. Radiance space localization is already being used in the operational data assimilation system at Environment Canada for their ensemble forecasts, as well as in the ensemble square root filter data assimilation system being considered for operational use at the National Centers for Environmental Prediction (NCEP; J. Whitaker 2009, personal communication); therefore, an understanding of the benefits and drawbacks of this type of localization is a critical topic. For several reasons, specifying the vertical localization for satellite radiances is less straightforward than for other data types.

The vertical location of a satellite radiance is not well defined because it is an integrated measure, sampling different layers of the atmosphere. The satellite channels typically used in operational data assimilation have weighting functions that overlap significantly, resulting in correlated radiances in neighboring channels. Localization ought to preserve these correct interchannel correlations. Additionally, individual satellite weighting functions are typically broad, covering a significant fraction of the model atmosphere in the vertical, which makes it difficult to define sufficiently broad localization functions. For all of these reasons, one might expect that distance-based localization in radiance space will have some difficulty in extracting the maximum benefit from satellite radiances. Although there are many aspects of data assimilation (e.g., quality control, radiance bias correction, etc.) that must be carefully designed and implemented in order to see large, positive impact for satellite radiances in both variational and EnKF contexts, we believe that one limiting factor for EnKFs may be the localization of radiances in the vertical. [Operational three-dimensional (3D) and four-dimensional variational data assimilation (4DVar) systems have seen great benefit from the direct assimilation of satellite radiances (Andersson et al. 1994; Kelly 1997; Derber and Wu 1998; English et al. 2000; Eyre et al. 2000; Baker and Campbell 2005), particularly those from microwave temperature sounders such as the Advanced Microwave Sounding Unit-A (AMSU-A).] Understanding some of the limitations of current satellite ensemble DA techniques should aid in the search for techniques that are superior for satellite observations.

The theoretical basis for radiance space localization is explored in section 2, and a conceptual 1D model that exposes its essential limitations is presented in section 3. In section 4, a more realistic 1D model is presented, with levels, forecast error covariances, observation error covariances, and satellite weighting functions taken from an operational global NWP system. Section 5 examines the limit of small observation error variance for a hypothetical radiometer with a sufficient number of channels to specify the analysis at each vertical level, and section 6 presents a summary of 1D model results and conclusions.

## 2. Theoretical basis for radiance space localization

*calculated by localizing the sample covariance matrix 𝗣*

_{j}*from ensemble*

_{j}^{f}*j*with a correlation matrix

**can be written (Houtekamer and Mitchell 2001) as where**

*ρ***is symmetric and positive semidefinite, then**

*ρ***is applied in the space of the model state vector, and only subsequently is radiative transfer applied. Following Houtekamer and Mitchell (2001), the Kalman gain from observation space localization is given by**

*ρ*We will refer to (2) as the Kalman gain from radiance space localization in this study, although more generally it is observation space localization.

Radiance space localization has one marked advantage over model space localization: a significantly lower operation count. The Schur product in model space requires *O*(*n*^{2}) operations, while the Schur product in observation space requires *O*(*np*). Radiance space localization is more computationally efficient for global NWP models because the size of the state vector is *n* ∼ 10^{8}, two orders of magnitude larger than the typical number of observations assimilated in a 6-h window (i.e., *p* ∼ 10^{6}). In particular, (2) allows computationally efficient assimilation for serial observation processing Kalman filters, which are commonly used (Anderson 2001; Whitaker and Hamill 2002; Houtekamer and Mitchell 1998, 2001, 2005).

^{1}In addition, a desirable property of any approximation to the optimal gain matrix is that in the limit of perfect observations and infinite ensemble size, the analysis is equal to the truth. To see that (1) preserves this property whereas (2) does not, note that if the number of observed variables is equal to the number of model variables, then the forward operator 𝗛 is a square, invertible matrix. Define the model-space localized forecast error covariance matrix

**y**

*= 𝗛*

_{t}**x**

*, then 𝗥 is identically zero, and we can write the analysis vector*

_{t}**x**

*as No matter how imperfect the (positive definite) model-space localization is, the analysis recovers the truth. For radiance space localization, it is not even clear what the analog to 𝗣*

_{a}*ought to be. Defining the representer matrix implied by the denominator of (2) as*

_{M}_{1}≠ 𝗣

_{2}for any nontrivial

**, which is clearly an undesirable property. A simple column model that shows the consequences stemming from the radiance space approximation is presented in the next section.**

*ρ*## 3. Conceptual 1D model

Here we consider a 1D model of atmospheric temperature with three vertical levels, in order to illuminate the difficulties that occur when localization is performed in radiance space. Assume that the three levels are sufficiently far apart so that the true forecast error covariance for temperature is the 3 × 3 identity matrix (Daley and Barker 2001; Ingleby 2001). It follows that the best localization function in model space is the 3 × 3 identity matrix, because a Schur product between it and the forecast error covariance matrix correctly suppresses any spurious *T*–*T* correlation between (far separated) levels (i.e., a Schur product of the identity matrix and any matrix 𝗔 eliminates all off-diagonal elements of 𝗔). Suppose further that we have a two-channel microwave satellite instrument that senses temperature. The weighting function for channel 1 was chosen to peak at the top level, and have no contribution from the bottom level; the weighting function for channel 2 was chosen to peak at the middle level, and have contributions from all three levels. More specifically, the first row of the forward operator 𝗛, corresponding to channel 1, is [0.75, 0.25, 0.0], and the second row of 𝗛, corresponding to channel 2, is [0.25, 0.50, 0.25]. Assume that the observation error in channel 1 is uncorrelated with that of channel 2 (a common assumption for real satellite instruments), and that the observation error variances are equal (for convenience). The observation error matrix 𝗥 is then the 2 × 2 identity matrix, scaled by the observation error variance *r*.

*here are diagonal. Suppose that, by chance, the ensemble sample covariance 𝗣*

^{f}*was precisely equal to the true forecast error covariance matrix 𝗣*

_{j}^{f}*. The gain matrix (evaluated analytically with Mathematica 4.1; Wolfram Research, Inc. 2001) that results from model space localization in (1) is then identical to the true gain matrix, and is given by The matrices 𝗛𝗣*

^{f}*𝗛*

_{j}^{f}^{T}and 𝗣

*𝗛*

_{j}^{f}^{T}have nonzero off-diagonal terms because the weighting functions of channels 1 and 2 overlap.

^{2}To highlight the problems that occur when the localization width is narrower than the observation weighting function, the radiance space localization matrix was chosen to be the projection of the correct model space localization matrix into observation space. The radiance space localization that follows from (2) with

**replaced by 3 × 2 and 2 × 2 identity matrices yields the following gain matrix:**

*ρ*The gain matrix in (5) is diagonal, eliminating the correct correlations between channels 1 and 2, even though 𝗣* _{j}^{f}* was equal to 𝗣

*. Errors in the first guesses for each of the two radiance channels are correlated using (4), and uncorrelated using (5). The physical interpretation of the nonzero third row of (4) is that both radiance channels (correctly) influence the temperature correction for the lowest model level; in contrast, the third row of (5) is identically zero, which means that the lowest model level is completely unconstrained by observations in either channel, regardless of how small the observation error*

^{f}*r*is. Both channels affect the top and middle levels in (4), whereas only channel 1 affects the top level and only channel 2 affects the middle level in (5); again, this result is independent of

*r*.

It is clear that the correct model space localization is too narrow when simply projected into radiance space. One could use a much broader localization in (2) than in (1), but in that case, spurious correlations would remain. It may be that in the vertical in radiance space that there is *no* localization that is both sufficiently broad and sufficiently narrow. In that case, we would expect that (5) will not (on average) reduce the analysis error as effectively as (4) for any nontrivial radiance space localization.

## 4. Realistic 1D model with AMSU-A analog

A more realistic 1D model was constructed from the 30 vertical levels (surface up to 4 hPa) from the Navy Operational Global Atmospheric Prediction System (NOGAPS; Hogan and Rosmond 1991) and the weighting functions for channels 6–11 (Fig. 1) of AMSU-A (Saunders 1993; NOAA 2009, section 7.3). The observation error variances for AMSU-A are taken from the operational version of the Naval Research Laboratory (NRL) Atmospheric Variational Data Assimilation System (NAVDAS) at the Fleet Numerical Meteorology and Oceanography Center (FNMOC). The forecast error covariance is constructed from the forecast error correlation and forecast error variance for temperature used in NAVDAS (Daley and Barker 2001). For convenience, the forecast error variance for temperature is assumed constant with height and equal to 1.0 K^{2} (approximately true according to Daley and Barker 2001).

_{10}pressure (Fig. 2) was used as the localization function in model space. GC99 guarantees a positive semidefinite localization, verified by performing a Cholesky decomposition. Once a particular model space localization matrix

**= 𝗟 is chosen, (1) becomes Next, the localization in radiance space was constructed, with each radiance observation assigned to the model level closest to the peak of the satellite channel’s weighting function (Houtekamer and Mitchell 2001) via a selector matrix 𝗦. [Other level assignment schemes are discussed in Fertig et al. (2007) and Miyoshi and Sato (2007), both in the context of the local ensemble transform Kalman filter (LETKF).] Level assignment defines a distance in radiance space, an answer to the question of “How far is channel 1 from channel 2?” The localization matrix in radiance space is given by 𝗦𝗟𝗦**

*ρ*^{T}, and (2) becomes Analysis error reduction resulting from the gain matrices in (6) and (7) can then be compared with the optimal analysis error reduction.

The optimal analysis error reduction was computed by assuming that the NAVDAS forecast error covariance is the true forecast error covariance. An observed radiance in each of the six simulated AMSU-A channels was constructed by summing the satellite weighting functions multiplied by the true forecast temperature (arbitrarily set to 0°C at each level), and adding random noise with variance equal to the prescribed observation error variance. The forecast temperature perturbations are given by the product of the left square root of the NAVDAS forecast error covariance matrix and a random normal vector **N**(0, *I*). The forecast radiance perturbations can be computed simply by summing the product of the satellite weighting functions and the forecast temperature perturbation at each model level, because microwave radiative transfer is approximately linear, and none of the channels chosen has significant contributions from the surface. [All of the arguments in this study apply equally well to *any* integrated measure, regardless of nonlinearity; our expectation is that the problems with radiance localization would be exacerbated for instruments such as the Special Sensor Microwave Imager (SSM/I), AMSU-B, the Microwave Humidity Sounder (MHS), the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI), the Atmospheric Infrared Sounder (AIRS), the Infrared Atmospheric Sounding Interferometer (IASI), and the Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E). Additionally, AMSU-A remains the instrument with the most global forecast impact in the Var systems at many global NWP centers, so EnKFs need to be able to handle AMSU-A radiances well.] Subtracting the forecast from the observations forms the innovation vector, consisting of a brightness temperature difference in each channel. A total of 100 000 trials were performed for four different ensemble sizes (8, 16, 32, and 64 members) and 7 values for error variance in channel 9 (10^{1}, 10^{0}, 10^{−1}, 10^{−2}, 10^{−3}, 10^{−4}, and 10^{−5}), with 10^{−1} observation error variance corresponding most closely to FNMOC operations (the observation error standard deviations in channels other than channel 9 were kept proportional to the channel 9 value). For each trial, a random forecast error vector **ε** * ^{f}* and an observation error vector

**ε**

*are generated. The innovation vector can then be formed as*

^{o}**ε**

*− 𝗛*

^{o}**ε**

*= (*

^{f}**y**−

**y**

*) − 𝗛(*

^{t}**x**

*−*

^{f}**x**

*) =*

^{t}**y**− 𝗛

**x**

*. A sample error covariance matrix is constructed from the ensemble, and the localization methods are applied. The mean square analysis error normalized by the mean square forecast error, averaged over all trials, is plotted against the log of the observation error variance for each ensemble size (Fig. 3). The 99% confidence intervals are shown for a raw EnKF, an EnKF localized in model space in (6), an EnKF localized in radiance space in (7), and the optimal Kalman filter. As the optimal localization width might easily be a function of observation error, ensemble size, and localization method, each point in Fig. 3 has had the localization width tuned (Table 1) to produce the lowest analysis error.*

^{f}^{3}The raw EnKF (green curve) makes the analysis

*worse*than the forecast when the number of ensemble members is less than the dimension of the state vector (i.e., 30). These rank-deficient cases are the most relevant to 4D EnKFs for global atmospheric data assimilation, as there are far fewer ensemble members than state vector variables. The EnKF localized in model space (red curve) significantly outperforms the EnKF localized in radiance space (cyan curve) in the mean for all ensemble sizes and all observation error variances less than 10 times the forecast error variance.

^{4}The theoretical best result (dark blue curve) converges quickly to approximately 0.72 as the observations are made more accurate. The reason that it does not converge to 0 is that 6 perfect observations are insufficient to specify the 30-level model state. Results from experiments with sufficient observations to specify the model state are presented in the next section.

## 5. Realistic 1D model with idealized microwave instrument

Given as many independent satellite radiances as vertical levels, the analysis error should tend to zero as the observation error variance tends to zero. The experiments in section 4 were repeated with a synthetic 30-channel satellite instrument. The satellite weighting functions were chosen to be approximately Gaussian, peaking at the 30 NOGAPS levels, and decaying to 0 within ±3 model levels (not shown). In total, 10 000 trials were performed for 6 different ensemble sizes (8, 12, 16, 20, 24, and 28) and 8 different observation error variances (10^{1}, 10^{0}, 10^{−1}, 10^{−2}, 10^{−3}, 10^{−4}, 10^{−5}, and 10^{−6}) with each hypothetical channel having equal observation error variance. Normalized analysis error variance is plotted against the log of the observation error variance for the 16-member ensemble, and 99.9% confidence intervals are shown (Fig. 4) for a raw EnKF, an EnKF localized in model space in (6), an EnKF localized in radiance space in (7), and the optimal Kalman filter. As the observation error variance is reduced, the average analysis error variance for both the optimal filter (dark blue) and model space localized EnKF (red) converges to 0, while the radiance space localized EnKF (cyan) plateaus at 0.27, significantly above 0. Model space localization produced a smaller analysis error than radiance space localization 73% of the time for observation error variances equal to 10^{−1}, and 100% of the time for observation error variances equal to 10^{−5} or less (Fig. 5).

To ensure that the results were not due to lack of tuning, a further set of experiments were performed, varying the half-width of the GC99 rational function from 0.1 to 1.2 for radiance space localization. The optimal half-width for the 16-member ensemble for the smallest observation error variance was found to be 0.88 rather than 0.40; however, the resulting analyses improved only slightly, reaching a minimum average analysis error variance of 0.24. At the lowest observational error variance tested, the optimally tuned radiance space localization in (7) was inferior to the untuned model space localization in (6) for all 10 000 trials.

## 6. Summary and conclusions

Although studies (e.g., Houtekamer and Mitchell 2005) have shown that useful information can be extracted from satellite radiances using radiance space localization, the simple examples presented here indicate that more improvement is possible. Two problems with distance-based radiance space localization in the vertical have been highlighted: 1) distance and location are not well defined for integrated measures and 2) broad satellite weighting functions force localization functions to either be so broad that they are ineffective, or so narrow that true interchannel error covariances are suppressed or eliminated. In experiments with 1D models based on the NAVDAS forecast error covariance model, radiance space localization produced analyses that were systematically worse than those produced by model space localization for all observation error variances less than 10 times the forecast error variance, including a case with typical values used in the operational data assimilation of AMSU-A at FNMOC. Finally, radiance space localization is incapable of recovering the true state with a sufficient set of radiance channels and vanishingly small observation error, which is not surprising given that (2) was not derived by a formal limit procedure from (1). As there are existing ensemble data assimilation methods that do not require radiance space covariance localization (e.g., Buehner 2005; Bishop and Hodyss 2009), we recommend that users carefully weigh the computational performance gains they expect relative to the drawbacks demonstrated here.

## Acknowledgments

The authors thank Jeff Whitaker, Peter Houtekamer, Herschel Mitchell, and our anonymous reviewer for their valuable comments. We gratefully acknowledge the support of funding from the National Oceanographic and Aeronautical Administration (NOAA), the Office of Naval Research (ONR), and the National Research Council. In particular, support from ONR Project Element 0602435N, Project BE-435-003, ONR Grant N0001407WX30012, and from NOAA, THORPEX Grant NA04AANRG0233 is acknowledged.

## REFERENCES

Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129****,**2884–2903.Anderson, J. L., 2007: Exploring the need for localization in ensemble data assimilation using a hierarchical ensemble filter.

,*Physica D***230****,**99–111.Andersson, E., , J. Pailleux, , J-N. Thepaut, , J. R. Eyre, , A. P. McNally, , A. G. Kelly, , and P. Courtier, 1994: Use of cloud-cleared radiances in three/four-dimensional variational data assimilation.

,*Quart. J. Roy. Meteor. Soc.***120****,**627–653.Baker, N. L., , and W. F. Campbell, 2005: AMSU-A radiance assimilation for the U.S. Navy.

,*Bull. Amer. Meteor. Soc.***86****,**22–24.Bishop, C. H., , and D. Hodyss, 2009: Ensemble covariances adaptively localized with ECO-RAP. Part 2: A strategy for the atmosphere.

,*Tellus***61****,**97–111.Buehner, M., 2005: Ensemble-derived stationary and flow-dependent background error covariances: Evaluation in a quasi-operational NWP setting.

,*Quart. J. Roy. Meteor. Soc.***131****,**1013–1043.Daley, R., , and E. Barker, 2001: NAVDAS: Formulation and diagnostics.

,*Mon. Wea. Rev.***129****,**869–883.Derber, J. C., , and W. S. Wu, 1998: The use of TOVS cloud-cleared radiances in the NCEP SSI analysis.

,*Mon. Wea. Rev.***126****,**2287–2299.English, S. J., , R. J. Renshaw, , P. C. Dibben, , A. J. Smith, , P. J. Rayer, , C. Poulsen, , F. W. Saunders, , and J. R. Eyre, 2000: A comparison of the impact of TOVS and ATOVS satellite sounding data on the accuracy of numerical weather forecasts.

,*Quart. J. Roy. Meteor. Soc.***126****,**2911–2931.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte-Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99****,**(C5). 10143–10162.Eyre, J. R., , S. J. English, , P. Butterworth, , R. J. Renshaw, , J. K. Ridley, , and M. A. Ringer, 2000: Recent progress in the use of satellite data in NWP. Met Office NWP Divisional Rep. 296.

Fertig, E. J., , B. R. Hunt, , E. Ott, , and I. Szunyogh, 2007: Assimilating non-local observations with a local ensemble Kalman filter.

,*Tellus***59A****,**719–730.Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125****,**723–757.Hogan, T., , and T. Rosmond, 1991: The description of the Navy Operational Global Atmospheric Prediction System’s spectral forecast model.

,*Mon. Wea. Rev.***119****,**1786–1815.Hollingsworth, A., , and P. Lönnberg, 1986: The statistical structure of short-range forecast errors as determined from radiosonde data. Part I: The wind field.

,*Tellus***38A****,**111–136.Horn, R. A., , and C. R. Johnson, 1990:

*Matrix Analysis*. Cambridge University Press, 575 pp.Houtekamer, P. L., , and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126****,**796–811.Houtekamer, P. L., , and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation.

,*Mon. Wea. Rev.***129****,**123–137.Houtekamer, P. L., , and H. L. Mitchell, 2005: Ensemble Kalman filtering.

,*Quart. J. Roy. Meteor. Soc.***131****,**3269–3289.Houtekamer, P. L., , H. L. Mitchell, , G. Pellerin, , M. Buehner, , M. Charron, , L. Spacek, , and B. Hansen, 2005: Atmospheric data assimilation with an ensemble Kalman filter: Results with real observations.

,*Mon. Wea. Rev.***133****,**604–620.Ingleby, N. B., 2001: The statistical structure of forecast errors and its representation in The Met. Office global 3-D variational data assimilation scheme.

,*Quart. J. Roy. Meteor. Soc.***127****,**209–231.Kelly, G. A., 1997: Influence of observations on the operational ECMWF system.

*Tech. Proc. Ninth Int. TOVS Study Conf.,*Igls, Austria, European Centre for Medium-Range Weather Forecasts, 239–244.Keppenne, C. L., 2000: Data assimilation into a primitive-equation model with a parallel ensemble Kalman filter.

,*Mon. Wea. Rev.***128****,**1971–1981.Miyoshi, T., , and Y. Sato, 2007: Assimilating satellite radiances with a local ensemble transform Kalman filter (LETKF) applied to the JMA global model (GSM).

,*SOLA***3****,**37–40.NOAA, 2009: KLM user’s guide with NOAA-N,-N’ supplement. [Available online at http://www2.ncdc.noaa.gov/docs/klm/index.htm].

Saunders, R. W., 1993: Note on the Advanced Microwave Sounding Unit.

,*Bull. Amer. Meteor. Soc.***74****,**2211–2212.Whitaker, J. S., , and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations.

,*Mon. Wea. Rev.***130****,**1913–1924.Wolfram Research, Inc., 2001: Mathematica, version 4.1.

Empirical optimal localization widths in log_{10} pressure (hPa) as a function of ensemble size, localization type (model space and radiance space), and observation error standard deviation.

^{1}

Equation (2) also has inconsistent notation; ** ρ** is a

*p*×

*p*matrix in observation space in the

*n*×

*p*matrix in the

**is an**

*ρ**n*×

*n*matrix (

*n*is the number of model state variables, and

*p*is the number of observations).

^{2}

For point measurements such as radiosonde temperatures, however, 𝗛𝗣* _{j}^{f}* 𝗛

^{T}is simply interpolation to the nearest model level, and (2) reduces to (1).

^{3}

Note that model space localization is much less sensitive to tuning than radiance space localization, and that the broad radiance space localizations appropriate for radiances will not be optimal for radiosondes.

^{4}

For the largest observation error variance shown, even the theoretical best method makes a negligible contribution to analysis error reduction. However, radiance space localization does perform slightly better than model space localization in this regime, and we do not currently have an explanation for this behavior.