## 1. Introduction

About 10 yr following the appearance of the ensemble Kalman filter (EnKF) technique by Evensen (1994), and the first application to an atmospheric system by Houtekamer and Mitchell (1998), EnKF is now rapidly becoming a viable choice of operational NWP systems. In fact, the Canadian Meteorological Centre has already applied an EnKF method with perturbed observations to the operational ensemble prediction system (EPS) in January 2005, following the preoperational investigation by Houtekamer et al. (2005). Because perturbing observations is a source of sampling errors, Whitaker and Hamill (2002) proposed a square root filter (SRF) method of EnKF without perturbed observations. Tippett et al. (2003) summarized several ways of ensemble SRF (EnSRF), all of which are efficient when observations are assimilated serially. Alternatively, Ott et al. (2002, 2004) proposed a local ensemble Kalman filter (LEKF), a kind of EnSRF that assimilates observations simultaneously and is efficient when implemented on a parallel architecture. Furthermore, Hunt et al. (2007) applied the ensemble transform Kalman filter (ETKF; Bishop et al. 2001) approach to further accelerate LEKF, the method called a local ensemble transform Kalman filter (LETKF; cf., Harlim 2006). Szunyogh et al. (2005) developed and assessed LEKF with the National Centers for Environmental Prediction’s Global Forecasting System (NCEP GFS) at a T62/L28 resolution. Recently, they succeeded in assimilating real observations (excluding satellite radiances; E. Kalnay 2006, personal communication). Whitaker et al. (2007) also applied their serial EnSRF to NCEP GFS at a T62/L28 resolution and assimilated real observations. Both studies obtained encouraging results using real observations.

A main limitation of EnSRF experiments published thus far is that the resolution has been lower than the currently operational data assimilation systems or EPS. A higher-resolution model with more precise physical processes would increase the computational time, and more importantly may require a larger ensemble size for stable filtering, causing substantial increase in computations. Thus, the present paper aims to investigate EnKF on the relatively high resolution of T159/L48 comparable to current operational systems. Because we use a massively parallel supercomputer known as the Earth Simulator (ES), parallel-efficient LETKF is chosen for this study. The LETKF scheme is briefly summarized in section 2 (cf. Hunt et al. 2007 in detail). In section 3 the implementation of LETKF in this study is described. Following the description of the model and experimental settings in section 4, results are presented in section 5. Finally, summary and discussions are described in section 6.

## 2. LETKF

LETKF was proposed by Hunt et al. (2007) as an efficient upgrade of LEKF (Ott et al. 2002; 2004) and has been tested by Harlim (2006) with the Lorenz-96 model (Lorenz 1996). Szunyogh et al. (2005) assessed LEKF with T62/L28 NCEP GFS, and they also tested LETKF with the same system, confirming no significant difference in convergence between LEKF and LETKF (I. Szunyogh et al. 2006, personal communication). An important advantage of LETKF/LEKF schemes is their efficiency in parallel computers. Because LETKF/LEKF separate the entire global grid into independent local patches, ideally they have 100% parallel efficiency. Because LETKF is several times faster than LEKF (E. Kalnay et al. 2005, personal communication), LETKF is the better choice.

The LETKF scheme first separates an entire grid vector into local patch vectors. Each grid point has its own local patch; the number of local patch vectors equals the number of global grid points. The local patch has a cubelike shape characterized by the horizontal and vertical grid lengths *l* and *l _{υ}*. We define the local patch size parameters

*l*and

*l*as a half side from the local patch center, so that the local patch becomes a rectangular solid with (2

_{υ}*l*+ 1) × (2

*l*+ 1) × (2

*l*+ 1) grid points. Namely, if the coordinate of the local patch center is (

_{υ}*i*,

*j*,

*k*), the 8 corners to define the rectangular solid are (

*i*±

*l*,

*j*±

*l*,

*k*±

*l*) where all combinations of plus and minus signs are applicable. The treatment of boundaries is described later, in section 3.

_{υ}*N*×

*m*matrix composed of

*m*ensemble perturbations of an

*N*-dimensional local patch. Assuming forecast ensemble perturbations 𝗘

*sample the forecast errors, we estimate the forecast error covariance 𝗣*

^{f}*as*

^{f}*m*×

*m*matrix

*as the forecast error covariance matrix in the space spanned by the forecast ensemble perturbations and perform Kalman filter analysis equations in the*

^{f}*m*-dimensional space. Note that

*m*is usually smaller than

*N*and that 𝗘

*gives the mapping from the*

^{f}*m*-dimensional space to the

*N*-dimensional physical space. The analysis error covariance is written as

*H*is treated by

**and**x

*δ*

**x**denote an ensemble mean and a perturbation. The analysis increment

*δ*

**x̃**

^{a}in the

*m*-dimensional space is

**d**is the observational increment. The analysis ensemble perturbations in the

*m*-dimensional space are

**x**

^{a}^{(1)}| · · · |

**x**

^{a(m)}] in the

*N*-dimensional physical space are computed by mapping with 𝗘

^{f}:x

*is the forecast ensemble mean state. If*

^{f}*N*is large, the computation is significantly efficient because it requires multiplication with the

*N*×

*m*matrix 𝗘

*only once. As pointed out by Hunt et al. (2007), this algorithm is more efficient if one shares eigenvalue decomposition in solving both Eq. (3) (matrix inversion) and Eq. (6) (matrix square root). Namely, using the eigenvalue decomposition*

^{f}^{T}= 𝗜 because the left-hand side of Eq. (8) is symmetric. After computing analysis ensemble members for all local patches independently, we combine the values at the center point of each local patch to obtain the global analysis ensemble members.

**y**

*is only 1. The variable*

^{o}*w*denotes the localization weight determined by the distance between the analyzed grid point and the observation location, and Map() indicates a projection mapping from the state vector to return variables at the analyzed grid point. Although Eqs. (11) and (12) are similar, they are not equivalent. Equation (11) has a larger denominator, thus the localization effect is stronger; that is, the observation localization works a little weaker than the conventional localization. In the case with multiple observations, the situation would be a little different, but the example with a single observation illustrates the essential difference. Miyoshi (2005) confirmed that the observation localization significantly stabilizes LEKF using the Lorenz-96 model and the Simplified Parameterization, Primitive-Equation Dynamics (SPEEDY) global model (Molteni 2003), showing performance as good as the serial EnSRF by Whitaker and Hamill (2002).

## 3. Implementation

### a. LETKF implementation

LETKF core modules are applicable to any dynamical model—from the Lorenz-96 model with 40 dimensions to an operational AGCM with tens of millions dimensions. The FORTRAN90 codes originally developed by Miyoshi (2005) based on the Lorenz-96 and SPEEDY models are optimized and parallelized in this study. Because Miyoshi (2005) implemented only LEKF, LETKF is newly developed in a consistent manner so that they could be easily switched. The LETKF codes have been tested with the Lorenz-96 model, showing the same level of accuracy as LEKF. The results are consistent with those of Harlim (2006).

*w*(

*r*) of the observation localization, we apply the Gaussian function:

*r*and

*σ*denote the distance from the local patch center and the localization scale parameter, respectively. We define different scale parameters for horizontal and vertical localizations (

*σ*and

*σ*).

_{υ}The local patches at the boundaries are defined in a similar way as in Miyoshi (2005) and Szunyogh et al. (2005). In the zonal direction, cyclic boundary is applied. In the meridional direction, we take points beyond the poles. Namely, letting *N*_{lon} and *N*_{lat} denote the largest zonal and meridional indices respectively, we define the boundary points as (*i*, *N*_{lat} + 1) = [*i* + (*N*_{lon}/2), *N*_{lat}] and (*i*, 0) = [*i* + (*N*_{lon}/2), 1]. As for the vertical boundaries at the bottom and top levels, we simply eliminate indices beyond the boundaries, so that the local patch is smaller. For example, if the coordinate of the local patch center is (*i*, *j*, 1), the 8 corners to define the rectangular solid are (*i* ± *l*, *j* ± *l*, 1) and (*i* ± *l*, *j* ± *l*, 1 + *l _{υ}*), where all plus and minus combinations are applicable.

### b. Parallel implementation

Parallel computing has two purposes: one is to separate computations to reduce total computational time, the other is to separate memory to enable memory-demanding computations. Because EnKF requires a large memory space to store all ensemble members, EnKF needs to take advantage of parallel computing for both purposes. In both ways, LETKF is an ideal parallel algorithm because of the independent treatment of each local patch.

The parallel implementation of LETKF in this study is as follows. Let *m* denote the ensemble size. For simplicity, we assume that we use *m* computational nodes (hereafter, nodes), although our implementation allows flexible choices. First, each node reads each ensemble member for efficient parallel data access. At this moment, the *i*th node has a global state vector of the *i*th ensemble member as in Fig. 1a. Then, each node separates the data into *m* physical regions and exchanges them with the other nodes, so that each node has *m* members of the same region as in Fig. 1b. This is one of two internode communications. At this moment, the *i*th node has *m* members of the *i*th separated region (Fig. 1b). Each node reads all observational data, but only stores data if it falls inside the *i*th region. Then, the *i*th node computes the LETKF for the local patches included in the *i*th region. Each node has *N*_{all}/*m* local patches, where *N*_{all} denotes the number of global grid points. After computing the LETKF analysis for all local patches, the *i*th node has *m* analysis ensemble members in the *i*th region. Finally, the other data transfer is executed to exchange data back, so that the *i*th node has the *i*th analysis ensemble member of the global grid, and each node outputs data at the same time.

Recent parallel computers have a multilayer parallel architecture; that is, shared-memory parallel units are combined to form a large separated-memory parallel computer. In the present implementation, we separate regions horizontally for the separated-memory architecture using the Message Passing Interface (MPI) instructions. In addition, we implemented shared-memory parallelization, where we use the OpenMP instructions for the vertical index of the loop to compute the independent local patch. In summary, there are horizontal and vertical indices of a large loop to compute each local patch; all can be done independently. Horizontally we separate the memory and the loop indices by MPI; vertically we separate the loop index by OpenMP.

## 4. Forecast model and experimental settings

### a. AGCM for the ES (AFES) model

In the present experiments, the ES, a massively parallel supercomputer with 640 nodes, is used to enable LETKF with a relatively high-resolution model and a large ensemble size. Each node contains a 16-GB shared memory and 8 vector processors with peak performance of 8 GFlops each, that is, 64 GFlops per node.

The forecast model is an AGCM known as AFES (Ohfuchi et al. 2004), specifically optimized for ES. AFES has been developed by modifying the FORTRAN codes of the Center for Climate System Research/National Institute for Environmental Studies (CCSR/NIES) AGCM, version 5.4.02 (Numaguti et al. 1997), especially optimizing the Legendre transform and adding physical processes such as the cumulus convection scheme by Emanuel and Zivkovic-Rothman (1999). It has a primitive equation dynamical core and all components of physical processes that operational AGCMs contain, so that it produces forecasts as reliable as operational AGCMs, though there are some known systematic bias errors (e.g., Nakamura et al. 2005; Enomoto et al. 2007). The prognostic variables are wind components, temperature, specific humidity, liquid water content, surface pressure, and ground temperature and moisture, all of which are analyzed by LETKF. We chose the T159/L48 resolution, corresponding to a grid of 480 × 240 × 48 that is equivalent to a 0.75° × 0.75° latitudinal and longitudinal horizontal resolution. It has a sigma vertical coordinate from 0.9949 up to 0.0025; if the surface pressure is 1000 hPa, the bottom and top levels are at 994.9 and 2.5 hPa, respectively.

The lower boundary conditions are prescribed by the observed weekly sea surface temperature and sea ice coverage made from National Oceanic and Atmospheric Administration (NOAA) optimum interpolation SST, version 2, data (Reynolds et al. 2002) provided by the NOAA/Cooperative Institute for Research in Environmental Sciences Earth System Research Laboratory/Physical Sciences Division (NOAA/CIRES ESRL/PSD) Climate Diagnostics branch. The ozone concentration in the model is based on the monthly and zonally averaged climatology of Wang et al. (1995). The surface topography and characterization are based on the Global 30 Arc-Second Elevation (GTOPO30) and Global Land Cover Characterization (GLCC) datasets, respectively, both distributed by the U.S. Geological Survey’s Center for Earth Resources Observation and Science.

### b. Experimental settings

#### 1) Observing Systems Simulation Experiments (OSSEs)

Two types of OSSEs under the perfect model assumption are performed in this study: one with a regular observing network (hereafter, OSSE1) and the other with a real observing network (hereafter, OSSE2). The true nature run for OSSEs is generated by running AFES for 9 months, starting with a randomly chosen initial condition. The initial time for the true run is set to 0000 UTC on 1 December 2003. After a 5-month spinup, the data assimilation cycle is started from 0600 UTC 1 May 2004 and 0000 UTC 1 August 2004 for OSSE1 and OSSE2, respectively. The initial ensemble members are chosen from fields of the true run at randomly chosen times in other months close to the assimilation time, in which way, the initial ensemble mean is an analog of the climatological mean in a slightly different season. Observational data are simulated by adding random noise to the observations corresponding to the true states mapped by the observational operator. The random noise is sampled from the Gaussian distribution with given observational error standard deviations. We sample observations every 6 h from the true run.

For OSSE1, observation locations are constant in time and space. One observation is taken at every 5 × 5 × 4 grid points, physically about every 420 km at the equator and 300 km at 45° latitude. The resulting coverage is 4% horizontally and 1% in the three-dimensional space out of the entire grid. Each station is assumed to have observations of zonal and meridional wind components, temperature, specific humidity, and surface pressure. The observational error standard deviations are defined as 1.0 m s^{−1}, 1.0 K, 0.1 g kg^{−1}, and 1.0 hPa for wind components, temperature, specific humidity, and surface pressure observations, respectively.

For OSSE2, observation locations and errors are adapted from the Japan Meteorological Agency’s (JMA) operational NWP system (not including satellite radiances). Real data consist of asynchronous observations, but in OSSE2, all observations are assumed to be taken simultaneously every 6 h. Thus, there is no need for temporal interpolation. The data assimilated in JMA include data from surface stations over land and ocean, radiosondes, aircraft reports, satellite-based wind [also known as, atmospheric motion vector (AMV)], sea surface winds by Quick Scatterometer, and bogus data of sea level pressures in the Southern Hemisphere reported by the Australian Bureau of Meteorology. Following the operational data selection by the quality control and data thinning, we extract the location (longitude, latitude, and pressure level) and observational error standard deviations from the operationally used observations. In JMA’s operational system, observational error standard deviations are estimated by the innovation statistics (Hollingsworth and Lonnberg 1986). The assimilated observations are wind components, temperature, relative humidity, and surface pressure.

*E*() denote functions that return the matrix trace and the expected value, respectively. Here, Δ is the inflation parameter, estimated at each analysis step. Assuming slowly varying inflation, we estimate Δ by Kalman filtering with persistence as a forecast model for Δ. To avoid negative inflation due to sampling errors by a limited number of observations, we set the lower limit to 0 or 1% spread inflation (the square root of the covariance inflation). Thus, the control parameters of LETKF are the ensemble size, local patch size, and localization scale parameters (

*m*,

*l*,

*l*,

_{υ}*σ*, and

*σ*). First, in OSSE1, the local patch size of

_{υ}*l*= 5 and

*l*= 2; that is, 11 × 11 × 5 local patch, and the localization scale parameters

_{υ}*σ*= 2.0 and

*σ*= 1.0 are chosen as default, where the degrees of freedom (d.o.f.) of the local patch are about 3000 (605 grid points with 5 prognostic variables). Even if we limit the d.o.f. with the localization scale parameters, they are as large as 400. Following the results of OSSE1, the ensemble size is fixed to

_{υ}*m*= 40, and the localization parameters are tuned in OSSE2.

Fields are verified against the true nature run. The total energy norm [Eq. (2.4) of Buizza and Palmer 1995] is used to measure the difference between two atmospheric fields. To measure the difference for each variable, we use the root-mean-square error (RMSE), where we consider weights by physical areas for horizontal averaging; that is, less weight per grid point in higher latitudes.

In both OSSEs, no initialization procedure is applied. The LETKF analysis increments are essentially linear combinations of dynamically constrained ensemble perturbations unless the error covariance localization is applied. All ensemble members are ideally on the attractor of the dynamical system, thus the analysis would also be on the attractor in the limit of the linear assumption. Therefore, the analysis is expected to be well balanced, so that we do not need initialization procedures to smooth out high wavenumber noise. In the presence of the covariance localization, analysis increments are not exactly linear combinations of ensemble members, which would introduce imbalance to some extent. However, Miyoshi (2005) and Szunyogh et al. (2005), for example, obtained promising results in their LEKF experiments without initialization procedures. Therefore, we expect that the imbalance by the localization would not cause a fatal problem.

#### 2) Experiments with real observations

In this group of experiments, the same observations as OSSE2 are assimilated, but the data are switched back to real observations. In OSSE2, observation locations and errors are adapted from JMA’s operational NWP system with quality-control flags. Here, we assimilate the same observations as those used in the operational system in August 2004.

Following the results of the two OSSEs, the ensemble size is fixed to *m* = 40. The localization parameters are fixed to 21 × 21 × 13 local patch and 6.0-grid horizontal and 3.0-grid vertical localizations, as tuned in OSSE2. Because we found that the adaptive covariance inflation does not work appropriately, which is partly because of incorrect observational error settings, we fixed the inflation parameter to 10% spread inflation (21% covariance inflation).

In OSSE2, all observations are assumed to be synchronous, but it is not the case in practice. Since real data include many asynchronous observations, a four-dimensional EnKF technique (4D-EnKF) by Hunt et al. (2004) is applied to assimilate them appropriately. In addition, temporal localization is applied in the same manner as the spatial localization. Namely, observations taken at a time far from the analyzed time have lower weight, because the assumption of 4D-EnKF (i.e., a perfect linear model) is less valid with a larger difference between the observing time and analyzed time. The temporal localization scale is chosen to be 3 h in this study. Initialization is not employed here, either. Furthermore, because the model accuracy plays an essential role in assimilating real observations, we upgrade the AFES model version to AFES 2.2, which has been proven to better simulate the real atmosphere than the previous version AFES 1.22. Because the model upgrades are beyond the scope of this paper, the details are not described. A detailed description will be provided in a separate paper.

In addition to the data assimilation cycle experiments, 48-h forecast experiments, initiated from the most probable ensemble mean states once a day at 1200 UTC, are performed to verify the forecast errors against their own analyses. Although this is a common way to verify the NWP system, it has disadvantages that forecast cycle without data assimilation gives the perfect score, that is, zero errors. Therefore, the forecast errors are also verified against JMA operational analysis to confirm that the verification is fair. The same T159/L48 AFES used in the data assimilation experiments is employed in the forecast experiments. To compare the verification results, 48-h forecast experiments initiated by JMA’s operational analysis and the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) Reanalysis (NNR; Kalnay et al. 1996) are also performed with the same AFES model. The two operational analyses were analyzed by the operational three-dimensional variational data assimilation (3DVAR) methods at the time of August 2004 with different models (T213/L40 JMA and T62/L28 NCEP global models). To initiate the AFES integration, 24-pressure-level JMA analysis products with a 1.25° × 1.25° resolution and 17-pressure-level NNR products with a 2.5° × 2.5° resolution are interpolated onto the AFES model grid. Because of the model differences and the reduced resolutions for the operational analyses, it is noted that the results would be favorable to LETKF.

## 5. Results

### a. OSSE1

#### 1) Sensitivity to the ensemble size

Figure 2 shows the time series of the analysis errors in total energy for 30 days. LETKF does not diverge even with 10 ensemble members, although larger ensemble sizes significantly stabilize the filter. If we measure the analysis errors by RMSE of surface pressure, the error levels with 10 members and 20 or more members are about 0.5 hPa and less than 0.4 hPa, respectively; both are lower than the observational error (1.0 hPa).

The ensemble size is increased up to 1000 for the first one-step analysis. Figure 3 shows the analysis errors of the first-step analysis, where the analysis errors decrease inversely proportional with the ensemble size up to about 80 members and gradually become flat. The error decreasing trend is saturated at around 320 members. Almost no effect is observed when increasing ensemble size more than 320, which is consistent with the substantial d.o.f. of the local patch. If we increase the localization parameters, the d.o.f. of the local patch increases; we might obtain smaller analysis errors by a large ensemble size, making more use of distant observations. Because the initial ensemble fields have no specific flow-dependent structures other than climatology and are far from the true field, the results may be different from those in the stabilized stage. Thus, we plot the analysis errors after 10-day cycles, also shown in Fig. 3, where the decreasing trend is getting flatter in smaller ensemble sizes. In addition, Fig. 3 also shows the case without covariance inflation, where we see larger errors but no signal of filter divergence. It is confirmed that even with 10 members, the filter does not diverge without covariance inflation for 60-day cycles. Overall we get smaller analysis errors with a larger ensemble size, thus we need to find the correct balance.

#### 2) Sensitivity to the localization parameters

So far, the localization parameters have been fixed. However, when the ensemble size is increased, the sampling errors in the error covariance among distant points are expected to be reduced. Thus, we could increase the localization scales to make more use of distant observations, so that the analysis errors could be even smaller. In fact, other research using EnKF with an AGCM (e.g., Houtekamer and Mitchell 2001; Anderson 2001; Bishop et al. 2001; Whitaker and Hamill 2002; Szunyogh et al. 2005; Miyoshi 2005) pointed out the sensitivity to the localization scales. The tuning process is basically done by trial and error, requiring many computations. With our limited resources, we investigate how much impact the localization scales have when ensemble sizes are 10 and 40.

Figure 4 shows the time series of the analysis errors as in Fig. 2, but for different localization scales. In addition to the default setting of 11 × 11 × 5 local patch with 2.0-grid (about 160 km) horizontal and 1.0-grid vertical localizations, we performed 13 × 13 × 7 local patch with 3.0-grid (about 240 km) horizontal and 1.5-vertical localizations for 10 and 40 ensemble members. There are clear separations of the analysis errors by changing the localization scales. With the larger localization scale, the 10-member LETKF shows a longer adjusting period in the early stage, but eventually the analysis errors in the stabilized stage become smaller. The 40-member case shows similar adjusting period and smaller analysis errors in the stabilized stage with a larger localization scale. Because the 10-member LETKF contains larger sampling errors in the error covariance among distant points, increasing localization scale may result in larger analysis errors in the adjusting stage. However, in the stabilized stage it seems the default localization scales are too small, so that the distant observations provide significant information even with 10 members. Because the impact is nonnegligible, tuning the localization parameters for a chosen ensemble size is suggested.

To investigate more precisely, the ensemble size is fixed to 40, and the localization parameters are changed systematically. First, horizontal localization sensitivity is investigated with vertical localization parameters fixed to a 7-grid patch and 2.0-grid localization. Horizontal localization parameters are chosen to be 11 × 11 (3.0), 21 × 21 (6.0), 31 × 31 (9.0), and 41 × 41 (12.0), where the parentheses indicate grid lengths (1.0 grid corresponds to about 80 km) of the Gaussian-type localization. Figure 5a shows the results, where we see the smallest error level by the 11 × 11 patch. The case with the largest 41 × 41 patch behaves differently from other cases, showing filter divergence. It suggests that the large local patch introduces sampling errors so large that LETKF is unstable. Here, with the 12-grid (about 1000 km) localization, the localization function drops to zero at about 3650 km when we apply the fifth-order piecewise rational function (Gaspari and Cohn 1999; Hamill et al. 2001), which is widely used by other EnKF studies (e.g., Whitaker and Hamill 2002; Houtekamer et al. 2005). The local patch forces zero covariance beyond 20 grid points, that is, about 1600 km. This localization scale has a similar order to what Houtekamer et al. (2005) applied, where the localization function drops to zero at 2800 km.

To see how the error covariance estimated by 40 ensemble members contains sampling errors and how the covariance localization works, the estimated error covariance is computed. Here, for simplicity, the conventional localization is applied, multiplying localization weights to the estimated error covariance itself. Thus, it is different from the observation localization used in LETKF. Figure 6 shows how the horizontal localization affects the shape of the error covariance in midlatitudes estimated by 40 ensemble perturbations at a randomly chosen time after a spinup of the cycle experiments. The shades show statistical significance levels to reject the null hypothesis of no correlation by a parametric hypothesis testing (e.g., chapter 5 of Wilks 1995). Precisely, the probabilities for 40 uncorrelated normal samples to show correlations greater than 0.3120, 0.4026, and 0.5007 are 5%, 1%, and 0.1%, respectively; the correlation values constitute the thresholds for the statistical significance levels. Without localization, the error covariance has noisy patterns in the nonsignificant correlation area. Severe localization with the 11 × 11 patch suppresses the noisy patterns but makes the shape closer to a circle and omits a large portion of the significant correlation structures. The case with the 21 × 21 patch better conserves the flow dependence around the center point; still, some of the significant correlation structures are out of bound. The case with the 41 × 41 patch covers most significant correlations, but it also contains the noisy pattern with the nonsignificant correlation. It is the trade-off between sampling errors and signals. The results of data assimilation cycle experiments shown in Fig. 5a suggest that the case with the 41 × 41 patch is severely affected by the spurious covariance. The results are generally true for other locations with similar latitudes at other times.

Similar investigations have been done for vertical localizations, with horizontal localization parameters fixed to the 21 × 21 patch and 6.0-grid (about 480 km) localization. Figure 5b shows sensitivity to vertical localization parameters, where we see different behavior for the case with the largest patch size, similarly to the case for horizontal localization. Although the cases with 7-, 11-, and 15-grid vertical patches show low error levels, only the case with the 21-grid patch shows filter divergence. The vertical error covariance estimated by 40 ensemble perturbations is shown in Fig. 7 for various vertical localizations. Similarly to the horizontal covariance localization in Fig. 6, more severe localization damps noisy patterns in nonsignificant correlation areas but destroys the flow-dependent shapes of the significant correlation areas. Large covariance appears in the stratosphere near the top, it does not show statistical significance. The vertical localization is essential for damping the large spurious covariance near the top layer.

The localization scale has been measured not with the physical length but with the model grids, similar to the Mercator mapping. Therefore, the physical distance between two successive grid points in the longitudinal direction is very much dependent on latitudes. Thus, we expect longer correlation length in the grid space in high latitudes. Figure 8 shows similar figures as Figs. 6 and 7 but in a polar region, north of Alaska. The covariance shows stronger low-wavenumber signals and longer covariance length scales in the longitudinal direction as expected. The localization damps the covariance in a much shorter length than the actual longitudinal spread. As a result, the localized covariance omits most significant areas in longitudinal directions. Because of the Gaussian weighting function, the discontinuity at the edge of the local patch is minimal. However, when the observational departure from the first guess is large, the discontinuity becomes more evident. In the current experiments, there has been no fatal problem experienced so far, even without initialization procedures.

The cross-covariance structure explains dynamical balance among different variables. For example, the cross covariance between temperature and winds is expected to represent a specific dynamical constraint. If the cross covariance is not appropriately considered, the effects of data assimilation may not be sustainable because of dynamical processes such as geostrophic adjustment. Moreover, dynamical imbalance in initial conditions may cause an initial shock in the model integration, which may cause artificial fast oscillations. Severe localization could be problematic since it may destroy the cross-covariance structure. To investigate how localization affects cross-covariance structures, Fig. 9 shows cross covariance between temperature and meridional winds in the polar regions. As shown in Fig. 8, very severe localization is evident in the polar regions because of the short physical distance between two successive grid points in the zonal direction. Figure 9 indicates that the cross covariance is also severely affected by the localization, so that most portions of significant correlation between temperature and meridional winds are omitted. However, importantly, the dipole structure around the center point is basically conserved even in the localized space. Because the dipole structure represents the dynamical balance, it is not destroyed by the localization completely. However, the amplitudes of the dipole are reduced because the peaks are not on the center point. As a result, the dynamical balance is partially destroyed (J. D. Kepert 2006, personal communication). It is important to define the same localization parameters for all variables in the same local patch, because the use of the different localization scales may cause dynamical imbalance in the cross-covariance structures. The discussions are based on the conventional localization that directly multiplies weights on the error covariance itself. Thus, it is not clear to what extent the statements are valid in the case of the observation localization. In fact, a precise examination of forecast outputs at every time step initiated by an LETKF analysis indicates no significant signal of artificial fast oscillations. Further discussions are provided in section 6.

#### 3) Timing and parallel efficiency

Table 1 shows timing results of LETKF for various combinations of the number of nodes and ensemble size. If the number of nodes equals the ensemble size and it is less than 80, the computational time is less than 4 min. With a fixed number of nodes, the increase of computational time is more significant with larger ensemble sizes. According to Table 1, the cost resulting from an increase in the ensemble size from 10 to 20 is less than double, but in the case from 500 to 1000, it is almost 5 times.

We normalize and invert timing to obtain the acceleration ratios, showing how much acceleration is gained by the parallelization, where the unit acceleration ratio corresponds to one processor. Figure 10 shows the acceleration ratios. The 100% parallelization ratio defines perfect acceleration, thus the linear relationship with the number of processors. We observe that the larger the ensemble size, the more efficient the parallelization. The 10-member LETKF is just slightly more efficient than the 99.80% parallelization ratio, whereas the 160- and 80-member LETKF are almost as efficient as 99.99%. Overall, LETKF shows excellent parallel efficiency as expected from the algorithmic design by Ott et al. (2002, 2004).

The localization parameters have significant impacts on analysis accuracy, but the computational time may limit the choice of the local patch size. To investigate the timing dependence on the local patch size, we fixed the ensemble size to 40 and measured timing for various choices of the local patch. Figure 11 shows the results, where we see the quadratic relationship. Each vertical patch size shows different increasing rates. Generally, a larger vertical patch size requires less computational time for a given number of grid points in a local patch. This is possibly because of the treatment of vertical localization around the top and bottom boundaries.

### b. OSSE2

#### 1) Tuning the localization scale parameters

In OSSE1, ensemble sizes larger than 20 give stable filter performance. Since the regular observing network is optimistic, the ensemble size is fixed to 40 in OSSE2. The results of OSSE1 suggest tuning of the localization parameters. Figure 12 shows the time series of analysis errors in OSSE2, where we see the sensitivity of localization scales. The default setting is not as stable in OSSE2. Increasing localization scales from default, we obtain lower error levels. In Fig. 12, the 21 × 21 × 13 local patch with 6-grid horizontal and 3-grid vertical localization, that is, parameters of *l* = 10, *l _{υ}* = 6,

*σ*= 6.0, and

*σ*= 3.0 (denoting H6V3), seem a reasonable choice, thus we fix the parameters.

_{υ}#### 2) Analysis errors and ensemble spreads

As shown in Fig. 12, the error level after a 10-day cycle is as low as 3.0 × 10^{20}, several times larger than the 0.6 × 10^{20} of OSSE1 (Figs. 2 and 3). Figure 13 shows zonal mean analysis RMSE of wind components, temperature, and surface pressure, temporally averaged for 21 days after the initial spinup. Large errors are located mostly in polar regions, especially in the Southern Hemisphere where few observations are available. Analysis errors are quite low in most regions, as low as 2.0 m s^{−1} for wind components, 1.0 K for temperature, and 0.5 hPa for surface pressure. To see the horizontal distribution of the analysis errors, Fig. 14 shows temporally averaged analysis RMSE and ensemble spread. Both fields show good correspondence, with a high pattern correlation of 0.96. The pattern generally represents the spatial distribution of surface observing stations, where we see small errors over land. The RMSEs and ensemble spreads of analysis and first-guess fields of surface pressure averaged both temporally and spatially are shown in Table 2. While the Northern Hemisphere and the Tropics show small errors, significantly larger errors appear in the Southern Hemisphere, as we have already seen in Figs. 13d and 14. Errors in the Southern Hemisphere are about twice as large as other regions. The ensemble spreads are almost equal to RMSEs in the Northern Hemisphere. In the Southern Hemisphere, the ensemble spreads are larger than the RMSEs, but the opposite is true in the Tropics. The results suggest that each region may have a different optimal covariance inflation. The first-guess ensemble spreads are larger than the analysis ensemble spreads; namely, the analysis ensemble perturbations are growing in 6 h. Overall, the ensemble spreads capture the true error variance well, indicating the LETKF’s ability to capture reasonable error structures.

### c. Experiments with real observations

Figure 15 shows the sea level pressure analysis fields at 0000 UTC 16 August 2004, where the LETKF analysis with AFES looks almost identical to the JMA operational analysis (3DVAR at the time). The difference between the two analysis fields is mostly as small as 0.5 hPa, and differences larger than 3.0 hPa appears over Antarctica, where few observations are available. Five-day forecast experiments initiated by three analyses (LETKF analysis, JMA operational analysis, and NNR) are performed. Figure 16 shows the verifications of 48-h forecast accuracy against their own analyses, where we see generally similar errors among the LETKF analysis, JMA operational analysis, and NNR, all forecasted with the same AFES model. NNR shows the largest errors among the three analyses in all regions, possibly because of the lower resolution. In the NH there is no significant difference between the LETKF and JMA analyses, but in the SH LETKF shows larger errors than the JMA analysis. There are many conventional observations in the NH but much less in the SH, thus satellite radiances play more important roles in the SH. Because LETKF does not assimilate satellite radiances, it is expected to be less accurate in the SH than the JMA analyses. To confirm that the above verification against their own analyses is a fair choice, a similar verification of forecast errors is computed against the JMA operational analysis. The results are similar to the above results, but with the largest errors of LETKF in the SH. The JMA forecast errors are the smallest in the NH, but we should note that the verification against the JMA analysis is favorable to the JMA analysis. Based on the forecast verification against the JMA analysis, we would conclude that the verification against their own analyses was fair.

## 6. Summary and discussion

LETKF is developed and assessed with the AFES model at a T159/L48 resolution with sophisticated physical processes. This resolution is similar to currently operational data assimilation and ensemble prediction systems. In this study, we performed three kinds of experiments: 1) perfect model experiments with a regular observing network (OSSE1), 2) perfect model experiments with a real observing network (OSSE2), and 3) experiments with real observations. From OSSE1, we found that even at this resolution with sophisticated physical processes LETKF does not diverge, even with 10 members. The filter is significantly stabilized with more than 20 members, in which case the surface pressure analysis errors are less than half as large as the observational errors. Moreover, it is very efficient in parallel architectures and requires approximately 200 s of computation per analysis cycle when the ensemble size equals to the number of computational nodes on the Earth Simulator. Furthermore, with the highly irregular real observing network in OSSE2, LETKF is stable, showing analysis errors as small as 2.0 m s^{−1} for wind components, 1.0 K for temperature, and 0.5 hPa for surface pressure in most regions, although significantly larger errors appear in polar regions, especially over the SH oceans, where few observations are available. The ensemble spreads capture the actual error structures well, representing the observing network in general. However, the spreads are larger than the actual errors in the Southern Hemisphere and the opposite is true in the Tropics, which suggests that the optimal covariance inflation may be different in each region. In terms of the required computational time and ensemble size, this research is a realistic operational implementation of LETKF.

It is noted that Szunyogh et al. (2005) showed good performance with the 40-member LEKF using the NCEP GFS at a T62/L28 resolution. Increasing the resolution means increasing the number of d.o.f. of the local patch with the same physical scale, so we may need more ensemble members to reduce sampling errors and stabilize the filter. However, our results suggest that the required ensemble size is not proportional to the model resolution, although experiments with a higher resolution model are necessary for more precise discussion.

The sensitivities to the localization parameters are investigated in this study. Although Houtekamer et al. (2005) used the localization weighting function, which drops to zero at 2800 km (Gaussian length scale of about 770 km), we found that with 40 ensemble members, 6 grid points (about 480 km) is optimal in OSSE2 with a real observing network. The optimal length scale is dependent on the ensemble size and observing network. The effect of localization is a trade-off between reducing sampling errors and extracting flow-dependent error information. With a more severe localization, we reduce sampling errors but extract less flow-dependent error information. Sampling errors are strongly affected by the ensemble size. When the ensemble size is given, it would be suggested that the localization scale be tuned. We found that the system is stable with just 10 members in OSSE1. However, a more severe localization would reduce the advantage of EnKF. Note that in LETKF, more computations are required with a larger local patch size. Computational capacity limits not only the ensemble size but also the localization parameters. The model resolution, ensemble size, and localization parameters interact nonlinearly to affect the analysis accuracy and computational cost. We need to find an optimal combination among them.

The effects of localization in the polar regions indicate that, because of the latitudinal difference in physical length in the zonal direction, it is preferred that the localization scale would be different in latitudes. Although the Gaussian-type localization damps the tails of the covariance smoothly, the current localization omits most significant covariance near poles. The current implementation of LETKF assumes horizontally uniform localization scales in the model grid space. We could improve the LETKF by considering horizontal uniformity in the physical space.

The dynamical balance is a key issue in atmospheric data assimilation. Initialization procedures to omit artificial fast oscillations caused by imbalance in initial conditions were not applied in this study as in Miyoshi (2005) and Szunyogh et al. (2005). However, as already described, imbalance is introduced by the error covariance localization, which is essential in EnKF with a limited ensemble size. The error covariance was computed and plotted using 40 ensemble members in OSSE1, indicating the effects of the covariance localization. We found that the cross-covariance dipole structure, which corresponds to the dynamical balance, is basically conserved after the localization, although the peak amplitudes are damped. The peak amplitude reduction is a major source of the imbalance in EnKF (J. D. Kepert 2006, personal communication). These discussions are based on the conventional localization that directly multiplies weights to the error covariance itself, not the observation localization used in the LETKF. Because the observation localization is different from the conventional localization, it is not clear to what extent the statements are valid in the case of the observation localization. It is an open question how the observation localization affects the balance. An additional forecast experiment has been performed to investigate if any artificial oscillations are identified at the beginning of the model integration, but no such oscillation tendencies are identified except for gravity waves by natural sources such as mountains and convections.

The results with real observations are encouraging because the LETKF produces a reasonable analysis. In the NH, the LETKF analysis shows a similar level of accuracy to JMA’s operational analysis. In the SH, JMA’s operational analysis outperforms the LETKF analysis, mostly because of the use of satellite radiances. AFES outperforms NNR in the 48-h forecast verification, but we note that NNR has been analyzed with a lower T62/L28-resolution NCEP model. Because of the model differences and the reduced resolutions by the pressure-level data for the operational analyses, it is not straightforward to precisely compare the analyses, but the results obtained in this study suggest that the LETKF analysis is as accurate as the operational 3DVAR analyses.

Although this study is important in the ways described above, there are three main limitations: the simple treatment of model errors, no use of satellite radiances, and insufficient investigations of innovation statistics. Because EnKF is sensitive to model errors (e.g., T. Miyoshi et al. 2006, personal communication), special treatments are required. Dee (1995) pointed out that quantitative information of the model errors is indispensable in Kalman filter assimilating real observations. In fact, when Houtekamer et al. (2005) and Whitaker et al. (2007) assimilated real observations with EnKF, they needed to manually enlarge the error covariance, that is, ensemble spreads, to account for model errors. Houtekamer et al. (2005) added a model error term using flow-independent error covariance used in 3DVAR, the method known as the “additive covariance inflation.” Whitaker et al. (2007) applied not only covariance inflation but also the “relaxation to prior” method, mixing background and analysis ensemble perturbations. In addition, they changed multiplicative inflation parameters in the NH and SH because of the large spatial dependence of the observational densities. T. Miyoshi et al. (2006, personal communication) discuss the importance of covariance inflation in the presence of model errors; in addition, they suggest effectiveness of model bias correction (cf. Dee and da Silva 1998; Danforth et al. 2007) within EnKF. In this study, we applied a large multiplicative covariance inflation (10% spread inflation) in order for LETKF to be stable in assimilating real observations. More sophisticated treatment, such as additive inflation and spatially dependent inflation factors, may improve the analysis accuracy and global structure of ensemble spreads. As for satellite radiances, because of the recent continuous increase of the data, their use became a major field of atmospheric data assimilation. Satellite radiances play important roles especially in data-poor regions such as the SH oceans and stratosphere. The last point is on the innovation statistics. We used the observational errors estimated by JMA operational system. However, because we use a different model, the innovation statistics may be different. The observational error standard deviations by JMA may not be optimal for the AFES–LETKF system. In fact, the adaptive inflation did not work in the experiments assimilating real observations, possibly because of poor observational error settings. According to Bormann et al. (2003), observational errors assigned to AMVs at different NWP centers vary by more than a factor of 2. Making the most use of innovation statistics to optimally estimate observational errors and covariance inflation parameters may have nonnegligible impacts on the analysis quality. These are future directions to further approach the operational use of the LETKF.

## Acknowledgments

We thank members of the chaos/weather group of the University of Maryland, especially Profs. Eugenia Kalnay and Brian Hunt for fruitful discussions. We also thank Drs. Wataru Ohfuchi and Takeshi Enomoto of ESC, and Yoshiaki Takeuchi and Ko Koizumi of NPD/JMA for kind understanding and support for this project. Dr. Takeshi Enomoto of ESC provided the codes for 48-h forecast verifications. We are grateful to two anonymous reviewers for their useful comments to significantly improve the manuscript. Prof. Chris Danforth kindly read the manuscript and gave us suggestions for English editing. We used the Earth Simulator under support of JAMSTEC. This work was partly supported by Grant-in-Aid for Young Scientists (B) (19740297) from the Ministry of Education, Culture, Sports, Science and Technology, Japan.

## REFERENCES

Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129****,**2884–2903.Bishop, C. H., B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects.

,*Mon. Wea. Rev.***129****,**420–436.Bormann, N., S. Saarinen, G. Kelly, and J-N. Thepaut, 2003: The spatial structure of observation errors in atmospheric motion vectors from geostationary satellite data.

,*Mon. Wea. Rev.***131****,**706–718.Buizza, R., and T. N. Palmer, 1995: The singular-vector structure of the atmospheric global circulation.

,*J. Atmos. Sci.***52****,**1434–1456.Danforth, C. M., E. Kalnay, and T. Miyoshi, 2007: Estimating and correcting global weather model error.

,*Mon. Wea. Rev.***135****,**281–299.Dee, D. P., 1995: On-line estimation of error covariance parameters for atmospheric data assimilation.

,*Mon. Wea. Rev.***123****,**1128–1145.Dee, D. P., and A. M. da Silva, 1998: Data assimilation in the presence of forecast bias.

,*Quart. J. Roy. Meteor. Soc.***124****,**269–295.Emanuel, K. A., and M. Zivkovic-Rothman, 1999: Development and evaluation of a convection scheme for use in climate models.

,*J. Atmos. Sci.***56****,**1766–1782.Enomoto, T., A. Yoshida, N. Komori, and W. Ohfuchi, 2007: Description of AFES 2—Beyond 10-km mesh global atmospheric simulations.

*High Resolution Numerical Modelling of the Atmosphere and Ocean,*W. Ohfuchi and K. Hamilton, Eds., Springer, 280 pp.Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics.

,*J. Geophys. Res.***99****,**C5. 10143–10162.Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125****,**723–757.Hamill, T. M., J. S. Whitaker, and C. Snyder, 2001: Distance-dependent filtering of background error covariance estimates in an ensemble Kalman filter.

,*Mon. Wea. Rev.***129****,**2776–2790.Harlim, J., 2006: Errors in the initial conditions for numerical weather prediction: A study of error growth patterns and error reduction with ensemble filtering. Ph.D. dissertation, University of Maryland, 76 pp.

Hollingsworth, A., and P. Lonnberg, 1986: The statistical structure of short-range forecast errors as determined from radiosonde data. Part I: The wind field.

,*Tellus***38A****,**111–136.Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter technique.

,*Mon. Wea. Rev.***126****,**796–811.Houtekamer, P. L., and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation.

,*Mon. Wea. Rev.***129****,**123–137.Houtekamer, P. L., H. L. Mitchell, G. Pellerin, M. Buehner, M. Charron, L. Spacek, and B. Hansen, 2005: Atmospheric data assimilation with an ensemble Kalman filter: Results with real observations.

,*Mon. Wea. Rev.***133****,**604–620.Hunt, B. R., and Coauthors, 2004: Four-dimensional ensemble Kalman filtering.

,*Tellus***56A****,**273–277.Hunt, B. R., E. J. Kostelich, and I. Szunyogh, 2007: Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter.

,*Physica D***230****,**112–126.Kalnay, E., and Coauthors, 1996: The NCEP/NCAR 40-Year Reanalysis Project.

,*Bull. Amer. Meteor. Soc.***77****,**437–471.Lorenz, E., 1996: Predictability: A problem partly solved.

*Proc. Seminar on Predictability,*Shinfield Park, Reading, United Kingdom, European Centre for Medium-Range Weather Forecasts, 1–18.Miyoshi, T., 2005: Ensemble Kalman filter experiments with a primitive-equation global model. Ph.D. dissertation, University of Maryland, 197 pp.

Molteni, F., 2003: Atmospheric simulations using a GCM with simplified physical parametrizations. I: Model climatology and variability in multi-decadal experiments.

,*Climate Dyn.***20****,**175–191.Nakamura, M., T. Enomoto, and S. Yamane, 2005: A simulation study of the 2003 heatwave in Europe.

,*J. Earth Simul.***2****,**55–69.Numaguti, A., M. Takahashi, T. Nakajima, and A. Sumi, 1997: Description of CCSR/NIES atmospheric general circulation model. CGER Supercomputer Monograph Rep. 3, National Institute of Environmental Sciences, 48 pp.

Ohfuchi, W., and Coauthors, 2004: 10-km mesh meso-scale resolving simulations of the global atmosphere on the Earth Simulator: Preliminary outcomes of AFES (AGCM for the Earth Simulator).

,*J. Earth Simul.***1****,**8–34.Ott, E., and Coauthors, 2002: Exploiting local low dimensionality of the atmospheric dynamics for efficient ensemble Kalman filtering. [Available online at http://arxiv.org/PS_cache/physics/pdf/0203/0203058v3.pdf.].

Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation.

,*Tellus***56A****,**415–428.Reynolds, R. W., N. A. Rayner, T. M. Smith, D. C. Stokes, and W. Wang, 2002: An improved in situ and satellite SST analysis for climate.

,*J. Climate***15****,**1609–1625.Szunyogh, I., E. J. Kostelich, G. Gyarmati, D. J. Patil, B. R. Hunt, E. Kalnay, E. Ott, and J. A. Yorke, 2005: Assessing a local ensemble Kalman filter: Perfect model experiments with the National Centers for Environmental Prediction global model.

,*Tellus***57A****,**528–545.Tippett, M. K., J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 2003: Ensemble square root filters.

,*Mon. Wea. Rev.***131****,**1485–1490.Wang, W-C., X-Z. Liang, M. P. Dudek, D. Pollard, and S. L. Thompson, 1995: Atmospheric ozone as a climate gas.

,*Atmos. Res.***37****,**247–256.Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations.

,*Mon. Wea. Rev.***130****,**1913–1924.Whitaker, J. S., T. M. Hamill, X. Wei, Y. Song, and Z. Toth, 2007: Ensemble data assimilation with the NCEP global forecast system.

, in press.*Mon. Wea. Rev.*Wilks, D. S., 1995:

*Statistical Methods in the Atmospheric Sciences*. Academic Press, 467 pp.

Timing (s) of LETKF on the ES.

RMSEs and ensemble spreads of analysis and first-guess fields of surface pressure (hPa), horizontally averaged over each region and temporally averaged for 21 days from 11 to 31 Aug 2004.