## 1. Introduction

Atmospheric, ocean, and land states are estimated using the information from observations and dynamical models, employing the methodology known as data assimilation. Current state-of-the-art data assimilation methods are the three-dimensional variational (3DVAR; Parrish and Derber 1992; Cohn et al. 1998; Courtier et al. 1998; Derber and Wu 1998; Lorenc et al. 2000; Daley and Barker 2001) and the four-dimensional variational (4DVAR; Lewis and Derber 1985; LeDimet and Talagrand 1986; Navon et al. 1992; Zupanski 1993a; Courtier et al. 1994; Bennett et al. 1996; Rabier et al. 2000; Zupanski et al. 2002b; Zou et al. 2001) data assimilation methods. These methods are applied with success at operational numerical weather prediction (NWP) centers, mostly in applications to global and synoptic scales.

Growing demand for improving spatial resolution and accuracy of weather and other environmental data analysis, however, requires that data assimilation methodology be extended into the mesoscales (Gustafsson et al. 1997; Puri and Mills 1997). One important issue of mesoscale data assimilation is the insufficient, or inadequate, coverage of conventional observations. This necessitates the inclusion of remote sensing measurements with high temporal and spatial resolution. Although considerable progress has been made in recent years (Zou et al. 1995; Zou and Kuo 1996; Sun and Crook 1997, 1998; Guo et al. 2000; Zou and Xiao 2000; Zupanski et al. 2002a,b; Vukicevic et al. 2004), the assimilation of clouds, precipitation, and soil measurements still presents a formidable challenge. Because of the highly nonlinear and discontinuous nature of mesoscale phenomena, as well as of observation operators, mesoscale data assimilation is a highly complex and challenging problem. This is further enhanced by a desire to create a computationally efficient system, capable of quickly delivering improved mesoscale analyses and forecasts.

In addition, all of the above data assimilation methods assume that statistical characteristics of errors are known, which is in general not true. The forecast error covariance is required, but not well understood in mesoscales. This is especially true for moist processes, with all water phases included. The model error and bias are poorly understood as well, and their statistics are generally unknown.

To address the described challenges of mesoscale data assimilation, a mesoscale data assimilation research algorithm was developed at the Cooperative Institute for Research in the Atmosphere (CIRA) at Colorado State University (CSU). The algorithm was designated the Regional Atmospheric Modeling Data Assimilation System (RAMDAS). Based on the 4DVAR methodology, it was designed after the National Centers for Environmental Prediction (NCEP) regional Eta 4DVAR system (Zupanski et al. 2002a,b). In this paper, the basic development and performance of the system is presented. In section 2 the general features of RAMDAS are explained. Section 3 discusses the minimization, with special emphasis on the Hessian preconditioning and restart procedure. The modeling of error covariances is presented in section 4. Experimental details are given in section 5, the results are presented in section 6, and the conclusions are drawn in section 7.

## 2. 4DVAR algorithm

Some details of the CIRA 4DVAR algorithm are the same as those given in Zupanski et al. (2002b). Important differences, such as the NWP model, observation operator, and control variable, will be presented here.

*weak-constraint*4DVAR cost function is defined as

*B*denotes the background (first guess), and the subscript

*n*refers to observation times, with

*N*being the total number of observation times during the assimilation period. The superscript T denotes a transpose. The prior (background) error covariance of forecast and model errors is denoted by 𝗕, and the observation error covariance is 𝗥. The nonlinear NWP model and the observation operators are denoted

*M*and

*H,*respectively. The matrix 𝗚 represents the weighting given to the high-frequency component of the control variable, and the nonlinear operator

*F*represents a

*diabatic*digital filter operator (Huang and Lynch 1993). The observation vector is

**, and the vector**

*y***denotes the augmented control variable. The first term on the right-hand side of (1) represents the prior information, given by the background error covariance penalty term. The second term is the penalty associated with the high-frequency wave component. The last term is the forecast fit to observations, measured over all observation times.**

*z*The digital filter formulation is same as in Zupanski et al. (2002b). Because of its diabatic form, the midpoint of the filter is not defined at the initial time of the assimilation period; rather it is defined at the time of a 1-h forecast (e.g., Gauthier and Thepaut 2001). The half-width of the filter is therefore 1 h. Such a short filter response time is chosen in order to better accommodate the mesoscale applications and prevent physically important phenomena from filtering out. The filter with chosen specifications will most effectively damp high-frequency oscillations that last less than, or about, 1 h. Most likely, this is not an optimal choice, and the filter response may need to be adjusted in order to accommodate specific mesoscale applications. The diagonal weight matrix 𝗚 is defined as a fraction of the background error variance, that is, 𝗚 = *κ*𝗕* _{D}*, similar to the definition of Wee and Kuo (2004). In the current version of RAMDAS,

*κ*= 1 × 10

^{−3}. The specification of the factor

*κ*, or the matrix 𝗚, is a very important and difficult issue that needs to be addressed in the future.

### a. NWP model

The NWP model used in the current 4DVAR system is the Regional Atmospheric Modeling System (RAMS) nonhydrostatic primitive equation model, which was developed at CSU (Tripoli and Cotton 1982; Pielke et al. 1992; Nicholls et al. 1995; Walko et al. 1995). The RAMS is a gridpoint model and employs a *σ–Z* terrain-following vertical coordinate and the Arakawa C grid in the horizontal. Clouds and precipitation are explicitly represented within a microphysics parameterization, employing a one-moment cloud liquid water scheme (Walko et al. 1995) and a two-moment scheme for rain, graupel, hail, aggregates, pristine ice, and snow (Meyers et al. 1997). The RAMS model is coupled with a Land Ecosystem Atmosphere Feedback model, version 2 (LEAF-2), which has multiple soil layers and accounts for the effects of vegetation. Radiation fluxes are parameterized using a two-stream scheme developed by Harrington (1997).

In this study the horizontal resolution is 15 km, and there are 31 vertical levels, with the top at 16 km. The anticipated horizontal resolution of the RAMS model in future data assimilation studies is about 1–2 km, adequate for resolving the mesoscale features characterizing severe storms and hurricanes. This requirement puts an extreme computational burden on the RAMDAS and will require a careful choice of computational platform in such studies.

### b. Adjoint model

The adjoint of the RAMS model corresponds to a 4.2.9 version of RAMS, with some improvements. An adjoint of the coupled land surface model (LEAF-2) was also developed, thus allowing assimilation of land surface measurements and complex microphysical interaction. The adjoint of the RAMS and LEAF-2 models was developed using a combination of manual programming and the Tangent-linear and Adjoint Model Compiler (TAMC) software (Giering and Kaminski 1998).

The RAMS adjoint is an adjoint of the true tangent-linear model of the RAMS discrete algorithm. The linearization was performed with respect to a full model solution at every time step. This means that the reference state for the adjoint integration is saved every time step in the forward forecast model integration. This feature requires large amounts of data storage, but improves the accuracy of the adjoint solution (Errico et al. 1993). The adjoint in RAMDAS includes all physical parameterizations as in RAMS, with the exception of atmospheric radiation and convective parameterization. The atmospheric radiation is assumed to be of secondary importance for the short-term forecast in the data assimilation. The convective parameterization was neglected because it is typically not used in high-resolution RAMS simulations. The accuracy of the adjoint model solution was tested in the standard way by comparing it to the tangent-linear model solution. The latter was compared, for a limited number of cases, to the RAMS nonlinear perturbation solution. The resolution of the adjoint model is the same as that used for the forecast model, that is, 15 km in horizontal, with 31 vertical levels.

### c. Observation operator

The Weather Research Forecasting (WRF) model observation operator, developed at the National Center for Atmospheric Research (NCAR), NCEP, the Forecast System Laboratory (FSL), and other institutions involved in the WRF project (Barker et al. 2004; Web page at http://www.wrf-model.org) is adopted for use in the CIRA 4DVAR system. In the version used, the WRF observation operator provides an interface with conventional observations, such as the aircraft (ACARS, AIREP), surface (SYNOP, SHIP), radiosonde (SOUND), pilot-balloon (PILOT), and METAR measurements. The variables measured are temperature, horizontal winds, pressure, and specific humidity. The interpolation (transformation) operators, and their adjoints, are adopted from the WRF 3DVAR code. The WRF 3DVAR will eventually use all operationally available observations, thus enabling the CIRA 4DVAR system to steadily improve the observation base. The prereleased WRF 3DVAR code, used to create the observation operators in CIRA’s 4DVAR system, works on a single processor and employs linear interpolation operators in all three spatial dimensions.

### d. Control variable

**z**has three components: (i) initial conditions, (ii) random model error, and (iii) lateral boundary conditions. The initial conditions refer to the initial time of the assimilation period. For convenience, the lateral boundary conditions are formally included as a lateral boundary component of the model error. Using the relationship between the systematic (

*ϕ*) and random model error (

**) (Zupanski 1997)**

*r**t*refers to time, and

*ν*and

*β*are empirical parameters. Results with the Eta 4DVAR system confirm that the model bias often has the most dominant impact in the forecast after data assimilation (e.g., Zupanski et al. 2002a,b).

**and the streamfunction**

*χ*

*ψ***and**

*u***are the horizontal wind vectors, in the east–west and north–south directions, respectively. Note that this change does not impact the formulation of the cost function (1), which is still defined in terms of horizontal velocity components, that is, the observable quantities (Xie et al. 2002). The use of (2) effectively changes the subspace in which the correlations are calculated. Consequently, it allows a simpler modeling of error covariances, as well as simplifying the data assimilation algorithm for use in various geographic regions, with**

*υ**x–y*axes possibly oriented differently from the east–west direction. This makes a considerable computational savings for error correlation modeling and avoids the rotation of error covariances. In addition, the use of velocity potential and streamfunction implicitly creates a cross correlation between the

**and**

*u***wind components, often neglected in direct modeling of**

*v***and**

*u***correlations.**

*υ*The control variables are generally chosen to be the predictive variables of an NWP model. In the case of the RAMS model, this includes potential temperature, Exner perturbation function, and three-dimensional wind vector. The inclusion of microphysical variables (e.g., cloud water, rain, graupel, hail, aggregates, pristine ice, snow) is a real challenge because of unknown statistical properties, unknown correlation with other variables, and a limited observational support that could be potentially used to improve the error statistics.

## 3. Minimization

The minimization algorithm implemented in RAMDAS is the limited-memory quasi-Newton algorithm of Nocedal (1980). In the experiments presented here, a memoryless version of this algorithm (Luenberger 1984), equivalent to the nonlinear conjugate-gradient algorithm with inexact line search, is used.

### a. Hessian preconditioning (change of variable)

^{1/2}is symmetric (e.g., Horn and Johnson 1985), the matrix 𝗔 is defined as

**is symmetric. An optimal preconditioning matrix is the Hessian (Axelsson and Barker 1984). Therefore, the best choice for a change of variable is based on a square root factorization of the Hessian, that is,**

*A***is the control variable in the minimization subspace. In general, because of high dimensionality of the problem, matrix**

*ζ***is difficult to calculate, or to store in computer memory. In practical applications,**

*A***is often neglected (Rabier et al. 2000; Parrish and Derber 1992) or approximated by an empirical diagonal matrix (Zupanski 1993b,1996). The empirical preconditioning method is adopted in the CIRA 4DVAR system, however with few important improvements related to the approximate calculation of the matrix (𝗜 + 𝗔)**

*A*^{−1/2}. The new preconditioning methodology is briefly presented here.

**z**

*is*

_{k}*k*is the iterative minimization index, 𝗴 is the gradient of the cost function,

**is the descent direction, and**

*d**α*is the step length. Using Newton’s equation 𝗘

*d**= −*

_{k}

*g**, and after dropping the iterative index*

_{k}*k*, the first few terms of the Taylor expansion (7) are

*J*=

*J*(

**z**

_{k}) −

*J*(

**z**

_{k}+

*α*

_{k}

**d**

_{k}) is the cost function decrease. After inverting the Hessian, and calculating an approximate diagonal matrix, for example, 𝗗 ≈ diag(

**I**+

**A**), the last equation reads

*diag*(

**) was estimated (e.g., Zupanski 1996). However, preliminary results indicate that the estimate of diag(𝗜 + 𝗔) produces more robust preconditioning.**

*A***will have constant values in the blocks corresponding to a horizontal domain, for each control variable. Thus, the number of unknown matrix elements is**

*D**V*×

_{c}*L*, where

_{c}*V*is the number of different control variables (e.g., Exner perturbation function, potential temperature, etc.), and

_{c}*L*is the number of vertical levels. With this assumption, the Taylor expansion for the control variable

_{c}*j*at the vertical level

*l*is

*expected*relative decrease

*ρ*, that is,

*J*is the (known) cost function for the control variable

_{j,l}*j*at the vertical level

*l*. Note that the cost function is not, in general, defined in the model space where the control variables are defined. If that is the case, the cost function

*J*should be estimated using the technique described in Zupanski (1996). The practical formula for unknown diagonal elements of the matrix 𝗗, including a factor used to normalize the gradient norm and the cost function, is

_{j,l}*η*is an empirical constant, the only one to be determined from experiments (e.g.,

*ρ*is not explicitly used or estimated). The subscript TOTAL refers to the total values of the gradient norm and the cost function, and, as before, the indexes

*j*and

*l*refer to the control variable and vertical level, respectively. Therefore, the change of variable (6) is in practice approximated by

*diag*(𝗜 + 𝗔)], 𝗗

^{−1/2}is well defined.

### b. Restart procedure and monitoring of convergence

One of the necessary requirements for building a robust nonlinear minimization algorithm is to monitor the minimization convergence and to devise an appropriate restart procedure. The convergence monitoring and the restart procedure used in RAMDAS are based on the angle test of Shanno (1985). Although this is the same procedure as used in the Eta 4DVAR system (Zupanski 1996), here we present some important details not reported elsewhere.

*d**from*

_{k}*k*denotes the iteration number,

**is the gradient,**

*g**τ*is approximately the inverse Hessian condition number, and || · || denotes an

*l*norm. In our applications

_{2}*τ*= 3 × 10

^{−2}.

This angle test has theoretical as well as practical advantages. It assures that the produced minimization sequence will converge, and the associated computational overhead is negligible. In our applications, the calculation (14) also involves the use of the third-order Butterworth digital low-pass filter (e.g., Roberts and Mullis 1987), which makes the restart less likely after the cutoff iteration number (13 chosen in this study). As a consequence, the presented angle test will have more impact in the first several iterations and will gradually decrease in significance as the minimization proceeds.

^{∞}

_{k=1}(cos

*γ*

_{k})

^{2}or Σ

^{∞}

_{k=1}(cos

*θ*

_{k})

^{2}, is a divergent series (Shanno 1985). If test (14) is satisfied, the divergence of the sequence Σ

^{∞}

_{k=1}(cos

*θ*

_{k})

^{2}is assured by the divergence of the Fletcher–Reeves algorithm (e.g., Σ

^{∞}

_{k=1}(cos

*γ*

_{k})

^{2}), implying global convergence of minimization. Therefore, in agreement with the angle test, one could measure the sequence

*ω*below one, indicating faster divergence of the Σ

_{k}^{∞}

_{k=1}(cos

*θ*

_{k})

^{2}series.

## 4. Error covariance modeling

The error covariance modeling is similar to the methodology used in Zupanski et al. (2002a,b). A difference comes from direct modeling of the square root correlation matrices, thus avoiding the need for an eigenvalue decomposition used in Zupanski et al. (2002b). Also, the vertical correlation is added, thus creating fully three-dimensional error covariances at the initial time of assimilation period. The forecast and model error covariances are modeled using the same procedure. The relative simplicity of the methodology allows an efficient on-the-fly calculation of square root error covariances, another practical improvement of this 4DVAR algorithm.

In RAMDAS, only simple, univariate correlations are modeled at the initial time of data assimilation. Since the transport of error covariances by the NWP model is an essential part of a 4DVAR algorithm, it is anticipated that the error covariance at the end of the assimilation interval may have the required complex structure. This, however, is not certain, and additional experiments are needed before the validity of the assumption can be reasonably confirmed. Implicitly, it is assumed that the NWP model impact on the error covariance transport cannot be defined satisfactorily using quasigeostrophic, or other forms, of the simplified balance equation Although this is not necessarily true for synoptic and global scales, it may be true for mesoscales, since there is no known adequate balance constraint available at these scales. The choice of the initial background and model error covariances will impact the covariances at the end time, and there is no assurance that the correlations at the end of the data assimilation interval will be realistic using the approach presented here. Our choice to use the univariate background and model error covariances at the initial time is based on a positive experience with the NCEP Eta 4DVAR system, which employed the same strategy. In addition, the simplified approach adopted here is appealing since the challenging modeling of the initial cross covariances is avoided.

**P**

*is the forecast error covariance, and 𝗤 is the model error covariance. In fact, 𝗤 can be defined as a block-diagonal matrix as well, with components corresponding to the model error covariances at different times during the assimilation period.*

_{f}### a. Correlations

*is the*

_{D}*variance*(e.g., diagonal), 𝗪 is a linear interpolation from control variable space to model space (e.g., from the Arakawa A grid to the Arakawa C grid in RAMDAS), and CC

^{T}is the

*correlation*matrix defined in the control variable (e.g., minimization) space. The square root correlation matrix 𝗖 is a block-diagonal matrix, with each block representing autocorrelation of a particular control variable component

*p*refers to the perturbation Exner function, potential temperature, streamfunction, and other components of the control variable. To define the three-dimensional square root correlation matrices 𝗖

*, an auxiliary matrix 𝗦 is constructed by convolution of one-dimensional homogeneous and isotropic correlation matrices in the direction of main spatial axes,*

_{p}*x*,

*y,*and

*z*. This approach is commonly used to produce multidimensional correlations (e.g., Oliver 1995; Purser et al. 2003). Formally, the one-dimensional correlation matrices 𝗫, 𝗬, and 𝗭 are defined as positive semidefinite symmetric band Toeplitz matrices (e.g., Roberts and Mullis 1987; Golub and van Loan 1989)

**, defining the three-dimensional correlation between the central point (**

*S**I*,

*J*,

*K*) and the point (

*I*

_{1},

*J*

_{1},

*K*

_{1}) are

*i*,

*j*, and

*k*are the

*relative*indexes in the

*x*,

*y*, and

*z*directions, respectively. Note that

*r*

^{x}

_{0}=

*r*

^{y}

_{0}=

*r*

^{z}

_{0}= 1.

^{T}, however, has diagonal elements different from one; thus it is not a true correlation matrix. Therefore, a normalization of the 𝗦𝗦

^{T}matrix is needed in order to create a true correlation matrix 𝗖

*𝗖*

_{p}^{T}

*. The normalized square root correlation matrix is*

_{p}*𝗖*

_{p}^{T}

*represents a true correlation matrix.*

_{p}The matrix 𝗖* _{p}* is sparse, and in practice only the nonzero correlations are used or stored. A three-dimensional grid box is defined for each grid point, representing the volume with nonzero correlations. With a judicious choice of indexing, such that it first includes the points inside the grid box, the matrix 𝗖.

𝗖* _{p}* is also a symmetric band Toeplitz matrix. Since the outer product of two symmetric band matrices is also a symmetric band matrix, with the bandwidth equal to the sum of individual bandwidths, the matrix 𝗖

*𝗖*

_{p}^{T}

*will have its bandwidth doubled. Let ℓ*

_{p}_{x}, ℓ

_{y}, and ℓ

_{z}denote the decorrelation lengths of the matrix 𝗖

*𝗖*

_{p}^{T}

*in the*

_{p}*x*,

*y*, and

*z*directions, respectively. Also, let the grid spacing in the

*x*,

*y*, and

*z*directions be denoted Δx, Δy, and Δz. Then, the choice of bandwidths

*d*,

*f*, and

*g*[Eqs. (20)–(22)], should correspond to

*d*= ℓ

_{x}/2Δ

*x*,

*f*= ℓ

_{y}/2Δ

*y*,

*g*= ℓ

_{z}/2Δ

*z*. Although the matrices 𝗫, Y, and 𝗭 are isotropic, the matrix 𝗖

*is*

_{p}*anisotropic*, since different decorrelation lengths are allowed along main spatial axes.

The issue of a decorrelation length specification still remains. As mentioned earlier (section 2b), there is a limited observational support and insufficient knowledge of statistics of the control variables related to microphysics. This implies the necessity of improving the forecast error statistics for use in mesoscale data assimilation. The situation is even more difficult for the model error statistics. A possible approach to define the model error variance, and the decorrelation length of model error covariance, is to use a fraction of the values specified for the forecast error covariance (e.g., Zupanski 1997; Zupanski et al. 2002a,b). Another approach is to estimate the error covariance parameters from observations (e.g., Wahba et al. 1995; Dee and da Silva 1999; Dee et al. 1999). In the experiments presented here, the model error covariance decorrelation length is calculated as a fraction of the forecast error decorrelation length for each variable, that is, one-half to one-third of the decorrelation length used for the forecast error covariance. If a large statistical sample of model errors is available, however, say from operational runs of a weak-constraint 4DVAR, one could use it to define a sample model error covariance, and possibly improve the specification of model error covariance decorrelation length. This may be difficult to achieve in practice, however, given the ever-changing NWP models.

### b. Variances

As seen from (18), the diagonal elements of the forecast and model error covariances (e.g., variances) are required in order to create a *covariance* matrix from a *correlation* matrix.

*representative*forecast error (denoted

*ErrFcst*) is defined as a square root of forecast error variance. The vertical profile of representative forecast errors is modeled using exponential functions of the form

*p*denotes pressure (defined in hectopascals),

*p*is a referent pressure, and

_{ref}*σ*and

*μ*are parameters defined for each control variable. Note that the expression (25a) reaches the maximum value at the referent pressure, while the properly specified expression (25b) allows a gradual decrease of variance with height, better suited for moisture-related variables. No horizontal variability of forecast errors is assumed. The parameter values used in RAMDAS are given in Table 1. This particular choice results in the following square root variances at the 1000-hPa surface: 1.0 m

^{2}s

^{−2}K

^{−1}for the Exner function, 1.5 K for potential temperature, 4.0 m s

^{−1}for horizontal winds, and 7.95e-4 for total water mixing ratio.

For the model error variance, however, there is no practical methodology available. This is mostly due to our limited understanding of the errors of NWP models and unavailable sample statistics. Not having any good guidance regarding the model error variance, we adopt an ad-hoc approach here. The model error variance is defined as a fraction of the forecast error variance. Experimental results with the NCEP Eta Model, also supported by the preliminary results with RAMDAS, suggest that an acceptable definition is 𝗤* _{D}* = 1 × 10

^{−4}(𝗣

*)*

_{f}*. A possible way to improve the current procedure is to collect a sample of model errors from a series of data assimilation runs, and then use a sample estimate of the model error covariance, or to use a maximum-likelihood procedure suggested by Dee and da Silva (1999).*

_{D}## 5. Experimental design

### a. Synoptic situation

The conducted experiments cover a progression of a developing surface cyclone over the Great Plains, between 0600 UTC 8 March 2002 and 0000 UTC 9 March 2002. To present the synoptic situation, the NCEP Eta Model grid analyses are plotted, showing sea level pressure, 1000–500-hPa thickness, and horizontal winds at 250 hPa. The development was noticeable at the surface at 0600 UTC 8 March (Fig. 1a). A 1000–500-hPa thickness indicates a short wave developing at midlevels, which helped maintain the energy source for further cyclone development. At 1200 UTC 8 March (Fig. 1b) the cyclone was well defined, with the center of low pressure over Utah. During the next 12 h (Figs. 1c,d) the surface low pressure moved fast over Colorado, finally being centered over Kansas and Oklahoma. During that time a strong surface convergence developed northeast from the center of low pressure (not shown), further enhancing the system development into a strong storm over the Midwest.

The RAMDAS integration domain used in this study is 1800 km × 1200 km, approximately centered over eastern Colorado (Fig. 1). The integration domain covers Colorado, Utah, Arizona, New Mexico, Oklahoma, Kansas, Nebraska, Wyoming, and parts of South Dakota and Texas. Given the resolution of 15 km, the RAMDAS domain has 120 × 80 × 31 grid points. This domain size and resolution are chosen in order to capture the cyclone development shown in Fig. 1, as well as to accommodate computational requirements.

### b. Experiments

The first set of experiments will address the issue of correlation modeling, with special emphasis on the choice of vertical decorrelation length. To better illustrate this issue, single-observation 4DVAR experiments will be conducted. In particular, the observations of specific humidity and temperature will be assimilated. In one set of experiments, all decorrelation lengths are as in the control (default) RAMDAS setup. This implies a vertical decorrelation length of 2.0 km for initial conditions, and 0.3 km for model error, for all variables (including the temperature and specific humidity). In the second set of experiments, no vertical correlation is assumed; that is, a Dirac delta-function response is implied (decorrelation length is 0.0 km for both the initial conditions and model error).

In the second set of experiments, all available observations are assimilated, using a default RAMDAS setup. In these preliminary experiments, special attention will be paid to the convergence and robustness of the minimization algorithm.

## 6. Results

### a. One-observation 4DVAR experiments

Single-observation 4DVAR experiments are conducted in order to examine the sensitivity of the posterior (analysis) covariance structure to the vertical decorrelation length. The assimilation period is 6 h, from 0600 to 1200 UTC on 8 March 2002. A single observation is placed at the end of the assimilation period, that is, at 1200 UTC 8 March 2002. Radiosonde observations of temperature at 600 hPa, and specific humidity at 900 hPa, are assimilated in separate 4DVAR experiments.

The observation, forecast, and model errors, as well as the geographical location of observations at these points, are given in Table 2. In general, one can note that the observation errors are smaller than the forecast errors, implying that the analysis solution will be closer to observations. This also depends on the model error value, which is in our experiments about two orders of magnitude smaller than the forecast error, similar to the results obtained with the NCEP Eta 4DVAR system.

#### 1) Temperature observation at 600 hPa

Note that although the temperature is observed, the control and predictive variable is the potential temperature. At the initial time of assimilation (i.e., 0600 UTC 8 March 2002), the initial conditions and model error adjustments (optimal solution minus first guess) are shown in Fig. 2. Because of very small differences between the experiments with different vertical decorrelation lengths, only results from the control experiment (i.e., the experiment with vertical correlations) are shown. One can notice localized and smooth adjustment as a consequence of using the correlation function described in section 5.

The vertical cross section of potential temperature analysis increments (analysis minus first guess) at the end time of data assimilation, valid at 1200 UTC 8 March 2002, is shown in Fig. 3. One can immediately note that there is no significant sensitivity of the temperature analysis to the choice of temperature vertical decorrelation length. At the same time, there is a deep vertical correlation throughout the troposphere.

A possible explanation for the lack of sensitivity to initial temperature vertical correlations may be due to strong coupling with other variables. For example, temperature is well correlated with winds and pressure in midlatitudes (e.g., Holton 1979). Integration of the forecast and adjoint models in an iterative minimization algorithm, such as the 4DVAR, results in an optimal state with complex contributions from all variables and error covariances. Formally, the impact of vertical correlations in (23) is to alter the three-dimensional correlations. Even without a specified initial vertical correlation, horizontal correlations are still present, impacting the smoothness of the analysis response. It is important to stress that this lack of sensitivity to the choice of initial vertical correlation is not a general result (e.g., Rabier and McNally 1993), and it is very likely that the response at other locations is different. One can only speculate that, at this particular point, atmospheric dynamics allows other variables, as well as temperature in neighboring horizontal points, to have an impact on the vertical column of temperature.

#### 2) Specific humidity observation at 900 hPa

In this experiment the specific humidity is observed, but the control and predictive variable is the total water mixing ratio. The initial conditions adjustment (optimal solution minus first guess) at the beginning of assimilation period, valid at 0600 UTC 8 March 2002, is shown in Fig. 4. There is no significant difference between the experiment with no vertical correlation (Fig. 4a) and the control experiment (Fig. 4b). There is somewhat larger adjustment in the experiment without vertical correlation, but there is no obvious impact on the vertical scale of the adjustment. For the model error, however, there is a clear signal showing different vertical scales of model error adjustment in the experiment without vertical correlation (Fig. 5a) and the control (Fig. 5b). Although of small magnitude, the model error is added in each model time step during the 6-h assimilation period, eventually producing a strong cumulative impact.

The total water mixing ratio analysis increment, valid at 1200 UTC 8 March 2002, is shown in Fig. 6. In the horizontal cross section (not shown) there is no significant sensitivity to the choice of vertical decorrelation length. In the vertical cross section (Fig. 6), however, there is a notable impact of vertical decorrelation. In the experiment without vertical correlation (Fig. 6a), the analysis increment has clearly shorter scales than in the control experiment (Fig. 6b). The reason may be due to relatively weaker coupling with other variables. This, however, needs to be taken with caution. In more intense atmospheric development, such as severe storms and tropical cyclones, and also with higher model resolution, microphysics may play a more important role than in the shown situation. This would possibly alter the results presented here. Future experiments in these extreme weather situations will tell us more about the appropriate choice of decorrelation lengths. At present, one can say that the vertical decorrelation length is an important parameter to be looked at, with potential consequences for other moisture and cloud-related control variables (e.g., various mixing ratios).

As in section 6a, one can speculate that, at this particular point, atmospheric dynamics restricts other variables from having a notable impact on the vertical column of specific humidity (i.e., total water mixing ratio). Therefore, the choice of the initial total water mixing ratio correlation does matter in this case.

### b. Convergence analysis with assimilation of all observations

Satisfactory convergence properties of the minimization algorithm are a necessary precursor for a robust and efficient variational data assimilation system. Convergence analysis results will be presented here, from three 6-h analysis cycles (0600 UTC 8 March 2002–0000 UTC 9 March 2002). Note that the analyses are not cycled; rather they all start from the NCEP Eta Model–produced analyses. Therefore, the background vector is obtained by interpolation from the Eta analysis. As in the single-observation experiments, the resolution of the analysis and the model is 15 km in horizontal, with 31 vertical levels.

One should be aware that for a highly nonlinear, and possibly discontinuous, cost function, multiple minima may exist. This means that the solution reached using an iterative minimization may not be optimal. Although generally there is no explicit assurance that the global minimum is reached, there are indirect ways to evaluate the minimization performance and be confident in the produced analysis. An approach, adopted in RAMDAS, is to use the restart procedure of Shanno (1985) (section 3b). This allows a test of global convergence, mathematically assured by the Fletcher–Reeves conjugate-gradient algorithm. Another possibility is to evaluate the analyses and the forecasts after data assimilation to justify the benefit of data assimilation without knowing whether the minimum is global or local.

The relative cost function decrease, for three analysis cycles, is shown in Fig. 7. The number of shown minimization iterations varies from 17 to 21, although convergence was reached after only 13 iterations in the analysis cycle 1800 UTC 8 March 2002–0000 UTC 9 March 2002, or after 20 iterations in the analysis cycle 1200–1800 UTC 8 March 2002. On average, the cost function decreases to about 40%–50% of its initial value. The relative gradient norm decrease, calculated with respect to background error covariance, is shown in Fig. 8. Except for one case, the gradient norm decreases to 20%–30% of its initial value. There are no dramatic jumps in the gradient norm, indicating relatively well-controlled and smooth minimization.

One can note a saturation of the gradient norm decrease in about 15 minimization iterations. This could be a consequence of using an adjoint model not fully corresponding to the forecast model (i.e., an adjoint without convective parameterization and radiation), thus producing a gradient with an error. In this particular application, it is likely that the neglect of convective parameterization in the adjoint model is contributing most to the error in the gradient. RAMDAS was designed to run using 1–2-km horizontal resolution, with explicit microphysics. In these scales, the convective parameterization is not dominant, and this was the reason for not developing the adjoint code for convective parameterization. In preliminary evaluation of RAMDAS employing a 15-km horizontal grid resolution, however, the convective parameterization has an important impact, indicated by a larger error in the gradient norm noted in Fig. 8. For the subsequent evaluation of minimization performance this may not be critical, since the convergence tests generally examine the tail of minimization sequence, using iterative increments rather than the actual values of the gradient norm or the cost function.

*k*denotes the iteration number, and

*J*is the cost function. The convergence ratio is also a function of accumulated roundoff errors and the machine accuracy. For the Linux PC cluster, used to perform data assimilation experiments, the chosen accuracy is 2.0e-4 The convergence ratio is shown in Fig. 9. Except for the first analysis cycle (0600–1200 UTC 8 March 2002), where the convergence rate is very close to linear, one can note a sharp decrease at the end of minimization, indicating a superlinear convergence. Following Zupanski (1996), an estimated Hessian condition number ranges from 60 to 80.

As suggested in section 3, a variable derived from an angle test, *ω _{k}* [Eq.(16)], can be used to monitor the global convergence of the minimization algorithm. The sequence

*ω*is shown in Fig. 10. Values from different assimilation experiments are very close, making it difficult to distinguish between each other. More importantly, all values are significantly smaller than the threshold (equal to one), indicating consistently good range of the angles between the descent direction and negative gradient. This is a strong indication that the minimization is globally convergent in all three situations (e.g., Shanno 1985).

_{k}## 7. Concluding remarks

A new 4DVAR data assimilation system is developed, named RAMADS. It includes a nonhydrostatic CSU/RAMS forecast model and its adjoint. The observations are obtained from the WRF 3DVAR algorithm, using the corresponding forward and adjoint observation operators. Although developed after the NCEP Eta 4DVAR system, the RAMDAS incorporates several important new features: improved error covariance modeling, upgraded Hessian preconditioning, as well as new definitions of control variables. As the NCEP Eta 4DVAR system, the RAMDAS is a *weak-constraint* 4DVAR algorithm; that is, both the initial conditions and the model error (bias) are adjusted.

Important features, such as the control variable definition, error covariance modeling, Hessian preconditioning, and restart procedure, are presented and discussed in realistic 4DVAR data assimilation experiments.

The impact of vertical error covariance modeling (e.g., vertical decorrelation length) is found to be strong for total water mixing ratio. Although more experiments are needed in order to gain experience with assimilation of cloud-related observations, our preliminary experiments indicate that careful consideration of vertical decorrelation lengths will be needed.

The minimization algorithm is performing smoothly, and for given computer accuracy, satisfactory convergence was reached in 10–15 iterations. The convergence rate is close to linear, with some indication of superlinear convergence. The algorithm is globally convergent, as indicated by an angle test.

The conducted experiments employ 15-km horizontal resolution forecast and adjoint models, with 31 vertical levels. The aim of the RAMDAS development is to be applied to mesoscale weather events, requiring a 1–2-km horizontal resolution. From that perspective, the presented results are preliminary, and further adjustment and development of the RAMDAS is anticipated. Few issues noted in this study related to the choice of vertical decorrelation length, and the saturation of the gradient norm decrease will be further investigated in high-resolution RAMDAS experiments (e.g., 1–2-km resolution). The RAMDAS is already applied to assimilation of Geostationary Operational Environmental Satellite (GOES) radiances in cloudy atmosphere, presented in Vukicevic et al. (2004). Examination of the impact of clouds and precipitation on mesoscale weather is one of the main future directions of RAMDAS applications. This will be especially important for improved prediction of severe storms and hurricanes. Overall, the presented 4DVAR algorithm indicates robust and efficient performance, needed for future challenging assimilation applications.

## Acknowledgments

We thank Dr. Robert Walko for his invaluable help during early stages of this work. We would also like to thank Mr. Mark Fassler for making easy the use of the CIRA’s Linux PC Cluster computing system. Thorough reviews by two anonymous reviewers are greatly appreciated. This work was supported by the Department of Defense Center for Geosciences/Atmospheric Research Grants DAAD19-01-2-0018 and DAAD19-02-2-0005.

## REFERENCES

Axelsson, O., and V. A. Barker, 1984:

*Finite-Element Solution of Boundary-Value*.*Problems: Theory and Computation.*Academic Press, 432 pp.Barker, D. M., W. Huang, Y-R. Guo, and Q. N. Xiao, 2004: A three-dimensional (3DVAR) data assimilation system for use with MM5: Implementation and initial results.

,*Mon. Wea. Rev.***132****,**889–914.Bennett, B. S., S. Chua, and L. M. Leslie, 1996: Generalized inversion of a global numerical weather prediction model.

,*Meteor. Atmos. Phys.***60****,**165–178.Cohn, S. E., A. da Silva, J. Guo, M. Sienkiewicz, and D. Lamich, 1998: Assessing the effects of data selection with the DAO physical-space statistical analysis system.

,*Mon. Wea. Rev.***126****,**2913–2926.Courtier, P., J-N. Thepaut, and A. Hollingsworth, 1994: A strategy for operational implementation of 4D-Var using an incremental approach.

,*Quart. J. Roy. Meteor. Soc.***120****,**1367–1387.Courtier, P., and Coauthors, 1998: The ECMWF implementation of three-dimensional variational assimilation (3D-Var). I: Formulation.

,*Quart. J. Roy. Meteor. Soc.***124****,**1783–1808.Daley, R., and E. Barker, 2001: NAVDAS: Formulation and diagnostics.

,*Mon. Wea. Rev.***129****,**869–883.Dee, D. P., and A. da Silva, 1999: Maximum-likelihood estimation of forecast and observation error covariance parameters. Part I: Methodology.

,*Mon. Wea. Rev.***127****,**1822–1834.Dee, D. P., G. Gaspari, C. Redder, L. Rukhovets, and A. da Silva, 1999: Maximum- likelihood estimation of forecast and observation error covariance parameters. Part II: Applications.

,*Mon. Wea. Rev.***127****,**1835–1849.Derber, J. C., and W-S. Wu, 1998: The use of TOVS cloud-cleared radiances in the NCEP SSI analysis system.

,*Mon. Wea. Rev.***126****,**2287–2302.Errico, R., T. Vukicevic, and K. Raeder, 1993: Comparison of initial and lateral boundary condition sensitivity for a limited-area model.

,*Tellus***45A****,**539–557.Gaspari, G., and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125****,**723–757.Gauthier, P., and J-N. Thepaut, 2001: Impact of the digital filter as a weak constraint in the preoperational 4DVAR assimilation system of Météo-France.

,*Mon. Wea. Rev.***129****,**2089–2102.Giering, R., and T. Kaminski, 1998: Recipes for adjoint code construction.

,*Assoc. Comput. Mach. Trans. Math. Software***24****,**437–474.Golub, G. H., and C. F. van Loan, 1989:

*Matrix Computations*. 2d ed. John Hopkins University Press, 642 pp.Guo, Y-R., Y-H. Kuo, J. Dudhia, D. Parsons, and C. Rocken, 2000: Four-dimensional variational data assimilation of heterogeneous mesoscale observations for a strong convective case.

,*Mon. Wea. Rev.***128****,**619–643.Gustafsson, N., P. Lonnberg, and J. Pailleux, 1997: Data assimilation for high-resolution limited-area models.

,*J. Meteor. Soc. Japan***75****,**367–382.Harrington, J. Y., 1997: The effects of radiation and microphysical processes on simulated warm and transition season Arctic stratus. Ph.D. dissertation, Colorado State University, 289 pp. [Available from Colorado State University, Dept. of Atmospheric Science, Fort Collins, CO 80523.].

Holton, J. R., 1979:

*An Introduction to Dynamic Meteorology*. Academic Press, 391 pp.Horn, R. A., and C. R. Johnson, 1985:

*Matrix Analysis*. Cambridge University Press, 575 pp.Huang, X-Y., and P. Lynch, 1993: Diabatic digital-filtering initialization: Application to the HIRLAM model.

,*Mon. Wea. Rev.***121****,**589–603.LeDimet, F. X., and O. Talagrand, 1986: Variational algorithm for analysis and assimilation of meteorological observations: Theoretical aspects.

,*Tellus***38A****,**97–110.Lewis, J. M., and J. C. Derber, 1985: The use of adjoint equations to solve a variational adjustment problem with advective constraints.

,*Tellus***37A****,**309–322.Lorenc, A. C., and Coauthors, 2000: The Met. Office global three-dimensional variational data assimilation scheme.

,*Quart. J. Roy. Meteor. Soc.***126****,**2991–3012.Luenberger, D. L., 1984:

*Linear and Non-linear Programming*. 2d ed. Addison-Wesley, 491 pp.Meyers, M. P., R. L. Walko, J. Y. Harrington, and W. R. Cotton, 1997: New RAMS cloud microphysics parameterization.

,*Atmos. Res.***45****,**3–39.Nicholls, M. E., R. A. Pielke, J. L. Eastman, C. A. Finley, W. A. Lyons, C. I. Tremback, R. L. Walko, and W. R. Cotton, 1995: Applications of the RAMS numerical model to dispersion over urban areas.

*Wind Climate in Cities*, J. E. Cermak et al., Eds., Kluwer Academic, 703–732.Nocedal, J., 1980: Updating quasi-Newton matrices with limited storage.

,*Math. Comput.***35****,**773–782.Oliver, D., 1995: Moving averages for Gaussian simulation in two and three dimensions.

,*Math. Geol.***27****,**939–960.Parrish, D. F., and J. C. Derber, 1992: The National Meteorological Center’s Spectral Statistical Interpolation Analysis System.

,*Mon. Wea. Rev.***120****,**1747–1763.Pielke, R. A., and Coauthors, 1992: A comprehensive meteorological modeling system—RAMS.

,*Meteor. Atmos. Phys.***49****,**69–91.Puri, K., and G. A. Mills, 1997: Initial state specification for mesoscale applications.

,*J. Meteor. Soc. Japan***75****,**395–413.Purser, R. J., W-S. Wu, D. F. Parrish, and N. M. Roberts, 2003: Numerical aspects of the application of recursive filters to variational statistical analysis. Part I: Spatially homogeneous and isotropic Gaussian covariances.

,*Mon. Wea. Rev.***131****,**1524–1535.Rabier, F., and T. McNally, 1993: Evaluation of forecast error covariance matrix. ECMWF Tech. Memo. 195, 36 pp.

Rabier, F., H. Jarvinen, E. Klinker, J-F. Mahfouf, and A. Simmons, 2000: The ECMWF operational implementation of four-dimensional variational assimilation. I: Experimental results with simplified physics.

,*Quart. J. Roy. Meteor. Soc.***126A****,**1143–1170.Roberts, R. A., and C. T. Mullis, 1987:

*Digital Signal Processing*. Addison-Wesley, 578 pp.Shanno, D. F., 1985: Globally convergent conjugate gradient algorithms.

,*Math. Programm.***33****,**61–67.Sun, J., and N. A. Crook, 1997: Dynamical and microphysical retrieval from Doppler radar observations using a cloud model and its adjoint. Part I: Model development and simulated data experiments.

,*J. Atmos. Sci.***54****,**1642–1661.Sun, J., and N. A. Crook, 1998: Dynamical and microphysical retrieval from Doppler radar observations using a cloud model and its adjoint. Part II: Retrieval experiments of an observed Florida convective storm.

,*J. Atmos. Sci.***55****,**835–852.Tripoli, G. J., and W. R. Cotton, 1982: The Colorado State University three-dimensional cloud/mesoscale model—Part I: General theoretical framework and sensitivity experiments.

,*Rech. Atmos.***16****,**185–219.Vukicevic, T., T. Greenwald, M. Zupanski, D. Zupanski, T. Vonder Haar, and A. S. Jones, 2004: Mesoscale cloud state estimation from visible and infrared satellite radiance.

,*Mon. Wea. Rev***132****,**3066–3077.Walko, R., W. R. Cotton, M. P. Meyers, and J. Y. Harrington, 1995: New RAMS cloud microphysics parameterization. Part I: The single-moment scheme.

,*Atmos. Res.***38****,**29–62.Wahba, G., D. R. Johnson, F. Gao, and J. Gong, 1995: Adaptive tuning of numerical weather prediction models: Randomized GCV in three- and four-dimensional data assimilation.

,*Mon. Wea. Rev.***123****,**3358–3369.Wee, T-K., and Y-H. Kuo, 2004: Impact of digital filter as a weak constraint in MM5 4DVAR: An observing system simulation experiment.

,*Mon. Wea. Rev.***132****,**543–559.Xie, Y., C. Lu, and G. L. Browning, 2002: Impact of formulation of cost function and constraints on three-dimensional variational data assimilation.

,*Mon. Wea. Rev.***130****,**2433–2447.Zou, X., and Y-H. Kuo, 1996: Rainfall assimilation through an optimal control of initial and boundary conditions in a limited-area mesoscale model.

,*Mon. Wea. Rev.***124****,**2859–2882.Zou, X., and Q. Xiao, 2000: Studies on the initialization and simulation of a mature hurricane using a variational bogus data assimilation scheme.

,*J. Atmos. Sci.***57****,**836–860.Zou, X., Y-H. Kuo, and Y-R. Guo, 1995: Assimilation of atmospheric radio refractivity using a nonhydrostatic adjoint model.

,*Mon. Wea. Rev.***123****,**2229–2249.Zou, X., H. Liu, J. Derber, J. G. Sela, R. Treadon, I. M. Navon, and B. Wang, 2001: Four-dimensional variational data assimilation with a diabatic version of the NCEP global spectral model: System development and preliminary results.

,*Quart. J. Roy. Meteor. Soc.***127****,**1095–1122.Zupanski, D., 1997: A general weak constraint applicable to operational 4DVAR data assimilation systems.

,*Mon. Wea. Rev.***125****,**2274–2292.Zupanski, D., M. Zupanski, E. Rogers, D. Parrish, and G. DiMego, 2002: Fine resolution 4DVAR data assimilation for the Great Plains Tornado Outbreak.

,*Wea. Forecasting***17****,**506–525.Zupanski, M., 1993a: Regional four-dimensional variational data assimilation in a quasi-operational forecasting environment.

,*Mon. Wea. Rev.***121****,**2396–2408.Zupanski, M., 1993b: A preconditioning algorithm for large-scale minimization problems.

,*Tellus***45A****,**578–592.Zupanski, M., 1996: A preconditioning algorithm for four-dimensional variational data assimilation.

,*Mon. Wea. Rev.***124****,**2562–2573.Zupanski, M., D. Zupanski, D. Parrish, E. Rogers, and G. DiMego, 2002: Four-dimensional variational data assimilation for the Blizzard of 2000.

,*Mon. Wea. Rev.***130****,**1967–1988.

Geographic location and errors in single observation experiments.