Four-dimensional variational data assimilation (4D-Var) is used for generating estimates of model variables, taking into account both information about the dynamics and physics from a numerical model, and available information about the true state of the atmosphere contained in the observations and the background. Usually, a cost function is defined measuring the distance between a model trajectory and the observations over an assimilation time window. 4D-Var is implemented mathematically by minimizing the defined cost function with respect to the control variables. The minimization algorithm used in 4D-Var employs the adjoint equations for the computation of the gradient of the cost function with respect to control variables.
Thus, a successful 4D-Var system hinges upon three main aspects: the use of observations and background information, the accuracy of the assimilation model used, and the efficiency of the minimization algorithm. Theoretically, improvement in any aspect may improve the 4D-Var system. A sizable progress has been accomplished in every one of the above aspects during the last decade. However, due to the enormous size of meteorological problems, such improvement is greatly restricted by presently available computer platforms.
In the present study, we focus on how to use a full physics adjoint model in 4D-Var. The inclusion of the totality of the physical processes in the adjoint model is a required step toward obtaining a better defined cost function. Thus this research aims to address improvements in the second aforementioned aspect, along with a discussion of associated computational CPU requirements. Issues related to model errors are not discussed here.
4D-Var was first applied to simple models (Le Dimet and Talagrand 1986; Lewis and Derber 1985; Courtier and Talagrand 1987; Talagrand and Courtier 1987), before being tested in the context of adiabatic primitive equation models (Thepaut and Courtier 1991; Navon et al. 1992b; Chao and Chang 1992; M. Zupanski 1993). Recently, increasingly sophisticated physical parameterizations have been introduced into 4D-Var (e.g., Zou et al. 1993a; Zou and Kuo 1996; Zou 1997; Tsuyuki 1997; Zupanski and Mesinger 1995; D. Zupanski 1993;Mahfouf and Rabier 1998, manuscript submitted to Quart. J. Roy. Meteor. Soc.). These studies have provided encouraging evidence that the inclusion of physics in the adjoint may improve the performance of 4D-Var.
In the context of an operational environment, Rabier et al. (1997, 1998), Rabier et al. (1998, manuscript submitted to Quart. J. Roy. Meteor. Soc.) and Klinker et al. (1998, manuscript submitted to Quart. J. Roy. Meteor. Soc.) used the European Centre for Medium-Range Weather Forecasts (ECMWF) 4D-Var incremental assimilation system to examine the influence of physical processes. The inclusion of physics was found to have the largest impact on the analysis of humidity fields. A positive impact was also found on the performance of analyses in the Tropics, with a reduction of the spinup of precipitation in the subsequent forecast, and improved wind scores. Also, an improvement in extratropical scores was noted. It should be pointed out that the physics in the ECMWF’s adjoint model consists of a simplified version of the physics package used in the nonlinear forecasting model (Mahfouf et al. 1996; Mahfouf 1999, Mahfouf and Rabier 1998, manuscript submitted to Quart. J. Roy. Meteor. Soc.).
While recent research has shown a beneficial impact of adjoint physics on the quality of assimilated data, the effect of nonlinearities in physical processes on the convergence rate of 4D-Var minimization is still not adequately understood. Theoretically, an adjoint model with full physics should be completely consistent with the nonlinear forecasting model, thus providing exact gradients. However, physical parameterizations display much stronger nonlinearities than the model dynamical part, even when they do not contain discontinuities or on-off switches. Due to the presence of strong nonlinearities in physical parameterizations, the accurate gradient and Hessian matrix may still not provide an effective descent direction for a minimization process.
The 4D-Var experiments in the present research will show that, for an intermediate number of minimization iterations at the early stages of the minimization process, relatively large errors in the assimilation analysis are found to be closely related to precipitation when full physics is included in the adjoint model, and these errors are larger than those present in assimilation analyses where almost no physics is included in the adjoint model. On the other hand, previous studies (e.g., Zou and Kuo 1996; Zou 1997) showed that strong nonlinearities, even on-off switches, may not pose a negative impact on final convergence of the minimization process, and thus the inclusion of physics in the adjoint model has an overall beneficial impact on assimilation results.
Therefore, the inclusion of full physics in the adjoint model requires a 4D-Var algorithm capable of overcoming the negative effect of strong nonlinearities present in physics at the early stages of the minimization process, while being able to take advantage of the positive aspects resulting from consistency between the forecasting nonlinear model and adjoint model.
Several approaches have been proposed for mitigating the negative effect of strong nonlinearities in physical processes included in the adjoint model by either direct modifications or simplifications to physical parameterizations themselves. Zupanski and Mesinger (1995) and Tsuyuki (1997) showed a beneficial effect when smoothing formulas are used to replace those with discontinuities. This technique is applicable for dealing with various nonlinearities. ECMWF uses simplified physics in the adjoint model (Mahfouf et al. 1996; Rabier et al. 1997).
An alternative approach is to deal with the nonlinear problem of physical processes in the adjoint model in the framework of the minimization procedure. In 4D-Var, physical processes are indeed of secondary importance for large-scale problems, compared with the dynamical processes for short-span assimilation windows (less than 24 h). This fact was validated by the success obtained by 4D-Var at ECMWF, where only simple horizontal and vertical diffusions are the sole physical processes included in the adjoint model used (Rabier et al. 1997, 1998). Further, physical processes are controlled by dynamical processes to a large degree for large-scale dynamics. These features allow us to deal with physical processes in a fashion differing from that used with dynamical processes. This basic view constitutes the foundation of this research. As an immediate consequence of this tenet, we naturally resort to the idea of a progressive inclusion of physical processes in 4D-Var as the minimization process proceeds, an idea first proposed by Courtier et al. (1994).
The progressive inclusion of physical processes can be implemented in 4D-Var using the incremental approach proposed by Courtier et al. (1994). In this approach, the minimization is performed with respect to increments. An increment is defined as a deviation from a background (guess) trajectory. The background trajectory is computed with the full nonlinear forecasting model that has a high-resolution and comprehensive physics. A simplified linear model with low-resolution and/or simplified physics is used for solving the minimization problem in the vicinity of the trajectory. This method is further extended by defining a sequence of cost functions for different stages of the minimization process. Then the approach allows a progressive inclusion of physical processes, for instance, using simplified physics adjoint models during the initial stages and full physics adjoint models toward the final stages of the minimization process. Due to the fact that effects of nonlinearities are weaker for perturbations of smaller sizes, this sequential cost function incremental method offers an alternative approach for alleviating effects of strong nonlinearities present in physics. This assertion will be validated using 4D-Var experiments designed in this research work.
Courtier et al. (1994) mentioned that there is no guarantee for convergence of the incremental approach using a sequence of cost functions. Also, the convergence rate may be negatively affected by using different cost functions in the course of a minimization iteration procedure. Minimization algorithms used in 4D-Var usually belong to limited-memory quasi-Newton methods. The basic motivation behind limited memory quasi-Newton methods is to try obtaining the rapid convergence associated with Newton’s method without explicitly evaluating the Hessian of a cost function at every iteration. This is accomplished by constructing approximations to the inverse Hessian based on information gathered during the descent process prior to the current iteration (Liu and Nocedal 1989). In 4D-Var, the minimization is terminated after a number of iterations much smaller than the dimension of the problem. Using a sequence of cost functions may cause additional difficulties for obtaining a good approximation to the inverse Hessian.
The truncated Newton minimization method has been extensively investigated and applied to various research areas (e.g., Dembo and Steihaug 1983; Nash 1985; Nash and Nocedal 1991; Schlick and Fogelson 1992). It has also been applied to meteorological problems (Navon et al. 1992a; Zou et al. 1993b; Wang et al. 1992, 1995). The standard truncated Newton method consists of nested iterations: an outer iteration and inner iteration. The theoretical framework of truncated Newton methods presents a suitable tool for examining performance of the incremental method using a sequence of cost functions. In fact, the incremental method using a sequence of cost functions can be viewed as an algorithmic variant of the truncated Newton method. As such, we expect to refine the sequential cost function algorithm and better analyze its convergence properties by using the theoretical framework of truncated Newton methods.
The aforementioned effort requires a specific and detailed understanding of the impact of physical processes in the adjoint model on 4D-Var. Here we use an adjoint model with full physics based on The Florida State University global spectral model (FSUGSM). This model has been successfully applied to carry out both 4D-Var and optimal parameter estimation by Zhu and Navon (1998, 1999) and Tsuyuki (1997, 1996). Having a full physics adjoint model at our disposal, we can use it as a benchmark with which to compare other adjoint models with partial physical processes. Carrying out such comparisons enables us to examine how physical processes in the adjoint model impact on the minimization process in 4D-Var.
The outline of this paper is as follows. In section 2, we briefly summarize the essential features of the FSUGSM model and its full physics adjoint model. Section 3 presents a description of a standard 4D-Var approach using the full physics adjoint model and an incremental method that involves very simple physics in the adjoint. Section 4 details comparisons between results obtained via the incremental 4D-Var and those obtained using the standard 4D-Var, followed by discussions on the effect of physical processes on 4D-Var. Section 5 examines the incremental method with a sequence of cost functions, and a new truncated Newton-like incremental method is presented and tested. Finally, section 6 discusses and summarizes the numerical results obtained in this study.
2. A brief description of FSUGSM and its adjoint
FSUGSM has been used in numerical weather forecasts for operational purposes for more than a decade. Forecasts using this model especially emphasize tropical aspects such as monsoon and tropical storms (e.g., Krishnamurti et al. 1991).
The model has a comprehensive advanced physical parameterization package. The main physical parameterizations include a fourth-order horizontal diffusion (Kanamitsu et al. 1983), a modified Kuo-type convective scheme (Krishnamurti et al. 1993), dry convective adjustment, large-scale condensation (Kanamitsu et al. 1983), surface flux via similarity theory (Businger et al. 1971), vertical distribution of fluxes utilizing diffusive formulation where the exchange coefficients are functions of the Richardson number (Louis 1979), longwave and shortwave radiative fluxes based on a band model (Harshvardan and Corsetti 1984; Lacis and Hansen 1974), computation of low, middle, and high clouds based on threshold relative humidity for radiative calculation, and surface energy balance coupled to the similarity theory (Krishnamurti et al. 1991).
The adjoint system of FSUGSM has been developed as a result of several years efforts. The adjoint system includes the dynamic core (Wang 1993) and all abovementioned physical parameterizations (Tsuyuki 1996; Zhu and Navon 1997). To improve the performance of the corresponding linearized model, based on which the adjoint model was derived, a number of smoothing techniques were introduced to remove discontinuities in some physical parameterization formulations prior to derivation of the adjoint model (Tsuyuki 1996, 1997; Zhu and Navon 1997).
The dependent variables of the forecasting model as well as the adjoint model include vorticity, divergence, the logarithm of surface pressure, temperature, and dewpoint depression that is defined as the difference between temperature and dewpoint temperature. In the following variational data assimilation experiments, the spectral expansion of the model variables is triangularly truncated at the wavenumber 42 (T42). A sigma (σ) coordinate is used in the vertical and the vertical resolution consists of 12 layers roughly between 100 and 1000 hPa.
3. Formulation descriptions
a. Standard 4D-Var
We carry out experiments with the standard 4D-Var using the full physics adjoint model, that is, the adjoint model is consistent with the forecasting model. The result serves as a benchmark for comparison with results obtained from simplified forms of 4D-Var. One purpose of this research is to examine the impact of physical processes on minimization processes and on errors in resulting assimilation analyses at different stages of minimization processes.
To examine errors in the assimilation analyses, the experiments are designed in an identical twin framework. Thus assimilation errors are just the differences between assimilation analyses and observations. A reference forecast is carried out from t0 = 0 to t1 = 6 h using the full physics forecasting model, and the initial field and the results of the forecast at t1 = 6 h are used as “observations”. The 4D-Var is carried out to recover the state at time t0. The initial condition of the reference forecast is the initialized analysis valid at 0000 UTC 3 September 1996. The guess initial condition (or the background field) is constructed by adding random perturbations with a mean-root-square error of 2.5 × 10−5 s−1 for vorticity, 1.6 × 10−5 s−1 for divergence, 4.0 K for dewpoint depression, and 2.1 K for temperature.
b. Incremental 4D-Var
The idea central to the incremental method is to use a simplified and linearized version of the forward forecasting model to compute the evolution of the increment δx(t0), while the evolution of the background is predicted with the full forward forecasting model. The impact of the simplification and linearization is thus considerably reduced.
The definition of (3.6) corresponds to the one (referred to as
In (3.6), x̃b(ti) and di are kept constant during the minimization process. Only x̃(ti), associated with x̃(t0), is changed as the minimization iteration proceeds. Correspondingly, only the adjoint model associated with
In incremental 4D-Var experiments, the simplified model does not include physical processes except for horizontal diffusion and a simple surface drag scheme as the one used in Tsuyuki (1996, 1997), while all other aspects remain the same as in the full forecasting model. Accordingly, the adjoint model includes only the dynamic core, the horizontal diffusion, and the simple surface drag scheme.
4. Comparisons between the standard and incremental 4D-Var
The incremental method provides us with a powerful tool to examine the impact of inclusion of physical processes in the adjoint model on 4D-Var. In the incremental 4D-Var setting, we may define the simplified model by using a lower model resolution and/or by using only part of physical processes or their simplified forms (e.g., Courtier et al. 1994; Rabier et al. 1997; Veerse and Thepaut 1998). In this section, we are only concerned with the incremental 4D-Var without almost any physical processes in the adjoint model. With the standard 4D-Var analysis described in the previous section, we can compare the difference between the standard 4D-Var and the incremental method. This difference is due only to the physics package being absent in the adjoint model. Only when the performance of the standard 4D-Var is superior to that of the incremental 4D-Var, does the inclusion of the full physics package in the adjoint model have a beneficial impact, and vice versa.
Previous results of some experiments have shown that 4D-Var displays a different behavior in the tropics than in the extratropics (e.g., Rabier et al. 1997). It is often assumed that this originates in differences in convective activities. We analyze the differences between the tropics and extratropics in several aspects of 4D-Var. Figure 1 presents the 3-h accumulated precipitation of the reference forecast (observations) over the last 3 h within the assimilation window. The major precipitation events occur between 30°S and 30°N. Thus in what follows we represent the tropics by the region between 30°S and 30°N, and the extratropics by all the other regions.
a. Convergence of minimization processes in terms of cost functions
For the standard 4D-Var, Fig. 2 displays the evolution of the cost function along with its tropical and extratropical parts versus the number of minimization iterations. The cost functions have been normalized using their corresponding initial values. The minimization process displays a good rate of convergence, even for the tropics. The value of the cost function decreases by one order of magnitude after the first 30 iterations, and decreases to 0.5% of its initial value after 70 minimization iterations. Thus, the inclusion of physics into the adjoint model is not detrimental to the final convergence of the minimization process. It is interesting to note that the tropical part of the cost function decreases at a slightly slower rate after 20 iterations.
For the incremental 4D-Var, Fig. 3 shows the evolution of the normalized cost function (3.6) versus the number of minimization iterations. The cost function displays a satisfactory rate of decrease. After 70 minimization iterations, it decreases to 1.3% of its initial value.
We further compute the values of the standard cost function defined by (3.3) in terms of the updated analysis, that is, the sum of the updated increments and background fields at each iteration of the incremental 4D-Var. These values are appropriate to be compared directly with those obtained via the standard 4D-Var, but not the values of the incremental cost function (3.6). Figure 4 shows the evolution of the normalized standard cost function values versus the number of minimization iterations. After 70 minimization iterations, the standard cost function value of the incremental 4D-Var decreases to 0.8% of its initial value, which is larger than the final value obtained using the standard 4D-Var. The incremental 4D-Var also exhibits a satisfactory convergence rate measured by the standard cost function.
A descent direction obtained by the incremental method can be found to be a descent direction of the standard 4D-Var at the early stages of the minimization process. Figure 5 presents the cost function change for every iteration, defined as the difference between two subsequent iterations k and k + 1. We note that all the changes are positive apart from the one at iteration 53. It is interesting to note that the largest changes occur between iteration 5 and iteration 25. This result is consistent with that obtained by Navon et al. (1992b) among others, that the minimization process balances large structures during the first 15–20 minimization iterations. A similar evolution of the cost function changes with the number of iterations is observed in the standard 4D-Var (Fig. 6). Qualitatively, the descent process of the incremental 4D-Var bears a strong similarity to that of the standard 4D-Var at the early stages of the minimization process.
By a quantitative examination, we observe that the cost function displays a faster rate of descent at the early stage of the minimization process (the first 25 iterations) in the incremental 4D-Var than in the standard 4D-Var. Figure 7 illustrates the difference between the incremental and standard 4D-Var versus the number of minimization iterations. Prior to iteration 50, the cost function of the standard 4D-Var is larger than the one of the incremental 4D-Var. However, after 50 iterations, the cost function of the standard 4D-Var becomes smaller than that of the incremental 4D-Var. Thus, when considered in terms of the decrease in the cost function, the standard 4D-Var with full physics adjoint does not perform better than the incremental method without almost any physics in the adjoint. However, after the early stage of the minimization process, the full physics adjoint in the standard 4D-Var does present a positive impact.
b. Assimilation analysis errors
Values of the cost function are spatially summed quantities. We present here detailed analyses of assimilated analysis error fields.
We first examine the difference between assimilation errors of the standard and incremental 4D-Var after 21 minimization iterations, when the cost function difference is near its maximum (Fig. 7). This is intended to study in depth the stage of the minimization process where the cost function of the incremental 4D-Var attains smaller values than those of the standard 4D-Var for the same number of minimization iterations.
Figures 8 and 9 show the vertical distribution of the assimilation analysis of the root-mean-square (rms) error along with that of the preforecast rms error at the end of the assimilation window. Here preforecast refers to forecasts starting from the assimilation analyses at time t0 within the time window. The rms errors of the assimilation analyses of the standard 4D-Var are consistently larger than those of the incremental 4D-Var for all model variables and model levels. Especially, the rms error of dewpoint depression of the standard 4D-Var is much larger than that of the incremental 4D-Var in the lower and middle troposphere. Correspondingly, the preforecast rms errors of the standard 4D-Var are also larger than that of the incremental 4D-Var.
Figures 10 and 11 show the assimilation analysis error of dewpoint depression for the standard and incremental 4D-Var, respectively. The model σ level 1 is taken to represent the levels near the ground, σ level 3 the lower troposphere, and σ level 9 the upper troposphere. At σ levels 1 and 3, the largest errors for the standard 4D-Var are located in South Asia and the western Pacific. Also, the largest errors are observed to be located in South America. The areas with the largest errors correspond to areas with a large rate of precipitation (Fig. 1). In fact, strong precipitation centers evidently correspond to centers of large assimilation analysis errors. Vukicevic and Bao (1998) found a similar local performance of 4D-Var.
A striking feature of the assimilation analysis errors in the incremental 4D-Var is that they are relatively small in regions where relatively large errors occur in the assimilation analyses of the standard 4D-Var. These regions include the precipitation areas. Further, unlike the error of the standard 4D-Var, the largest errors in the incremental 4D-Var at all model levels are not found in the strong precipitation areas in the tropics, but they rather appear to be related to special orographic features such as the Plateau of Tibet and the Rocky Mountains. Correspondingly, the major large error areas in the incremental 4D-Var are located in the extratropics, and the errors in these areas are substantially larger than those in the standard 4D-Var. The relationship between the large assimilation errors and some special orographic features in the incremental 4D-Var strongly suggests that the large errors result from the absence of the boundary layer physical processes in the adjoint model.
These results show that at the early stages of the minimization the inclusion of the precipitation physics displays a negative impact on the 4D-Var, and the absence of the precipitation physics in the adjoint model does not affect significantly the 4D-Var. However, the absence of the boundary layer physics in the adjoint leads to large errors in the assimilation analyses in the lower troposphere.
We now proceed to analyze the assimilation analysis errors after 70 minimization iterations, when the standard 4D-Var attains a smaller value of the cost function than that of the incremental 4D-Var.
Let us first examine the vertical distribution of the rms errors in the assimilation analysis and 6-h preforecasts. For the standard 4D-Var, the vertical variation of the rms errors is indeed very small (Fig. 12). Contrary to the uniform distribution of the standard 4D-Var, the rms errors of the incremental method display significant vertical variations (Fig. 13). The rms errors are two to three times larger at model levels near the ground than in the middle troposphere for all model variables. Also, at model levels near the ground the rms errors in the incremental method are larger than those in the standard 4D-Var. In the middle and upper troposphere, there is no substantial difference between the standard and incremental 4D-Var in as far as the rms error is concerned. These results indicate that the absence of physics in the adjoint causes substantial errors at model levels in the lower troposphere.
We now turn to analyze the spatial distribution of error of the assimilation analyses. Figure 14 illustrates the spatial distribution of error of the assimilation analysis of dewpoint depression on representative model σ levels at iteration 70 in the standard 4D-Var. As expected, the error of the assimilation analysis exhibits a localized character. The maximal error attains a value of up to 1.6 K. The errors in vorticity, divergence, and temperature fields exhibit spatial distributions similar to that of dewpoint depression, with maxima of 0.7 × 10−5, 0.7 × 10−5, and 1.2 K, respectively. Thus the minimization does display a slower rate of convergence in some limited areas.
Intense physical processes such as deep convection, usually occur in limited regions. The influence of physical processes on 4D-Var may also be regional. We carry out a comparison between the error of the assimilation analysis in Fig. 14 and the accumulated precipitation in Fig. 1. The strong precipitation region in South Asia and the western Pacific corresponds to large errors in the assimilation analysis. Also, the strong precipitation region in South America corresponds to large errors in the assimilation analysis. Furthermore, it is evident that these isolated strong precipitation centers correspond to large error centers. The errors that correlate with the strong precipitation dominate the error field. From the spatial correlations between precipitation and errors of assimilation analysis, we conclude that the moisture physics may lead to large errors in the assimilation analysis over some limited regions where strong precipitation occurs.
Corresponding to Fig. 14, Fig. 15 illustrates the assimilation analysis error for the incremental 4D-Var. For levels 1 and 3, the assimilation analyses of the incremental 4D-Var tend to have relatively small errors in the regions associated with the areas of the largest error in the assimilation analyses of the standard 4D-Var. The error distribution at level 9 is of importance, since the large errors there evidently correlate with large precipitation rates such as over South Asia. Remarkably, most of the large precipitation areas correspond to large error centers. Similar to the previous discussion for iteration 21, we conclude that the absence of the boundary layer physics in the adjoint model leads to the dominant assimilation error in the lower troposphere, while the absence of the precipitation physics causes a dominant assimilation error in the upper troposphere after an intermediate number of minimization iterations.
The change in the assimilation analysis errors as the minimization iteration proceeds exhibits some important features. For the incremental 4D-Var, the spatial distribution of assimilation analysis errors changes very little from minimization iteration 21 to iteration 70 at model levels 1 and 3. Corresponding to the rms error, the error sizes do not display any reduction. At level 9, the size of the error is significantly reduced. However, relatively large error centers appear over some small areas related to precipitation. These relatively large error centers are not found at iteration 21. They either display a larger magnitude or have no counterpart in the standard 4D-Var. These observations suggest that the assimilation analysis errors due to the absence of the physics in the adjoint model cannot be reduced by performing additional minimization iterations. The errors over some precipitation areas, especially at middle and high latitudes, may become dominant in the middle and upper troposphere, and have larger sizes than those present in the standard 4D-Var.
We have carried out two 24-h forecasts starting from the assimilation analyses after 70 iterations, obtained from the standard and incremental 4D-Var, respectively. The difference between the two forecasts does not seem to be significant in terms of the rms error, but the rms errors of all the model variables are slightly smaller for the standard 4D-Var than for the incremental 4D-Var in the lower troposphere. The major difference is in the precipitation spinup. The accumulated precipitation of the forecast in the first 6 h is significantly insufficient in the incremental 4D-Var. Figure 16 shows the errors in the 3-h accumulated precipitation during the last 3 h within the assimilation window. Comparing with Fig. 1, we see that the forecast using the assimilation analyses of the incremental 4D-Var is about 20% less than the accurate amount, while the error is very small when using the assimilation analyses of the standard 4D-Var.
5. A 4D-Var strategy using full physics adjoint
As mentioned previously, inclusion of physical processes into the adjoint model sizably increases the computational cost of 4D-Var. In this model, the standard 4D-Var requires an amount of CPU time twice as large as that required by the incremental method that does not include almost any physics in the adjoint model. This ratio of CPU time between the standard 4D-Var and the incremental method is representative of operational models as estimated by Courtier et al. (1994) and Rabier et al. (1997). The increase in the CPU time when the full physics package is included into adjoint models may preclude the operational implementation of 4D-Var.
We have shown that the inclusion of the full physics into the adjoint model presents a negative impact on the assimilation analyses over precipitation regions at the early stages of the minimization process. Thus, when full physics is included in the adjoint model, an adequate 4D-Var strategy is necessary in order to circumvent the above-mentioned negative impact while taking advantage of the positive impact on the final assimilation analyses.
As mentioned in the introduction, the sequential cost function incremental approach proposed by Courtier et al. (1994) provides a possible method for alleviating the effect of nonlinearities and reducing the computational load due to the inclusion of physics into the adjoint model. We will discuss this algorithm by comparing it with the algorithmic features of the truncated Newton method, and then introduce and test a new algorithm suitable for dealing with physical processes in the adjoint model.
a. The standard truncated Newton method and its variants
Thus, the standard truncated Newton method consists of nested iterations. There is an outer iteration (loop) that corresponds to a general optimization method. At each outer iteration (loop) we compute a search direction and perform a line search. The computation of the search direction uses an inner iteration corresponding to the iterative method used to solve the Newton equations.
b. A truncated Newton-like incremental method
A key aspect in the implementation of the truncated Newton-like method is the selection of the approximate inner cost function Ĵ(x). We have shown that an important feature of the incremental cost function (3.6) is that its descent direction is the descent direction of the standard cost function (3.3) when the L-BFGS algorithm is used both in the incremental and in the standard 4D-Var. The incremental cost function (3.6) is selected to serve as an inner cost function.
The algorithmic setup of the truncated Newton-like method can be formulated and outlined in the following manner:
(a) Use the adjoint model without almost any physics to carry out a number of minimization iterations in terms of the incremental cost function (3.6), where very simple physics is included in the adjoint model. Here use a limited-memory quasi-Newton method (Liu and Nocedal 1989). Then obtain an analysis increment.
(b) Perform a line search in terms of the standard cost function (3.3), using the analysis increment as the search direction, and then update the assimilation analysis.
(c) Carry out a number of minimization iterations of the outer loop where the full physics adjoint model is used. Here also use the limited-memory quasi-Newton method (Liu and Nocedal 1989).
(d) Repeat the cycle consisting of steps a–c.
The standard truncated Newton method does not include step c. Step c is introduced here in order to take into account the effect of inclusion of full physics in the adjoint model. Due to step c, the truncated Newton-like method can be viewed as a variant of the sequential cost function incremental approach proposed by Courtier et al. (1994). Thus we refer to this method as a truncated Newton-like incremental method. We also refer to both steps b and c as being the outer loop. The number of minimization iterations to be conducted in each loop should be determined according to the total number of minimization iterations permitted. Generally, during the first cycles, a relatively small number or no iterations at all are conducted in step c, while a relatively large number of iterations is conducted in step a.
The crucial aspect for speeding up the rates of descent of this algorithm is the Hessian update information. The limited-memory quasi-Newton method is based on the idea of computing the descent direction pk as −
c. 4D-Var experiments using the truncated Newton-like incremental method
To avoid the strong effect of nonlinearities in the physical processes, a relatively large number of iterations in step a should be carried out in the first cycle. Thus the distance between the updated analysis and the solution would have been substantially reduced when the standard 4D-Var starts. This is mandated by the fact that the standard 4D-Var is subject to nonlinearities, and smaller perturbations may weaken the effect of nonlinearities. We perform 40 minimization iterations in step a, and 30 minimization iterations in step c. Then only one cycle of the truncated Newton-like incremental method is carried out. This experiment is mainly intended to examine the expected benefit of its application toward alleviating the effect of nonlinearities in the physics.
Figure 17 shows the normalized cost function versus the number of iterations. The final normalized cost function reaches a value 0.4% of its initial value at the end of the entire cycle. The normalized cost function value reaches a value of 0.5% of its initial value in the standard 4D-Var using full physics in the adjoint model after 70 iterations, and 0.8% of its initial value in the incremental method without almost any physics in the adjoint model. In terms of the rate of decrease of the cost function, the one cycle truncated Newton-like method is superior to either the standard 4D-Var or the incremental method.
Figure 18 presents the vertical distribution of the rms errors of assimilation analyses along with those for the forecast rms error at the end of the assimilation window. The rms errors of the one cycle truncated Newton-like method are smaller than those of the standard 4D-Var. It is more interesting to compare the results of the one cycle truncated Newton-like method with the incremental method without almost any physics in the adjoint model. The assimilation analysis error for model levels near the ground is much smaller than that in the incremental method. The assimilation analysis error in the new method is also somewhat smaller for the divergence field in the upper troposphere compared with the incremental 4D-Var without almost any physics in the adjoint model. Corresponding to these improvements, the error in the 3-h accumulated precipitation during the last 3-h within the assimilation window also become very small, being close to that obtained from the standard 4D-Var. It should be pointed out that the error at model levels near the ground is still considerably larger than the one present in the standard 4D-Var using full physics in the adjoint model.
To a large degree, the one cycle truncated Newton-like incremental method accomplishes the aim of alleviating the negative effects of the nonlinearities while taking into account the influence of physical processes in a satisfactory manner. The CPU time required increases by only 40% compared with that required by the incremental method that involves almost no physics in the adjoint model.
We have assumed that continuously updating Hessian information at every iteration in both the inner loop and the outer loop may accelerate the convergence of the minimization procedure. To verify this point, we have conducted an experiment in which the minimization is restarted without using Hessian information in the inner loop when the outer loop starts. Figure 17 shows that over the entire outer loop the cost function decreases in average by 10% faster when the Hessian information is continuously updated.
The major deficiency in the one cycle truncated Newton-like incremental method is that the error at model levels near the ground is still larger than that present in the standard 4D-Var. Indeed, the main cause of this deficiency is that the effect of the boundary physics is not sufficiently taken into account. This is due to the fact that the boundary physics starts impacting the 4D-Var from the early stages of the minimization process. To remedy this deficiency, we devise a two-cycle experiment, in which the inner loop consists of 25 minimization iterations, and the outer loop consists of 10 minimization iterations in the first cycle. In the second cycle, the inner loop uses 20 minimization iterations and the outer loop 15 minimization iterations. The computational cost increases by 35% compared with the incremental method without almost any physics in the adjoint model.
The resulting assimilation analyses using the two cycle truncated Newton-like incremental method experience an improvement in all the aspects examined. The value of the cost function at the end of the minimization process decreases to 0.3% of its initial value, which is smaller than the values obtained by either the standard or the incremental 4D-Var. Figure 19 presents the vertical distribution of the rms errors of assimilation analyses along with those for the forecast rms error at the end of the assimilation window. As expected, the large error at model levels near the ground disappears. It is very encouraging to find that the rms error is consistently smaller than that present in either the standard or the incremental 4D-Var for every model level and for all model variables. This advantage is also preserved in ensuing 24-h forecasts as well as in precipitation spinup, when using the two cycle truncated Newton-like incremental method.
6. Summary and discussions
In this paper, we presented results of several 4D-Var experiments. These experiments included both the standard 4D-Var where full physics was used in the adjoint model, and versions of incremental 4D-Var where only selected physical processes were used. The comparison between these experiments provided us a detailed understanding of how physical processes act on a 4D-Var procedure as the minimization process proceeds.
Results obtained showed that at the early stages of the minimization process, the analysis errors at the standard 4D-Var were overall larger than those in the incremental 4D-Var analyses where almost no physics was included in the adjoint model. The major assimilation analysis errors in the standard 4D-Var were found to be located over intense precipitation regions. On the contrary, for the incremental 4D-Var analyses without almost any physics in the adjoint model, the major errors did not correlate with precipitation, and the analysis errors over the large precipitation regions were smaller than those in the standard 4D-Var analyses. After an intermediate number (about 50 for the experiments in this research) of minimization iterations, the errors over some precipitation regions gradually become larger in the incremental 4D-Var analyses than those in the standard 4D-Var. Thus, the inclusion of precipitation physics in the adjoint model appeared to be detrimental to convergence rates of the minimization process at the early stages of the minimization process and then gradually turned out to become beneficial. Interestingly, the 4D-Var with full physics in the adjoint model did indeed speedup the precipitation spinup compared with the incremental 4D-Var without almost any physics in the adjoint model after iteration 70. However, the reduction of the precipitation spinup period cannot be solely attributed to the inclusion of the precipitation physics in the adjoint model, but may be the result of the combined impact of several physical processes.
In the incremental 4D-Var where almost no physics were involved in the adjoint model, major assimilation analysis errors were found to be primarily located in the lower troposphere, and exhibited a correlation with some special orographic features. Especially, the errors in the dewpoint depression and temperature analyses could not be further reduced after 20 minimization iterations. These large errors can be primarily attributed to the absence of the boundary layer physics in the adjoint model. Further, the major assimilation analysis errors were always associated with the absence of the boundary layer physics during the entire minimization procedure. Unlike the precipitation physics, the boundary layer physics in the adjoint model impacted positively the 4D-Var during the entire minimization procedure. The importance of including the boundary layer physics in the adjoint model has been recognized in several studies (e.g., Buizza 1994; Rabier et al. 1997).
The influence of precipitation physics warrants a detailed discussion. In the experiments, assimilation errors in the standard 4D-Var using full physics in the adjoint model did not exhibit a reduction over precipitation regions at least prior to iteration 21. The reduction in assimilation errors over precipitation regions occurred at the latest stages of the minimization process. A further explanation can be provided in terms of the eigenvalue spectrum of the Hessian of the cost function. We conjecture that precipitation physics is related to eigenvectors of the Hessian associated with smaller eigenvalues. Since the minimization process generally deals with smaller eigenvalues at later stages of the minimization process in the framework of the L-BFGS minimization algorithm, assimilation errors can be reduced only at the later stages of the minimization. This also means that if this mechanism prevails for the 4D-Var with full physics, then in order to beneficially extract the additional information provided by the full physics 4D-Var, a larger number of minimization iterations must be carried out than in the adiabatic case. To alleviate this situation, an efficient preconditioning approach is mandatory.
The overall influence of the physical processes in the adjoint model on 4D-Var was found to exhibit different behavior between the early and later stages of the minimization process. This different behavior was related to the presence of strong nonlinearities in the precipitation physics. As the minimization process proceeded, the assimilation analysis became closer to the minimization solution. The effect of nonlinearities in physics on the minimization process tended to weaken. At this stage the advantage of the consistency between the nonlinear forecasting model and the adjoint model thus started playing a dominant role. Interestingly, the boundary layer physics in the adjoint model turned out to be beneficial during the entire minimization process. This may be attributed in part to the relatively weak nonlinearities present in the boundary physics.
The results obtained suggest some issues that should be taken into account in order to maximize the beneficial impact when 4D-Var uses full physics in the adjoint model. At the early stages of the minimization process, the 4D-Var would be better implemented using the incremental method that does not include full physics in the adjoint model. 4D-Var experiments have shown that a minimization process generally acts first on the larger scales (Thepaut and Courtier 1991b; Navon et al. 1992b;Tanguay et al. 1995). Thus the small scales can be dealt with only during the last stage of the minimization process as in Veerse and Thepaut (1998). We suggest avoiding using full physics in the adjoint model at the early stages of the minimization process in order to avoid the detrimental effect of the strong nonlinearities, which does not mean that physics have little impact on the 4D-Var at this stage. To take into account the impact of full physics, we should use full physics in the adjoint model repeatedly after the early stages of the minimization. Here we have proposed to avoid using continuously the full physics package in the adjoint model during the entire minimization process, a suggestion justified theoretically by the recognition that physical processes are controlled to a large degree by dynamical processes. This observation is validated by the numerical experiments presented in this research.
The above-mentioned requirements have been fulfilled using the new truncated Newton-like incremental method. The algorithm is a variant of the incremental method using a sequence of cost functions. It consists of an inner loop and an outer loop. The incremental method including almost no physics in the adjoint model comprises the inner loop, while the outer loop consists of the standard 4D-Var using the full physics adjoint model. In the standard truncated Newton method a line search is performed when the minimization process moves from the inner loop to the outer loop, while preconditioning is used in the inner loop. We used the L-BFGS algorithm for both the outer and inner loops. This allowed updating the Hessian continuously at every minimization iteration in both the outer and inner loops. A two-cycle truncated Newton-like incremental experiment was performed. In the first cycle, 25 minimization iterations were conducted in the inner loop using the adjoint model without almost any physics and 10 minimization iterations in outer loop using the full physics adjoint model. In the second cycle, 20 minimization iterations were conducted in the inner loop and 15 minimization iterations in outer loop. The results obtained showed the quality of the assimilation analyses to be better than that obtained from either the standard 4D-Var or the incremental 4D-Var after 70 iterations in all the aspects examined. This two-cycle truncated Newton-like incremental method accomplishes the aim of alleviating the negative effects of the nonlinearities in the physics along with satisfactorily taking into account the impact of physical processes. The CPU time required increased by only 35% compared with that required by the incremental method that involves almost no physics in the adjoint model.
We recognize that the way in which various physical mechanisms operate is highly model dependent. Some of the conclusions derived in this study may be restricted to the particular model used here, namely the FSUGSM. There are still many other open problems to consider. We used only a simple scaling method to precondition the minimization. Better preconditioning methods are available such as those of Courtier et al. (1994), Yang et al. (1996), and Zupanski (1996) to cite but a few. We do not know if better preconditioning methods available will reduce the detrimental effect of the strong nonlinearities present in the physics at the early stages of the minimization process. In this research, we used only observations of wind, temperature, dewpoint depression, and surface pressure fields in a twin experiment frame. Observations sensitive to physics were not used, such as precipitation amount and total precipitable water, or other observations from satellite platforms. Especially, benefits of precipitation data have been demonstrated in 4D-Var (Zupanski and Mesinger 1995; Zou and Kuo 1996; Tsuyuki 1996), but the adjoint precipitation physics is necessary when precipitation data are assimilated. It may be more interesting to investigate possibilities that observational data sensitive to physics be appropriately assimilated, when they are introduced only at some stages of the minimization process. These problems require a more in-depth investigation.
We would like to thank two referees, Drs. Xiaolei Zou and Milija Zupanski, as well as one anonymous referee for their incisive and beneficial reviews, which significantly contributed toward improving the content and presentation of this paper. We acknowledge the support from NSF Grant ATM-9731472 managed by Dr. Pamela Stephens, whom we would like to thank for her support and from the Supercomputer Computations Research Institute, which is partially funded by the Department of Energy through Contract DE-FG05-85ER250000. Acknowledgment is also made to the National Center for Atmospheric Research, which is sponsored by the National Science Foundation, for providing the computing time used in this research.
Buizza, R., 1994: Sensitivity of optimal unstable structures. Quart. J. Roy. Meteor. Soc.,120, 429–451.
Businger, J. A., J. C. Wyngard, Y. Izumi, and E. F. Bradley, 1971: Flux profile relationships in the atmospheric surface layer. J. Atmos. Sci.,28, 181–189.
Chao, W. C., and L.-P. Chang, 1992: Development of a four-dimensional variational analysis system using the adjoint method at GLA. Part 1: Dynamics. Mon. Wea. Rev.,120, 1661–1673.
Courtier, P., and O. Talagrand, 1987: Variational assimilation of meteorological observations with the adjoint vorticity equation. Part II: Numerical results. Quart. J. Roy. Meteor. Soc.,113, 1129–1347.
——, J.-N. Thepaut, and A. Hollingsworth, 1994: A strategy for operational implementation of 4D-Var, using an incremental approach. Quart. J. Roy. Meteor. Soc.,120, 1367–1388.
Dembo, R. S., and T. Steihaug, 1983: Truncated-Newton algorithms for large-scale unconstrained optimization. Math. Program.,26, 190–212.
Gilbert, J. C., and C. Lemarechal, 1989: Some numerical experiments with variable storage quasi-Newton algorithms. Math. Program.,45,407–436.
Harshvardan, and T. G. Corsetti, 1984: Long-wave parameterization for the UCLA/GLAS GCM. NASA Tech. Memo. 86072, 52 pp. [Available from Goddard Space Flight Center, Greenbelt, MD 20771.].
Ide, K., P. Courtier, M. Ghil, and A. Lorenc, 1997: Unified notation for data assimilation: Operational, sequential and variational. J. Meteor. Soc. Japan,75 (1B), 71–79.
Kanamitsu, M., K. Tada, K. Kudo, N. Sato, and S. Isa, 1983: Description of the JMA operational spectral model. J. Meteor. Soc. Japan,61, 812–828.
Krishnamurti, T. N., J. Xue, H. S. Bedi, K. Ingles, and D. Oosterhof, 1991: Physical initialization for numerical weather prediction over the Tropics. Tellus,43A, 53–81.
——, H. S. Bedi, and K. Ingles, 1993: Physical initialization using the SSM/I rain rates. Tellus,45A, 247–269.
Lacis, A. A., and J. E. Hansen, 1974: A parameterization of the absorption of solar radiation in the earth’s atmosphere. J. Atmos. Sci.,31, 118–133.
Le Dimet, F.-X., and O. Talagrand, 1986: Variational algorithms for analysis and assimilation of meteorological observations. Tellus,38A, 97–110.
Lewis, J., and J. Derber, 1985: The use of adjoint equations to solve a variational adjustment problem with advective constraints. Tellus,37A, 309–327.
Liu, D. C., and J. Nocedal, 1989: On the limited memory BFGS method for large scale optimization. Math. Program.,45, 503–528.
Lorenc, A. C., 1986: Analysis methods for numerical weather prediction. Quart. J. Roy. Meteor. Soc.,112, 1177–1194.
Louis, J. F., 1979: A parametric model of vertical eddy fluxes in the atmosphere. Bound.-Layer Meteor.,17, 187–202.
Mahfouf, J.-F., 1999: Influence of physical processes on the tangent-linear approximation. Tellus,51A, 147–166.
——, R. Buizza, and R. M. Errico, 1996: Strategy for including physical processes in the ECMWF variational data assimilation system. Proc. ECMWF Workshop on Non-Linear Aspects of Data Assimilation, Reading, United Kingdom, ECMWF, 595–632.
Nash, S. G., 1985: Preconditioning of truncated-Newton methods. SIAM J. Sci. Stat. Comput.,6, 599–616.
——, and J. Nocedal, 1991: A numerical study of the limited memory BFGS method and the truncated-Newton method for large scale optimization. SIAM J. Optim.,1, 358–372.
Nocedal, J., 1980: Updating quasi-Newton matrices with limited storage. Math. Comput.,35, 773–782.
Rabier, F., and Coauthors, 1997: Recent experimentation on 4D-Var and first results from a simplified Kalman filter. ECMWF Research Department Tech. Memo. 240, 42 pp. [Available from ECMWF, Shinfield Park, Reading, Berkshire RG2 9AX, United Kingdom.].
——, J.-N. Thepaut, and P. Courtier, 1998: Extended assimilation and forecast experiments with a four-dimensional variational assimilation system. Quart. J. Roy. Meteor. Soc.,124, 1861–1887.
Schlick, T., and A. Fogelson, 1992: TNPACK—A truncated Newton package for large-scale problems. Part I: Algorithms and usage. ACM Trans. Math. Software,18, 46–70.
Talagrand, O., and P. Courtier, 1987: Variational assimilation of meteorological observations with the adjoint vorticity equation. Part I: Theory. Quart. J. Roy. Meteor. Soc.,113, 1131–1328.
Tanguay, M., P. Bartello, and P. Gauthier, 1995: Four-dimensional data assimilation with a wide range of scales. Tellus,47A, 974–997.
Thepaut, J.-N., and P. Courtier, 1991: Four dimensional data assimilation using the adjoint of a multilevel primitive equation model. Quart. J. Roy. Meteor. Soc.,117, 1225–1254.
Tsuyuki, T., 1996: Variational data assimilation in the Tropics using precipitation data. Part II: 3-D model. Mon. Wea. Rev.,124, 2545–2561.
——, 1997: Variational data assimilation in the Tropics using precipitation data. Part III: Assimilation of SSM/I precipitation rates. Mon. Wea. Rev.,125, 1447–1464.
Veerse, F., and J.-N. Thepaut, 1998: Multiple-truncated incremental approach for four-dimensional variational assimilation. Quart. J. Roy. Meteor. Soc.,124, 1889–1908.
Vukicevic, T., and J.-W. Bao, 1998: The effect of linearization errors on 4DVAR data assimilation. Mon. Wea. Rev.,126, 1695–1706.
Wang, Z., 1993: Variational data assimilation with 2-D shallow water equations and 3-D FSU Global Spectral Models. Ph.D. dissertation, Dept. of Mathematics, The Florida State University, 235 pp. [Available from Dirac Science Library, The Florida State University, Tallahassee, FL 32306.].
——, I. M. Navon, F. X. Le Dimet, and X. Zou, 1992: The second order adjoint analysis: Theory and application. Meteor. Atmos. Phys.,50, 3–20.
——, ——, X. Zou, and F. X. Le Dimet, 1995: A truncated-Newton optimization algorithm in meteorology applications with analytic Hessian/vector products. Comput. Optim. Appl.,4, 241–262.
Yang, W., I. M. Navon, and P. Courtier, 1996: A new Hessian preconditioning method applied to variational data assimilation experiments using NASA general circulation models. Mon. Wea. Rev.,124, 1000–1017.
Zhu, Y., and I. M. Navon, 1997: Documentation of the tangent-linear and adjoint models of the radiation and boundary layer parameterization packages of the FSU Global Spectral Model T42L12. Tech. Rep. FSU-SCRI-97-98. [Available from SCRI, The Florida State University, Tallahassee, FL 32306–4052.].
——, and ——, 1998: FSU-GSM forecast error sensitivity to initial conditions: Application to Indian summer monsoon. Meteor. Atmos. Phys.,68, 35–41.
——, and ——, 1999: Impact of parameter estimation on the performance of the FSU Global Spectral Model using its full-physics adjoint. Mon. Wea. Rev.,127, 1497–1517.
Zou, X., 1997: Tangent linear and adjoint of “on–off” processes and their feasibility for use in 4-dimensional variational assimilation. Tellus,49A, 3–31.
——, and Y.-H. Kuo, 1996: Rainfall assimilation through an optimal control of initial and boundary conditions in a limited-area mesoscale model. Mon. Wea. Rev.,124, 2859–2882.
——, I. M. Navon, and J. Sela, 1993a: Variational data assimilation with moist threshold processes using the NMC Spectral Model. Tellus,45A, 370–387.
——, ——, M. Berger, K. H. Phua, T. Schlick, and F. X. Le Dimet, 1993b: Numerical experience with limited-memory quasi-Newton methods and truncated Newton methods. SIAM J. Numer. Optim.,3, 582–608.
Zupanski, D., 1993: The effects of discontinuities in the Betts–Miller cumulus convection scheme on four-dimensional variational data assimilation. Tellus,45A, 511–524.
——, and F. Mesinger, 1995: Four-dimensional variational assimilation of precipitation data. Mon. Wea. Rev.,123, 1112–1127.
Zupanski, M., 1993: Regional four-dimensional variational data assimilation in a quasi-operational forecasting environment. Mon. Wea. Rev.,121, 2396–2408.
——, 1996: A preconditioning algorithm for four-dimensional variational data assimilation. Mon. Wea. Rev.,124, 2562–2573.