An ensemble-based approach is proposed to obtain conditional nonlinear optimal perturbation (CNOP), which is a natural extension of linear singular vector to a nonlinear regime. The new approach avoids the use of adjoint technique during maximization and is thus more attractive. Comparisons among CNOPs of a simple theoretical model generated by the ensemble-based, adjoint-based, and simplex-search methods, respectively, not only show potential equivalence of the first two approaches in application according to their very similar spatial structures and time evolutions of the CNOPs, but also reveal the limited performance of the third measure, an existing adjoint-free algorithm, due to its inconsistent spatial distribution and weak net growth ratio of norm square of CNOP comparing with the results of the first two methods. Because of its attractive features, the new approach is likely to make it easier to apply CNOP in predictability or sensitivity studies using operational prediction models.
The determination of the fastest-growing perturbation is one of the key issues for weather and climate predictability. To serve this purpose, Lorenz (1965) introduced the concepts of linear singular vector (LSV) and linear singular value (LSVA). Such a linear approach has been widely used to find the fastest-growing perturbations of atmospheric flows, and its applications have been extended to explore error growth and predictability in the past decade (e.g., Xue et al. 1997a,b; Thompson 1998; Samelson and Tziperman 2001). In addition, LSVs are employed to construct initial perturbations for ensemble forecast (Palmer et al. 1993; Montani et al. 1996) and to determine targeted areas in adaptive observations (Buizza and Montani 1999). The theories of LSV and LSVA are established on the basis that evolution of initial perturbation is assumed to be linear, which can be described approximately by a tangent linear model (TLM). The linear assumption is reasonable when the amplitude of perturbation is small enough. However, a small initial perturbation can rapidly evolve to an amplitude that is much larger due to the nonlinearity of prediction models, as in the case for the atmosphere and ocean circulations that are governed by complex nonlinear dynamics. Clearly, the linear theory cannot meet the need of describing nonlinear evolution of finite-amplitude initial perturbation.
Having realized the limitation of LSV and the need to deal with the predictability of a nonlinear system, many methods have been attempted. Lorenz (1996) used the leading Lyapunov vector and exponent to study predictability. The concept of the finite-size Lyapunov exponent was introduced and applied when the initial uncertainty is not very small (Aurell et al. 1997; Boffetta et al. 1998). The breeding method was put forward as an extension of Lyapunov vector into the nonlinear field with finite-amplitude perturbations (Toth and Kalnay 1997). Oortwijn and Barkmeijer (1995) and Barkmeijer (1996) considered the nonlinear effects by an iterative procedure. Mu (2000) employed nonlinear model directly and proposed a novel concept of nonlinear singular vector (NSV) and nonlinear singular value (NSVA), which is a natural generalization of the classic LSV and LSVA.
To refine NSV and NSVA and to describe the initial perturbation that has the largest nonlinear evolution, Mu et al. (2003) introduced the concept of conditional nonlinear optimal perturbation (CNOP). CNOP is characterized by maximum nonlinear evolution of initial perturbations satisfying constraint conditions (Mu et al. 2003; Mu and Duan 2003). If the evolution of CNOP is measured by the growth rate of initial perturbation, CNOP may not be the fastest-growing perturbation, but its nonlinear evolution describes the maximum evolution of initial perturbations at prediction time. Thus, CNOP may be more important for predictability. In a sense, this kind of initial perturbation can be regarded as the constrained optimal perturbation of a nonlinear system, namely CNOP (Mu and Duan 2005). Mu and Duan (2003) and Duan et al. (2004) used CNOP to study the optimal precursor for El Niño–Southern Oscillation (ENSO) and the effect of nonlinearity on error growth for ENSO. Mu et al. (2004) used CNOP to investigate the nonlinear growth of instability of oceanic thermohaline circulation. Mu et al. (2007) applied CNOP to adaptive observations. All these applications of CNOP suggest that CNOP is a useful tool for predictability and stability or sensitivity studies, and there are many potential uses of CNOP in the studies of weather and climate predictability.
The need of using adjoint integrations during the maximization process to determine CNOP, however, is limiting its widespread use in complicated operational models that do not have an adjoint available. To alleviate this limitation, we propose a new approach to obtain CNOP, using an ensemble technique instead of the adjoint method. At present, the ensemble technique has a wide use in atmospheric sciences. For example, it was applied to a sophisticated weather prediction model (Hakim and Torn 2008), potential vorticity inversion (Gombos and Hansen 2008), and initial-condition sensitivity analysis (Torn and Hakim 2008; Ancell and Hakim 2007). In this paper, it will be used to calculate the gradient of cost function in iterations of the maximization. The methodology of the new approach is presented in section 2, and a preliminary test is given in section 3 using a simple Burgers function model. A summary and discussions are included in section 4.
The basic idea of the ensemble-based method to obtain CNOP is as follows: a lot of linearly independent or orthogonal initial perturbations (x′) are used to create the corresponding prediction increments (y′), a statistical model between x′ and y′ is built through an ensemble technique to easily calculate the gradient of the cost function in the iteration of optimization process, and the nonlinear prediction model is used to obtain the value of the cost function during the iteration to guarantee the nonlinear evolution of the constrained initial perturbation (see Fig. 1). According to the definition of CNOP given by Mu et al. (2003), a constraint condition is given first and then an initial perturbation is founded through the optimal process that satisfies the given constraint condition and makes the prediction increment of interest develop maximally. The initial perturbation determined in this way is then considered to be CNOP.
Assume xb be the background state of atmosphere at the initial time (t0), that is, the initial condition, x′ and y′ be an initial perturbation and the corresponding prediction increment at the prediction time (t), respectively, which satisfy the following nonlinear relationship:
where 𝗠 is a nonlinear prediction model y = 𝗠t0→t(x) is a prediction from t0 to t generated by the model 𝗠 with x as the initial condition. Here xb, x, and x′ are all mx-dimensional column vectors, and y and y′ are my-dimensional column vectors. Choosing n nonzero initial perturbations: x′1, x′2, … , x′n (n ≪ m = mx + my), which are linearly independent: det(𝗣xT𝗣x) ≠ 0, or orthogonal: (x′i)Tx′j = 0 (i ≠ j), where the superscript T means transpose, the corresponding prediction increments, y′1, y′2, … , y′n, can be obtained using the model 𝗠:
and 𝗣x is the initial perturbation matrix:
Correspondingly, the prediction increment matrix is denoted as 𝗣y:
If all the chosen initial perturbations are small enough comparing with xb, that is, ‖x′i‖ ≪ ‖xb‖ (i = 1, 2, … , n), a tangent linear relationship between y′i and x′i is approximately true according to Eq. (2):
where 𝗠′ is the tangent linear model of 𝗠. Using the matrix 𝗣x, an initial perturbation x′ can be projected onto a n-dimensional sample space:
Using Eq. (6) as well as the linear independence or orthogonality of x′1, x′2, … , x′n, α can be expressed as
To filter out the long-distance correlations between the initial perturbations and the corresponding prediction increments on model grids, many of which are false because of the much smaller ensemble size than the grid number, we replace the matrix 𝗣y with ρ · 𝗣y like Houtekamer and Mitchell (2001), where the Schür product of two matrices having the same dimension is denoted 𝗔 = 𝗕 · 𝗖 and consists of the element-wise product such that ai,j = bi,j · ci,j. This gives
The matrix ρ in Eq. (11) is calculated according to
where the filtering function C0 is defined by Gaspari and Cohn [1999, Eq. (4.10)] as
where and are the horizontal and vertical distances between the location of the ith row vector of 𝗣y and the location of the jth row vector of , respectively, and d0h and d0v are horizontal and vertical Schür radii, respectively.
According to the definition of CNOP (Mu et al. 2003), the initial perturbation x′* we want to obtain should make the prediction increment y′ maximum, while satisfying the constraint condition of ‖x′‖2 ≤ δ2 (here δ is a given small positive real number):
where J(x′) = ‖y′(x′)‖2. For the convenience of using existing algorithms of minimization, the reciprocal of J(x′), named J̃(x′), is then used to obtain CNOP:
In this study, the spectral projected gradient (SPG)-software for convex-constrained optimization by Birgin et al. (2001) is used to minimize Eq. (15). For each iteration of the optimization, the value and the gradient of J̃(x′), as well as the constraint condition, are required. To ensure the nonlinearity of the optimal solution, the value of J̃(x′) is calculated using the nonlinear model in Eq. (1). To greatly reduce computing time, the gradient of J̃(x′) is obtained based on the linear relationship between x′ and y′ in Eq. (16). In this way, we avoid the use of the adjoint model and the tangent liner model.
3. Preliminary test
Since the concept of CNOP was put forward, the adjoint-based approach has been the only method to obtain CNOPs (e.g., Mu and Duan 2003; Duan et al. 2004, 2008; Duan and Mu 2006; Mu and Jiang 2008; Terwisscha van Scheltinga and Dijkstra 2008). When a new approach is proposed, one may naturally ask whether the optimal initial perturbation obtained by the new approach is the same as or close to the CNOP derived by the adjoint technique. One may also be interested in a comparison between the new approach and an existing adjoint-free (or derivative free) method. In response to these concerns, three experiments are designed using a simple model of the viscous Burgers equation that describes the time evolution of the state variable u:
where γ = 0.005 m2 s−1, L = 100 m, and T = 30 s. The model solves the above equation using the central finite difference scheme on both the temporal and spatial directions, with Δx = 1.0 m as the spatial grid size (mx = my = 101) and Δt = 1.0 s as the time step. In the three experiments, the adjoint-based, ensemble-based, and simplex-search methods are used to calculate CNOP, respectively, where the simplex-search method is a derivative-free (adjoint free) optimization algorithm (Ahmed 2001). The norm square is employed to measure the perturbation u′(t) and the initial constraint condition is set to be ‖u′(0)‖ ≤ δ = 8 × 10−4 m s−1. It is shown that the adjoint-based and ensemble-based methods generate almost identical horizontal distributions and structures of CNOP (see Figs. 2a,b) except for a slight difference in their amplitudes (see Figs. 2a,b and 3a). The nonlinear time evolutions of both CNOPs are in such good agreement as to appear to be identical in Fig. 4, in terms of their norm squares ‖u′(t)‖2 (black and green curves in Fig. 4a) or their net growth ratios of norm square rg = [‖u′(t)‖2 − ‖u′(0)‖2]/[‖u′(0)‖2] (black and green curves in Fig. 4b). The difference between the net growth ratios of norm of both CNOPs, Δrg, is very small before the perturbations start to grow rapidly at the time t = 21 s and then become larger following the growth of the perturbations (Fig. 5a). However, the difference at the prediction time (t = 30 s) is less than 0.07 (Fig. 5a) and is small enough to be ignored comparing with the values of the growth ratios (16.523 48) at that time (Fig. 4b). In contrast to the ensemble-based approach, the simplex-search method generates a CNOP (Fig. 2c) quite different from that produced by the adjoint-based method (Figs. 2a and 3b) in spatial distributions, although it is a kind of fast-developing initial perturbation (yellow curve in Fig. 4). Its maximum net growth ratio of norm square is 12.682 08 (at the prediction time), which is about a factor of 4 lower than that of the adjoint-based method (16.590 24) and that of the ensemble-based method (Fig. 5b).
In the experiments, both the ensemble-based and adjoint-based approaches adopt the same SPG software for their optimizations. They took 132 and 136 iterations, respectively, before convergences. Like most other derivative-free optimization methods, the simplex-search algorithm converges very slowly, which took 2933 iterations before convergence. During the optimizations, the adjoint-based method need run the nonlinear model for 136 times to calculate the value of the cost function and run the adjoint model for 136 times for obtaining the gradient (or derivation) of the cost function. Meanwhile, the ensemble-based method only need run the nonlinear model for 132 times for optimization plus 58 times for the generation of ensemble members (58 initial perturbation samples are selected from hundreds of randomly produced perturbation samples based on a quality control that the correlation coefficient between any two samples is less than a set threshold that is 0.2 in this study), which is obviously time saving comparing with the adjoint-based method. Because of its slow convergence rate, the simplex-search algorithm need run the nonlinear model for 3366 times to seek the vertex location of cost function. It is the most expensive approach.
The above analysis indicates a good practicability of the ensemble-based approach. It suggests that this approach could be applied to obtain CNOP of an operational prediction model.
4. Summary and discussion
We have introduced a new approach to obtain CNOP based on the ensemble technique. This approach is easier to implement and much more time saving than the classic adjoint-based method. We have also used a simple theoretical model to numerically test the performance of the new approach, and have shown that the nonlinear characteristics of CNOP evolution are well captured by the new approach. The numerical experiments show a good equivalence between the new method and adjoint-based method, because they generate almost identical spatial structures and very similar time developments of CNOP in terms of the norm or the net growth ratio of norm, which suggests that the ensemble-based method is an effective tool to obtain CNOP. The experiments also reveal the advantages of the new method comparing with the simplex-search method, an existing adjoint-free algorithm. We believe that the new approach could be applied to obtain CNOPs of complicated nonlinear models governing the atmospheric and oceanic motions. In particular, we are exploring the application of CNOP generated by the ensemble-based approach for adaptive observations in an operational numerical weather prediction system.
The authors thank M. Mu for his suggestions about the theoretical aspect of this study. We are grateful to W.-S. Duan for his help in calculating CNOP using the adjoint technique. This study is jointly supported by the National Basic Research Program of China (973 Program, Grant 2004CB418304) and the NSF of China innovation research group fund (Grant 40821092).
Corresponding author address: Bin Wang LASG, Institute of Atmospheric Physics, Chinese Academy of Sciences, P.O. Box 9804, Beijing 100029, China. Email: firstname.lastname@example.org