## 1. Introduction

The determination of the fastest-growing perturbation is one of the key issues for weather and climate predictability. To serve this purpose, Lorenz (1965) introduced the concepts of linear singular vector (LSV) and linear singular value (LSVA). Such a linear approach has been widely used to find the fastest-growing perturbations of atmospheric flows, and its applications have been extended to explore error growth and predictability in the past decade (e.g., Xue et al. 1997a,b; Thompson 1998; Samelson and Tziperman 2001). In addition, LSVs are employed to construct initial perturbations for ensemble forecast (Palmer et al. 1993; Montani et al. 1996) and to determine targeted areas in adaptive observations (Buizza and Montani 1999). The theories of LSV and LSVA are established on the basis that evolution of initial perturbation is assumed to be linear, which can be described approximately by a tangent linear model (TLM). The linear assumption is reasonable when the amplitude of perturbation is small enough. However, a small initial perturbation can rapidly evolve to an amplitude that is much larger due to the nonlinearity of prediction models, as in the case for the atmosphere and ocean circulations that are governed by complex nonlinear dynamics. Clearly, the linear theory cannot meet the need of describing nonlinear evolution of finite-amplitude initial perturbation.

Having realized the limitation of LSV and the need to deal with the predictability of a nonlinear system, many methods have been attempted. Lorenz (1996) used the leading Lyapunov vector and exponent to study predictability. The concept of the finite-size Lyapunov exponent was introduced and applied when the initial uncertainty is not very small (Aurell et al. 1997; Boffetta et al. 1998). The breeding method was put forward as an extension of Lyapunov vector into the nonlinear field with finite-amplitude perturbations (Toth and Kalnay 1997). Oortwijn and Barkmeijer (1995) and Barkmeijer (1996) considered the nonlinear effects by an iterative procedure. Mu (2000) employed nonlinear model directly and proposed a novel concept of nonlinear singular vector (NSV) and nonlinear singular value (NSVA), which is a natural generalization of the classic LSV and LSVA.

To refine NSV and NSVA and to describe the initial perturbation that has the largest nonlinear evolution, Mu et al. (2003) introduced the concept of conditional nonlinear optimal perturbation (CNOP). CNOP is characterized by maximum nonlinear evolution of initial perturbations satisfying constraint conditions (Mu et al. 2003; Mu and Duan 2003). If the evolution of CNOP is measured by the growth rate of initial perturbation, CNOP may not be the fastest-growing perturbation, but its nonlinear evolution describes the maximum evolution of initial perturbations at prediction time. Thus, CNOP may be more important for predictability. In a sense, this kind of initial perturbation can be regarded as the constrained optimal perturbation of a nonlinear system, namely CNOP (Mu and Duan 2005). Mu and Duan (2003) and Duan et al. (2004) used CNOP to study the optimal precursor for El Niño–Southern Oscillation (ENSO) and the effect of nonlinearity on error growth for ENSO. Mu et al. (2004) used CNOP to investigate the nonlinear growth of instability of oceanic thermohaline circulation. Mu et al. (2007) applied CNOP to adaptive observations. All these applications of CNOP suggest that CNOP is a useful tool for predictability and stability or sensitivity studies, and there are many potential uses of CNOP in the studies of weather and climate predictability.

The need of using adjoint integrations during the maximization process to determine CNOP, however, is limiting its widespread use in complicated operational models that do not have an adjoint available. To alleviate this limitation, we propose a new approach to obtain CNOP, using an ensemble technique instead of the adjoint method. At present, the ensemble technique has a wide use in atmospheric sciences. For example, it was applied to a sophisticated weather prediction model (Hakim and Torn 2008), potential vorticity inversion (Gombos and Hansen 2008), and initial-condition sensitivity analysis (Torn and Hakim 2008; Ancell and Hakim 2007). In this paper, it will be used to calculate the gradient of cost function in iterations of the maximization. The methodology of the new approach is presented in section 2, and a preliminary test is given in section 3 using a simple Burgers function model. A summary and discussions are included in section 4.

## 2. Methodology

The basic idea of the ensemble-based method to obtain CNOP is as follows: a lot of linearly independent or orthogonal initial perturbations (**x**′) are used to create the corresponding prediction increments (**y**′), a statistical model between **x**′ and **y**′ is built through an ensemble technique to easily calculate the gradient of the cost function in the iteration of optimization process, and the nonlinear prediction model is used to obtain the value of the cost function during the iteration to guarantee the nonlinear evolution of the constrained initial perturbation (see Fig. 1). According to the definition of CNOP given by Mu et al. (2003), a constraint condition is given first and then an initial perturbation is founded through the optimal process that satisfies the given constraint condition and makes the prediction increment of interest develop maximally. The initial perturbation determined in this way is then considered to be CNOP.

**x**

*be the background state of atmosphere at the initial time (*

_{b}*t*

_{0}), that is, the initial condition,

**x**′ and

**y**′ be an initial perturbation and the corresponding prediction increment at the prediction time (

*t*), respectively, which satisfy the following nonlinear relationship: where 𝗠 is a nonlinear prediction model

**y**= 𝗠

_{t0→t}(

**x**) is a prediction from

*t*

_{0}to

*t*generated by the model 𝗠 with

**x**as the initial condition. Here

**x**

*,*

_{b}**x**, and

**x**′ are all

*m*-dimensional column vectors, and

_{x}**y**and

**y**′ are

*m*-dimensional column vectors. Choosing

_{y}*n*nonzero initial perturbations:

**x**′

_{1},

**x**′

_{2}, … ,

**x**′

*(*

_{n}*n*≪

*m*=

*m*+

_{x}*m*), which are linearly independent: det(𝗣

_{y}_{x}

^{T}𝗣

*) ≠ 0, or orthogonal: (*

_{x}**x**′

*)*

_{i}^{T}

**x**′

*= 0 (*

_{j}*i*≠

*j*), where the superscript T means transpose, the corresponding prediction increments,

**y**′

_{1},

**y**′

_{2}, … ,

**y**′

*, can be obtained using the model 𝗠: and 𝗣*

_{n}*is the initial perturbation matrix: Correspondingly, the prediction increment matrix is denoted as 𝗣*

_{x}*: If all the chosen initial perturbations are small enough comparing with*

_{y}**x**

*, that is, ‖*

_{b}**x**′

*‖ ≪ ‖*

_{i}**x**

*‖ (*

_{b}*i*= 1, 2, … ,

*n*), a tangent linear relationship between

**y**′

*and*

_{i}**x**′

*is approximately true according to Eq. (2): where 𝗠′ is the tangent linear model of 𝗠. Using the matrix 𝗣*

_{i}*, an initial perturbation*

_{x}**x**′ can be projected onto a

*n*-dimensional sample space: Based on Eqs. (4), (5), and (6), the corresponding prediction increment

**y**′ can be projected onto the same sample space: Using Eq. (6) as well as the linear independence or orthogonality of

**x**′

_{1},

**x**′

_{2}, … ,

**x**′

*,*

_{n}**can be expressed as Substituting**

*α***in Eq. (7) by that in Eq. (8), the following relationship between**

*α***x**′ and

**y**′ can be established: where

_{y}

**· 𝗣**

*ρ*_{y}

*a*

_{i,j}=

*b*

_{i,j}·

*c*

_{i,j}. This gives The matrix

**in Eq. (11) is calculated according to where the filtering function**

*ρ**C*

_{0}is defined by Gaspari and Cohn [1999, Eq. (4.10)] as where

*i*th row vector of 𝗣

*and the location of the*

_{y}*j*th row vector of

*d*

_{0}

^{h}and

*d*

_{0}

^{v}are horizontal and vertical Schür radii, respectively.

**x**′

_{*}we want to obtain should make the prediction increment

**y**′ maximum, while satisfying the constraint condition of ‖

**x**′‖

^{2}≤

*δ*

^{2}(here

*δ*is a given small positive real number): where

*J*(

**x**′) = ‖

**y**′(

**x**′)‖

^{2}. For the convenience of using existing algorithms of minimization, the reciprocal of

*J*(

**x**′), named

*J̃*(

**x**′), is then used to obtain CNOP: Using Eqs. (11) and (15), the gradient of

*J̃*(

**x**′) can be formulated as In this study, the spectral projected gradient (SPG)-software for convex-constrained optimization by Birgin et al. (2001) is used to minimize Eq. (15). For each iteration of the optimization, the value and the gradient of

*J̃*(

**x**′), as well as the constraint condition, are required. To ensure the nonlinearity of the optimal solution, the value of

*J̃*(

**x**′) is calculated using the nonlinear model in Eq. (1). To greatly reduce computing time, the gradient of

*J̃*(

**x**′) is obtained based on the linear relationship between

**x**′ and

**y**′ in Eq. (16). In this way, we avoid the use of the adjoint model and the tangent liner model.

## 3. Preliminary test

*u*: where

*γ*= 0.005 m

^{2}s

^{−1},

*L*= 100 m, and

*T*= 30 s. The model solves the above equation using the central finite difference scheme on both the temporal and spatial directions, with Δ

*x*= 1.0 m as the spatial grid size (

*m*=

_{x}*m*= 101) and Δ

_{y}*t*= 1.0 s as the time step. In the three experiments, the adjoint-based, ensemble-based, and simplex-search methods are used to calculate CNOP, respectively, where the simplex-search method is a derivative-free (adjoint free) optimization algorithm (Ahmed 2001). The norm square

**u**′(

*t*) and the initial constraint condition is set to be ‖

**u**′(0)‖ ≤

*δ*= 8 × 10

^{−4}m s

^{−1}. It is shown that the adjoint-based and ensemble-based methods generate almost identical horizontal distributions and structures of CNOP (see Figs. 2a,b) except for a slight difference in their amplitudes (see Figs. 2a,b and 3a). The nonlinear time evolutions of both CNOPs are in such good agreement as to appear to be identical in Fig. 4, in terms of their norm squares ‖

**u**′(

*t*)‖

^{2}(black and green curves in Fig. 4a) or their net growth ratios of norm square

*r*= [‖

_{g}**u**′(

*t*)‖

^{2}− ‖

**u**′(0)‖

^{2}]/[‖

**u**′(0)‖

^{2}] (black and green curves in Fig. 4b). The difference between the net growth ratios of norm of both CNOPs, Δ

*r*, is very small before the perturbations start to grow rapidly at the time

_{g}*t*= 21 s and then become larger following the growth of the perturbations (Fig. 5a). However, the difference at the prediction time (

*t*= 30 s) is less than 0.07 (Fig. 5a) and is small enough to be ignored comparing with the values of the growth ratios (16.523 48) at that time (Fig. 4b). In contrast to the ensemble-based approach, the simplex-search method generates a CNOP (Fig. 2c) quite different from that produced by the adjoint-based method (Figs. 2a and 3b) in spatial distributions, although it is a kind of fast-developing initial perturbation (yellow curve in Fig. 4). Its maximum net growth ratio of norm square is 12.682 08 (at the prediction time), which is about a factor of 4 lower than that of the adjoint-based method (16.590 24) and that of the ensemble-based method (Fig. 5b).

In the experiments, both the ensemble-based and adjoint-based approaches adopt the same SPG software for their optimizations. They took 132 and 136 iterations, respectively, before convergences. Like most other derivative-free optimization methods, the simplex-search algorithm converges very slowly, which took 2933 iterations before convergence. During the optimizations, the adjoint-based method need run the nonlinear model for 136 times to calculate the value of the cost function and run the adjoint model for 136 times for obtaining the gradient (or derivation) of the cost function. Meanwhile, the ensemble-based method only need run the nonlinear model for 132 times for optimization plus 58 times for the generation of ensemble members (58 initial perturbation samples are selected from hundreds of randomly produced perturbation samples based on a quality control that the correlation coefficient between any two samples is less than a set threshold that is 0.2 in this study), which is obviously time saving comparing with the adjoint-based method. Because of its slow convergence rate, the simplex-search algorithm need run the nonlinear model for 3366 times to seek the vertex location of cost function. It is the most expensive approach.

The above analysis indicates a good practicability of the ensemble-based approach. It suggests that this approach could be applied to obtain CNOP of an operational prediction model.

## 4. Summary and discussion

We have introduced a new approach to obtain CNOP based on the ensemble technique. This approach is easier to implement and much more time saving than the classic adjoint-based method. We have also used a simple theoretical model to numerically test the performance of the new approach, and have shown that the nonlinear characteristics of CNOP evolution are well captured by the new approach. The numerical experiments show a good equivalence between the new method and adjoint-based method, because they generate almost identical spatial structures and very similar time developments of CNOP in terms of the norm or the net growth ratio of norm, which suggests that the ensemble-based method is an effective tool to obtain CNOP. The experiments also reveal the advantages of the new method comparing with the simplex-search method, an existing adjoint-free algorithm. We believe that the new approach could be applied to obtain CNOPs of complicated nonlinear models governing the atmospheric and oceanic motions. In particular, we are exploring the application of CNOP generated by the ensemble-based approach for adaptive observations in an operational numerical weather prediction system.

## Acknowledgments

The authors thank M. Mu for his suggestions about the theoretical aspect of this study. We are grateful to W.-S. Duan for his help in calculating CNOP using the adjoint technique. This study is jointly supported by the National Basic Research Program of China (973 Program, Grant 2004CB418304) and the NSF of China innovation research group fund (Grant 40821092).

## REFERENCES

Ahmed, S., 2001: Derivative free optimization in higher dimension.

,*Int. Trans. Oper. Res.***8****,**285–303.Ancell, B., , and G. J. Hakim, 2007: Comparing adjoint and ensemble sensitivity analysis.

,*Mon. Wea. Rev.***135****,**4117–4134.Aurell, E., , G. Boffetta, , A. Crisanti, , G. Paladin, , and A. Vulpiani, 1997: Predictability in the large: An extension of the concept of Lypunov exponent.

,*J. Phys. A***30****,**1–26.Barkmeijer, J., 1996: Constructing fast-growing perturbations for the nonlinear regime.

,*J. Atmos. Sci.***53****,**2838–2851.Birgin, E. G., , J. E. Martinez, , and R. Marcos, 2001: Algorithm 813: SPG-software for convex-constrained optimization.

,*ACM Trans. Math. Software***27****,**340–349.Boffetta, G., , P. Giuliani, , G. Paladin, , and A. Vulpiani, 1998: An extension of the Lyapunov analysis for the predictability problem.

,*J. Atmos. Sci.***55****,**3409–3416.Buizza, R., , and A. Montani, 1999: Targeted observations using singular vectors.

,*J. Atmos. Sci.***56****,**2965–2985.Duan, W-S., , and M. Mu, 2006: Investigating decadal variability of El Niño–Southern Oscillation asymmetry by conditional nonlinear optimal perturbation.

,*J. Geophys. Res.***111****,**C07015. doi:10.1029/2005JC003458.Duan, W-S., , M. Mu, , and B. Wang, 2004: Conditional nonlinear optimal perturbation as the optimal precursors for El Niño–Southern oscillation events.

,*J. Geophys. Res.***109****,**D23105. doi:10.1029/2004JD004756.Duan, W-S., , H. Xu, , and M. Mu, 2008: Decisive role of nonlinear temperature advection in El Niño and La Niña amplitude asymmetry.

,*J. Geophys. Res.***113****,**C01014. doi:10.1029/2006JC003974.Gaspari, G., , and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions.

,*Quart. J. Roy. Meteor. Soc.***125**(554) 723–757.Gombos, D., , and J. Hansen, 2008: Potential vorticity regression and its relationship to dynamical piecewise inversion.

,*Mon. Wea. Rev.***136****,**2668–2682.Hakim, G. J., , and R. D. Torn, 2008: Ensemble synoptic analysis.

*Synoptic–Dynamic Meteorology and Weather Analysis and Forecasting: A Tribute to Fred Sanders, Meteor. Monogr.,*No. 55, Amer. Meteor. Soc., 147–161.Houtekamer, P. L., , and H. L. Mitchell, 2001: A sequential ensemble Kalman filter for atmospheric data assimilation.

,*Mon. Wea. Rev.***129****,**123–137.Lorenz, E. N., 1965: A study of the predictability of a 28-variable atmospheric model.

,*Tellus***17****,**321–333.Lorenz, E. N., 1996: Predictability—A problem partly solved.

*Proc. Seminar on Predictability,*Reading, United Kindom, European Centre for Medium-Range Weather Forecasts, 1–18.Montani, A., , R. Buizza, , and A. J. Thorpe, 1996: Singular vector calculations for cases of cyclogenesis in the North Atlantic storm-track. Preprints,

*Seventh Conf. on Mesoscale Processes,*Reading, United Kingdom, Amer. Meteor. Soc., 391–392.Mu, M., 2000: Nonlinear singular vectors and nonlinear singular values.

,*Sci. China***43D****,**375–385.Mu, M., , and W-S. Duan, 2003: A new approach to studying ENSO predictability: Conditional nonlinear optimal perturbation.

,*Chin. Sci. Bull.***48****,**1045–1047.Mu, M., , and W-S. Duan, 2005: Conditional nonlinear optimal perturbation and its applications to the studies of weather and climate predictability.

,*Chin. Sci. Bull.***50****,**2401–2407.Mu, M., , and Z-N. Jiang, 2008: A method to find perturbations that trigger blocking onset: Conditional nonlinear optimal perturbations.

,*J. Atmos. Sci.***65****,**3935–3946.Mu, M., , W-S. Duan, , and B. Wang, 2003: Conditional nonlinear optimal perturbation and its applications.

,*Nonlinear Processes Geophys.***10****,**493–501.Mu, M., , L. Sun, , and D. A. Henk, 2004: The sensitivity and stability of the ocean’s thermocline circulation to finite amplitude freshwater perturbations.

,*J. Phys. Oceanogr.***34****,**2305–2315.Mu, M., , H-L. Wang, , and F-F. Zhou, 2007: A preliminary application of conditional nonlinear optimal perturbation to adaptive observation (in Chinese).

,*Chin. J. Atmos. Sci.***31**(6) 1102–1112.Oortwijn, J., , and J. Barkmeijer, 1995: Perturbations that optimally trigger weather regimes.

,*J. Atmos. Sci.***52****,**3932–3944.Palmer, T. N., , F. Molteni, , R. Mureau, , R. Buizza, , P. Chapelet, , and J. Tribbia, 1993: Ensemble prediction.

*Proc. ECMWF Seminar Proc. on Validation of Models over Europe,*Vol. 1, Shinfield Park, Reading, United Kingdom, ECMWF, 21–66.Samelson, R. M., , and E. Tziperman, 2001: Instability of the chaotic ENSO: The growth-phase predictability barrier.

,*J. Atmos. Sci.***58****,**3613–3625.Terwisscha van Scheltinga, A. D., , and H. A. Dijkstra, 2008: Conditional nonlinear optimal perturbations of the double-gyre ocean circulation.

,*Nonlinear Processes Geophys.***15****,**727–734.Thompson, C. J., 1998: Initial conditions for optimal growth in a coupled ocean–atmosphere model of ENSO.

,*J. Atmos. Sci.***55****,**537–557.Torn, R. D., , and G. J. Hakim, 2008: Ensemble-based sensitivity analysis.

,*Mon. Wea. Rev.***136****,**663–677.Toth, Z., , and E. Kalnay, 1997: Ensemble forecasting at NCEP: The breeding method.

,*Mon. Wea. Rev.***125****,**3297–3318.Xue, Y., , Y. M. A. Cane, , and S. E. Zebiak, 1997a: Predictability of a coupled model of ENSO using singular vector analysis. Part I: Optimal growth in seasonal background and ENSO cycles.

,*Mon. Wea. Rev.***125****,**2043–2056.Xue, Y., , Y. M. A. Cane, , and S. E. Zebiak, 1997b: Predictability of a coupled model of ENSO using singular vector analysis. Part II: Optimal growth and forecast skill.

,*Mon. Wea. Rev.***125****,**2057–2073.