## 1. Introduction

The atmosphere is a chaotic system, and even negligible initial errors will give rise to gradual deviation of the forecast state from the true path, eventually resulting in chaos (Lorenz 1963a,b, 1965; Li and Chou 1997). This means that the weather has a predictability limit beyond which forecasts will lose all skill. Based on this, any single forecast is simply an estimate of the future state of the atmosphere within a stochastic framework but provides no information regarding its reliability. Ensemble prediction offers one approach to generate probabilistic forecasts of the future state of the system based on a statistical sampling approach (Leith 1974).

The ensemble method provides an approximate description of the probability density function (PDF) of the forecast state based on a finite number of samples. In a generally perfect model, if a group of initial states sample the uncertainty related to the initial value reasonably well, the states of their individual integration will have the ability to represent the uncertainty of the forecast states. In contrast with a single forecast, ensemble averaging provides a nonlinear filter that reduces error (Toth and Kalnay 1997). However, the ensemble mean simply treats the ensemble forecast as a single forecast, which does not indicate how reliable the result is. The biggest advantage of ensemble prediction is the product of the second moment of the ensemble, or the ensemble spread (Whitaker and Loughe 1998; Zhu et al. 2002). With this additional information, the quality of forecasts can be significantly enhanced.

Initially, the use of ensemble techniques focused on random samples (Monte Carlo forecasting) as a description of the probability distribution of initial states (Epstein 1969; Leith 1974). However, the atmosphere is an extremely complex system, which has a very high phase-space dimension; that is, the number of random samples must be sufficiently large. Consequently, the cost of performing the requisite number of predictions presents a significant challenge. However, as more and more intensive studies of data assimilation procedures have been completed, analysis errors are no longer considered to be totally unknown. Pure random perturbations may introduce irrelevant directions to a chosen initial state (Houtekamer and Derome 1994). In the early 1990s, ensemble prediction systems were developed and used operationally by weather prediction centers. Different schemes are used to generate the initial perturbations in the control analysis: for example, the bred vector (BV) method (Toth and Kalnay 1993, 1997) at the National Centers for Environmental Prediction (NCEP), the singular vector (SV) method (Molteni and Palmer 1993; Molteni et al. 1996; Buizza 1997) at the European Centre for Medium-Range Weather Forecasts (ECMWF), and the perturbed-observation approach (Houtekamer et al. 1996) at the Canadian Meteorological Centre (CMC). The BV and SV methods have been widely used in predictability and probabilistic prediction of weather and climate (Cai et al. 2003; Yang et al. 2009; Cheng et al. 2010; Buizza 2010). Recently, some methods based on the ensemble transform concept to generate the initial perturbations, such as the ensemble transform Kalman filter (ETKF), ensemble transform (ET), and ensemble transform with rescaling (ETR) methods, have been developed and attempted to improve operational forecasts (Bishop et al. 2001; Wang and Bishop 2003; Wei et al. 2006, 2008; Descamps and Talagrand 2007; Pu and Hacker 2009).

Of several operational ensemble generation schemes, the breeding method is the most time efficient (Wang and Bishop 2003). Toth and Kalnay (1993, 1997) introduced the breeding cycle as a procedure to simulate the behavior of growing modes in the analysis cycle (see Fig. 1). Their rationale was that in a data assimilation cycle, random errors introduced at each analysis stage would evolve into fast-growing directions in the atmospheric flow; that is, the ratio of fast-growing errors to the total error would significantly increase in the control analysis (Toth and Kalnay 1997). Based on these properties of analysis errors, the breeding method was proposed to simulate the analysis cycle and generate fast-growing error modes. In this process, random perturbations are superposed on the initial condition, evolve directly in a nonlinear model of dynamical flow, and are then scaled down to the same size as the initial perturbations at regular intervals. The final perturbations corresponding to each initial state are called BVs, which are added and subtracted from the control analysis to generate the ensemble samples. A more complete description is given in appendix A.

The BVs are a nonlinear extension of the local Lyapunov vectors (LLVs) (Toth and Kalnay 1997; Kalnay 2003). Through the breeding cycles, the BVs all generally turn toward the fastest-growing modes. Toth and Kalnay (1997) state that the BVs have strong similarities in some local regions but are quasi-orthogonal globally. However, the high correlation between the local structures of the BVs may degrade the global orthogonality to some extent. Wang and Bishop (2003) analyzed the orthogonal properties of BVs, concluding that “nearly all of the ensemble forecast variance is contained in a single direction for both the simple breeding and the masked breeding.” This result means that BVs may not span the fast-growing subspace efficiently. However, short-term error growth is a local phenomenon. The high correlation of BVs over local regions may limit forecast skill in those regions (Annan 2004, hereafter A04). Several methods have been recently proposed to prevent BVs from collapsing into one dominant vector. For example, A04 attempted to obtain the fastest-growing modes using orthogonal BVs. Primo et al. (2008) introduced a new logarithmic bred vector (LBV) to increase the diversity of the ensemble. Corazza et al. (2003) and Greybush et al. (2013) attempted to keep BVs less correlated by adding a small random perturbation to the perturbed run. These methods improve to some extent the performance of the diversity of the ensemble perturbations. However, they did not effectively eliminate the dependence among the initial perturbations.

Early investigators developed the theory of Lyapunov exponents [or the Lyapunov exponent spectrum (LEs)] to quantitatively estimate the expanding or contracting nature of different directions in the phase space of nonlinear dynamical systems (Benettin 1980; Wolf et al. 1985; Fraedrich 1987; Trevisan and Legnani 1995). The LEs are global properties of the attractor of the dynamical system. Furthermore, local or finite-time Lyapunov exponents have been proposed (Yoden and Nomura 1993; Ziehmann et al. 2000) to measure the short-term growth rate of small initial perturbations. However, these local or finite-time Lyapunov exponents, which are similar to the global Lyapunov exponents, are established based on the assumption that initial perturbations are infinitesimal and therefore can be controlled by linear error growth dynamics. To understand the nonlinear evolution of initial errors, the nonlinear local Lyapunov exponent (NLLE) was proposed to study the predictability of an

Traditional BVs are obtained through natural breeding cycles. However, natural breeding may not be efficient for two reasons. First, the BVs span a relatively small part of the subspace of fast-growing directions owing to their dependence of perturbations. Second, BVs naturally evolved from random perturbations may not be able to provide a stable indication of the fastest-growing direction after a finite breeding period. We attempted to use the comparison and orthogonalization methods to obtain NLLVs to overcome these two problems. The comparison method, which was used by A04, attempts to capture the fastest-growing direction by comparing the growth rate of orthogonal BVs; it is, therefore, more effective than a single breeding cycle. Here, we will use this comparison method for the acquisition of the LNLLV. The additional NLLVs can be obtained successively by using Gram–Schmidt reorthonormalization (GSR) (Wolf et al. 1985; Li and Wang 2008). A detailed description of the GSR and comparison methods is given in section 2. The first few NLLVs are obtained in this way by optimally selecting the fastest-growing modes from the initial perturbations; they are strictly orthogonal and thereby provide an appropriate orthogonal basis for the fast-growing subspace. The NLLV ensemble would be expected to outperform the BV ensemble. A comparison of the performances of the BV and NLLV methods in a perfect system is our primary aim here. Results using the Monte Carlo method are also included for reference.

The remainder of this paper is arranged as follows. Section 2 provides a brief description of the NLLE, NLLEs, and NLLVs, while section 3 introduces the idealized models and the initialization of the analysis states. A description of the BV and NLLV ensemble generation schemes is given in section 4, and the performance of the different ensemble methods are quantitatively compared in section 5. The discussion and our conclusions are presented in section 6.

## 2. Defining the nonlinear local Lyapunov vector

*n*-dimensional nonlinear dynamical system whose evolution is governed bywhere

**x**, is governed by the nonlinear equations:where

*n*th vector is projected onto the subspace orthogonal to the previous

*n*− 1 vectors. If

*m*perturbations through breeding procedure, the GSR would provide the following orthogonal set

*i*th NLLE

To obtain the fastest-growing LNLLV more effectively than a single BV, we adapted the comparison method of A04 before orthogonalizing the perturbations (as illustrated in Fig. 2). This method includes two steps: a comparison of the growth rates of perturbations in each breeding cycle and a reordering of the perturbations by their growth rate from the largest to the smallest. The NLLVs bear some similarities to the orthogonal BVs as proposed by A04 in the use of the comparison method. However, some differences still exist between them. First, the orthogonal BVs in A04 are mainly used in the local region to increase local diversity, while the NLLVs are globally orthogonal and represent a nonlinear extension of the LLVs based on the theory of Lyapunov stability. Second, the generated ensemble perturbations in A04 are much less correlated than the BVs but are not strictly orthogonal. A04 reported that the mean correlation between any two ensemble perturbations is 0.17 (i.e., not zero) at the end of the integration time in the Lorenz96 model, implying that these perturbations are not strictly orthogonal. However, the NLLVs are strictly orthogonal (i.e., zero correlation) and so form an appropriate orthogonal basis for the fast-growing subspace.

## 3. Experimental setup

To determine the differences among these ensemble methods as clearly as possible, we made our comparison based on two ideal models, namely, the Lorenz63 (Lorenz 1963a) and Lorenz96 models (Lorenz 1996). The former can be regarded as a one-point model, while the latter can been seen as resembling a low-order real NWP model. We have taken special account of the generation of the control analyses to ensure that they have no partiality for any initialization methods. Anderson (1997) simply used random perturbations as analysis errors to compare dynamically constrained (i.e., BV and SV) and unconstrained perturbation methods, but these kinds of random perturbations in dynamical systems were recognized to be the nongrowing type on average, which would reduce initially (Toth and Kalnay 1993). Therefore, growing perturbations (i.e., BV and SV) may have no advantage in simulating such analysis errors. Houtekamer and Derome (1994) even used any of the BVs as an analysis error to investigate the improvement of the breeding method to control the forecast. Our experiments found that the higher correlation between the BVs and the analysis error give some unfair advantage to the breeding method over other ensemble methods (not shown). To perform a thorough comparison of the different initialization methods, following Descamps and Talagrand (2007), the control analyses in the BV, NLLV, and random perturbation methods were obtained using the ensemble Kalman filter (EnKF) data assimilation method (Evensen 2003, 2004). The detailed description of this assimilation method is shown in appendix B.

The BV and NLLV cycles are implemented in relation to the analysis cycle. In each analysis cycle, an analysis state is updated by combining the observations and forecast states from a numerical model, as shown in Fig. 2. The forecast error that evolves in time through the atmospheric flow will naturally project into the growing directions. Therefore, growing components will develop within the analysis cycle and dominate the analysis errors. The breeding and NLLV cycles are implemented to simulate the development of growing errors in the analysis cycle and to obtain fast-growing modes associated with the evolving state of the atmosphere. This allows the BVs and NLLVs to sample efficiently the probability of analysis error.

*N*= 2, where

*N*denotes the number of the ensemble member in a forecast example) for this model.

*m*= 40, where

*m*denotes the dimension of the state vector of the ensemble forecast model) model that has been used by various authors as a low-order proxy for atmospheric prediction and assimilation studies (e.g., Lorenz and Emanuel 1998; Anderson 2001). We denote the state variables

In the case *F* = 8, the model solves Eq. (8) also using a fourth-order Runge–Kutta scheme with a time step of 0.05 time units. This model can be seen as the time evolution of an arbitrary one-dimensional quantity on a constant latitude circle in which a time increment of 0.2 nondimensional units corresponds to about 1 day in terms of the error growth rate (Lorenz 1996).

**x**denotes the true state vector, and

**y**is the observation vector, thenwhere

**x**to the space of observational data

**y**. The added noise vector

To make the experimental setup clearer, two tables listing the experiment setup of EnKF data assimilation and the breeding procedures are included (Tables 1 and 2, respectively). The statistical performances of the forecasts are compared, for each initialization scheme, by averaging over a large number of experiments. The number of experiments was *M* = 10 000 with different initial analysis states for the Lorenz63 model and *M* = 500 for the Lorenz96 model. To ensure that the different experiments were conducted independently, they had different initial states and different observations. For the Lorenz63 model observations were assimilated every 0.1 time units and the performing time was 0.4 time units in each case, while for the Lorenz96 model the assimilation cycle was 0.05 time units (6 h) and was repeated over 10 time units (50 days). Under the parameters as set here, the amplitudes of the initial analysis errors were within the bounds 5%–10% of the natural variability (about 14 for the Lorenz63 model and 2.8 for the Lorenz96 model; in the

Ensemble size, number of observation stations, assimilation interval, and total assimilation length of the EnKF assimilation for the Lorenz63 and Lorenz96 models.

Number of experiments, number of ensemble members, rescaling interval, and total breeding length of the BV and the NLLV method for the Lorenz63 and Lorenz96 models.

Once the analysis states are generated, the ensemble perturbations produced by the random perturbation, BV and NLLV schemes will be added and subtracted from them. The BVs and NLLVs are generated through breeding cycles with regular rescaling. For the Lorenz63 model, the length of a breeding cycle of the NLLV and BV methods is 0.1 time units and the process is repeated over 0.4 time units. For the Lorenz96 model, the length of a breeding cycle of both methods is 0.2 time units (1 day) and the breeding time is 2 time units (10 days). The orthogonalization process for the NLLVs is performed during each breeding cycle. As for the random perturbation method, the random noise is from a Gaussian distribution of

## 4. Nonlinear model results

### a. Perturbation growth rate

Before providing a detailed assessment of the quality of the ensemble predictions, the error growth of the different initial schemes will be investigated. For the Lorenz63 model, the growth rates of the vectors from the three schemes were compared by calculating the ratio of the final to initial error (in the

The process for the Lorenz96 model was similar to that for the Lorenz63 model. The rescaling factor was 0.17 (the average RMSE of analysis error). After a breeding period of 2 time units (10 days), the first EOF explains 52% of the variance of the bred vectors, indicating that these modes have a certain degree of similarity in patterns that represent the unstable modes determined by the background flow (Corazza et al. 2003). A group of 10 vectors were used for the generation of the NLLVs and BVs. The comparison of growth rates from the different schemes is shown in Fig. 3. Here, the Lyapunov exponent was used to measure the perturbation growth.

_{5}) are shown here. Each performance is the average of 500 samples. The LNLLV appears to remain the fastest during the first 4 days. The NLLVs represent different directions ordered by their growth rates. Among the first five NLLVs, the growth rates of the first two both over take those of the BVs during the first 2 days, while the growth rates of the last three NLLVs are slightly smaller than that of the BV. These results indicate that the NLLVs span the fast-growing subspace well. After that period, with the growth of the perturbations, the growth rates of all perturbations slow and become very close. The random perturbations are initially characterized by negative growth rates; the growth becomes enhanced (superexponential) (Trevisan and Legnani 1995) after about 1 day. This may be related to the fast growth of the small-scale errors. After small-scale error saturation had occurred in a short time, the growth rate gradually slowed down and mainly depended on the error growth in the large scales (Lacarra and Talagrand 1988; Molteni and Palmer 1993). We chose three pairs of members (

*N*= 6) to implement ensemble forecasts for the Lorenz96 model. The use of more ensemble members only limitedly improves the prediction skill for both the BV and NLLV methods.

### b. The Lorenz63 model

The RMSE and pattern anomaly correlation (PAC) offer good measures of overall forecast performances (Buizza et al. 2005). Figures 4a and 4b shows the mean error (in the

### c. The Lorenz96 model

The Lorenz96 model is a relatively complex model that resembles a real NWP model and can be used to assess different aspects of forecast performance. As, in the real world, uncertainties in the initial states are inevitable, we assumed that the control analysis is the most real initial state we could obtain. Here, all results were derived from the comparison with true references.

#### 1) Overall measures of ensemble performance

Similarly to the Lorenz63 model, the mean RMSE and PAC were also used here to assess overall performance (Fig. 5). Besides, the ensemble spread is added in Fig. 5a as a comparison with the RMSE of ensemble mean. The spread is defined as the standard deviation of the perturbed ensemble members. The three ensemble schemes evidently outperform the single control forecast in ensemble skill. The gain in predictability from running an ensemble is about 12–24 h on forecast days 7–10. For the first few days (about 4 days), ensemble averages are indistinguishable from the control forecasts. The initial interval can be considered to be a linear stage, during which positive and negative twin perturbations cancel each other out almost completely. However, when the errors gradually increase and enter the nonlinear stage, ensemble averaging plays a much more important role in nonlinear filtering to reduce error growth.

The ensemble skill of the BVs did not show an evident improvement over that of the random vectors. The reason for this may be that during the initial linear stage, the random vectors have already pointed to growing directions through the dynamical flow during the forecast stage, which also spans the subspace of growing errors. However, random vectors have a lower correlation with the analysis errors than bred vectors, and so perform slightly less well than bred vectors, especially in the PAC score. Compared with these two schemes, the ensemble mean of NLLVs is more skillful for 5–10-day forecasts. This can be explained by the fact that the NLLVs sample analysis errors in more mutually orthogonal directions than do the other two schemes. The probability distribution of the analysis errors are more likely to be well simulated by the orthogonal basis the NLLVs form.

The spread is a measure of ensemble reliability. For a statistically reliable ensemble system, ensemble spread should be close to the error of the ensemble mean (Buizza et al. 2005). Figure 5 shows that the growth of the RMSE exceeds that of the spread for all three methods. The use of the comparison method in generating the ensemble perturbations allows the NLLV method to exhibit the largest perturbation growth during the first 2 days; after this, the random method exhibits the largest spread overall, but the spatial distribution of its spread has the lowest correlation with that of the ensemble forecast error (Fig. 6). The ensemble spread of the NLLVs remained larger than that of the BVs from days 1 to 10. This was attributed to the NLLVs sampling the analysis errors in directions much less correlated than used in the breeding schemes.

To distinguish the respective effects of the comparison and the orthogonalization processes on ensemble forecasting, three experiments were implemented and their results were listed in Table 3. By comparing experiments 1 and 2, the only use of the orthogonalization procedure could reduce about 3%–4% of the RMSE of ensemble mean relative to the BV method. The further use of the comparison method as in experiment 3 could have another nearly 1% improvement for the ensemble-mean skill. These results indicate that the global orthgonality is more critical than the comparison method for improving ensemble forecasting in the Lorenz96 model. But the two methods may have different relative performance based on different models and different assimilation systems. This needs a further investigation in future research.

Mean RMSE of ensemble mean for the Lorenz96 model at days 3, 6, and 9 for different experiments. Each experiment was implemented with different initial perturbations, including three bred vectors (expt 1), three orthogonal NLLVs (expt 2), and three orthogonal NLLVs selected from ten perturbations using the comparison method (expt 3).

We also compared the forecast error for the BV and NLLV methods at day 7 with the same number of ensemble members (Fig. 6). It can be seen that the ensemble-mean skill for the BVs and NLLVs are very close when two ensemble members are used. However, with the increase of the number of ensemble members, the forecast error of the NLLV and BV schemes are both decreased, and the performance of NLLVs is more significant than the BVs. It indicates that the addition of the rest NLLVs have larger contributions to the improvement of prediction skill than the BVs. It can be attributed to the orthogonality of the NLLVs that better spanning the fast-growing subspace though the rest of them point to mild-growing modes.

#### 2) The validation scores

Reliability and resolution are considered to be two important attributes of ensemble predictions (Murphy 1973). Here, two scores are used to assess the performances of the ensemble forecasts in statistical reliability and resolution. One is the spread–skill correlation and the other is the classical Brier score.

The first measure of statistical reliability focuses on spread and skill correlations. The average error of the ensemble-mean forecasts from the three schemes is used as a reference field to ensure fair comparisons; that is, the forecast skill is defined as the average distance of the respective ensemble-mean forecasts from the true state. Regions with high forecast skill (small average error) regularly have a relatively small spread and vice versa (Buizza et al. 2005). The spread–error correlation over the whole field (40 grids) was computed as a measure of transient consistency. The day-to-day variations of this measure for the BV, random perturbations, and NLLV methods are shown in Fig. 7. As a comparison, the results of the linear LLVs are also given. Similar to the NLLVs, they are also orthogonal perturbations, but they are obtained using an integration of the tangent linear model (TLM) of Lorenz96.

The forecast error field at time 0 can be considered to be the analysis error amplitude. As the ensemble perturbations are designed to represent analysis errors, the ensemble spread should have a similar pattern to the analysis distance field. As is indicated in Fig. 7, the NLLV scheme has the highest correlation coefficient at the initial time, which means that this method performed best at simulating the initial analysis errors. Furthermore, this method maintains its advantage through the whole forecast period. Because the effects of nonlinear terms were introduced in the NLLVs, the spread of the NLLV ensemble better reflected the nonlinear error growth of the ensemble mean than did the LLVs from 1 to 10 days. It is worth noting that the initial coefficient of the Monte Carlo method approximates to zero, which implies that random perturbations are almost incapable of capturing the mode of the analysis errors. The BV method performs better than the Monte Carlo method. The initial perturbations of these four schemes evolve to become increasingly similar to the forecast errors under the driving effects of background flow. The correlations between the spread and skill of these four schemes spontaneously peaks at about day 5, which may be due to reaching the end of the linear stage and the increasingly strong nonlinear effects. At long forecast ranges, the correlation between spread and skill must decrease to zero (Whitaker and Loughe 1998).

The other score used was the classical Brier score, which measures both the reliability and resolution of the ensemble predictions. This score assesses the performance of the probability forecast of the occurrence of a binary event

As is shown in Fig. 8a, for the basic Brier score, the NLLVs performed better than either the BVs or random vectors, and the advantage of BVs over random vectors was very small. Figure 8b presents the evolution of the two components of BS as a function of time. Overall, the best performance was achieved by the NLLV ensemble, and the figure also indicates that the superiority of the NLLVs over the BVs comes mainly from their reliability. It may be explained that the stronger correlations and smaller ensemble diversity of BVs result in synergic deviation of ensemble forecast states from the observed state. So the BV ensemble may not describe the probability of real events efficiently. On the other hand, the superiority of the NLLVs over random vectors comes mainly from their resolution. This phenomenon may arise because random members sample the true state from a largely unconstrained distribution, which, over longer periods, approximately provides the mean probability of the event occurring. Overall, the relative performance can be ranked as NLLV > BV > Random.

## 5. Discussion and conclusions

The NWP models have extremely high degrees of freedom, which leads to huge difficulties in sampling all of the directions in the phase spaces. Therefore, one can argue that it is appropriate to sample the directions in a subspace that is relatively more important. One proposal is to employ a breeding cycle similar to the analysis cycle to generate perturbations largely projecting onto the growing components of the analysis errors (Toth and Kalnay 1997). This method dramatically improved the level of the ensemble prediction using finite members and has become a useful operational forecast tool. However, independent breeding cycles may cause a certain degree of global similarity of bred vectors, especially in local regions, where the stronger correlations are in fact repeated sampling, which degrades the effectiveness of the ensemble. In the present paper, the NLLV ensemble generation technique was proposed to sample the probability distribution of analysis errors using a set of independent and orthogonal directions. Generally, compared to the BVs, the globally orthogonality of the NLLVs can be helpful to reduce the dependence among perturbations and increase the ensemble spread. The comparison method could also be used to optimally select the fastest-growing perturbations. Therefore, the NLLVs performed better than the BVs in simulate the analysis errors in the fastest-growing subspace.

The search for the fastest-growing subspace is improved by using more initial perturbations in the NLLV scheme than in the BV scheme. This increases the computing time (to about 4 times that required by the BV scheme for the Lorenz96 model). However, the breeding cycles of the NLLVs are independent of each other before orthogonalization and thus can be implemented in parallel to save computational time. Therefore, the increased computation cost of the NLLVs in comparison with the BVs is mainly attributed to the comparison and orthogonalization procedures. Our experiment using the Lorenz96 model has a computational expense for generating the NLLV initial perturbations that is only about 8% greater than that of generating the BV initial perturbations.

For the low-order models discussed here, the NLLV method has shown a relatively better performance than those of the BV and the Monte Carlo methods. The Monte Carlo method generally needs many more samples because of its defects in the recognition of the specification of initial errors, which leads to a large computational cost. On the other hand, random sampling degrades its resolution for actual events (Fig. 8b). The significant advantage of BVs is the ease of implementation. However, in our experiments with the Lorenz96 model, the BVs, which have stronger global similarity and smaller ensemble spread, may not describe the probability of real events efficiently and thus have lower reliability.

Our experiments provide a preliminary demonstration of the GSR method as a quick and effective method for separating independent ensemble modes in the Lorenz 3- and 40-variable models. A high-dimensional numerical model will greatly increase the dimension of the ensemble perturbations and also bring certain difficulties in utilizing the GSR method. However, the dimension of the fast-growing subspace is much smaller than the model dimension. Orthogonal perturbations can capture this subspace to the greatest extent, and orthogonalization is only needed to generate the first few fastest-growing NLLVs, thus greatly saving computation time. The present computing power is sufficient to satisfy the requirements of the orthogonalization in high-order models. Therefore, the NLLV scheme could be expected to be quick and effective for generating ensemble perturbations in a high-dimension numerical model. Nevertheless, some other techniques mentioned in the introduction have been proposed to increase the diversity of traditional BVs. These schemes may be able to improve the prediction skill of the traditional BV method. It is worthwhile investigating the performance of NLLVs and comparing with these methods in more complex models.

## Acknowledgments

We wish to thank Dr. Zhina Jiang for many useful comments and assistance with this work. This research was jointly supported by the NSFC projects (41375110, 41175069) and the 973 projects of China (2012CB955200 and 2010CB950400).

## APPENDIX A

### Ensemble Generation Scheme of Bred Vector

Initially, Toth and Kalnay (1993) used a global rescaling factor to generate each ensemble perturbation when using the BV method. In general, the value of the factor is the same size as the empirically determined global analysis RMSE (Wang and Bishop 2003). However, global rescaling is not optimal for reflecting the characteristics of the observations over local regions because it appears that regions where observations are limited usually have larger analysis errors, and larger regional perturbation amplitudes are therefore expected, and vice versa. Based on this relationship, Toth and Kalnay (1997) introduced the smooth regional rescaling factor.

## APPENDIX B

### The Ensemble Kalman Filter

The

## APPENDIX C

### Brier Score Decomposition

*k*th bin, then

*k*th bin, the probability of forecasts and observations read

*k*th bin;

It is

## REFERENCES

Anderson, J. L., 1997: The impact of dynamical constraints on the selection of initial conditions for ensemble predictions: Low-order perfect model results.

,*Mon. Wea. Rev.***125**, 2969–2983, doi:10.1175/1520-0493(1997)125<2969:TIODCO>2.0.CO;2.Anderson, J. L., 2001: An ensemble adjustment Kalman filter for data assimilation.

,*Mon. Wea. Rev.***129**, 2884–2902, doi:10.1175/1520-0493(2001)129<2884:AEAKFF>2.0.CO;2.Annan, J. D., 2004: On the orthogonality of bred vectors.

,*Mon. Wea. Rev.***132**, 843–849, doi:10.1175/1520-0493(2004)132<0843:OTOOBV>2.0.CO;2.Benettin, G., , L. Galgani, , A. Giorgilli, , and J. M. Strelcyn, 1980: Lyapunov characteristic exponents for smooth dynamical systems and for Hamiltonian systems; a method for computing all of them. Part 1: Theory.

,*Meccanica***15**, 9–20, doi:10.1007/BF02128236.Bishop, C., , B. J. Etherton, , and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects.

,*Mon. Wea. Rev.***129**, 420–436, doi:10.1175/1520-0493(2001)129<0420:ASWTET>2.0.CO;2.Buizza, R., 1997: Potential forecast skill of ensemble prediction, and spread and skill distributions of the ECMWF Ensemble Prediction System.

,*Mon. Wea. Rev.***125**, 99–119, doi:10.1175/1520-0493(1997)125<0099:PFSOEP>2.0.CO;2.Buizza, R., 2010: Horizontal resolution impact on short and long range forecast error.

,*Quart. J. Roy. Meteor. Soc.***136**, 1020–1035, doi:10.1002/qj.613.Buizza, R., , P. Houtekamer, , Z. Toth, , G. Pellerin, , M. Wei, , and Y. Zhu, 2005: A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems.

,*Mon. Wea. Rev.***133**, 1076–1097, doi:10.1175/MWR2905.1.Cai, M., , E. Kalnay, , and Z. Toth, 2003: Bred vectors of the Zebiak–Cane model and their potential application to ENSO predictions.

,*J. Climate***16**, 40–56, doi:10.1175/1520-0442(2003)016<0040:BVOTZC>2.0.CO;2.Cheng, Y., , Y. Tang, , P. Jackson, , D. Chen, , and Z. Deng, 2010: Ensemble construction and verification of the probabilistic ENSO prediction in the LDEO5 Model.

,*J. Climate***23**, 5476–5479, doi:10.1175/2010JCLI3453.1.Corazza, M., and et al. , 2003: Use of the breeding technique to estimate the structure of the analysis “errors of the day.”

,*Nonlinear Processes Geophys.***10**, 233–243, doi:10.5194/npg-10-233-2003.Descamps, L., , and O. Talagrand, 2007: On some aspects of the definition of initial conditions for ensemble prediction.

,*Mon. Wea. Rev.***135**, 3260–3272, doi:10.1175/MWR3452.1.Ding, R. Q., 2007:

*Nonlinear Error Dynamics and Predictability Study*(in Chinese). Chinese Academy of Sciences, 192 pp.Ding, R. Q., , and J. P. Li, 2007: Nonlinear finite-time Lyapunov exponent and predictability.

,*Phys. Lett.***364A**, 396–400, doi:10.1016/j.physleta.2006.11.094.Ding, R. Q., , J. P. Li, , and K. Ha, 2008: Trends and interdecadal changes of weather predictability during 1950s–1990s.

,*J. Geophys. Res.***113**, D24112, doi:10.1029/2008JD010404.Ding, R. Q., , J. P. Li, , and K. Seo, 2010: Predictability of the Madden–Julian oscillation estimated using observational data.

,*Mon. Wea. Rev.***138**, 1004–1013, doi:10.1175/2009MWR3082.1.Epstein, E. S., 1969: Stochastic dynamic prediction.

,*Tellus***21**, 739–759, doi:10.1111/j.2153-3490.1969.tb00483.x.Evensen, G., 2003: The ensemble Kalman filter: Theoretical formulation and practical implementation.

,*Ocean Dyn.***53**, 343–367, doi:10.1007/s10236-003-0036-9.Evensen, G., 2004: Sampling strategies and square root analysis schemes for the EnKF.

,*Ocean Dyn.***54**, 539–560, doi:10.1007/s10236-004-0099-2.Fraedrich, K., 1987: Estimating weather and climate predictability on attractors.

,*J. Atmos. Sci.***44**, 722–728, doi:10.1175/1520-0469(1987)044<0722:EWACPO>2.0.CO;2.Greybush, S. J., , E. Kanlnay, , M. J. Hoffman, , and R. J. Wilson, 2013: Identifying Martian atmospheric instabilities and their physical origins using bred vectors.

,*Quart. J. Roy. Meteor. Soc.***139**, 639–653, doi:10.1002/qj.1990.Houtekamer, P. L., , and J. Derome, 1994: Prediction experiments with two-member ensembles.

,*Mon. Wea. Rev.***122**, 2179–2191, doi:10.1175/1520-0493(1994)122<2179:PEWTME>2.0.CO;2.Houtekamer, P. L., , L. Lefaivre, , J. Derome, , H. Ritchie, , and H. L. Mitchell, 1996: A system simulation approach to ensemble prediction.

,*Mon. Wea. Rev.***124**, 1225–1242, doi:10.1175/1520-0493(1996)124<1225:ASSATE>2.0.CO;2.Kalnay, E., 2003:

Cambridge University Press, 341 pp.*Atmospheric Modeling, Data Assimilation and Predictability.*Lacarra, J.-F., , and O. Talagrand, 1988: Short-range evolution of small perturbations in a barotropic model.

,*Tellus***40A,**81–95, doi:10.1111/j.1600-0870.1988.tb00408.x.Leith, C. E., 1974: Theoretical skill of Monte Carlo forecasts.

,*Mon. Wea. Rev.***102**, 409–418, doi:10.1175/1520-0493(1974)102<0409:TSOMCF>2.0.CO;2.Li, J. P., , and J. F. Chou, 1997: Existence of atmosphere attractor.

*Sci. China,***40D,**215–224.Li, J. P., , and S. H. Wang, 2008: Some mathematical and numerical issues in geophysical fluid dynamics and climate dynamics.

,*Commun. Comput. Phys.***3**, 759–793.Li, J. P., , and R. Q. Ding, 2009: Studies of predictability of single variable from multi-dimensional chaotic dynamical system (in Chinese with English abstract).

,*Chin. J. Atmos. Sci.***33**, 551–556.Li, J. P., , and R. Q. Ding, 2011: Temporal–spatial distribution of atmospheric predictability limit by local dynamical analogues.

,*Mon. Wea. Rev.***139**, 3265–3283, doi:10.1175/MWR-D-10-05020.1.Li, J. P., , and R. Q. Ding, 2013: Temporal–spatial distribution of the predictability limit of monthly sea surface temperature in the global oceans.

,*Int. J. Climatol.***33**, 1936–1947, doi:10.1002/joc.3562.Li, J. P., , R. Q. Ding, , and B. H. Chen, 2006: Review and prospect on the predictability study of the atmosphere (in Chinese).

*Review and Prospects of the Developments of Atmosphere Sciences in Early 21st Century,*China Meteorology Press, 96–104.Lorenz, E. N., 1963a: Deterministic nonperiodic flow.

,*J. Atmos. Sci.***20**, 130–141, doi:10.1175/1520-0469(1963)020<0130:DNF>2.0.CO;2.Lorenz, E. N., 1963b: The predictability of hydrodynamic flow.

*Trans. New York Acad. Sci. Ser. II,***25,**409–433.Lorenz, E. N., 1965: A study of the predictability of a 28-variable atmospheric model.

,*Tellus***17**, 321–333, doi:10.1111/j.2153-3490.1965.tb01424.x.Lorenz, E. N., 1996: Predictability: A problem partly solved.

*Proc. Seminar on Predictability,*Vol. I, Reading, United Kingdom, ECMWF, 1–18.Lorenz, E. N., , and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model.

,*J. Atmos. Sci.***55**, 399–414, doi:10.1175/1520-0469(1998)055<0399:OSFSWO>2.0.CO;2.Molteni, F., , and T. N. Palmer, 1993: Predictability and finite time instability of the northern winter circulation.

,*Quart. J. Roy. Meteor. Soc.***119**, 269–298, doi:10.1002/qj.49711951004.Molteni, F., , R. Buizza, , T. N. Palmer, , and T. Petroliagis, 1996: The new ECMWF Ensemble Prediction System: Methodology and validation.

,*Quart. J. Roy. Meteor. Soc.***122**, 73–119, doi:10.1002/qj.49712252905.Murphy, A. H., 1973: A new vector partition of the probability score.

,*J. Appl. Meteor.***12**, 595–600, doi:10.1175/1520-0450(1973)012<0595:ANVPOT>2.0.CO;2.Primo, C., , M. A. Rodríguez, , and J. M. Gutiérrez, 2008: Logarithmic bred vectors. A new ensemble method with adjustable spread and calibration time.

,*J. Geophys. Res.***113**, D05116, doi:10.1029/2007JD008998.Pu, Z. X., , and J. Hacker, 2009: Ensemble-based Kalman filters in strongly nonlinear dynamics.

,*Adv. Atmos. Sci.***26**, 373–380, doi:10.1007/s00376-009-0373-9.Rui-Qiang, D., , L. Jian-Ping, , and H. Kyung-Ja, 2008: Nonlinear local Lyapunov exponent and quantification of local predictability.

,*Chin. Phys. Lett.***25**, 1919–1922, doi:10.1088/0256-307X/25/5/109.Stephenson, D. B., , C. A. S. Coelho, , and I. T. Jolliffe, 2008: Two extra components in the Brier score decomposition.

*Wea. Forecasting,***23,**752–757, doi:10.1175/2007WAF2006116.1.Toth, Z., , and E. Kalnay, 1993: Ensemble forecasting at NMC: The generation of perturbations.

,*Bull. Amer. Meteor. Soc.***74**, 2317–2330, doi:10.1175/1520-0477(1993)074<2317:EFANTG>2.0.CO;2.Toth, Z., , and E. Kalnay, 1997: Ensemble Forecasting at NCEP: The breeding method.

,*Mon. Wea. Rev.***125**, 3297–3318, doi:10.1175/1520-0493(1997)125<3297:EFANAT>2.0.CO;2.Trevisan, A., , and R. Legnani, 1995: Transient error growth and local predictability: A study in the Lorenz system.

,*Tellus***47A**, 103–117, doi:10.1034/j.1600-0870.1995.00006.x.Wang, X., , and C. Bishop, 2003: A comparison of breeding and ensemble transform Kalman filter ensemble forecast schemes.

,*J. Atmos. Sci.***60**, 1140–1158, doi:10.1175/1520-0469(2003)060<1140:ACOBAE>2.0.CO;2.Wei, M., , Z. Toth, , R. Wobus, , Y. Zhu, , C. H. Bishop, , and X. Wang, 2006: Ensemble transform Kalman filter-based ensemble perturbations in an operational global prediction system at NCEP.

,*Tellus***58A**, 28–44, doi:10.1111/j.1600-0870.2006.00159.x.Wei, M., , Z. Toth, , R. Wobus, , and Y. Zhu, 2008: Initial perturbations based on the ensemble transform (ET) technique in the NCEP global operational forecast system.

,*Tellus***60A**, 62–79, doi:10.1111/j.1600-0870.2007.00273.x.Whitaker, J. S., , and A. F. Loughe, 1998: The relationship between ensemble spread and ensemble mean skill.

,*Mon. Wea. Rev.***126**, 3292–3302, doi:10.1175/1520-0493(1998)126<3292:TRBESA>2.0.CO;2.Wolf, A., , J. B. Swift, , H. L. Swinney, , and J. A. Vastano, 1985: Determining Lyapunov exponents from a time series.

,*Physica D***16**, 285–317, doi:10.1016/0167-2789(85)90011-9.Yang, S.-C., , C. Keppenne, , M. Rienecker, , and E. Kalnay, 2009: Applications of coupled bred vectors to seasonal-to-interannual forecasting and ocean data assimilation.

,*J. Climate***22**, 2850–2870, doi:10.1175/2008JCLI2427.1.Yoden, S., , and M. Nomura, 1993: Finite-time Lyapunov stability analysis and its application to atmospheric predictability.

,*J. Atmos. Sci.***50**, 1531–1543, doi:10.1175/1520-0469(1993)050<1531:FTLSAA>2.0.CO;2Zhu, Y. J., , Z. Toth, , R. Wobus, , D. Richardson, , and K. Mylne, 2002: The economic value of ensemble-based weather forecasts.

,*Bull. Amer. Meteor. Soc.***83**, 73–83, doi:10.1175/1520-0477(2002)083<0073:TEVOEB>2.3.CO;2.Ziehmann, C., , L. A. Smith, , and J. Kurths, 2000: Localized Lyapunov exponents and the prediction of predictability.

,*Phys. Lett.***271A**, 237–251, doi:10.1016/S0375-9601(00)00336-4.