## 1. Introduction

Due to the chaotic nature of the atmosphere, the initial condition and numerical modeling uncertainties rapidly amplify (Nicolis et al. 2009). Ensemble forecasting is an important way to estimate the prediction uncertainty and also to provide probabilistic information on the occurrence of certain events (e.g., Buizza 2019). Traditional ensemble forecasting methods mainly address the impact of uncertainties on the initial conditions. The earliest method, given this context, is the Monte Carlo forecasting method (MCF), which imposes random initial perturbations on the initial analysis field and produces a set of ensemble members to estimate the probability distribution of an event (Epstein 1969; Leith 1974). The MCF was an important step forward in moving the ensemble forecasting approach from basic research to operational practice. Subsequently, some studies suggested that growing-type initial perturbations can offer a good description of the initial analysis errors (Mureau et al. 1993; Toth and Kalnay 1993, 1997), which overcome the problem of the initial perturbation decrease experienced by the MCF. Toth and Kalnay (1993) developed, in particular, the breeding method to find growing-type initial perturbations, the bred vectors (BVs), and applied it in the ensemble forecasting system at the National Centers for Environmental Prediction (NCEP) in 1992. Similarly, nonlinear local Lyapunov vectors (Feng et al. 2014; Feng et al. 2016; Feng et al. 2018) and backward Lyapunov vectors (e.g., Vannitsem and Duan 2020; Demaeyer et al. 2022) were successfully applied in the development of ensemble forecasts. The European Centre for Medium-Range Weather Forecasts (ECMWF) proposed an alternative method based on singular vectors (SVs; Mureau et al. 1993; Buizza and Palmer 1995; Molteni et al. 1996) and produced ensemble forecasts with great success. The SVs capture the optimal unstable growth property of the initial analysis errors in the linearized regime. However, they cannot cope with the impact of nonlinear physical processes on the amplification of the initial perturbations (Anderson 1997; Hamill et al. 2000).

Considering the limitations of the linear theory of SVs, Mu et al. (2003) proposed the conditional nonlinear optimal perturbation (CNOP), which is an extension of the leading SV in the nonlinear regime. CNOP fully considers the influence of nonlinear physical processes and represents the optimally growing initial perturbation in the nonlinear regime. Mu and Jiang (2008) replaced the leading SV with the CNOP to produce the initial perturbations of the ensemble forecasts, which demonstrated higher forecast skills than the SVs (see also Huo and Duan 2019; Zhou et al. 2021). To take into account multiple nonlinear processes in the development of the initial perturbations for ensemble forecasts, Duan and Huo (2016) formulated the orthogonal CNOPs (O-CNOPs) method to produce mutually independent nonlinear optimal initial perturbations for ensemble forecasting. The O-CNOPs have been shown to display a higher ensemble forecast skill than the SVs and BVs and a more reasonable ensemble spread for estimating the uncertainty in a hierarchy of models (Duan and Huo 2016; Huo et al. 2019; Wang and Duan 2019; Wang 2021).

The ensemble forecasting methods mentioned above focus on considering the initial uncertainties, which are valid under a perfect model assumption. However, there is no perfect model in reality. Several studies have indicated that considering model errors is important for improving the skill of numerical weather forecasting and climate prediction (Buizza et al. 1999; Palmer 2000; Orrell et al. 2001; Orrell 2005; Palmer et al. 2009; Duan et al. 2013; Vannitsem 2014). In view of this, new ensemble forecasting methods have been designed to address forecast uncertainties caused by model errors. For example, the ECMWF proposed a stochastically perturbed parameterization tendency scheme (SPPT; Buizza et al. 1999) and stochastic kinetic energy backscatter scheme (SKEB; Shutts 2005), leading to important improvements of the ensemble forecast skill (Berner et al. 2009; Palmer et al. 2009; Du et al. 2018; see also the special issue, Buizza 2019). Hou et al. (2006) developed a stochastic total tendency perturbation scheme (STTP) to emulate model uncertainties in the NCEP global ensemble forecasting system in February 2010 (also see Hou et al. 2008; Hou et al. 2010). In the STTP schemes, the stochastic forcing term, similar to the random initial perturbations in MCF, did not fully capture the rapid unstable growth behavior of model errors. As argued above, the ensemble forecasting system requires growing-type perturbations, which can make the ensemble members deviate from the control forecast and reliably encompass the true value. Therefore, in an ensemble forecasting system incorporating the impact of model errors, the key question is how to generate realistic rapidly growing model perturbations.

To obtain rapidly growing model perturbations, Barkmeijer et al. (2003) proposed using a forcing singular vector (FSV) closely related to the SVs, which represents a rapidly growing constant tendency perturbation in a linear framework. This constant tendency perturbation describes the combined effects of the model systematic errors and parts of state-dependent model errors that are not explicitly described in the model equations (Feng and Duan 2013). Due to the limitation of the linear approximation of FSV, Duan and Zhou (2013) proposed approaching this problem using the nonlinear forcing singular vector (NFSV). The NFSV is a tendency perturbation that makes the forecast deviate from the reference state more significantly; and if it is used in an ensemble forecasting framework, it could better encompass the truth and could provide more reliable ensembles. However, an ensemble forecasting system requires a set of perturbations. This can be achieved by formulating a new approach based on a set of orthogonal NFSVs, following the idea of O-CNOPs developed in Duan and Huo (2016). These vectors will be referred to as the O-NFSVs in the following sections, see section 2. The O-NFSVs provide mutually independent model tendency perturbations that enable the descriptions of the forecast uncertainties caused by the model errors. However, in realistic forecasting systems, the effect of both the initial errors and the model errors, especially the effect of their interaction, are inevitable (Nicolis et al. 2009). The key question to address in a fully integrated forecasting system is how to combine the initial errors and the model errors correctly to obtain a reliable ensemble. Although ensemble forecasting systems exist that consider both the initial errors and the model error effects (e.g., Buizza et al. 1999; Hou et al. 2010), they are built by superimposing the independent initial perturbations (such as SVs, BVs, or others) and the model tendency perturbations (e.g., SPPT or STTP). To date, no attention has been given to the dynamically coordinated growth of the initial and the model perturbations, which may limit the skill of the ensemble forecasts. To overcome this limitation, we further extend the O-NFSVs by developing C-NFSVs that combine the impacts of the initial errors and the model errors and formulate a novel ensemble forecasting approach, which is tested in a simple setting based on the Lorenz-96 model (Lorenz 1996).

The rest of this paper is organized as follows. In section 2, the C-NFSVs are introduced, together with the particular cases referred to as the O-NFSVs and O-CNOPs. In section 3, the Lorenz-96 model adopted in this study is described. In section 4, the experimental design is detailed, and in section 5, the results of the ensemble forecasting experiments are presented. Then, the possible usefulness of deep learning in emulating this type of ensemble forecast is discussed in section 6. Finally, a summary and discussion are provided in section 7.

## 2. The C-NFSVs method and its two particular cases of O-NFSVs and O-CNOPs

**U**

_{0}and

**U**(

**x**,

*t*) represent the initial field and its evolution in time, respectively, and

*F*is a nonlinear differential operator. Assuming that the dynamical system [Eq. (2.1)] and its initial field are exactly known, then the state

**U**(

**x**,

*t*) at a future time

*T*can be given by

*M*is the nonlinear propagator of Eq. (2.1).

_{T}**u**

_{0}to represent the initial errors and tendency perturbations

**f**(

**x**,

*t*) to describe the combined effect of different kinds of model errors (see Barkmeijer et al. 2003; Duan and Zhou 2013; Duan and Zhao 2014; Tao and Duan 2019; Tao et al. 2020), the equations of the forecast model can be written as

*R*. If one uses

^{n}*M*(

_{T}**f**)(⋅) to denote the propagator of Eq. (2.3) from the initial time

*t*= 0 to the prediction time

*t*=

*T*, the corresponding prediction error (denoted by

**u**

*) can be expressed as*

_{T}*M*(

_{T}**U**

_{0}) is the reference state (to be predicted), not contaminated by any errors.

**f**= 0), which represents the initial error that causes the largest prediction error at the prediction time. On the other hand, Duan and Zhou (2013) developed the NFSV approach under a perfect initial condition assumption (i.e.,

**u**

_{0}=

**0**), which describes the model tendency error that caused the largest prediction error at the prediction time. Based on these two approaches and considering time-constant model errors (i.e., tendency perturbation

**f**is assumed to be constant), we can express the maximization problem leading to the C-NFSVs as follows:

*r*

**f**

*and tendency perturbations*

_{j}**f**

*and the departure from the reference state at time*

_{j}*T*, respectively;

*σ*

**and**

_{f}*σ*

_{I}are positive constant numbers that constrain the amplitudes of the tendency perturbations and the initial perturbations, respectively. Note that

*r*

**f**

*is the initial perturbation, where*

_{j}Here, we assume that the initial errors and the model errors are closely related and amplify along the same direction in phase space. This simplification makes life easier but is also justified by the dynamics of the model errors that are also amplified by the generic chaotic mechanism at the origin of the amplification of the initial errors and which after some time behave as an initial error perturbation (Nicolis 2003; Vannitsem 2006; Nicolis et al. 2009).

The C-NFSVs consist of mutually orthogonal initial perturbations and tendency perturbations, which coherently maximize the cost function at time *T* in Ω* _{j}*. In other words, the C-NFSVs coherently induce perturbation evolution [i.e.,

**u**

*in Eq. (2.4)], which are the largest. Moreover, their corresponding objective function values rank as*

_{T}*O-NFSVs*: If the initial fields could be exactly known [i.e.,

**u**

_{0}= 0 in Eq. (2.4) or

*r*= 0 in Eq. (2.5)], the prediction errors would only be caused by model errors

**f**. Then, Eq. (2.5) becomes

*j*= 1 in Eq. (2.7)] to mutually orthogonal subspaces of the model tendencies.

*O-CNOPs*: If the model is considered perfect (i.e.,

**f**= 0) and only initial errors

**u**

_{0}are considered, Eq. (2.5) can be rewritten as

_{,}yields the O-CNOPs defined by Duan and Huo (2016). For

*j*= 1, the resultant initial perturbation is the CNOP proposed by Mu et al. (2003). The O-CNOPs have better performance than the singular vectors and bred vectors in ensemble forecasts for skillful typhoon tracks (Huo and Duan 2019; Huo et al. 2019).

To generate the C-NFSVs and their particular cases, one should first solve the optimization problems given in Eqs. (2.5), (2.7), and (2.8). This is done using a nonmonotonic spectral projected gradient solver [SPG2; the details can be seen in Birgin et al. (2000)], where the gradients of the cost function with respect to the initial perturbation and the tendency perturbation are needed. For the calculation of C-NFSVs, we refer to Duan and Zhou (2013) for the computation of the gradient of the cost function with respect to both the initial perturbations and tendency perturbations. This is also reproduced in appendix A. With this gradient information, we can compute the C-NFSVs by descending along the direction of the gradient using the SPG2 solver.

Please note that much accurate adjoint-gradient information is used in the present study to compute the C-NFSVs because of the simplicity of the Lorenz-96 model (see section 3 for the Lorenz-96 model). There exist algorithms that do not need an adjoint to calculate the gradient but can still be used to compute the C-NFSVs for complex models; the particle swarm optimization algorithm and the genetic algorithm are examples. These algorithms have been successfully applied in the calculation of the CNOP and NFSV mentioned above [see the review of Wang et al. (2020)].

## 3. Lorenz-96 model

*j*= 1, 2, …,

*m*(

*X*satisfies the cyclic boundary conditions, i.e.,

_{j}*X*

_{−}_{1}=

*X*

_{m−}_{1},

*X*

_{0}=

*X*, and

_{m}*X*

_{m}_{+1}=

*X*

_{1}, and they can be thought of as representing nondimensional meteorological quantities (e.g., temperature, pressure, vorticity, or gravitational potential, etc.) that are equally spaced along a latitudinal circle. The linear terms and constant term

*F*describe the internal dissipation of the atmosphere and the external forcing, respectively.

Throughout the present study, the reference model dimension and forcing term are chosen as *m* = 40 and *F* = 8, displaying sensitivity to initial conditions (Lorenz 1996; Van Kekem 2018). The Lorenz-96 model with this configuration is integrated using a fourth-order Runge–Kutta scheme with a nondimensional time step of 0.05 time units. Considering a dimensional time unit of 5 days, the dissipative decay time of the system is approximately one time unit, and the error-doubling time is approximately 0.4 time units. These properties of the Lorenz-96 model are consistent with realistic numerical weather forecast models (Lorenz and Emanuel 1998). Furthermore, when the system has reached its attractor, the mean and standard deviation of the variable *X _{j}* are approximately equal to 2.3 and 3.6, respectively. This system has been widely used in theoretical studies of error dynamics (Vannitsem and Toth 2002; Orrell 2005), data assimilation (Whitaker and Hamill 2002; Hunt et al. 2004; Bai et al. 2013) and adaptive observation (Lorenz and Emanuel 1998; Khare and Anderson 2006). This model has also been often regarded as a platform to explore the usefulness of new ensemble forecasting approaches (Descamps and Talagrand 2007; Revelli et al. 2010; Basnarkov and Kocarev 2012; Feng et al. 2016; Grudzien et al. 2020). The Lorenz-96 model with the above configuration is therefore also used here for examining the possible impact of using the C-NFSVs for ensemble forecasting.

## 4. Experimental strategy

*η*(

_{j}*j*= 1, 2, …, 40) randomly selected from a dataset of normal distribution

*N*(0,

**I**) are supposed to describe the model errors. Then, the perfect system can be written as follows:

**X**, where

**X**= (

*X*

_{1},

*X*

_{2}, …,

*X*

_{40}). From the time series, we take the state values of

**X**every 1460 time steps (i.e., one year) as the initial values and their respective subsequent 12-day (i.e., 48 time steps) evolution as truth runs. Thus, a total of 200 truth runs are selected in the present study.

The “observations” used are artificial and obtained by adding random noises sampled from a standard normal distribution *N*(0, **I**) with a standard deviation of the observational errors equal to 27% of the standard deviation of *X _{j}* within the time period of the data assimilation cycle. The “observations” are assimilated by applying the four-dimensional variational data assimilation (4D-Var) to the imperfect Lorenz-96 model yielding the optimal initial field at the initial time of the forecast. Starting from these initial fields, the Lorenz 96 model is integrated for 12 days, providing the control forecasts associated with the 200 truth runs. Note that these control forecasts here are contaminated by both the initial analysis errors and model tendency errors.

*σ*

_{I}and

*σ*

**in Eq. (2.6)] and the optimization time periods [0,**

_{f}*T*] (see section 2), which are also two important factors affecting the ensemble forecast skills. As such, we take different combinations of them to compute the C-NFSVs (see Table 1). As in Orrell (2002), the maximal amplitude of the tendency perturbations is computed as

**x**(

*t*+ Δ

_{j}*t*) denotes the truth state at time

*t*+ Δ

_{j}*t*using Eq. (4.1), and

**s**(

*t*+ Δ

_{j}*t*) is its forecast at time

*t*+ Δ

_{j}*t*using Eq. (3.1) with the initial value being the true state

**x**(

*t*), and ǁ⋅ǁ

_{j}_{2}is an L2 norm. The maximal amplitude of the initial perturbations is fixed as

**a**(

*t*

_{0}) is the initial analysis field of the control forecast and

**x**(

*t*

_{0}) is the initial field of the truth run. Three kinds of constraint radii with

*σ*

_{I}and

*σ*

**, i.e., 0.6**

_{f}*δ*

_{I}and 0.6

*δ*

**, 0.8**

_{f}*δ*

_{I}and 0.8

*δ*

**, and**

_{f}*δ*

_{I}and

*δ*

**, are experimentally selected for the initial and tendency perturbations, which, together with the four selected optimization times of 3, 4, 5 and 6 days, lead to a total of 12 combinations (i.e.,**

_{f}*E*,

_{i}*i*= 1, 2, 3, …, 12 in Table 1). Note that the constraint radii above are chosen for the purpose of analyzing their impact, while in a realistic forecast, they should be estimated using Eqs. (4.2) and (4.3) according to the historical observations and their hindcasts by assuming the former are truth runs and the latter are control forecasts. For each of the truth runs, the control forecast is regarded as the basic state around which the 20 orthogonal C-NFSVs are computed for each

*E*, where the number of C-NFSVs is experimentally selected as 20, as it almost provides the highest ensemble forecast skill. These 20 C-NFSVs are superimposed to the initial field and model tendency of the control forecast as positive and negative perturbation pairs and integrate in to the model equation (3.1), leading to 40 perturbed forecasts which, together with the control forecast, generate 41 ensemble forecast members for each forecast date.

_{i}Twelve combinations of optimization time period *T* and the constraint radius. *δ*_{I} denotes the magnitude of the initial analysis error measured by the L2-norm, *δ*** _{f}** is the magnitude of the tendency error measured by the L2-norm, and

*E*(

_{i}*i*= 1, 2, 3, …, 12) represents the combination of the four optimization time periods and three constraint radii.

Two-by-two contingency table of a binary event.

The evaluation of the quality of the ensemble forecast system is often performed by analyzing the skill of the ensemble mean in a deterministic way and by evaluating the skill of the probabilistic forecast themselves. The information extracted illustrates different aspects of the performance of the ensemble forecast system. In the present study, the root-mean-square error (RMSE) and the anomaly correlation coefficient (ACC) between the mean of the ensemble members and the true state are used to assess the quality of the deterministic forecast. The Brier score (BS; Brier 1950) and the relative operating characteristic curve area (ROCA; Mason 1982) are adopted to measure the probabilistic forecast skill of binary events. Here, we define the following two categories of events: higher frequency events (event 1) with *X _{j}* > 2 and lower frequency events (event 2) with

*X*> 5, and the two events occur with frequencies of 0.53 and 0.25, respectively. The details of these four scores are described in appendixes B, C, D, and E. Note that the RMSE and BS are negatively oriented (i.e., the smaller the value is, the higher the ensemble forecast skill is), while the ACC and ROCA are positively oriented (i.e., the larger the value is, the higher the ensemble forecast skill is).

_{j}## 5. Results

In this section, the experimental strategy described in the last section is conducted, and the role of C-NFSVs in improving forecast skills is evaluated. Its performance is then compared with the ensemble forecasting based on O-CNOPs and O-NFSVs, emphasizing the importance of simultaneously considering the initial and model errors in the maximization problem leading to the C-NFSVs.

### a. The ensemble forecast skill

The C-NFSVs experiments are listed in Table 1. The O-CNOPs and O-NFSVs have been designed in such a way that they have the same amplitudes as the C-NFSVs. For all forecasting dates and lead times ranging from 6 h to 12 days, we compute the RMSE and ACC of the ensemble mean (deterministic) forecasts for each *E _{i}*, together with the BS and ROCA of the two binary events for the probabilistic forecasts. Figure 1 plots the RMSE, ACC, BS, and ROCA averaged over all the lead times for all the forecast dates.

In Fig. 1, it is noticed that for any kind of perturbation, their associated ensemble forecasts tend to achieve the highest skill when the perturbation amplitudes constrained by *E _{i}* are close to the amplitudes of the initial analysis errors defined in Eq. (4.3) and/or the tendency errors defined in Eq. (4.2). This suggests that if one adopts the O-CNOPs to estimate the initial error impact on the ensemble forecasts or the O-NFSVs to measure the model error impact, their amplitudes should be close to those of the initial analysis errors or tendency errors, respectively. If one adopts the C-NFSVs to consider the impact of both the initial and model errors, their amplitudes are slightly smaller than the pure initial analysis errors or tendency errors and provide higher skills as measured by the RMSE and ACC for the deterministic forecasting or measured by the BS and ROCA in the case of probabilistic forecasting.

Figure 1 also indicates that the skill based on the O-NFSVs gradually increases with the optimization time intervals and constraint radius. However, for the ensemble forecasts based on the O-CNOPs, the skill undergoes a fast increase first and then a slow increase when the optimization time intervals are increased for small initial perturbations. For large initial perturbations, there is no substantial change in skill. In fact, the forecast model adopted here includes a model error −*η _{j}*, superimposed at each time step of the model integration, so it accumulates, leading to a more significant impact at long lead times. Therefore, the O-NFSVs with longer optimization time intervals may be better for grasping the impact of model errors for increasing lead times. The corresponding ensemble forecasts with larger optimization time intervals may then exhibit higher skills, as shown in Fig. 1. For the O-CNOPs, since they only optimize the impact of the initial uncertainties, they may not be very sensitive to the impact of the model errors at long lead times, implying a reduced sensitivity of the skill to the optimization time interval.

If we now compare the ensemble forecasting system based on O-CNOPs, O-NFSVs and C-NFSVs for the same optimization time interval, we find that the ensemble forecasts generated by the C-NFSVs often achieve the highest skill. This means that there is a much larger space for the ensemble forecasting generated by C-NFSVs to improve the forecast skill against the control forecast than those based on the O-NFSVs and O-CNOPs, which, is because the C-NFSVs combine the impact of both errors.

To further clarify the dependence of the skill on different lead times, Fig. 2 displays the differences in performance measures at each lead time for the highest skill experiments (see Fig. 1). It is shown that, for all the lead times, the differences between the O-NFSVs (O-CNOPs) and C-NFSVs are positive for RMSE and BS, and negative for ACC and ROCA at most lead times, further indicating that the ensemble forecasting based on the C-NFSVs possess a higher skill than those made by O-CNOPs and O-NFSVs. When comparing O-CNOPs and O-NFSVs, Fig. 2 further indicates that in the early stage of the forecasts, the ensembles based on the O-NFSVs have a lower skill than those based on the C-NFSVs and O-CNOPs, while for increasing lead times, the ensemble forecasts based on the O-NFSVs gradually improve compared with those based on the O-CNOPs and the C-NFSVs. This result indicates that the impact of the initial errors dominates the forecast error in the early stage of the forecasts, while when the lead time is large, the impact of the model errors gradually increases, and the O-NFSVs start to exert a stronger influence on the forecast uncertainties with a progressive convergence to the skill of the C-NFSVs.

The difference in skills between the ensemble forecasts generated by O-CNOPs and those generated by C-NFSVs (red) and between the ensemble forecasts made by O-NFSVs and those made by C-NFSVs (blue). The horizontal axis denotes the lead time, and the vertical axis represents the difference in the RMSE, ACC, BS, and ROCA values.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

The difference in skills between the ensemble forecasts generated by O-CNOPs and those generated by C-NFSVs (red) and between the ensemble forecasts made by O-NFSVs and those made by C-NFSVs (blue). The horizontal axis denotes the lead time, and the vertical axis represents the difference in the RMSE, ACC, BS, and ROCA values.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

The difference in skills between the ensemble forecasts generated by O-CNOPs and those generated by C-NFSVs (red) and between the ensemble forecasts made by O-NFSVs and those made by C-NFSVs (blue). The horizontal axis denotes the lead time, and the vertical axis represents the difference in the RMSE, ACC, BS, and ROCA values.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

### b. Reliability

The above results have shown that the C-NFSVs allow for obtaining higher skills than the O-NFSVs and O-CNOPs when both the initial and model errors are present. In this section, we will explore the reliability of the probabilistic aspects of the ensemble forecasting system. In principle, a reliable ensemble should display a ratio between the RMSE of the ensemble mean and the spread of the ensemble members (i.e., the square root of the average of the variances of the ensemble forecast members around their mean; see appendix B) approaching 1 during the entire forecast time period (Bowler 2006; Leutbecher and Palmer 2008; Hopson 2014; Fortin et al. 2014), indicating that the spread is a good proxy for the ensemble mean forecast error. Figure 3 shows the ratio of the ensemble spread to RMSE for C-NFSVs, O-CNOPs, and O-NFSVs when they achieve the highest skill in their respective settings (see Table 1 and Fig. 1), for (i) the O-CNOPs are computed by taking the amplitude *δ*_{I} and the optimization time of 5 days; (ii) the O-NFSVs are obtained with the amplitude *δ*** _{f}** and the optimization time of 6 days; and (iii) the C-NFSVs are obtained by using the initial perturbation amplitude 0.8

*δ*

_{I}and tendency perturbation amplitude 0.8

*δ*

**, together with an optimization time of 6 days. Note that the ratio in Fig. 3a is averaged over all initial states and relevant variables, while in Fig. 3b, the focus is on the individual variables, and the average is taken over the forecasting period. For the latter, it is assumed that the**

_{f}*X*in the Lorenz-96 model describes the variables at the grid points along a latitudinal circle (see section 3), and thus, the figure shows the spatial distribution of the ratio of the ensemble spread to the RMSE. Both ratios in Figs. 3a,b show coherent results. That is, the ensemble spread made by the C-NFSVs is much closer to the RMSE of the ensemble mean forecasts and indicates that the C-NFSVs provide a better estimate of the uncertainty either in time or in space.

_{j}(a) Temporal and (b) spatial variability of the ratio of the ensemble spread to the RMSE for the ensemble forecasts made by C-NFSVs (green), O-NFSVs (blue), and O-CNOPs (red). The O-CNOPs are calculated by using the amplitude *δ*_{I} and the optimization time of 5 days; the O-NFSVs are obtained with the amplitude *δ*** _{f}** and the optimization time of 6 days, while the C-NFSVs are obtained by using the initial perturbation amplitude 0.8

*δ*

_{I}and a tendency perturbation amplitude of 0.8

*δ*

**for an optimization time of 6 days.**

_{f}Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

(a) Temporal and (b) spatial variability of the ratio of the ensemble spread to the RMSE for the ensemble forecasts made by C-NFSVs (green), O-NFSVs (blue), and O-CNOPs (red). The O-CNOPs are calculated by using the amplitude *δ*_{I} and the optimization time of 5 days; the O-NFSVs are obtained with the amplitude *δ*** _{f}** and the optimization time of 6 days, while the C-NFSVs are obtained by using the initial perturbation amplitude 0.8

*δ*

_{I}and a tendency perturbation amplitude of 0.8

*δ*

**for an optimization time of 6 days.**

_{f}Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

(a) Temporal and (b) spatial variability of the ratio of the ensemble spread to the RMSE for the ensemble forecasts made by C-NFSVs (green), O-NFSVs (blue), and O-CNOPs (red). The O-CNOPs are calculated by using the amplitude *δ*_{I} and the optimization time of 5 days; the O-NFSVs are obtained with the amplitude *δ*** _{f}** and the optimization time of 6 days, while the C-NFSVs are obtained by using the initial perturbation amplitude 0.8

*δ*

_{I}and a tendency perturbation amplitude of 0.8

*δ*

**for an optimization time of 6 days.**

_{f}Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

The Talagrand diagram (or rank histogram) is also a measure of the reliability for the ensemble forecasting system (Talagrand et al. 1997; Candille and Talagrand 2005). Specifically, a reliable ensemble has a flat histogram, indicating that the observation is indistinguishable from any member of the ensemble forecast. Figure 4 displays the Talagrand diagram of the C-NFSVs, O-CNOPs, and O-NFSVs at varying lead times. It is shown that the histograms for the C-NFSVs are much flatter than those for the O-NFSVs and O-CNOPs. This suggests that the ensemble forecasting made by the C-NFSVs is more reliable than those based on the O-CNOPs and O-NFSVs in the Lorenz-96 model.

Talagrand diagrams for the ensemble forecasts made by (top) C-NFSVs, (middle) O-CNOPs, and (bottom) O-NFSVs with configurations of the perturbation amplitudes and optimization times as in Fig. 3a, at lead times of (from left to right) 2, 4, 6, 8, and 10 days, respectively. The red horizontal lines denote the expected value of the probability.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

Talagrand diagrams for the ensemble forecasts made by (top) C-NFSVs, (middle) O-CNOPs, and (bottom) O-NFSVs with configurations of the perturbation amplitudes and optimization times as in Fig. 3a, at lead times of (from left to right) 2, 4, 6, 8, and 10 days, respectively. The red horizontal lines denote the expected value of the probability.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

Talagrand diagrams for the ensemble forecasts made by (top) C-NFSVs, (middle) O-CNOPs, and (bottom) O-NFSVs with configurations of the perturbation amplitudes and optimization times as in Fig. 3a, at lead times of (from left to right) 2, 4, 6, 8, and 10 days, respectively. The red horizontal lines denote the expected value of the probability.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

The perturbation versus error correlation analysis (PECA; see appendix F), another popular score, is also used to evaluate the quality of the ensembles (Wei and Toth 2003; Buizza et al. 2005; Wei et al. 2008). The higher the PECA values of the individual (or optimally combined) ensemble members are, the more successful the ensemble is in achieving its goal of capturing forecast errors. Figure 5 plots the PECA and indicates that, for either the optimally combined ensembles or the individual ones, the C-NFSVs and O-CNOPs tend to explain a larger amount of the forecast errors present in the control forecasts than the O-NFSVs in the early stage of the forecasts. For longer lead times, the ability of the O-NFSVs to explain the forecast error variances overtakes that of the O-CNOPs. In any case, the C-NFSVs almost always possess the largest PECA value and explain the largest amount of the forecast error variances of the control forecasts. The interpretation provided in section 5a is still valid: the impact of the initial errors dominates the forecast error in the early stage of the forecasts that the O-CNOPs capture well, while for longer lead times, the O-NFSVs start to exert a stronger influence on the forecast uncertainties, with a progressive convergence to the optimal skill of the C-NFSVs. All these results indicate that the ensemble members generated by the C-NFSVs are much better for capturing the forecast errors and lead to an enhanced ensemble quality.

The PECA values for C-NFSVs (green), O-NFSVs (blue), and O-CNOPs (red). The solid lines are the PECA values averaged over the 40 individual perturbation members, and the dashed lines represent the PECA values for the optimal combinations of the 40 perturbation members.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

The PECA values for C-NFSVs (green), O-NFSVs (blue), and O-CNOPs (red). The solid lines are the PECA values averaged over the 40 individual perturbation members, and the dashed lines represent the PECA values for the optimal combinations of the 40 perturbation members.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

The PECA values for C-NFSVs (green), O-NFSVs (blue), and O-CNOPs (red). The solid lines are the PECA values averaged over the 40 individual perturbation members, and the dashed lines represent the PECA values for the optimal combinations of the 40 perturbation members.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

### c. Impact of the dynamics of the C-NFSVs on the forecast skill

The dynamically growing similarity of the initial and tendency perturbations is important in achieving a high skill in the current ensemble forecasting system. To clarify this, we conduct the following two sets of experiments. The first set consists of perturbing the system by combining the O-CNOPs and O-NFSVs in ensemble forecasting mode in configurations in which they provide the highest skill. Specifically, the O-CNOPs are obtained with the initial perturbation amplitudes *δ*_{I} and the optimization time interval of 5 days, while the O-NFSVs are the ones for which the tendency perturbation amplitude is constrained by *δ*** _{f}** and the optimization time interval is fixed to 6 days (see Fig. 1). These skills are compared with those of the C-NFSVs in their best configuration (i.e., the initial perturbation amplitude with a radius of 0.8

*δ*

_{I}and tendency perturbation amplitude with a radius of 0.8

*δ*

**, together with the optimization time interval of 6 days; see Fig. 1). The second set of experiments combines O-CNOPs and O-NFSVs with the same optimization time interval and amplitude of perturbations as in the C-NFSVs of the first set. Figure 6 shows the comparisons of these two sets of experiments. It is shown that the ensemble forecasts made by the C-NFSVs consistently, on average, provide a smaller RMSE and BS and a larger ACC and ROCA than those made by the combined modes, which indicates that the C-NFSVs possess higher ensemble forecast skills than the above kinds of combined modes. This also suggests that the C-NFSVs are not a simple superposition of O-CNOPs and O-NFSVs but possess dynamical features that lead to a higher forecast skill. In other words, a simple combination of initial perturbations and tendency perturbations may cause inconsistent dynamical behaviors between them and would degrade the ensemble forecasting skill.**

_{f}As in Fig. 2, but showing the skill performance differences between the ensemble forecasts made by the combination of O-CNOPs and O-NFSVs when they obtain the highest skill scores and those made by C-NFSVs (red) and between the combination of O-CNOPs and O-NFSVs with the same optimization time period and perturbation amplitudes as in C-NFSVs and those made by C-NFSVs (blue).

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

As in Fig. 2, but showing the skill performance differences between the ensemble forecasts made by the combination of O-CNOPs and O-NFSVs when they obtain the highest skill scores and those made by C-NFSVs (red) and between the combination of O-CNOPs and O-NFSVs with the same optimization time period and perturbation amplitudes as in C-NFSVs and those made by C-NFSVs (blue).

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

As in Fig. 2, but showing the skill performance differences between the ensemble forecasts made by the combination of O-CNOPs and O-NFSVs when they obtain the highest skill scores and those made by C-NFSVs (red) and between the combination of O-CNOPs and O-NFSVs with the same optimization time period and perturbation amplitudes as in C-NFSVs and those made by C-NFSVs (blue).

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

## 6. Tests of the use of deep learning to improve the applicability of C-NFSVs for real-time forecasts

The computation cost of ensemble forecasting systems is a challenge in an operational chain. The ensemble forecast method introduced in the present study is made of members perturbed with C-NFSVs constructed through an SPG2 algorithm with multiple iterations along the fastest descent direction of the gradient (see section 2), and a large amount of computation time is needed. In this section, we propose an alternative way to estimate the evolution of the ensemble members generated by the C-NFSVs rather than to compute C-NFSVs themselves that would allow for a real-time implementation.

Deep learning (DL) techniques have been applied in weather and climate forecasting (Salman et al. 2015, Scher 2018), statistical postprocessing (Rasp and Lerch 2018; Scheuerer et al. 2020; Veldkamp et al. 2021; Vannitsem et al. 2021, for a recent review), and uncertainty prediction (Scher and Messori 2018), and have shown strong nonlinear fitting and prediction abilities. The DL techniques allow for developing inference functions between an input and an output by learning a training dataset. For the ensemble forecasts generated by the C-NFSVs of section 5, the ensemble members for different truth runs are constructed by perturbing control forecasts with their respective C-NFSVs and are close to the control forecasts. Therefore, we can use the control forecasts (also “hindcasts”) of a training set of truth runs and their ensemble forecasting members generated by the C-NFSVs and use the DL approach to learn the dependence relationship of the ensemble forecasting members to the corresponding control forecasts. These inference functions can then be used to produce ensemble forecasting members in the testing period. Clearly, if the DL approach is successful in achieving an ensemble forecasting skill close to that of section 5, it can be used to bypass the computational burden of the C-NFSVs because the training is done offline.

In the present study, the idea is to input the control forecast to a DL model and output the corresponding ensemble members around the control forecast. As the control forecast is made of 40 spatial variables *X _{j}* (

*j*= 1, 2, 3, …, 40) integrated over a time period of 10 days, it consists of a spatiotemporal forecasting problem. In DL algorithms, the convolutional neural network (CNN) is a tool well adapted to learn the features of multidimensional datasets and is especially popular for dealing with spatial information (LeCun et al. 2015; Krizhevsky et al. 2017), while the long short-term memory neural network (LSTM) is a network able to extract temporal information. The convolutional LSTM [ConvLSTM; Shi et al. (2015)], which is a combination of LSTM and CNN, can then be used for appropriately learning spatial–temporal data. The difference between the ConvLSTM and the LSTM lies in that the fully connected structure of the network is replaced by a convolution operation that can identify spatial features. More precisely, the output of ConvLSTM includes both temporal and spatial dimensions but the memory unit connections of ConvLSTM operate mainly on the temporal dimension, while the convolutions capture the spatial information at a specific time. Then, if the output of the ConvLSTM is used as the input of the CNN, the spatial features of the adjacent time can be further extracted, with the advantage of reducing the number of parameters due to the sparse connectivity and parameter sharing of the CNN. The fully connected layers are then “merging” the information provided by the CNN and produce the output.

We take the first consecutive 10-yr data (i.e., a total of 14 600 time steps with the time step being equal to 0.05) from the 200-yr integration of the model Eq. (4.1) as truth runs. We divide this time series into two parts: one part contains the first 12 000 time step data as the training period, and the other part includes the remaining 2600 time step data as the test period. The training data for all the lead times are normalized to have an amplitude in the interval [−1, 1] by calculating *z* is the original data, and *z*_{min}, *z*_{max}, and *z _{N}* are the minimum, maximum and normalized values of the training data

*z*for all the lead times, respectively. These normalized data are the inputs of the LSTM cells. Thanks to the ConvLSTM, the temporal features of the control forecasts are extracted, while the local spatial information is preserved. The output of the ConvLSTM is then used as the input of the CNNs, allowing for the identification of the deeper spatial features of the control forecast. Then, the fully connected layers are used for merging the features that have been extracted and generating the output of the normalized values of the 40 variables of an ensemble member at a given lead time. Such a strategy is summarized in Fig. 7 describing Model-1. Note that for each member at each lead time, there is a separate Model-1, so that a total of 400 DL models with the architecture of Model-1 are used to generate the 40 ensemble members at lead times of 1, 2, 3, …, 10 days. Note that the system under investigation has only 40 dimensions. In this context, the fully connected layers can be easily used to merge the extracted features and generate the final output. In an operational environment, the spatial dimensions are considerably higher, and other types of layers, such as the deconvolution layers, should be used after the above CNN layers in Model-1 instead of the fully connected layers. This should make the DL models more adaptable to the high-dimensional problems associated with operational forecasts.

Sketch of Model-1. Two ConvLSTM layers with filter sizes of 3 × 3 and 40 filters; four 3 × 3 convolutional layers with 80, 160, 320, and 320 filters; three fully connected layers with 800, 200, and 40 nodes; batch normalization (BN) layers; 2 × 2 max pooling layers; and dropout layers with a dropout rate of 0.2 are selected. The input *m*th ensemble member at a given lead time *t*. For each member at each lead time, a separate Model-1 is used. A total of 400 Model-1s are used to generate the ensemble members.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

Sketch of Model-1. Two ConvLSTM layers with filter sizes of 3 × 3 and 40 filters; four 3 × 3 convolutional layers with 80, 160, 320, and 320 filters; three fully connected layers with 800, 200, and 40 nodes; batch normalization (BN) layers; 2 × 2 max pooling layers; and dropout layers with a dropout rate of 0.2 are selected. The input *m*th ensemble member at a given lead time *t*. For each member at each lead time, a separate Model-1 is used. A total of 400 Model-1s are used to generate the ensemble members.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

Sketch of Model-1. Two ConvLSTM layers with filter sizes of 3 × 3 and 40 filters; four 3 × 3 convolutional layers with 80, 160, 320, and 320 filters; three fully connected layers with 800, 200, and 40 nodes; batch normalization (BN) layers; 2 × 2 max pooling layers; and dropout layers with a dropout rate of 0.2 are selected. The input *m*th ensemble member at a given lead time *t*. For each member at each lead time, a separate Model-1 is used. A total of 400 Model-1s are used to generate the ensemble members.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

In Model-1, we experimentally select two ConvLSTM layers of 40 filters and four convolutional layers of 80, 160, 320 and 320 filters with each filter having the size 3 × 3 (where a padding is made for having the same size of input and output) and three fully connected layers of 800, 200 and 40 nodes. In addition, batch normalization layers and max pooling layers with a filter size of 2 × 2 and a stride of 2 are empirically selected to enhance the efficiency of the training process. Dropout layers with a dropout rate of 0.2 are also used for regularization. Furthermore, the Leaky ReLU active function is also introduced for learning nonlinear features (Maas et al. 2013), and the Adam optimizer with a learning rate of 10^{−}^{5} (Kingma and Ba 2014) is used in the loss function for minimizing the distance between the DL output and the corresponding truth. In Model-1, the size of the mini batch for each epoch in the training process is chosen as 100, and the number of epochs is fixed to 200. All these hyperparameters have been defined by trial and error.

*Y*

_{pred}in Eq. (6.2)] during the training period and the real values [denoted by

*Y*

_{truth}in Eq. (6.2)]; CC is the correlation coefficient between them; and

*n*is the size of the mini batch. The MSE reflects the magnitude of the error and is sensitive to outliers but does not provide the location of the error. Therefore, to make the ensemble members generated by Model-1 more efficient, the parameters

*α*and

*β*are introduced for adjusting the weights of MSE and CC to reach a good balance between the correction of the amplitude and location. The outputs of the DL model are therefore normalized to have an amplitude in the interval [

*−*1, 1] using a hyperbolic tangent function (tanh; see Fig. 7), and the values of

*α*and

*β*are fixed to 0.2 and 0.8 to achieve an optimal quality for the ensemble members.

Figure 8 shows the corresponding RMSE, ACC, BS and ROCA compared to those of the ensemble forecasts of section 5 (for simplicity, hereafter referred to as “original forecasts”). It is found that the ensemble forecasts generated by Model-1 are close to the original forecasts when analyzing the deterministic forecasting skills (RMSE and ACC). For probabilistic forecasting, Model-1 presents a skill approximately 3.0% lower than the original forecast but still acceptable.

The skills of the control forecast (gray) and the ensemble forecasts made by the C-NFSVs calculated by the SPG2 (blue) and deep learning (DL) model (i.e., Model-1; red), as measured by RMSE, ACC, BS, and ROCA. These values are obtained by an averaging over 2600 truth runs in the test period.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

The skills of the control forecast (gray) and the ensemble forecasts made by the C-NFSVs calculated by the SPG2 (blue) and deep learning (DL) model (i.e., Model-1; red), as measured by RMSE, ACC, BS, and ROCA. These values are obtained by an averaging over 2600 truth runs in the test period.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

The skills of the control forecast (gray) and the ensemble forecasts made by the C-NFSVs calculated by the SPG2 (blue) and deep learning (DL) model (i.e., Model-1; red), as measured by RMSE, ACC, BS, and ROCA. These values are obtained by an averaging over 2600 truth runs in the test period.

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

It was also found that the ensemble spread inferred by Model-1 is much smaller than the corresponding RMSE of the ensemble mean. This indicates that the ensemble members here, compared to those provided by the C-NFSVs, do lack variability and therefore do not provide an appropriate estimate of the uncertainty around the ensemble mean. Alternative ways should be found. To solve this issue, we further use the architecture of Model-1 but change the model output and associated loss function to directly estimate the RMSE of the individual ensembles (i.e., the RMSE with the realization being *N* = 1 in appendix B), rather than using the corresponding ensemble spread (i.e., the spread with the realization being *N* = 1 in appendix B). For convenience, we refer to this modified DL model as “Model-2.” In this situation, the inputs are still the control forecasts, but the outputs are now the estimations of the RMSE of the individual ensembles at a given lead time (i.e., replace the 40 nodes in the last dense layer with 1 node). For lead times from 1 to 10 days, a total of 10 DL models (i.e., Model-2) are trained. Furthermore, in the loss function MSE, as in Eq. (6.2), *Y*_{truth} represents the RMSE of the individual ensembles made by Model-1 in the training period, and *Y*_{pred} is its estimation based on Model-2 and then applied in the testing period. Figure 9 shows the estimation of the RMSE averaged over all the true runs and all the lead times in the testing period. The RMSE forecasted by Model-2 is close to the RMSE of Model-1. As a result, Model-1 combined with Model-2 provides an ensemble forecasting system mirroring that of the original forecasts, with results close to those predicted by the original forecasting system.

RMSE of the ensemble forecasts based on Model-1 (blue bars) and the corresponding ensemble spread (gray bars) and the estimation of the RMSE of the ensemble forecasts (red bars), as made by the deep learning model (i.e., the Model-2).

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

RMSE of the ensemble forecasts based on Model-1 (blue bars) and the corresponding ensemble spread (gray bars) and the estimation of the RMSE of the ensemble forecasts (red bars), as made by the deep learning model (i.e., the Model-2).

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

RMSE of the ensemble forecasts based on Model-1 (blue bars) and the corresponding ensemble spread (gray bars) and the estimation of the RMSE of the ensemble forecasts (red bars), as made by the deep learning model (i.e., the Model-2).

Citation: Monthly Weather Review 150, 11; 10.1175/MWR-D-22-0007.1

The DL approach to ensemble forecasting based on the C-NFSVs, as proposed in the present work, saves computing time as it no longer needs an optimization process at each forecasting date. However, many uncertainties are still present concerning the construction of the DL model, especially for the choice of the values of the hyperparameters, and other inputs and outputs may also need considering. This needs further investigation.

## 7. Summary and discussion

In this study, we extend the NFSV approach to address the model error impact on ensemble forecasting and propose a C-NFSV ensemble forecasting method that considers the impact of both the initial and model errors. The C-NFSVs provide a group of optimally growing combined perturbations for both the initial conditions and the model tendencies. Two particular cases were also considered: O-CNOPs and O-NFSVs. The former is an ensemble forecasting method formerly proposed by Duan and Huo (2016), dealing with the initial error impact on the forecasts, while the latter estimates the model error impact. The usefulness of the C-NFSVs is demonstrated in the context of the Lorenz (1996) system.

The results show that the ensemble forecasting based on O-CNOPs has a higher skill than the one based on the O-NFSVs in the early stage of the forecasts, while in the later stage of the forecast, the impact of the model errors becomes more prominent and the ensemble forecasting based on the O-NFSVs excels. In any case, the forecasts based on the C-NFSVs, thanks to their optimization on both the initial and model errors, possess higher skill than the ones based on the O-CNOPs and O-NFSVs. These results justify the interest in using C-NFSVs in building ensemble forecasts.

Considering that developing an ensemble based on C-NFSVs is challenging for ensemble forecasting in an operational environment, we also discuss the possible use of deep learning algorithms in providing ensemble forecast information similar to that generated by the C-NFSVs. We develop two different DL models, one that learns the link between the control forecasts and the ensemble members and a second that learns the link between the control and the RMSE of the ensemble. These DL models allow bypassing of the optimization problem to obtain the C-NFSVs at each forecast date. In the context of the Lorenz-96 model, we show that the ensemble forecasts made by the DL models have forecasting skills close to those made by computing the C-NFSVs. This suggests that the introduction of DL algorithms could not only greatly reduce the computing time costs of the ensemble forecasting system made by the C-NFSVs but could also achieve a comparable forecast skill. It also illustrates the potential of the DL algorithms in the context of the operational ensemble forecasts but the DL models must then be adapted to the high-dimensional problems of the operational forecasts. Furthermore, a large variety of DL architectures exist and need to be optimized by trial or error. It should also be understood that the DL model requires a very large training dataset during the training period, and then a large amount of computational time is needed for calculating the C-NFSVs before the operational implementation. Therefore, a highly efficient and effective optimization algorithm is still necessary for calculating the C-NFSVs during the hindcast period.

The current results also emphasize the importance of the dynamically coordinated growth of the initial perturbations and model perturbations in improving the ensemble forecast skill. In the same spirit, comparisons between C-NFSVs and the combined modes of other types of initial perturbations (e.g., BVs or SVs) and other types of model perturbations (such as STTP or SPPT) are also worth performing in the future. Another interesting avenue in the development of C-NFSVs is to consider the effect of time-varying stochastic errors; a combined mode of C-NFSVs and random forcing tendency perturbations may cover a broader range of model errors and have potential for further improving the ensemble forecast skill. In addition, given the simplicity of the Lorenz-96 model, more realistic models must be considered to examine the usefulness of the C-NFSVs for ensemble forecasting systems. This will be the subject of a follow-up work. In any case, it is expected that the C-NFSVs, possibly combined with DL algorithms, will play an important role in realistic ensemble forecasts in the future.

## Acknowledgments.

The authors thank the three anonymous reviewers and the editor, Michael Scheuerer, for their valuable comments and suggestions. This work is supported by the National Natural Science Foundation of China (Grant 41930971). SV is also partly supported by the project ROADMAP, a coordinated JPI-Climate/JPI-Oceans project, financed by the Belgian Science Policy under Contract B2/20E/P1/ROADMAP.

## Data availability statement.

The datasets generated and/or analyzed during the study are stored on computers at the State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG; https://www.lasg.ac.cn) and will be available to researchers upon request.

## APPENDIX A

### A Derivation of the Gradients

**u**(

**u**

_{0},

**f**;

*T*) =

*M*(

_{T}**f**)(

**U**

_{0}+

**u**

_{0})

*− M*(

_{T}**U**

_{0}) [i.e.,

**u**

*in Eq. (2.4) in section 2], and ⟨⋅⟩ is the inner product. By minimizing Eq. (A1) using an existing optimization solver, the C-NFSVs can be obtained, provided the gradients of the cost function with respect to initial perturbations and model tendency perturbations are estimated. Following Duan and Zhou (2013), the gradients can be computed.*

_{T}*J*

_{1}(

**u**

_{0},

**f**) are

*δ*

**u**(

*t*) and

*δ*

**f**are governed by the following tangent linear model:

*λ*

_{1}and

*λ*

_{2}, we obtain

*δJ*

_{1}as follows:

*λ*

_{1}(

*t*) and

*λ*

_{2}(

*t*) satisfy

Note that the gradients **u**_{0} = *r***f**, the gradient required is

## APPENDIX B

### Root-Mean-Square Error of the Ensemble Mean and Ensemble Spread

*j*th component of the

*M*members:

*j*th component of the

*m*th ensemble member;

*j*= 1, 2, …,

*J*and

*m*= 1, 2, …,

*M*. The root-mean-square error (RMSE) of the ensemble mean measures the difference between the ensemble mean

*O*

_{j}_{,}

*(i.e., the truth runs in the present study), which is defined as follows:*

_{n}*N*represents the number of realizations.

## APPENDIX C

### Anomaly Correlation Coefficient

*X*is the forecast value,

_{j}*O*is the observation,

_{j}## APPENDIX D

### Brier Score

*N*is the number of realizations of the prediction process, and

*f*and

_{i}*o*are the probability of forecast and observation for the

_{i}*i*th prediction process, respectively. A smaller BS value indicates a better probability forecast skill.

## APPENDIX E

### Relative Operating Characteristic Curve Area

The relative operating characteristic curve area (ROCA; Mason 1982) is a measure of the resolution of a prediction system. By considering whether an event occurs at every grid and checking the forecasts against the observations, we can construct a two-category contingency table (see Table E1), where *a* and *b* represent the number of hits and false alarms, respectively; and *c* and *d* represent the number of misses and correct rejections, respectively.

*H*and

*F*, and the area under the ROC curve is called ROCA, which decreases from 1 to 0 as more false alarm rates occur. The ROCA is calculated as in Eq. (E3):.

*M*is the number of categories relative to probability thresholds. A larger ROCA value indicates a better probability forecast. When the ROCA is greater than 0.5, the forecast can be regarded as skillful.

## APPENDIX F

### Perturbation versus Error Correlation Analysis

**P**

*(*

_{i}*t*),

**F**

*(*

_{i}*t*), and

**F**

_{ctrl}(

*t*) are the

*i*th ensemble perturbation,

*i*th perturbed forecast, and the control forecast, respectively. The forecast errors

**E**(

*t*) are defined as the difference between the control forecast

**F**

_{ctrl}(

*t*) and the verifying analysis

**F**(

*t*) (i.e., the truth runs in the present study):

*M*perturbations is obtained by solving the least squares problem:

**P**

_{opt}can be written as

**X**and

**Y**is defined as

**X**=

**P**

*(or*

_{i}**X**=

**P**

_{opt}) and

**Y**=

**E**. The square of the correlation

*A*

^{2}is the explained error variance.

## REFERENCES

Anderson, J. L., 1997: The impact of dynamical constraints on the selection of initial conditions for ensemble predictions: Low-order perfect model results.

,*Mon. Wea. Rev.***125**, 2969–2983, https://doi.org/10.1175/1520-0493(1997)125<2969:TIODCO>2.0.CO;2.Bai, Y. L., X. Li, and C. L. Huang, 2013: Handling error propagation in sequential data assimilation using an evolutionary strategy.

,*Adv. Atmos. Sci.***30**, 1096–1105, https://doi.org/10.1007/s00376-012-2115-7.Barkmeijer, J., T. Iversen, and T. N. Palmer, 2003: Forcing singular vectors and other sensitive model structures.

,*Quart. J. Roy. Meteor. Soc.***129**, 2401–2423, https://doi.org/10.1256/qj.02.126.Basnarkov, L., and L. Kocarev, 2012: Forecast improvement in Lorenz 96 system.

,*Nonlinear Processes Geophys.***19**, 569–575, https://doi.org/10.5194/npg-19-569-2012.Berner, J., G. J. Shutts, M. Leutbecher, and T. N. Palmer, 2009: A spectral stochastic kinetic energy backscatter scheme and its impact on flow-dependent predictability in the ECMWF ensemble prediction system.

,*J. Atmos. Sci.***66**, 603–626, https://doi.org/10.1175/2008JAS2677.1.Birgin, E. G., J. M. Martinez, and M. Raydan, 2000: Nonmonotone spectral projected gradient methods on convex sets.

,*SIAM J. Optim.***10**, 1196–1211, https://doi.org/10.1137/S1052623497330963.Bowler, N. E., 2006: Comparison of error breeding, singular vectors, random perturbations and ensemble Kalman filter perturbation strategies on a simple model.

,*Tellus***58A**, 538–548, https://doi.org/10.1111/j.1600-0870.2006.00197.x.Brier, G. W., 1950: Verification of forecasts expressed in terms of probability.

,*Mon. Wea. Rev.***78**, 1–3, https://doi.org/10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2.Buizza, R., 2019: Introduction to the special issue on “25 years of ensemble forecasting” .

,*Quart. J. Roy. Meteor. Soc.***145**(Suppl. 1), 1–11, https://doi.org/10.1002/qj.3370.Buizza, R., and T. N. Palmer, 1995: The singular-vector structure of the atmospheric global circulation.

,*J. Atmos. Sci.***52**, 1434–1456, https://doi.org/10.1175/1520-0469(1995)052<1434:TSVSOT>2.0.CO;2.Buizza, R., M. Miller, and T. N. Palmer, 1999: Stochastic representation of model uncertainties in the ECMWF Ensemble Prediction System.

,*Quart. J. Roy. Meteor. Soc.***125**, 2887–2908, https://doi.org/10.1002/qj.49712556006.Buizza, R., P. L. Houtekamer, Z. Toth, G. Pellerin, M. Z. Wei, and Y. J. Zhu, 2005: A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems.

,*Mon. Wea. Rev.***133**, 1076–1097, https://doi.org/10.1175/MWR2905.1.Candille, G., and O. Talagrand, 2005: Evaluation of probabilistic prediction systems for a scalar variable.

,*Quart. J. Roy. Meteor. Soc.***131**, 2131–2150, https://doi.org/10.1256/qj.04.71.Demaeyer, J., S. G. Penny, and S. Vannitsem, 2022: Identifying efficient ensemble perturbations for initializing subseasonal-to-seasonal prediction.

,*J. Adv. Model. Earth Syst.***14**, e2021MS002828, https://doi.org/10.1029/2021MS002828.Descamps, L., and O. Talagrand, 2007: On some aspects of the definition of initial conditions for ensemble prediction.

,*Mon. Wea. Rev.***135**, 3260–3272, https://doi.org/10.1175/MWR3452.1.Du, J. , and Coauthors, 2018: Ensemble methods for meteorological predictions.

*Handbook of Hydrometeorological Ensemble Forecasting*, Q. Duan et al., Eds., Springer, 1–52.Duan, W. S., and F. F. Zhou, 2013: Non-linear forcing singular vector of a two-dimensional quasi-geostrophic model.

,*Tellus***65****A**, 18452, https://doi.org/10.3402/tellusa.v65i0.18452.Duan, W. S., and P. Zhao, 2014: Revealing the most disturbing tendency error of Zebiak–Cane model associated with El Niño predictions by nonlinear forcing singular vector approach.

,*Climate Dyn.***44**, 2351–2367, https://doi.org/10.1007/s00382-014-2369-0.Duan, W. S., and Z. H. Huo, 2016: An approach to generating mutually independent initial perturbations for ensemble forecasts: Orthogonal conditional nonlinear optimal perturbations.

,*J. Atmos. Sci.***73**, 997–1014, https://doi.org/10.1175/JAS-D-15-0138.1.Duan, W. S., B. Tian, and H. Xu, 2013: Simulations of two types of El Niño events by an optimal forcing vector approach.

,*Climate Dyn.***43**, 1677–1692, https://doi.org/10.1007/s00382-013-1993-4.Epstein, E. S., 1969: Stochastic dynamic prediction.

,*Tellus***21**, 739–759, https://doi.org/10.3402/tellusa.v21i6.10143.Feng, F., and W. S. Duan, 2013: The role of constant optimal forcing in correcting forecast models.

,*Sci. China Earth Sci.***56**, 434–443, https://doi.org/10.1007/s11430-012-4568-z.Feng, J., R. Q. Ding, D. Q. Liu, and J. P. Li, 2014: The application of nonlinear local Lyapunov vectors to ensemble predictions in Lorenz systems.

,*J. Atmos. Sci.***71**, 3554–3567, https://doi.org/10.1175/JAS-D-13-0270.1.Feng, J., R. Q. Ding, J. P. Li, and D. Q. Liu, 2016: Comparison of nonlinear local Lyapunov vectors with bred vectors, random perturbations and ensemble transform Kalman filter strategies in a barotropic model.

,*Adv. Atmos. Sci.***33**, 1036–1046, https://doi.org/10.1007/s00376-016-6003-4.Feng, J., J. Li, R. Ding, and Z. Toth, 2018: Comparison of nonlinear local Lyapunov vectors and bred vectors in estimating the spatial distribution of error growth.

,*J. Atmos. Sci.***75**, 1073–1087, https://doi.org/10.1175/JAS-D-17-0266.1.Fortin, V., M. Abaza, F. Anctil, and R. Turcolle, 2014: Why should ensemble spread match the RMSE of the ensemble mean?

,*J. Hydrometeor.***15**, 1708–1713, https://doi.org/10.1175/JHM-D-14-0008.1.Grudzien, C., M. Bocquet, and A. Carrassi, 2020: On the numerical integration of the Lorenz-96 model, with scalar additive noise, for benchmark twin experiments.

,*Geosci. Model Dev.***13**, 1903–1924, https://doi.org/10.5194/gmd-13-1903-2020.Hamill, T. M., C. Snyder, and R. E. Morss, 2000: A comparison of probabilistic forecasts from bred, singular-vector, and perturbed observation ensembles.

,*Mon. Wea. Rev.***128**, 1835–1851, https://doi.org/10.1175/1520-0493(2000)128<1835:ACOPFF>2.0.CO;2.Hopson, T. M., 2014: Assessing the ensemble spread–error relationship.

,*Mon. Wea. Rev.***142**, 1125–1142, https://doi.org/10.1175/MWR-D-12-00111.1.Hou, D., Z. Toth, and Y. Zhu, 2006: A stochastic parameterization scheme within NCEP global ensemble forecast system.

*18th Conf. on Probability and Statistics in the Atmospheric Sciences*, Atlanta, GA, Amer. Meteor. Soc., 4.5, https://ams.confex.com/ams/Annual2006/techprogram/paper_101401.htm.Hou, D., Z. Toth, and Y. Zhu, 2008: The impact of a stochastic perturbation scheme on hurricane prognosis in NCEP global ensemble forecast system.

*23rd Conf. on Weather Analysis and Forecasting/19th Conf. on Numerical Weather Prediction*, New Orleans, LA, Amer. Meteor. Soc., 4A.6, https://ams.confex.com/ams/23WAF19NWP/techprogram/paper_154340.htm.Hou, D., Z. Toth, Y. Zhu, W. Yang, and R. Wobus, 2010: A stochastic total tendency perturbation scheme representing model-related uncertainties in the NCEP global ensemble forecast system. NOAA/NCEP/EMC, Camp Springs, MD, 50 pp., http://www.emc.ncep.noaa.gov/gmb/yzhu/gif/pub/Manuscript_STTP_Tellus_A_HOU-1.pdf.

Hunt, B. R., and Coauthors, 2004: Four-dimensional ensemble Kalman filtering.

,*Tellus***56A**, 273–277, https://doi.org/10.3402/tellusa.v56i4.14424.Huo, Z. H., and W. S. Duan, 2019: The application of the orthogonal conditional nonlinear optimal perturbations method to typhoon track ensemble forecasts.

,*Sci. China Earth Sci.***62**, 376–388, https://doi.org/10.1007/s11430-018-9248-9.Huo, Z. H., W. S. Duan, and F. F. Zhou, 2019: Ensemble forecasts of tropical cyclone track with orthogonal conditional nonlinear optimal perturbations.

,*Adv. Atmos. Sci.***36**, 231–247, https://doi.org/10.1007/s00376-018-8001-1.Khare, S. P., and J. L. Anderson, 2006: An examination of ensemble filter based adaptive observation methodologies.

,*Tellus***58A**, 179–195, https://doi.org/10.1111/j.1600-0870.2006.00163.x.Kingma, D. P., and J. Ba, 2014: Adam: A method for stochastic optimization.

*Third Int. Conf. for Learning Representations*, San Diego, CA, ICLR, 1–15, https://arxiv.org/abs/1412.6980.Krizhevsky, A., I. Sutskever, and G. E. Hinton, 2017: ImageNet classification with deep convolutional neural networks.

,*Commun. ACM***60**, 84–90, https://doi.org/10.1145/3065386.LeCun, Y., Y. Bengio, and G. Hinton, 2015: Deep learning.

,*Nature***521**, 436–444, https://doi.org/10.1038/nature14539.Leith, C. E., 1974: Theoretical skill of Monte Carlo forecasts.

,*Mon. Wea. Rev.***102**, 409–418, https://doi.org/10.1175/1520-0493(1974)102<0409:TSOMCF>2.0.CO;2.Leutbecher, M., and T. N. Palmer, 2008: Ensemble forecasting.

,*J. Comput. Phys.***227**, 3515–3539, https://doi.org/10.1016/j.jcp.2007.02.014.Lorenz, E. N., 1996: Predicability: A problem partly Lorenz96.

*Proc. Workshop on Predictability*, ECMWF, Reading, United Kingdom, 1–18, https://www.ecmwf.int/node/10829.Lorenz, E. N., and K. A. Emanuel, 1998: Optimal sites for supplementary weather observations: Simulation with a small model.

,*J. Atmos. Sci.***55**, 399–414, https://doi.org/10.1175/1520-0469(1998)055<0399:OSFSWO>2.0.CO;2.Maas, A. L., A. Y. Hannun, and A. Y. Ng, 2013: Rectifier nonlinearities improve neural network acoustic models.

*Proc. ICML*, Atlanta, GA, PMLR, 1–3, http://ai.stanford.edu/∼amaas/papers/relu_hybrid_icml2013_final.pdf.Mason, I. B., 1982: A model for assessment of weather forecasts.

,*Aust. Meteor. Mag.***30**, 291–303.Molteni, F., R. Buizza, T. N. Palmer, and T. Petroliagis, 1996: The ECMWF ensemble prediction system: Methodology and validation.

,*Quart. J. Roy. Meteor. Soc.***122**, 73–119, https://doi.org/10.1002/qj.49712252905.Mu, M., and Z. N. Jiang, 2008: A new approach to the generation of initial perturbations for ensemble prediction: Conditional nonlinear optimal perturbation.

,*Chin. Sci. Bull.***53**, 2062–2068, https://doi.org/10.1007/s11434-008-0272-y.Mu, M., W. S. Duan, and B. Wang, 2003: Conditional nonlinear optimal perturbation and its applications.

,*Nonlinear Processes Geophys.***10**, 493–501, https://doi.org/10.5194/npg-10-493-2003.Mureau, R., F. Molteni, and T. N. Palmer, 1993: Ensemble prediction using dynamically conditioned perturbations.

,*Quart. J. Roy. Meteor. Soc.***119**, 299–323, https://doi.org/10.1002/qj.49711951005.Nicolis, C., 2003: Dynamics of model error: Some generic features.

,*J. Atmos. Sci.***60**, 2208–2218, https://doi.org/10.1175/1520-0469(2003)060<2208:DOMESG>2.0.CO;2.Nicolis, C., R. A. P. Perdigao, and S. Vannitsem, 2009: Dynamics of prediction errors under the combined effect of initial condition and model errors.

,*J. Atmos. Sci.***66**, 766–778, https://doi.org/10.1175/2008JAS2781.1.Orrell, D., 2002: Role of the metric in forecast error growth: How chaotic is the weather?

,*Tellus***54**, 350–362, https://doi.org/10.3402/tellusa.v54i4.12159.Orrell, D., 2005: Ensemble forecasting in a system with model error.

,*J. Atmos. Sci.***62**, 1652–1659, https://doi.org/10.1175/JAS3406.1.Orrell, D., L. Smith, J. Barkmeijer, and T. N. Palmer, 2001: Model error in weather forecasting.

,*Nonlinear Processes Geophys.***8**, 357–371, https://doi.org/10.5194/npg-8-357-2001.Palmer, T. N., 2000: Predicting uncertainty in forecasts of weather and climate.

,*Rep. Prog. Phys.***63**, 71–116, https://doi.org/10.1088/0034-4885/63/2/201.Palmer, T. N., R. Buizza, F. Doblas-Reyes, T. Jung, M. Leutbecher, G. J. Shutts, M. Steinheimer, and A. Weisheimer, 2009: Stochastic parametrization and model uncertainty. ECMWF Research Department Tech. Memo. 598, ECMWF, Reading, United Kingdom, 42 pp., https://doi.org/10.21957/ps8gbwbdv.

Rasp, S., and S. Lerch, 2018: Neural networks for postprocessing ensemble weather forecasts.

,*Mon. Wea. Rev.***146**, 3885–3900, https://doi.org/10.1175/MWR-D-18-0187.1.Revelli, J. A., M. A. Rodriguez, and H. S. Wio, 2010: The use of rank histograms and MVL diagrams to characterize ensemble evolution in weather forecasting.

,*Adv. Atmos. Sci.***27**, 1425–1437, https://doi.org/10.1007/s00376-009-9153-6.Salman, A. G., B. Kanigoro, and Y. Heryadi, 2015: Weather forecasting using deep learning techniques.

*2015 Int. Conf. on Advanced Computer Science and Information Systems*(ICACSIS), Depok, Indonesia, IEEE, https://doi.org/10.1109/ICACSIS.2015.7415154.Scher, S., 2018: Toward data‐driven weather and climate forecasting: Approximating a simple general circulation model with deep learning.

*Geophys. Res. Lett.*,**45**, 12 616–12 622, https://doi.org/10.1029/2018GL080704.Scher, S., and G. Messori, 2018: Predicting weather forecast uncertainty with machine learning.

,*Quart. J. Roy. Meteor. Soc.***144**, 2830–2841, https://doi.org/10.1002/qj.3410.Scheuerer, M., M. B. Switanek, R. P. Worsnop, and T. M. Hamill, 2020: Using artificial neural networks for generating probabilistic subseasonal precipitation forecasts over California.

,*Mon. Wea. Rev.***148**, 3489–3506, https://doi.org/10.1175/MWR-D-20-0096.1.Shi, X. J., Z. R. Chen, H. Wang, D. Y. Yeung, W. K. Wong, and W. C. Woo, 2015: Convolutional LSTM network: A machine learning approach for precipitation nowcasting.

*Proc. 28th Int. Conf. on Neural Information Processing Systems*, Cambridge, MA, MIT Press, https://dl.acm.org/doi/10.5555/2969239.2969329.Shutts, G., 2005: A kinetic energy backscatter algorithm for use in ensemble prediction systems.

,*Quart. J. Roy. Meteor. Soc.***131**, 3079–3102, https://doi.org/10.1256/qj.04.106.Talagrand, O., R. Vautard, and B. Strauss, 1997: Evaluation of probabilistic prediction systems.

*ECMWF Workshop on Predictability*, Reading, United Kingdom, ECMWF, 1–25.Tao, L. J., and W. S. Duan, 2019: Using a nonlinear forcing singular vector approach to reduce model error effects in ENSO forecasting.

,*Wea. Forecasting***34**, 1321–1342, https://doi.org/10.1175/WAF-D-19-0050.1.Tao, L. J., W. S. Duan, and S. Vannitsem, 2020: Improving forecasts of El Nino diversity: A nonlinear forcing singular vector approach.

,*Climate Dyn.***55**, 739–754, https://doi.org/10.1007/s00382-020-05292-5.Toth, Z., and E. Kalnay, 1993: Ensemble forecasting at NMC: The generation of perturbations.

,*Bull. Amer. Meteor. Soc.***74**, 2317–2330, https://doi.org/10.1175/1520-0477(1993)074<2317:EFANTG>2.0.CO;2.Toth, Z., and E. Kalnay, 1997: Ensemble forecasting at NCEP and the breeding method.

,*Mon. Wea. Rev.***125**, 3297–3319, https://doi.org/10.1175/1520-0493(1997)125<3297:EFANAT>2.0.CO;2.Van Kekem, D. L., 2018:

*Dynamics of the Lorenz-96 Model: Bifurcations, Symmetries and Waves*. University of Groningen, 193 pp.Vannitsem, S., 2006: The role of scales in the dynamics of parameterization uncertainties.

,*J. Atmos. Sci.***63**, 1659–1671, https://doi.org/10.1175/JAS3708.1.Vannitsem, S., 2014: Stochastic modelling and predictability: Analysis of a low-order coupled ocean-atmosphere model.

, A372, 20130282, https://doi.org/10.1098/rsta.2013.0282.*Philos. Trans. Roy. Soc.*Vannitsem, S., and Z. Toth, 2002: Short-term dynamics of model errors.

,*J. Atmos. Sci.***59**, 2594–2604, https://doi.org/10.1175/1520-0469(2002)059<2594:STDOME>2.0.CO;2.Vannitsem, S., and W. Duan, 2020: On the use of near-neutral backward Lyapunov vectors to get reliable ensemble forecasts in coupled ocean–atmosphere systems.

,*Climate Dyn.***55**, 1125–1139, https://doi.org/10.1007/s00382-020-05313-3.Vannitsem, S., and Coauthors, 2021: Statistical postprocessing for weather forecasts: Review, challenges, and avenues in a big data world.

,*Bull. Amer. Meteor. Soc.***102**, E681–E699, https://doi.org/10.1175/BAMS-D-19-0308.1.Veldkamp, S., K. Whan, S. Dirksen, and M. Schmeits, 2021: Statistical postprocessing of wind speed forecasts using convolutional neural networks.

,*Mon. Wea. Rev.***149**, 1141–1152, https://doi.org/10.1175/MWR-D-20-0219.1.Wang, J., 2021:

*Study on the Moist Singular Vectors and Nonlinear Initial Perturbation in GRAPES-GEPS*. University of Chinese Academy of Sciences, 129 pp.Wang, Q., M. Mu, and G. Sun, 2020: A useful approach to sensitivity and predictability studies in geophysical fluid dynamics: Conditional non-linear optimal perturbation.

,*Natl. Sci. Rev.***7**, 214–223, https://doi.org/10.1093/nsr/nwz039.Wang, Y., and W. Duan, 2019: Influences of initial perturbation amplitudes and ensemble sizes on the ensemble forecasts made by CNOPs method.

,*Chin. J. Atmos. Sci.***43**, 919–933, http://duanws.lasg.ac.cn/ueditor/php/upload/file/20191009/1570608319337097.pdf.Wei, M. Z., and Z. Toth, 2003: A new measure of ensemble performance: Perturbation versus error correlation analysis (PECA).

,*Mon. Wea. Rev.***131**, 1549–1565, https://doi.org/10.1175//1520-0493(2003)131<1549:ANMOEP>2.0.CO;2.Wei, M. Z., Z. Toth, R. Wobus, and Y. J. Zhu, 2008: Initial perturbations based on the ensemble transform (ET) technique in the NCEP global operational forecast system.

,*Tellus***60A**, 62–79, https://doi.org/10.1111/j.1600-0870.2007.00273.x.Whitaker, J. S., and T. M. Hamill, 2002: Ensemble data assimilation without perturbed observations.

,*Mon. Wea. Rev.***130**, 1913–1924, https://doi.org/10.1175/1520-0493(2002)130<1913:EDAWPO>2.0.CO;2.Zhou, Q., L. Chen, W. Duan, X. Wang, Z. Zu, X. Li, S. Zhang, and Y. Zhang, 2021: Using conditional nonlinear optimal perturbation to generate initial perturbations in ENSO ensemble forecasts.

,*Wea. Forecasting***36**, 2101–2111, https://doi.org/10.1175/WAF-D-21-0063.1.