Using a Nonlinear Forcing Singular Vector Approach to Reduce Model Error Effects in ENSO Forecasting

Nonlinear forcing singular vector (NFSV)-based assimilation is adopted to determine the model tendency errors that represent the combined effect of different kinds of model errors; then, an NFSV-tendency error forecast model is formulated. This error forecast model is coupled with an intermediate complex model (ICM) and makes the ICM output closer to the observations; ﬁnally, an NFSV-ICM forecast model for ENSO is constructed. The competing aspect of the NFSV-ICM is to consider not only model errors but also the interaction between model errors and initial errors because of the mathe-matical nature of the NFSV-tendency errors. Based on the prediction experiments for tropical SSTAs during either the training period (1960–96; i.e., when the NFSV-ICM is formulated) or the cross-validation period (1997–2016), the NFSV-ICM is determined to have a much higher forecast skill in predicting ENSO that, speciﬁcally, extends the skillful predictions of ENSO from a lead time of 6 months in the original ICM to a lead time of 12 months. The higher skill of the NFSV-ICM is especially reﬂected in the predictions of SSTAs in the central and western Paciﬁc. For the well-known spring predictability barrier (SPB) phenomenon that greatly limits ENSO forecasting skill, the NFSV-ICM also shows great abilities in suppressing its negative effect on ENSO predictions. Although the NFSV-ICM is presently only involved with the NFSV-related assimilation of SSTs, it has shown its usefulness in predicting ENSO. It is clear that the NFSV-based assimilation approach is effective in dealing with the effect of model errors on ENSO forecasts.


Introduction
El Niño-Southern Oscillation (ENSO), which is known as the dominant interannual mode in the tropical Pacific, has been the focus of scientists and the public over the past several decades because of its global impact on climate/weather (McPhaden et al. 2006). Through efforts over the decades, there has been significant progress toward observing, understanding, and simulating ENSO (Zebiak and Cane 1987;Jin 1997a,b;McPhaden et al. 1998;Timmermann et al. 2018). Consequently, the skillful predictions of Niño-3.4 SST anomalies related to ENSO have been achieved 6-12 months in advance (even up to 2 years) in hindcast experiments (Chen and Cane 2008;Luo et al. 2008).
To date, more than 20 climate models have been used to routinely predict ENSO in real time (see the website at https://iri.columbia.edu/our-expertise/climate/enso/). However, the skillful and realistic forecasting of ENSO can only be made with, at most, a 6-month lead (Barnston et al. 2012).
Initial errors have been recognized as one of the main factors that limit ENSO forecast skills. To reduce the impact of initial errors, great efforts have been made to optimize initial conditions under model constraints and observations by data assimilation (e.g., Chen et al. 1997;Behringer et al. 1998;Sugiura et al. 2008;Gao et al. 2016;etc.). In addition, intensified observations in some key areas (i.e., targeted observations) were suggested to provide the most useful observations for data assimilation to suppress initial error growth and provide a Denotes content that is immediately available upon publication as open access. skillful prediction of ENSO (Mu et al. 2015;Duan and Hu 2016;Hu and Duan 2016;Duan and Feng 2017;Tao et al. 2017Tao et al. , 2018. For example, Morss and Battisti (2004a,b), based on the observation system simulation experiment (OSSE), suggested that for ENSO forecasting longer than a few months, the most important area for observations is the eastern equatorial Pacific and its south; the secondary region of importance is the western equatorial Pacific. By using sequential importance sampling assimilation method, Kramer and Dijkstra (2013) also showed that the optimal observation locations for SST are located in the eastern tropical Pacific for minimizing the uncertainty in the Niño-3 index.
In addition to initial errors, an increasing number of studies have also emphasized the importance of model errors in yielding ENSO prediction uncertainties (Latif et al. 2001;Stainforth et al. 2005;Zheng et al. 2009a;Wu et al. 2016;Tao et al. 2019). For instance, Jin et al. (2008) noted that the accuracy of ENSO variability is related to the simulated climatology as a result of a poor prediction skill when using models with climatological biases.
As one source of model errors, the model parametric errors (MPEs) were also found to have a role in yielding model systematic errors and influencing ENSO variability (Bejarano and Jin 2008;Macmynowski and Tziperman 2008;Zhu and Zhang 2018). Tao et al. (2019) investigated the impact of MPEs on ENSO predictions from the perspective of optimal error growth and indicated that MPEs have the potential to influence the strength of the Bjerknes feedback, which is crucial to the development of SST anomalies, thus disturbing ENSO predictions. In addition, they also demonstrated that the spring predictability barrier (SPB) phenomenon of ENSO can also be caused by MPEs besides initial errors. To reduce the effect of the model errors caused by the MPEs,  made a skillful hindcast of the strong El Niño event in 2015 by optimizing two key parameters associated with Bjerknes feedback. In particular, Wu et al. (2016) reduced the SPB phenomenon of ENSO and extended the valid lead time by considering multiple physical parameters determined by the ensemble adjustment of the Kalman filter.
Missing some processes in models also tends to induce model errors and affect ENSO simulations and predictions. For example, westerly wind bursts (WWBs) have been shown to have the ability to excite the onset of El Niño events . Lopez and Kirtman (2014) demonstrated that the inclusivity of statedependent WWBs in an ENSO model can greatly improve the ENSO prediction skill. Therefore, when the model lacks the WWB effect and yields model errors, the ENSO forecast skill can be largely influenced. Yu et al. (2003) suggested that the characteristics of WWBs depend on the large-scale SST field and therefore indicated that the model errors induced by the lack of WWBs may be of a certain structure and significantly disturb the predictions of ENSO. Of course, model errors are not only caused by some physical processes that do not appear in models but also due to those that are highly simplified. For an intermediate coupled model, model errors are significant and from different sources (Qi et al. 2017); moreover, it is hard to distinguish their effects in predictions. Therefore, improving some processes in models may fail to enhance the ENSO prediction skills due to the interaction of other model errors. Finding a way to obtain or filter the combined effect of kinds of model errors is promising to improve the ENSO prediction.
To this end, Zheng and Zhu (2016) considered synthetically the model uncertainties and developed a firstorder Markov stochastic model that is added to the tendency equation for SST in the intermediate complex model (ICM) in an attempt to depict the approximately combined effect of model uncertainties. Such an approach has the ability to filter unpredictable stochastic processes and capture more realistic ENSO evolutions (Zheng et al. 2009a). But there are still large systematic biases since the Markov stochastic model perturbation cannot approximate the effect of all model errors to the greatest extent (Qi et al. 2017). Therefore, an optimal method was suggested to represent the combined effect of different kinds of model errors. Duan and Zhou (2013) extended the (linear) forcing singular vector (Barkmeijer et al. 2003) to a nonlinear field and proposed the approach of a nonlinear forcing singular vector (NFSV), which depicts the model tendency errors that have the largest effect on prediction uncertainty Qi. et al. 2017). Then with the idea of data assimilation, Duan et al. (2014) improved the NFSV approach so that it can be used to extract the comprehensive tendency errors (see section 2). Using the NFSV-related assimilation, they obtained the tendency errors of the Zebiak-Cane model (Zebiak and Cane 1987), which was in turn forced on the SST equation of the Zebiak-Cane model in an attempt to offset the effects of the model errors. As a result, they successfully simulated realistic ENSO evolutions using the Zebiak-Cane model equipped with the NFSV-related assimilation.
The impressive effect of NFSV on correcting model stimulates us to apply the NFSV to improve the ENSO prediction. It should be pointed out that the model errors are time dependent and the NFSV-tendency errors during predictions are case dependent. Thus, to improve the prediction, we have to obtain the mode tendency errors in advance. The observation data are available so that we can obtain the NFSV-tendency errors during the assimilation period. However, the NFSV-tendency errors beyond the assimilation period are unavailable due to unavailable observations. Therefore, can we use the NFSV-related assimilation for predictions? If yes, how do we do it? In doing so, can the prediction skill be greatly improved by using the NFSV-related assimilation?
To address these above questions, we develop an approach to apply the NFSV-related assimilation in ENSO predictions and establish a new ENSO forecast system by using an intermediate coupled model. The study is classified into two parts: the first part introduces the new ENSO forecast system, with the NFSV-related assimilation, and its performance in ENSO predictions; the second part examines the prediction skill in distinguishing El Niño types (i.e., the EP and CP El Niño) to explain the physical reasons for improving the prediction skill associated with the various types of ENSO. In the present article, we focus on the first part (i.e., the new ENSO forecast system and its performance in ENSO predictions).
The remainder of this paper is organized as follows. Section 2 provides a description of the NFSV approach and its related assimilation. Section 3 describes the intermediate coupled model for ENSO (ICM) used in the present study. In section 4, we develop the new ENSO forecast system based on the NFSV-related assimilation and the ICM. In section 5, the performance of the NFSV-ICM is examined. Finally, we present a summary and a discussion in section 6.

Methods
In the present study, we use the NFSV-related assimilation approach to correct the ENSO model and establish the ENSO forecast system. For convenience, we briefly describe the idea behind the NFSV and NFSV-related assimilation.

a. The NFSV approach
The NFSV is a nonlinear extension of the (linear) forcing singular vector proposed by Barkmeijer et al. (2003). It considers the combined effect of different kinds of model errors and represents the tendency error that causes the largest prediction error at the prediction time (Duan and Zhou 2013;Duan and Zhao 2015). The NFSV f * can be derived from the maximization problem shown in Eq. (1): where M t (0) represents the propagator of a nonlinear model [Eq.
(2)] and M t (f) represents that of the nonlinear model but with a tendency perturbation f [see Eq.
(3)]; u represents the state variable and u 0 denotes its initial state. The term J represents the cost function that measures the deviation in the state from the reference state M t (0)(u 0 ) in terms of the norm kÁk due to the effect of the tendency perturbation f. Here, f 2 V tells the constraint of the tendency perturbation. Usually, we define kfk # d, which means the tendency error is less than d [d is a positive number; the details can be seen in Duan and Zhou (2013) and Duan et al. (2016)]: 8 > < > : From Eq. (1), it is easily known that if M t (0)(u 0 ) is a control forecast, the NFSV represents one kind of optimal tendency perturbation of the control forecast and has the potential to cause the perturbed forecast largest departure from the control forecast at the prediction time. That is, the control forecast M t (0)(u 0 ) is most sensitive to the NFSV-tendency perturbation, which may provide information on correcting the control forecast skill. Of course, we can also modify the cost function of the NFSV according to a particular physical problem. For example, to obtain a tendency perturbation that can induce the largest uncertainties within the prediction period from t 0 to t, the cost function can be modified as In this case, the cost function J gives the largest accumulated departure from the control forecast that is merely induced by the tendency error.

b. The NFSV-related assimilation
Based on the NFSV approach, Duan et al. (2014) proposed an optimal forcing vector (OFV) approach to offset the model errors and improve the model simulation ability, in which they simply modified the cost function J of the NFSV approach [see Eq. (5)]. The OFV represents total tendency perturbation, which is superimposed on the tendency equation of the model and makes the model simulation closest to the observations. The OFV, which is denoted by f o , is related to a minimization (or assimilation) problem: where u obs (x, t) denotes the time series of the observation data and M t (f)(u 0 ) represents the model simulation result at the t month, which is obtained by integrating Eq.
(3) with an initial state u 0 . The assimilation problem Eq. (5) is identical to the following optimization problem: That is, the OFV derived by Eq. (5) satisfies Eq. (6), while Eq. (6) bears greatly resemblance to Eq. (4) for the NFSV approach. In this situation, the OFV can be understood as a tendency perturbation that is constrained by f 2 V, which makes the perturbed forecast depart from the control forecast (i.e., the unperturbed forecast) at the greatest extent but remain closest to the observation due to the constraint condition. In this sense, the OFV is mathematically consistent with the idea of the NFSV. For convenience, here, we rename the ''OFV'' as the ''NFSV'', and the related calculation is called the ''NFSV-related assimilation.'' The NFSVrelated assimilation (which is used to treat model errors) can show its difference from the initial value assimilation (which is being used to deal with the initial errors). In the present study, we use the NFSV-related assimilation to correct the ICM for ENSO and establish a new ENSO forecast system.

ENSO model and data
The ENSO model adopted here is an ICM developed by Zhang et al. (2003), and its forecast results have been presented on the International Research Institute for Climate and Society (IRI) web page, showing real-time ENSO forecast results generated by more than 20 models across the globe (see https://iri.columbia.edu/ourexpertise/climate/forecasts/enso/current/?enso_tab5enso-sst_table). The ICM is an air-sea coupled model that consists of a statistical wind stress model, an intermediate dynamic ocean model (Keenlyside and Kleeman 2002), and an SST anomaly model that represents surface ocean thermodynamics. The wind stress model is constructed from the singular value decomposition (SVD) approach that determines the relationship between the SST and the wind field from 1963 to 1996. To represent the effect of thermocline fluctuations on SST variability, an empirical model is developed to parameterize the temperature of the subsurface water entrained into the mixed layer (T e ) from the sea surface height. As a result, the SST anomaly model with the T e model can capture the realistic features of ENSO evolution, including the period and amplitude (Zhang et al. 2005). However, since some processes are missing in this model, such as the effects of freshwater flux and salinity, which were found to have a role in the amplitude of ENSO (e.g., Zhang et al. 2012), the ICM fails to well capture the strength of ENSO in realistic predictions Zheng and Zhu 2016). In addition, the ICM only represents the air-sea interaction in the tropical Pacific and ignores the effect of the extratropical Pacific; therefore, considerable model errors still exist in the model itself for realistic prediction. Zheng et al. (2009a) found that initial perturbations-related ensemble predictions have small effect on improving ENSO predictions using the ICM, while model perturbations-based ensemble predictions show large improvement in ENSO predictions. Especially, Qi et al. (2017) showed that model errors in the ICM are more important than initial errors in realistic prediction. All these studies encourage us to use the NFSV approach to optimally capture the model errors so as to improve the ENSO prediction skills.
Observational and/or reanalysis data are required to initialize and evaluate the ICM in ENSO forecasting. In the present study, we follow Zhang et al. (2005) and adopt a simple nudging procedure to initialize the ICM by using SST observations (Barnett et al. 1993). The wind stress anomaly is reconstructed from the SST field during the period from 1854 to the start time of the forecast via the SVD-based historical SSTwind relation. Then, the reconstructed wind field is used to force the ocean model to initialize the ocean dynamic states. In addition, the observed SST anomalies are nudged into the SST anomaly model to generate the initial SST field. Here, the observed monthly SST field is from the National Oceanic and Atmospheric Administration (NOAA) Extended Reconstructed SST, version 3b dataset [ERSSTv3b; (Smith et al. 2008)], and the monthly wind stress field is from the National Centers for Environmental Prediction-National Center for Atmospheric Research (NCEP-NCAR) reanalysis (Kalnay et al. 1996). The former observed SST data span from 1854 to present, and the latter wind stress field spans from 1949 to the present.
As for other configurations of the ICM, the readers can refer to Zhang et al. (2005).

ENSO forecast system of the NFSV-related assimilation
Based on the ICM and its initial assimilation described in section 3, we develop the ENSO forecast system using the NFSV-related assimilation associated with the model error correction. The main idea is as follows. An NFSV-tendency error forecast model is built up and then coupled with the ICM, with the initial assimilation in section 3, to correct the ICM and achieve a useful skill that is substantially higher than that of the ICM that only has the initial assimilation. The ENSO forecast system with the NFSV-tendency error model can be achieved through the following three steps: (i) reveal the NFSV-type tendency error that makes the model simulation for SST anomalies closest to those that are observed by using the NFSV-related assimilation approach, (ii) derive a tendency error forecast model by considering a function of the predetermined NFSV-tendency errors as the observed initial SST, and (iii) couple the NFSV-type tendency error forecast model with the ICM and finally establish the ENSO forecast system, which not only predicts the SSTA but also estimates the future tendency error.

a. The NFSV-type tendency error
Focusing on the SST predictions, we attribute the combined effect of the model errors to the SST tendency errors. The NFSV-related assimilation problem is constructed as follows: where X obs and X denote the observed and simulated monthly SST anomalies within a 1-yr assimilation window [t 0 , t 0 1 nDt] (Dt 5 1 month, n 5 12) and u 0 represents the initial analysis obtained by the initial value assimilation. Here f t * represents the NFSV-type tendency error that can lead the model to output accurate SST anomalies. Here, the tendency errors are monthly dependent in the 1-yr assimilation window. That is, the tendency error is constant within one month to guarantee the mutually adjustment of ocean and atmospheric variables (Duan et al. 2014). Thus, according to Eq. (7), we can obtain a set of NFSV-type tendency errors with twelve components [i.e., in the 1-yr assimilation window, namely, each month has one tendency error. Obviously, the NFSV-type tendency errors are somewhat different due to the different assimilation windows and initial times. For the period from 1854 to 2017, we take each month as the initial month of the 1-yr assimilation window and calculate the NFSV-type tendency error based on the NFSV-related assimilation. Figure 1 shows a sketch diagram of the NFSV-related assimilation for different initial months. Then, we obtain a set of monthly dependent NFSV-type tendency errors with respect to the 12 components. For the NFSVtype tendency errors of different assimilation windows, we perform a composite analysis on the components whose months overlap in different assimilation windows. Then, we can obtain 164 monthly tendency errors during the period from 1854 to 2017.
A snapshot of the winter components of the NFSVtype tendency errors during El Niño episodes is displayed in Fig. 2. It is shown that large NFSV-type tendency errors are found near the north and south boundaries of the model. This is probably because the ICM is a regional model that has low skill in simulating the climate state near the model boundaries. It is also found that large tendency errors arise in the eastern Pacific cold tongue, which indicates that the intensity of the El Niño event simulated by the ICM has large errors. When further examining the relationship between the evolution of the observed SST anomalies and the corresponding NFSV-type tendency errors along the equator (see Fig. 3), it is illustrated that the NFSVtendency errors exhibit an ENSO-like oscillation, of which the tendency errors are mainly located east of 1608W; furthermore, these errors are positive (negative) when the observed SSTs are warming (cooling). From the definition of the NFSV-type tendency errors in Eq. (7), it is inferred that the ICM tends to underestimate both El Niño and La Niña events in terms of their amplitudes due to the effects of model errors, while the NFSV-tendency errors can offset such effects (see Duan et al. 2014).
b. The NFSV-type tendency error forecast model In step i, we obtain the NFSV-type tendency errors during different assimilation windows, which can correct the ICM to simulate the observed ENSO cycle. Note that the time-dependent observations adopted to determine the NFSV-tendency errors are available during the assimilation windows. However, for predictions, we do not have access to observations during the prediction time period and, thus, cannot obtain the corresponding NFSVtendency errors by the approach in step i. By realizing the usefulness of the NFSV-related assimilation in correcting the model errors, we hope this assimilation can be used for the predictions. Therefore, how do we make the predictions via this type of assimilation?
According to Eq. (7), it is known that NFSV-tendency errors are dependent on the known observed SST anomalies. Furthermore, we find that the related observation series during different assimilation windows correspond to different NFSV-tendency errors. That is, the NFSV-tendency errors are flow dependent. Figure 3 shows that a certain flow-dependent relation exists between the NFSV-type tendency error and the observed SST anomaly (see section 4a). That is, the NFSV-type tendency errors are positive (negative) in the eastern FIG. 1. Schematic diagram illustrating the NFSV-related assimilation windows and the strategy of the composite NFSV-tendency errors. The twelve blue shaded boxes in each row cover one assimilation window, with the initial month marked on the vertical axis, which denote the 12 members of the NFSV-tendency errors during the 1-yr assimilation window. The red shaded boxes on the last row represent the 164 monthly tendency errors during the period from 1854 to 2017, which are obtained by taking the ensemble mean of the members whose months overlap in different assimilation windows. tropical Pacific when the observed SSTs are warming (cooling). This relation encourages us to construct an equation that addresses the dependence of the NFSVtendency errors on the observed SST anomalies. Then, we can use this equation to forecast future NFSV-tendency errors according to known observations. The SVD approach is used to clarify the flowdependent relation between the NSFV-type tendency errors and the observed SSTs. We can build up a leadlag relationship between the observed SST anomalies and the NFSV-tendency errors by the SVD. That is, we develop an equation to describe the sensitivity of the lagged NFSV-tendency errors to the lead observations, which allows for the estimation of the NFSV-tendency errors during the forecast period using the current observations. To achieve this, the covariance C in the SVD analysis is calculated from the matrix including the observed SST anomalies and the lag NFSV-tendency errors [see Eq. (8)]: where (i, j) represents the model grid, l represents the lagged months of the NFSV-type tendency errors relative to the SST anomalies, and N represents the time length for computing the covariance C. Then, the lead-lag relation is obtained by performing the SVD technique on the covariance matrix C (Bretherton et al. 1992), which can be written as where F l describes the relationship between the lagged NFSV and the l month lead SST anomaly and TE represents the NFSV-tendency error estimated by Eq. (9) with the l month lead observed by the SST anomaly field. Such a lead-lag relationship provides the possibility of estimating NFSV-tendency tendency errors in advance. That is, when the prediction is initialized at one month, the future NFSV-tendency errors can be forecasted through this lead-lag relation by inputting the known initial observations. Thus, an NFSV-tendency error forecast model can be constructed.
The NFSV-tendency error forecast model, as shown in the last paragraph, can be constructed using the known SST observations and predetermined NFSVs. To examine its validity, we take the period from 1960 to 1996 as the training period to determine the NFSVtendency error forecast model and the period 1997-2017 as the cross-validation period. In the training period, 10 leading SVD modes are used to construct F l in Eq. (9), while the remaining SVD modes are discarded due to their nearly stochastic and unrelated properties. Since the truncation of the SVD modes may reduce the variance in the NFSV-tendency errors, a scalar coefficient a is introduced to scale the strength of the tendency errors. In the present study, the coefficient a is taken as 0.6 and verified to be more applicable than other values in predicting ENSO. The experiments for determining a and SVD modes are referred to in the appendix. Now, we use the NFSV-tendency error forecast model to predict the NFSV-type tendency errors during the cross-validation period. As mentioned above, tendency errors are significant along the equator and northern boundary of the model. Thus, to demonstrate the effectiveness of the NFSV-tendency error forecast model, we present the predicted NFSV-tendency errors along the equator (Fig. 4a) and northern boundary of the model (Fig. 4b). In particular, Fig. 4 displays the predicted NFSV-tendency errors at the 6-month lead time during the period 1990-2001, which includes 1990-96 as part of the training period and 1997-2001 as part of the cross-validation period. It is shown that the predicted NFSV-type tendency errors during the training period 1990-96 are almost identical to the NFSV-type tendency errors predetermined by the NFSV assimilation along both the equator and northern boundary of the model, suggesting that the lead-lag relationship between the observed SST and NFSV-type tendency errors is well captured by the function F l in Eq. (9). For the crossvalidation period 1996-2001, it is also found that the predicted NFSV-tendency errors are well approximated compared to the predetermined NFSVtendency errors, including the amplitudes and locations. In particular, the phase change in the NFSV-tendency errors fits the predetermined errors very well. All of these results suggest that the constructed NFSV-SST relation [i.e., Eq. (9)] has the ability to predict future NFSV-tendency errors from the current SST field. Therefore, it can be confidently said that the NFSV-tendency error can reduce the model uncertainties and make a better prediction when its related tendency error forecast model is coupled with the ICM.

c. The ENSO forecast system of the NFSV assimilation
We combine the original ICM with its initial assimilation and the NFSV-tendency error forecast model determined by the NFSV-assimilation approach and finally formulate a new ENSO forecast system by superimposing the predicted NFSV-tendency error on the SST tendency of the ICM. Figure 5 shows a schematic diagram of the new ENSO forecast system (hereafter NFSV-ICM). The initialization scheme in the NFSV-ICM is the same as that in ICM (section 3). Specifically, we run the oceanic component of the ICM forced by the reconstructed wind before the prediction begins, nudge the observed SST to the model, and then finally obtain the initial states of the predictions. Using these initial states, we integrate the NFSV-ICM and make the ENSO predictions. The predicted NFSV-tendency errors are superimposed on the total tendency of the SST equation in the ICM with the initial value assimilation, which then perturbs the predicted SST anomalies at each time step of the model integral. As such, the predicted NFSV-tendency errors also suppress the effects of the initial errors on the prediction uncertainties. Theoretically, the NFSV-tendency errors consider the interaction between the model errors and initial errors.

The performance of the NFSV-ICM
The predictions of SST anomalies (SSTAs) associated with ENSO are made with 1-, 2-, 3-, . . . , 12-month lead times for the period 1960-2016. For comparison, the prediction results from both the original ICM and NFSV-ICM are output, and their skill scores are evaluated against the observed monthly mean SST anomalies. In addition, we note that the NFSV-tendency error forecast model embedded in the NFSV-ICM is obtained by the NFSV-related assimilation for the observed SST during the training period of 1960-96. Hence, the NFSV-ICM should be first validated by predicting ENSO during the training period of 1960-96 and then try to show whether the feedback among state variables is reasonable. Then, the model should be tested by forecasting ENSO during the crossvalidation period of 1997-2016 and examining its accountability in more realistic predictions. FIG. 4. Time-dependent NFSV-tendency errors determined by the NFSV-related assimilation (shaded) and those estimated from the NFSV-tendency error forecast model with a 6-month lead time (contours) along (a) the equator and (b) the northern boundary of the model. The purple dashed lines distinguish the training period (1991-96) from the cross-validation period (1997)(1998)(1999)(2000)(2001). The shaded regions either during the training period or cross-validation period coincide well with those marked by contours, which indicates that the NFSV-tendency error forecast model is valid in estimating future NFSV-tendency errors. The contour interval is 2 3 10 26 8C s 21 . To examine the prediction skill of SSTAs, two frequently used measurements are selected. One is the root-mean-square error (RMSE), which represents the deviation in the predictions from the observations, and the other is the anomaly correlation coefficient (ACC) that measures how parallel predictions and observations reach. Figures 6 and 7 illustrate the spatial distributions of the ACC and RMSE for predicted SST anomalies against those for the observed SSTAs in the tropical Pacific, respectively. It is shown that, for all lead times, the NFSV-ICM shows a much higher prediction skill (either in the ACC or RMSE) in the central tropical Pacific than the ICM. The NFSV-ICM also demonstrates skillful predictions of SSTAs in the far western tropical Pacific, while the ICM fails to do so. In the meridional direction, the skillful predictions generated by the NFSV-ICM cover almost the whole tropical Pacific, especially for short lead times. However, the ICM only shows skill in the region between 108N and 108S; furthermore, with increasing lead times, the skills drop quickly in the eastern tropical Pacific and near the equator. When the lead time is up to 9 months, the ICM predictions lose useful skill in almost the whole tropical Pacific and even show a negative ACC off the equator (Fig. 6c1). Comparatively, the ACC of the NFSV-ICM predictions is still greater than 0.6 in the central tropical Pacific at this lead time (Fig. 6c2).
From the perspective of the RMSE, it can be seen that the prediction errors are much larger along the equator when the predictions are made using the ICM (left panels of Fig. 7). At short lead times, the prediction errors are mainly concentrated near the coast of Peru and, with increasing lead times, such prediction errors in the eastern Pacific become large and extend toward the west. In addition, large prediction errors are also found near the model boundaries; these errors propagate to the central equatorial Pacific with time and increase to 1.28C at the 12-month lead time. However, when using the NFSV-ICM, all of these prediction errors are significantly decreased, especially over regions near the equator. It is obvious that the NFSV-ICM possesses smaller prediction errors than the ICM in predicting SSTAs in the tropical Pacific.
The anomaly correlations between the observed and predicted SSTAs in the Niño-3.4 area are plotted in Fig. 8a1 as a function of the lead time. The prediction skill of the ICM declines faster than that of the NFSV-ICM, and the differences between the ICM and NFSV-ICM in the ACCs gradually increase from 0.1 at the 3-month lead time to 0.3 at the 12-month lead time, with the NFSV-ICM showing a much larger ACC. Furthermore, it is at all lead times that the skill of NFSV-ICM defeats that of ICM and persistence prediction. If the skillful predictions are regarded as the ACC being larger than 0.6, the predicable time length for the Niño-3.4 index can be increased from 6 months using the ICM to 12 months using the NFSV-ICM. A similar improvement is also shown in the RMSE (see Fig. 8b1), especially for predictions with a 12-month lead time. These results indicate that the NFSV-ICM is more skillful than the ICM in predicting Niño-3.4 index.
To reveal the season-dependent prediction skill, the anomaly correlations for the Niño-3.4 index are also calculated as a function of both start months and lead times. The results for the ICM and NFSV-ICM are shown in Figs. 9a1 and b1, respectively. It is clearly shown that the prediction skill is significantly dependent on the season. Specifically, both the ICM and NFSV-ICM predictions show high skill in boreal winter and low skill in spring. However, in either winter or spring, the former predictions always present lower skills than the latter predictions in predicting the Niño-3.4 index. The low prediction skill in spring is generally referred to as the well-known SPB phenomenon. The SPB is defined as a rapid decline in the anomaly correlation coefficient when the prediction is made across boreal spring. From the above results, it can be deduced that although both the ICM and NFSV-ICM suffer from the SPB phenomenon, the NFSV-ICM notably reduces the effect of the SPB in predicting tropical Pacific SST anomalies and gives a more accurate prediction than the ICM.
In summary, we demonstrate here that the NFSV-ICM has greater abilities than the ICM in predicting SST anomalies associated with ENSO during the training period. In particular, the SPB phenomenon is weakened in the NFSV-ICM, and the related prediction skills are In the last section, we have validated the NFSV-ICM in improving ENSO prediction during the training period from 1960 to 1996. The NFSV-tendency error forecast model is obtained by assimilating the observation information during this period. It is therefore understandable that the NFSV-ICM provides a significant improvement in ENSO predictions during 1960-96. In fact, a cross-validation experiment is much more realistic for examining the validity of the NFSV-ICM in predicting ENSO. The cross-validation experiment here refers to the fact that the NFSV-tendency error forecast model obtained during 1960-96 is inserted into the ICM to predict the SSTAs during the period of 1997-2016. That means we use the same NFSV-ICM in the last section to examine the improvement of the FIG. 7. As in Fig. 6, but for the RMSE. The contour interval is 0.28C. prediction skill of the NFSV-ICM against the ICM in predicting ENSO.
The ACC and RMSE for the prediction experiments are presented in Figs. 10 and 11, respectively. It is shown that the anomaly correlation obtained from the NFSV-ICM is obviously larger than that from the ICM at lead times of 1, 2, 3, . . . , 12 months, and the RMSE is much smaller than that of the ICM. This indicates that the NFSV-ICM outperforms the ICM in predicting SST anomalies. In particular, the NFSV-ICM still provides useful skill at a 12-month lead time, with an ACC larger than 0.6 (Fig. 8a2). Although the SPB phenomenon also occurred in the cross-validation period of 1997-2016, it was weaker in the NFSV-ICM than the ICM (Figs. 9a2 and 9b2). Similar to the training period, the prediction skills for SSTAs in the central and western tropical Pacific are also significantly improved in the NFSV-ICM compared with the original ICM (Fig. 10). These results show that the NFSV-ICM is valid for reaching high prediction skills of SSTAs during the cross-validation period. Therefore, the NFSV-ICM can be a useful forecast system for realistic ENSO events.

c. Prediction skills for two types of ENSO
The fact that the NFSV-ICM shows high performance in central Pacific implies that the new model can well predict the central Pacific warming events known as CP El Niño. Such warming events that occur frequently in recent decades are found to show different climate effect compared with the traditional El Niño (denoted as EP El Niño) (Ashok et al. 2007). So, predicting the space structures, especially distinguishing two kinds of El Niño events are also important.
Predictions for two kinds of El Niño events are shown in Figs. 12 and 13, respectively. Although the ICM has the ability to predict the amplitude of EP El Niño, the ICM tends to predict a cooler-than-normal SST anomaly in subtropical Pacific. By contrast, the NFSV-ICM cannot only predict the amplitude but also capture the space structures of the EP El Niño events. Furthermore, an evident improvement is found in CP El Niño prediction using the NFSV-ICM. The ICM tends to lose skill in predicting CP El Niño at 6-month lead time and usually predicts a cooling event. While the NFSV-ICM still has skills in predicting the spatial distributions of CP El Niño events. From the above, it is indicated that the ICM equipped with the NFSV-assimilation is likely to have advantage in discerning and predicting different types of El Niño events.

Conclusions and discussion
The predictions of ENSO events are generally influenced by both initial errors and model uncertainties. In particular, an ICM usually neglects or simplifies some physical processes, which induces large model errors and influences the accuracy of the predictions of ENSO (Qi et al. 2017). In the present study, we focus on the model errors and develop a new ENSO forecast system (NFSV-ICM) consisting of an ICM and an NFSV-tendency error forecast model that is used to estimate the combined effect of the model errors. The prediction experiments are performed for tropical SSTAs during both the training period  and the cross-validation period (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016). The results commonly show that the NFSV-ICM tends to possess a much higher forecast skill compared with the original ICM. In particular, a considerable improvement in the forecast skill is reflected in the central and western tropical Pacific. Furthermore, the well-known SPB phenomenon is also obviously weakened in the NFSV-ICM. The NFSV-ICM shows useful skill in ENSO forecasting and can be a promising ENSO forecast system.
The high skill of the NFSV-ICM in predicting ENSO is mainly due to the embedded NFSV-tendency error FIG. 9. ACC of the predicted Niño-3.4 SST anomalies as a function of start month and lead time. The predictions are made by the (a) ICM and (b) NFSV-ICM during the (left) training period  and (right) validation period (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016). The contour interval is 0.1. forecast model. Since the model errors are from different model error sources and their effects are mixed in the prediction uncertainties Zheng and Zhu 2016), it is difficult to distinguish them and study them separately. The NFSV-tendency error forecast model considers the effect of model errors from a macro perspective and proposes the approach of describing the combined effect of different model errors by the NFSV-tendency perturbation. The NFSV-tendency perturbation is superimposed on each time step of the model integrals and therefore also suppresses the effect of the initial errors. The NFSV-tendency perturbation tries to reach the optimal tendency error by using an NFSV-related assimilation and is sufficient for estimating the interaction between model errors and initial errors, which therefore corrects the ICM forecasts to the greatest extent and makes the NFSV-ICM obtain high skill in predicting ENSO. The NFSV-ICM can consider not only the effect of model errors but also the effect of the initial errors. This is the competing aspect of the NFSV-ICM compared with the traditional initial value FIG. 10. As in Fig. 6, but for the cross-validation period (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016). assimilation. Because the NFSV-tendency error forecast model is constructed based on the SVD relationship between SST and NFSV. An undeniable fact is that the trained SST-NFSV relation is dependent on the historical data and the uncertainties of the SVD analyses. Therefore, sensitivity experiments regarding to the NFSV-tendency error model are implemented to explore the prediction skills influenced by the uncertainties of the SST-NFSV relation (see the appendix). The fact that the SST-NFSV relation is essentially nonlinear implies the uncertainties of the relation constructed by the linear SVD approach. More intelligent and advanced methods (e.g., machine learning) are expected to be adopted to make the SST-NFSV relation more robust (Reichstein et al. 2019).
It is pointed out that the ICM-based ensemble prediction system (EPS-ICM; see Zheng et al. 2009b) shows skill almost equivalent to that in the NFSV-ICM for SST predictions. Note that the EPS-ICM is constructed not only using ensemble Kalman filter (EnKF) data assimilation with SST fields to generate the initial ensemble conditions but also a model-error model to characterize FIG. 11. As in Fig. 7, but for the cross-validation period (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016). the model uncertainties. The model-error model is a zero-mean first-order Markov stochastic model that is developed by analyzing historical model errors. That is, the EPS-ICM is an ensemble forecast system involving initial and model errors and using a relatively advance data assimilation. However, the NFSV-ICM is only involved with a deterministic prediction system, where the SST observations are only used to initialize the model with simple data assimilation (i.e., the nudging method). Obviously, the NFSV-tendency error forecast model is the one that plays an important role in improving the level of predictions generated by the NFSV-ICM. The NFSV-tendency error forecast model, due to its optimal NFSV-tendency perturbation, possesses more possibilities in correcting the model and greatly improving the prediction skill. On the other hand, it is worth mentioning that Zheng and Zhu (2016) considered the coupling of atmosphere and ocean in initialization of the EPS-ICM and used an advanced EnKF assimilation approach to initialize the model, finally achieving much higher forecast skill of ENSO. This encourages us to equip the NFSV-ICM with EnKF and consider the air-sea coupling in initialization, and then enhance greatly the forecast skill of the NFSV-ICM with respect to the types of ENSO. Exactly, such an idea is under investigation.
In addition, after the 1990s, a new flavor of El Niño, with its warm center in the central tropical Pacific (known as CP El Niño) in comparison with the traditional El Niño (EP El Niño), which has a warm center in the eastern tropical Pacific, occurred frequently, giving rise to additional model errors and proposing new challenges to the simulations and predictions of ENSO (Kim et al. 2012;Tian and Duan 2016;Duan et al. 2018). Although some models, including the ICM used in the present study, are equipped with advanced assimilation techniques to optimize the initial fields, they still showed low forecast skill for CP ENSO due to model error  (Hendon et al. 2009;Duan et al. 2014). In the present study, we have shown that the NFSV-ICM shows great improvement in the prediction skills for SSTs in the central and western tropical Pacific. And we find that the ICM equipped with the NFSV assimilation greatly improve the capacity of discernments and predictions for two kinds of El Niño events. Thus, the NFSV-ICM may provide a promising way to study predictability in terms of the EP and CP El Niño events.
In addition, we have not addressed why and how the NFSV-tendency errors work in the SST predictions, especially for the high prediction skill in the central Pacific. These issues will be focused on and addressed in the next paper.
the SST-NFSV relation will be included. As a result, the constructed error model has low skill in estimating the tendency errors during the period that is not overlapping the training period. Thus, this section will further examine the sensitivity of the NFSV-ICM to the strength of the error forcing and the retained SVD modes. The structure of the tendency errors that are estimated using the SST anomaly is highly dependent on how many SVD modes are retained. To estimate the error model in terms of the spatial structure during the training period (i.e., 1960-97), a mean spatial similarity (MSS) between predicted tendency errors (NFSV p ) and NFSVs is used defined as MSS 5 (1/N) å t N t5t 1 [NFSV(t)NFSV p (t)]/ [jNFSV(t)jjNFSV p (t)j], in which N is 37 3 12 (months). The result is presented in Fig. A1. It is clear that more SVD modes retained in constructing the NFSV-error model has a better ability to capture the patterns of the tendency errors. Particularly, the skill is improved gently when more than 6 leading modes are reserved. That is, the high-order modes have small role in the SST-NFSV relation. Besides, retaining high-order modes will contain the noise that harms the estimation of the tendency error using the SST information. In this sense, retaining reasonable SVD modes can filter out noise and extract the robust relation for SST-NFSV.
The truncation of the SVD modes necessarily changes the strength of the tendency errors estimated. The sensitivity to the SVD modes and the corresponding strength of the error forcing to the ICM is analyzed from the perspective of the predictions for the SST anomaly. The anomaly correlations between predicted and observed SST anomaly as a function of lead time during the period 1960-1996 are shown in Fig. A2. In short time FIG. A1. The mean spatial similarity between NFSVs and statistical determined NFSVs as a function of retained SVD modes.
FIG. A2. Correlations of the Niño-3.4 SST anomaly during the period 1960-97, as a function of retained SVD modes and intensity of the error forcing (i.e., a). Each panel denotes the result at a certain lead time. The x coordinate is the value of the a (e.g., 6a denotes a 5 0:6). The y coordinate is the number of the leading SVD modes retained in the NFSV-tendency error model (e.g., 10m denotes 10 modes).
(e.g., 1-month lead time) predictions, the predicted results are insensitive to the SVD modes and the error strength since the model errors play small role in short time predictions. As the prediction length is increased, the performance of the NFSV-ICM is dependent on the SVD modes and a. As shown in Fig. A2, the NFSV-ICM possesses largest skill when a 5 0:6 or a 5 0:8, where the correlation is larger than 0.65 even predicting 12 months. In addition, the dependence of the model skill on SVD modes is prominent with lead time increased. Consistent with Fig. A1, the NFSV-ICM with higher modes retained tends to show higher skill in SST predictions. But the skill using 10 SVD modes are identical to that using 12 SVD modes.
As discussed above, a success of the NFSV-ICM cannot only improve the prediction during the training period but also can make sense in other periods without overlapping the training period. Therefore, the sensitivity of the model skill to the SVD modes and a during the period 1997-2016 is shown in Fig. A3. Similar to Fig. A2, limited SVD modes and a certain a allow the improvement of the ICM equipped with the NFSVtendency error model. The skill of the NFSV-ICM reach the peak when the 10 SVD modes are retained and a 5 0:6 or a 5 0:4. Also, significant decrease of the prediction error is achieved (not shown).
From the above, the new ENSO forecast system has the largest skill in ENSO prediction when the NFSV-tendency error model is determined by 10 leading SVD modes and a 5 0:6.