## Abstract

Combining dynamical models with statistical algorithms is an important way to improve weather and climate prediction. In this study, a concept of a perfect model, whose solutions are from observations, is introduced, and a dynamical-statistical-analog ensemble forecast (DSAEF) model is developed as an initial-value problem of the perfect model. This new analog-based forecast model consists of the following three steps: (i) construct generalized initial value (GIV), (ii) identify analogs from historical observations, and (iii) produce an ensemble of predictands. The first step includes all appropriate variables, not only at an instant state but also during their temporal evolution, that play an important role in determining the accuracy of each predictand. An application of the DSAEF model is illustrated through the prediction of accumulated rainfall associated with 21 landfalling typhoons occurring over South China during the years of 2012–16. Assuming a reliable forecast of landfalling typhoon track, two different experiments are conducted, in which the GIV is constructed by including (i) typhoon track only; and (ii) both typhoon track and landfall season. Results show overall better performance of the second experiment than the first one in predicting heavy accumulated rainfall in the training sample tests. In addition, the forecast performance of both experiments is comparable to the operational numerical weather prediction models currently used in China, the United States, and Europe. Some limitations and future improvements as well as comparisons with some existing analog ensemble models are also discussed.

## 1. Introduction

Considerable progress has been made in numerical weather prediction (NWP) during the past decades due partly to a steady accumulation of scientific knowledge and partly to technological advances in utilizing a variety of observations and gaining computing power (Bauer et al. 2015). Despite the steady progress, significant forecast errors in today’s NWP models are still present, especially in processing atmospheric statistical properties that are not directly available from the NWP outputs, such as tropical cyclone (TC) intensity, and other mesoscale and smaller-scale features. Thus, it is highly desirable to develop alternative approaches to reducing NWP errors. To our knowledge, Koo (1958a) was among the first to make such an attempt by treating weather forecasts as both an “initial-value” and “evolution” problem. He hypothesized that the three-dimensional flow structures of a large-scale baroclinic system at a given moment could be reconstructed from the evolution of flow contours and temperature fields before and after. In a subsequent study, Koo (1958b) stressed the importance of using historical data in NWP models to improve weather forecasts. Later, Charney et al. (1969) pointed out that historical data, even incomplete sometimes, are useful for inferring the present atmospheric state in NWP models. By transforming a solution of differential equations problem into an extreme-value of functional-variational problem, Chou (1974) expanded the notion of the solution of differential equations by introducing the “generalized solution” concept through incorporating past observations into the temporal evolution of meteorological fields in NWP. Then, Chou (1986) explained why a dynamical model and statistical methods can be combined and how the combined dynamical-statistical model could improve NWP. Briefly, such a dynamical-statistical model can retain the physical significance of the atmospheric processes described by the primitive equations, while it adds the statistical properties of certain quantities that cannot be accurately determined dynamically.

In general, NWP errors come from the following two types of sources: one from the model’s initial conditions and the other from the model’s dynamical and physical deficiencies. The former can be significantly reduced through the application of sophisticated data assimilation and ensemble forecasts, while the latter still remains a challenging research subject (Chou 1974; Lorenc 1986; Kalnay 2002). To reduce prediction errors from model deficiencies, an important alternative is to develop empirical strategies, in addition to increasing model resolution and improving physical parameterizations.

Empirical strategies to reduce forecast errors may be classified into (i) tendency error corrections, which can be made online by nudging the model tendencies toward observations during model’s integrations, and (ii) direct forecast-error corrections, which can be made offline after obtaining model forecasts (Danforth and Kalnay 2008a). The first empirical strategy is NWP model dependent and has been widely used with various data assimilation techniques in both the operational and research communities (e.g., Kalnay 2002). Leith (1978) developed a statistical method in which a state-dependent empirical correction is applied to the dynamical model through minimizing the model tendency errors at every time step. Subsequently, numerous studies have shown improvements of NWP with different models by following Leith’s methodology (e.g., DelSole and Hou 1999; Danforth et al. 2007; DelSole et al. 2008; Danforth and Kalnay 2008a,b; Yang et al. 2008).

The second strategy involves the diagnostic or prognostic use of analogs with the assumption that two similar flow patterns will likely remain similar for some time to follow (Van Den Dool 1994). For the former, Roebber and Bosart (1998) found a strong dependence of observed rainfall structures on mesoscale details after exploring their sensitivities to small variations in the synoptic circulation. The latter has been applied to predicting short-term climate fluctuations (Barnett and Preisendorfer 1978; Livezey et al. 1994), and estimating atmospheric predictability (Lorenz 1969). Further, Chou (1979) suggested that if the forecast state could be regarded as a small perturbation superimposed on a historical analog field, an analog-statistical technique can be applied to the dynamical forecast field. By estimating the model’s tendency error with respect to the historical analog state, Qiu and Chou (1989) obtained a deviation equation in a quasigeostrophic model and then illustrated the improvements of model forecast performance. With the same philosophy, improvements have been reported in medium-range to seasonal forecasts by some studies (Huang et al. 1993; Bao et al. 2004; Yu et al. 2014). A similar methodology, applied to four-dimensional variational assimilation to estimate the initial tendency errors of analog samples, was proposed independently by D’Andrea and Vautard (2000). In addition, some other studies have developed various strategies to reduce model tendency errors (Johansson and Saha 1989; Kaas et al. 1999; Klinker and Sardeshmukh 1992; Saha 1992; Yang and Anderson 2000).

The direct forecast-error corrections have been made in several different ways. The first well-known method is called the model output statistics (MOS) that can generate the best possible local weather forecasts from NWP model outputs using regression techniques (Glahn and Lowry 1972; Klein and Glahn 1974; Lorenz 1977). The other statistical methods, including singular value decomposition (SVD) analysis (Feddersen et al. 1999) and the postprocessor approach (Arpe and Klinker 1986; Chen and Lin 2006), also show skillful improvements in NWP and numerical model simulations. Meanwhile, Ren and Chou (2006, 2007), and Chou and Ren (2006) proposed an approach of the analog correction of errors (ACE) by combining a statistical method with a dynamical NWP model. This approach has the key idea that the current-state prediction errors of a dynamic model can be statistically predicted (or estimated) using the past model’s prediction errors with respect to the historical (NWP forecast) analog states. The ACE is defined by

where ** ψ** is the model state vector with

*ψ*_{0}being the initial value and

*m*is the number of the best analogs identified from the historical NWP model forecast data,

*P*

_{o}(

*ψ*_{j}),

*P*(

*ψ*_{j}), and

*E*′(

*ψ*_{j}) are the observation, the NWP model forecast, and the error of the

*j*th best analog, respectively. Using the ACE approach, Sun et al. (2006), Ren et al. (2014a), and Liu and Ren (2017) showed improved El Niño–Southern Oscillation (ENSO) prediction with different general circulation models, while many other studies focused on improving monthly to seasonal forecasts (e.g., Zheng et al. 2009; Ren et al. 2009; Duan et al. 2011; Tan et al. 2012; Ren et al. 2014b; Cheng et al. 2016). In addition, Gao et al. (2006) demonstrated that the ACE approach could improve daily to monthly forecasts of atmospheric flows. Despite the wide applications of the ACE approach in different studies, its design is far from perfect and has considerable room for improvement.

It is worthwhile to mention some recent studies on the hybrid analog-ensemble-based forecasts. Diomede et al. (2014) investigated the calibration of limited-area ensemble rainfall forecasts for driving discharge prediction, and results revealed that the analog-based method has the best performance among three different methods in capably correcting position errors and spreading deficiencies. Some other studies (e.g., Delle Monache et al. 2013; Nagarajan et al. 2015; Junk et al. 2015; Eckel and Delle Monache 2016) focused on developing analog-based ensemble schemes for ensemble model output statistics (EMOS), and their results suggested that the analog-based ensemble forecasts are better than traditional EMOS or MOS forecasts. Zhou and Zhai (2016) developed an analog prediction system for persistent extreme precipitation (PEP), and showed that it is capable of identifying approaching PEP events earlier and yielding a more accurate forecast for the location and intensity of PEP than direct model outputs at 3-day and longer lead times in the Yangtze–Huai River valley. Frediani et al. (2017) used an object-analog technique with 15 ensemble members to forecast 89 storm events in the northeastern United States during the period of 2004–13, and found that the object-analog forecasts are competitively skillful compared to simpler analog techniques. Elsberry and Tsai (2014) developed a situation-dependent intensity prediction (SDIP) technique for western North Pacific TCs, based on the mean intensity changes from the 10 best historical track analogs. This SDIP technique was then successfully applied by Tsai and Elsberry (2015, 2016, 2017) to the other TC cases.

In the present study, we will present a new analog-ensemble-based approach with more attention given to the full use of historical observational data. Thus, the purposes of this study are to (i) improve the ACE approach by replacing historical model forecasts with historical observations; (ii) develop a dynamical-statistical-analog ensemble forecast (DSAEF) model by introducing a concept of the perfect model; and (iii) demonstrate the forecast credibility of the DSAEF model through its application to the heavy rainfall forecasts associated with 21 landfalling TCs.

The next section provides a review of the theory and the current status of the ACE approach. Section 3 shows the forecast improvements of the ACE approach by taking an arithmetic mean of analogs from observations, and how it is used as a backbone for the development of the DSAEF model including the functionals of finding the maximum or minimum values of ensemble members from observations. Some fundamental differences between the DSAEF model and the previous analog-ensemble-based studies will be discussed. Section 4 presents an example of applying the DSAEF model to heavy rainfall forecasts associated with 21 landfalling TCs. A summary and conclusions are given in the final section.

## 2. Current status of the analog correction of errors

As presented by Ren and Chou (2006, 2007) and Chou and Ren (2006), the ACE approach consists of two steps: an inverse problem of estimating model errors, and the ACE approach to correcting the errors. They are described below separately.

### a. An inverse problem of estimating model errors

In general, NWP is considered as an initial-value problem for a set of partial differential equations (PDEs). Mathematically, a current NWP model can be expressed by

where *L* is the model operator that characterizes the NWP model, ** ψ** the model state vector, and

*ψ*_{0}the initial value of

**(see Chou and Ren 2006). Assuming that a perfect model operator A exists, which is the ultimate goal of current NWP model development, this perfect model also involving a set of PDEs can be expressed by**

*ψ*A predictand *P*(** ψ**) should be exactly the same as the observation

*P*

_{0}(

**) as in the perfect model (3). However, in reality**

*ψ**P*(

**) from the model (2) would always deviate from**

*ψ**P*

_{0}(

**) in the perfect model (3). The difference between**

*ψ**P*(

**) and**

*ψ**P*

_{0}(

**) is defined as the model forecast error, and the differenced model (error) operator may be expressed as**

*ψ**A*(

**) −**

*ψ**L*(

**) =**

*ψ**E*(

**). Thus, (3) can be rewritten as**

*ψ*where *E*(** ψ**) is an unknown model error operator that cannot be accurately obtained.

Since available historical data are essentially the special solutions to (3) or (4), methodologically, the initial-value problem set of PDEs can be transformed into an inverse problem that solves the PDEs by rewriting the formulation of the forecast problem. In doing so, the PDE solutions are known (i.e., from the historical data), whereas the unknowns in (4) can be determined by estimating the model error operator *E*(** ψ**). This can be achieved by solving the inverse problem through the ACE as described in the next subsection.

### b. The analog correction of errors (ACE)

A simple approach can be used to estimate *E*(** ψ**) in a numerical model (Ren and Chou 2006). That is, given an initial value

*ψ*_{0}, the predictand

*P*(

*ψ*_{0}) and “(unknown) observation”

*P*

_{0}(

*ψ*_{0}) can be obtained from the model (2) and perfect model (3), respectively. Then, the forecast error is simply

*E*′(

*ψ*_{0}) =

*P*

_{0}(

*ψ*_{0}) −

*P*(

*ψ*_{0}). Turning this equation around gives

*P*

_{0}(

*ψ*_{0}) =

*P*(

*ψ*_{0}) +

*E*′(

*ψ*_{0}), implying that the model prediction problem boils down to finding

*E*′(

*ψ*_{0}).

In reality, historical observations can be used to construct the model’s initial values *ψ*_{i}(*i* = 1, 2, …, *n*) where *n* represents the number of historical observations. Following the same procedures as described above, the predictand *P*(*ψ*_{i}) can be obtained by solving the numerical model (2), and the (known) observation *P*_{0}(*ψ*_{i}) are obtained through the perfect model (3). Then,

can be defined as the forecast error vector with respect to the historical observational data used for model validation, and its arithmetic mean, defined by (1), is considered as the systematic error (i.e., model bias).

Mathematically, the more observational data used for a linear model system, the better are the forecast results to be obtained, just like a pure statistical model. However, this is not the case for a nonlinear model system (such as the atmosphere and oceans) (Chou and Ren 2006). Instead, the use of some optimally selected data with certain relations (e.g., weather patterns or storm tracks) is better than just the use of more observational data. Thus, for a specific forecast, an analog forecast approach could be adopted, in which analogs of the initial value *ψ*_{0} from historical observations are first found and then used to make future forecasts (i.e., the ACE approach). Note that in this approach, NWP models play an important role in producing realistic model initial conditions, but are not needed to make forecasts because of adding *E*′(*ψ*_{0}), valid at *T* = 0, to the forecast variables *P*(*ψ*_{i}). In this sense, the ACE approach should provide more realistic forecasts than those classical dynamical-statistical forecasts, in which all possible historical data are typically used.

Assuming that the analogs of the initial value (*ψ*_{0}) from the historical observational data can be found, defined as an analog (*ψ*_{1}), then we may write the forecast error vector as

where ** ψ**′ =

*ψ*_{1}−

*ψ*_{0}is the error between the analog

*ψ*_{1}and the initial value

*ψ*_{0}.

If many analogs *ψ*_{j} = (*j* = 1, 2, …, *m*) from the historical data can be found, and they are all close to *ψ*_{0}, then finding the forecast error problem becomes

Equation (7) implies that the forecast error vector can be estimated from the mean forecast error with respect to the top *m* analogs.

However, in estimating the forecast error vector *E*′(*ψ*_{0}) associated with the model initial conditions, it is often not possible to obtain a long history of model forecasts for the predictands corresponding to the top *m* analogs (e.g., due to the continued updates of various model physics and initialization procedures). This makes it difficult to implement the ACE approach in these cases, which represents one of the limitations of the ACE approach.

## 3. An improved analog prediction approach

### a. Improving the ACE-based prediction by taking an ensemble mean

Given the abovementioned limitation with the ACE approach, it is necessary to search for an alternative way to improve this approach. One way is to combine (7) with an existing forecast, which gives a possible solution as

With the definition of (5) for the forecast error correction vector from the historical data, (8) can be rewritten as

where $P\xaf\u2061(\psi j)$, the estimated mean predictand based on the top *m* analogs, can be taken as an ensemble mean forecast if the initial errors $\psi j\u2032=\psi 0\u2212\psi j\u2061(j=1,2,\u2026,m)$ are treated as initial perturbations. Then, the difference $P\u2061(\psi 0)\u2212P\xaf\u2061(\psi j)$ should be small when $\psi j\u2032=\psi 0\u2212\psi j\u2061(j=1,2,\u2026,m)$ are small enough. Thus, (9) can be rewritten as

That is, when the analog forecast error is taken as an arithmetic mean solution [i.e., $P\xaf\u2061(\psi j)$], the to-be-improved forecast should equal to the arithmetic mean forecast $P\xafO\u2061(\psi j)$ based on the *m* observations *P*_{O}(*ψ*_{j}) corresponding to the top *m* analogs *ψ*_{j} (*j* = 1, 2, …, *m*) from the historical data.

It follows from the above analysis that the improved ACE-based prediction is equivalent to the arithmetic mean of an ensemble of *m* observations (i.e., the exact solutions of a perfect model) that correspond closely to the top *m* analogs in the historical data. In the next subsection, we will show another form of the ensemble function that can be used to improve the ACE-based prediction with historical observational data by introducing a concept of the perfect model.

### b. A dynamical-statistical-analog ensemble forecast (DSAEF) model

As mentioned before, the atmospheric motion fully satisfies the perfect model operator *A*(** ψ**) in the dynamical model (3) or (4). Then, a forecast problem is to determine how to obtain the predicted field vector

*P*

_{O}(

*ψ*_{O}) using the perfect model

*A*from the initial value

*ψ*_{O}.

Figure 1 shows a schematic of the DSAEF model. Before making a forecast with the DSAEF model, we may assume the existence of *m* analogs *ψ*_{j} (*j* = 1, 2, …, *m*) to the initial state vector *ψ*_{O}, which is referred to as the generalized initial value (GIV) in this study, and their corresponding observations of the predictands *P*_{0}(*ψ*_{j}) (*j* = 1, 2, …, *m*) (i.e., the exact solutions of the perfect model *A*). If the initial errors $\psi j\u2032=\psi 0\u2212\psi j\u2061(j=1,2,\u2026,m)$ in the perfect model operator*A* are treated as initial perturbations, the predictands *P*_{0}(*ψ*_{0}) become a problem of ensemble forecasts. That is, *P*_{0} (*ψ*_{0}) is an ensemble of the *m* forecasts or the *m* observations *P*_{0}(*ψ*_{j}) (*j* = 1, 2, …, *m*) corresponding to the *m* initial perturbations added to the perfect model operator *A*. This can be expressed by

where *F*(*X*_{1}, *X*_{2}, …, *X*_{m}) is the ensemble function of functional *X*_{1}, *X*_{2}, …, *X*_{m}. If the above ensemble function is an arithmetic mean, we obtain $P0\u2061(\psi 0)=P\xaf0\u2061(\psi j)$, which is exactly (10).

For some forecast problems, the ensemble function, *F*(*X*_{1}, *X*_{2}, …, *X*_{m}), may also take the functionals of finding the maximum or minimum values of the ensemble members. As an example, we will illustrate in section 4, through an application of the DSAEF model to rainfall forecasts associated with a landfalling TC over South China, that the performance of the ensemble function of “finding the maximum values” is better than that of taking an arithmetic mean in the improved ACE approach.

Figure 2 shows the flowchart of applying the DSAEF model, which consists of three steps: (i) construct the generalized initial value (GIV); (ii) identify analogs; and (iii) produce an ensemble of predictands. In the first step, the GIV includes not only initial conditions at *T* = 0 but also, if necessary, the subsequent temporal evolution of main variables affecting the predictands in the context of dynamical models, including the perfect model (i.e., observations) and typical NWP models, and statistical methods. Thus, the GIV in four dimensions (4D; i.e., three dimensions at the initial state plus the subsequent temporal evolution) may be composed of all possible variables associated with the predictands, which can be a prognostic meteorological variable such as pressure and horizontal winds, an index such as the East Asian monsoon intensity index or a weather phenomenon such as TC tracks. Both the second and third steps involve the application of statistical methods. Identifying analogs consist of three substeps: (i) Identify analogs for each variable, which can be done after applying a similarity index, following Ren et al. (2018). (ii) Identify analogs for the GIV. Obviously, when there are more than one variable involved in the GIV, a certain order of variables for identifying analogs needs to be considered. (iii) Determine the *m* best analogs. Generally, the order of variables considered in step (ii) may influence the final solution of the DSAEF. Thus, it is desirable to obtain the best possible order after comparing different ones, and then the *m* (*m* ≥ 1) best analogs can be obtained. The third step consists of two parts: identify the observations of predictands for the *m* top analogs; and select a functional form for an ensemble of the *m* observations of predictands. After completing the above three steps, a forecast using the DSAEF model can be obtained.

One can see from the above description that in the DSAEF model, “dynamics” implies the theoretical basis of the perfect model, while “statistics” includes the generalized initial value, identifying analogs, and optimizing an ensemble of predictands.

### c. Comparison with some existing analog-based statistical methods

Given many analog-ensemble-based statistical studies in the literature, as reviewed in section 1, it is of interest to discuss what unique features of the DSAEF model are. First, its theoretical framework originates from the GIV problem of a perfect model, which is the ultimate goal of NWP model development, whereas almost all the existing analog-ensemble approaches do not have their own theoretical frameworks. Moreover, the DSAEF model can be used to not only make analog ensemble forecasts but also show what physical implications of each variable involved are.

In addition, the DSAEF model differs in methodology from some existing analog-ensemble approaches such as the hybrid analog ensemble (AnEn) method (Delle Monache et al. 2013; Nagarajan et al. 2015; Junk et al. 2015; Eckel and Delle Monache 2016). That is, (i) unlike the AnEn method, the DSAEF model requires the construction of a suite of appropriate initial conditions; (ii) the AnEn method needs a long history of deterministic NWP model forecast data from a frozen modeling system, which may not be easy to obtain whereas the DSAEF model searches optimal analogs directly from historical observations (i.e., the so-called generalized initial value); (iii) in the AnEn method analogs are sought independently at each location and for each lead time, whereas in the DSAEF model analogs are sought solely depending on the generalized initial value and for all lead times; and (iv) although both the AnEn method and the DSAEF model obtain ensemble forecasts from historical observations (of predictands) with different schemes, detailed procedures to find those optimal analogs are quite different, as can be seen from an example shown in the next section.

## 4. An application to landfalling TC precipitation forecasts

### a. How to apply the DSAEF model to landfalling TC precipitation forecasts?

In this section, we illustrate the applicability of the DSAEF model to predicting accumulated precipitation associated with landfalling TCs. Figure 3 shows a flowchart on how to apply the DSAEF model to the prediction of landfalling TC precipitation (LTP), hereafter referred to as DSAEF_LTP model, following the procedures listed in Fig. 2. The DSAEF_LTP model involves four steps: obtaining TC track forecast, constructing generalized initial value, identifying analogs, and finding ensemble LTP of the analogs.

Obtaining accurate TC track forecasts is the first important step toward reasonable forecasts of LTP, since it is a zero-order variable to determine the LTP distribution. After reviewing the previous studies of heavy LTP and LTP forecasts, Ren and Xiang (2017) found that a TC track with respect to different geographical locations is also the key variable in determining the intensity of LTP. Given considerable progress made during the past decades in TC track forecasts (Rappaport et al. 2009; Langmack et al. 2012; Cangialosi and Franklin 2016), it is reliable to use the predicted TC tracks by today’s NWP models as the first step in the DSAEF_LTP model.

In the second step of constructing generalized initial value, it is highly desirable to include all the physical variables that may influence LTP, including both TC’s internal variables (e.g., intensity, size) and environmental variables (e.g., vertical wind shear, subtropical high, summer monsoon), many of which remain to be examined. Their GIV includes not only their initial state but also during the subsequent evolution period. For example, both the observed and predicted tracks for a target TC are treated as the generalized initial value. At this stage of the model development, only two physical variables, TC track and landfalling date, are incorporated into the GIV.

The third step involves identifying *m* top analogs for the GIV. It consists of three substeps: identifying analogs for each variable, identifying GIV’s analogs, and determining the *m* best analogs, which are similar to those in the second step of the DSAEF model (Fig. 2). At present, the GIV includes both TC tracks and landfalling dates. The former can be done by applying the TC track similarity area index (TSAI) (Ren et al. 2018), which is an objective index representing an enclosed area bounded by two TC tracks with two lines connecting their initiating and ending points, while the latter is performed by taking seasonal similarities of the TCs landfalling dates. The smaller the TSAI value is, the greater similarities are the two TC tracks, where a null value indicates the complete overlap of the two tracks. Based on the TSAI obtained within a region of concern that is referred to as the similarity region, it is straightforward to obtain the historical landfalling TCs in a similarity order to the target TC. Then, seasonal similarities can be divided into three landfalling types according to the TC landfalling date as detailed in Table 1.

In the fourth step, we identify accumulated rainfall amounts for each of the top *m* analogs, and then select a functional form for an ensemble of the *m* analogs. The former can be obtained by performing the objective synoptic analysis technique (OSAT) (Ren et al. 2006, 2007), whose function is to partition TC precipitation objectively. The latter can be obtained by taking the two functional forms: ensemble mean and maximum values. Inputting the above information into the DSAEF model would yield the targeted TC’s accumulated rainfall field.

### b. An application to LTP forecasts

With the above qualitative description, we may apply the DSAEF_LTP model to the prediction of accumulated rainfall associated with TCs that produced more than 100 mm of daily precipitation at the minimum one rain gauge station during 2012–16 in South China. There are a total of 27 TCs, 17 of which occurring during 2012–14 are used as a training sample, and while the remaining 10 occurring during 2015–16 are used as an independent set of forecast samples for validation. In doing so, historical data (1960–2016) used herein include the best track data from the TC database (http://tcdata.typhoon.org.cn/en/zjljsjj_sm.html) of the China Meteorological Administration (CMA) and the daily rainfall data of 191 rain gauge stations in South China (Fig. 4). Despite the above long historical archival, real-time TC track forecast data of CMA/National Meteorological Center (NMC) for the 27 TCs are also used. In addition, the rainfall forecast data from the European Centre for Medium-Range Weather Forecasts (ECMWF) model, the Global Forecast System (GFS) of the National Centers for Environmental Prediction, and the global spectral model (T639) of CMA/NMC are used herein for the purpose of comparison with the predicted rainfall by the DSAEF_LTP model.

To see the sensitivity of the LTP forecasts to different physical variables in the DSAEF_LTP model, the following two forecast experiments are conducted: experiment 1 with TC track only, experiment 2 with both TC track and landfalling date. The LTP forecast skill is measured by the threat score (TS), which is the primary operational-forecast verification metric in China, that is,

TSs for two accumulated rainfall thresholds (i.e., 100 and 250 mm) are evaluated over all the surface rain gauge network as given in Fig. 4. In this study, the DSAEF_LTP model for experiment 2 has the following seven parameters (see Table 1): initial date (P1), similarity region (P2), threshold of the segmentation ratio of a latitudinal extreme point (P3), threshold of the overlap percentage of two similar TC tracks (P4), seasonal similarity (P5), number of the most similar TCs (P6) and ensemble scheme (P7). The above parameters, except for seasonal similarity (P5), are also included in the model for experiment 1. Considering different options for each parameter, as listed in Table 1, there are 34 560 (= 4 × 15 × 3 × 6 × 16 × 2) and 103 680 (= 4 × 15 × 3 × 6 × 3 × 16 × 2) different forecast schemes for experiments 1 and 2, respectively. However, because some TCs generate precipitation over land soon after genesis, fewer values for parameters P1 and P2 can be used, leading to a sharp decrease in the total number of forecast schemes for the two experiments. To find the best forecast scheme more meaningfully, we exclude six short-tracked TCs with P1 = 1. So, the remaining 21 TCs, whose tracks are shown in Fig. 4, are adopted for the tests. Then, 12 TCs during 2012–13 are selected as the training sample whereas the rest 9 TCs during 2014–16 are treated as the independent sample. Experiments 1 and 2, which have a total of 5184 and 15 552 forecast schemes, respectively, are carried out to obtain the best forecast scheme from the training sample, respectively. After comparing the TS of the two experiments, the final best one from the two best forecast schemes will be applied to the independent forecast tests. During the above tests, a target TC may have several TCs that occurred prior as its analogs.

Figure 5 presents scatterplots of TSs for the training sample of the accumulated rainfall amounts of ≥100 and ≥250 mm from the 5184 forecast schemes associated with experiment 1. Obviously, there are two groups of TSs: one is the ensemble mean (gray dots) and the other is the maximum value ensemble (blue dots) according to the ensemble scheme selected at each station. It is evident from Fig. 5 that the maximum value ensemble is much better than the ensemble mean for intense precipitation forecasts. By comparison, the two highest TS values from the three NWP models, denoted by hollow circles (○), have the values of TS_{250}= 0.0237 by the ECMWF model and TS_{100}= 0.1312 by the GFS model. Clearly, a total of 1048 out of 5814 schemes are better than the NWP forecasts, as they appear in the first quadrant encompassed by the two inner dashed axes. Moreover, the red dot indicates the highest value of TS_{250} + TS_{100} that gives the best forecast scheme. See Table 1 for the values of the six parameters associated with the best scheme in this experiment.

Following the same procedures as described above, we can find the best forecast scheme for experiment 2. It is evident from Fig. 6 that a total of 3087 schemes out of 15 552 ones perform better than the NWP forecasts in predicting the accumulated precipitation of ≥100 and ≥250 mm. Similarly, the best scheme should have the highest TS value of TS_{250} + TS_{100} (i.e., as indicated by a red dot). See Table 1 for the values of the seven parameters associated with the best forecast scheme for experiment 2.

A comparison of the parameter values used for the two best schemes between experiments 1 and 2 reveals that (i) five of the six parameters in experiment 1 (i.e., P1, P2, P3, P4, and P7) are the same as those in experiment 2; and (ii) P6, the number of analogs, is 6 and 7 in experiment 1 and 2 (see Table 1), respectively. Furthermore, a comparison of the TSs by the two best schemes of the DSAEF_LTP model between experiments 1 and 2 shows that adding P5 (i.e., seasonal similarity) tends to improve the overall performance of the best forecast scheme. That is, experiment 2 has the summed TS of TS_{250} +TS_{100} = 0.2941, where TS_{250} = 0.0833, and TS_{250} = 0.2108, while experiment 1 has the summed TS of TS_{250} +TS_{100} = 0.26, where TS_{250} = 0.0662, and TS_{100} = 0.1938.

Using the best scheme of experiment 2 to represent the DSAEF_LTP model, Fig. 7 compares its mean forecast skills to those of the three NWP models for the independent sample of years 2014–16. Results show that the summed TS (i.e., TS_{250} + TS_{100}) of the DSAEF_LTP model is comparable to that of the NWP models, with the GFS being the best one (0.3094), followed by the DSAEF_LTP (0.2406), the ECMWF (0.2182), and the T639 (0.2065). In terms of the individual TS_{250} and TS_{100}, the DSAEF_LTP model is ranked the second and third, respectively. Similar forecast skills are true for the best scheme of experiment 1 (not shown).

Figure 8 compares the individual TS_{250} associated with the 9 targeted TCs during 2014–16 from the four different models. Note that the three NWP models have TS_{250} ≥ 0 only for the most intense rain-producing TC(1410), indicating that they tend to underpredict intense rainfall due partly to the use of coarse grid resolution. Meanwhile, the DSAEF_LTP model has TS_{250} ≥ 0 for three cases: TC(1410), TC(1418), and TC(1522). In this case, why is the DSAEF_LTP model not ranked the first? A further analysis of the results reveals that our DSAEF_LTP model tends to overpredict the coverage of more intense precipitation (i.e., ≥250 mm), namely giving higher false alarms (i.e., 9) than those of the NWP models (i.e., 5), as shown by thin color bars on the horizontal axis in Fig. 8. As a result, the mean TS for the DSAEF_LTP is smaller than those of the GFS and T639 shown in Fig. 7. The false alarms with TS = 0, causing the lower ranking of the DSAEF_LTP model as calculated by (12), can be attributed partly to the ensemble prediction scheme P7 being “the maximum”.

The TS_{100} from the four models are compared in Fig. 9, revealing that all the four models have good average TS_{100} values (i.e., 0.23–0.40) for intense rain-producing TCs (i.e., with precipitation ≥250 mm), whereas much smaller values (i.e., 0.056–0.094) are obtained for TCs with precipitation between 100 and 250 mm.

It is apparent from the above analyses that the DSAEF_LTP model can provide numerous possible schemes for a given set of parameters and available historical data, and that the best scheme could be obtained through a training sample. This differs from previous statistical or analog studies, in which only one forecast scheme could be obtained. In particular, in most climate prediction studies and the model output statistics (MOS) for objective weather forecasts, a training test needs to be conducted in order to determine a statistical relationship in the form of a regression equation between a predictand and certain variables. In this sense, the best scheme in our model is equivalent to “the regression equation.”

## 5. Summary and conclusions

In this study, we have (i) presented an innovative way to improve the prediction of certain meteorological variables with the ACE approach by taking an arithmetic mean of the observations corresponding to the *m* best historical analogs; (ii) developed the DSAEF model based on the concepts of a perfect model and its initial perturbations, after considering different ensemble functional forms, and (iii) demonstrated the applicability of the DSAEF model to LTP forecasts of 21 TCs occurring in South China.

Our work suggests that although a perfect forecast with today’s dynamical models is impossible to obtain, combining dynamical forecasts with initial condition analogs from historical observations to a maximum possible extent constitutes an innovative way to reduce forecast errors. The DSAEF model developed herein is such a promising approach along this direction, in which analogs from historical observations play an important role in determining forecast accuracy.

Despite the successful application of the DSAEF model, there are several issues on its general applications that are worth discussing. First, it is necessary for the initial perturbations to be as small as possible. This implies that having sufficient analogs from historical observations is critical, and that the more analogs, the better forecasts are. Second, appropriate construction of the GIVs is the key to forecast success with the DSAEF model. A useful strategy is to consider the physical implications of the generalized initial value associated with predictands and the easiness to find their corresponding analogs. Our experience suggests that (i) as many key physical factors associated with the predictands as possible be included as the main body of the GIV; (ii) the GIV be mainly composed of historical observations, and well-predicted key physical variables such as TC tracks; and (iii) the GIV be limited to a (small) key region rather than too large an area. In this regard, Van Den Dool (1994) also pointed out that finding an analog over a small area is much easier than that over a large one. Third, since the quality of identified analogs depend critically on the degree of analogs, a loose analog criterion for one variable may yield a few more analogs than those with a strict one. Thus, an effective strategy is to design different analog criteria for different variables. Fourth, it is always a good idea to test different functional forms for an ensemble of predictands and then choose the best one. Some common functional forms may include an ensemble mean, an extreme (maximum or minimum) value, or a certain percentile.

As demonstrated in section 4, a successful strategy for applying the DSAEF model to LTP forecasts is to include as many relevant variables as possible to the GIV in order to improve forecast skill. For example, the DSAEF_ LTP model with the two variables of “TC track” and “landfalling date” shows better performance than the model with only one variable of “TC track.” In addition, an important step to obtain a successful application of the DSAEF model is to find the best forecast scheme. Thus, it is necessary to figure out how many basic schemes should be and how they could be optimized to yield the best forecast scheme. Consider a case with *k* “similarity” parameters plus the other ones (e.g., the initial time and the number of analogs). If the DSAEF model has a total of *K* parameters (*K* > *k*) and if the number of values for the *j*th parameter *p*_{j}(*j* = 1, 2, …, *K*) is *n*_{j}, then the number of schemes is *n*_{1} × *n*_{2} × … × *n*_{K}. To find the best scheme, a reasonable way is to use one of the most important indices to evaluate the forecast performance for each scheme, and then determine the best one based on the evaluations.

It should be mentioned that the present study is just a good start to the application of the DSAEF model to heavy LTP forecasts. There is considerable room for improvements with the DSAEF_LTP model. This could be achieved mainly by two ways: one is incorporating more variables associated with TC internal characteristics (e.g., intensity and size) and environmental conditions (e.g., vertical wind shear, relative humidity, and summer monsoon), another one is improving values of the parameters in the model such as similarity region (P2) and ensemble prediction scheme (P7). The DSAEF model could also be applied to the other meteorological problems (e.g., intense horizontal winds in TCs, regional weather conditions, some of which will be explored in our future studies). In particular, we wish to mention that with its encouraging performance shown herein and our other recent studies (i.e., Ren et al. 2018; Ding et al. 2019; Jia et al. 2020), the DSAEF_LTP model has now been put under operational heavy precipitation foresting tests for landfalling TCs at the CMA/NMC and the Hainan Meteorological Observatory of China. The related results will be reported in future studies.

## Acknowledgments

We would like to express sincere thanks to Profs. Russ Elsberry, Jifan Chou, Jianping Huang, Weijing Li, and Guolin Feng for their helpful discussion and comments. This work was supported by the National Natural Science Foundation of China (Grant 41675042), the Hainan Provincial Key R&D Program of China (SQ2019KJHZ0028), the project “dynamical–statistical ensemble technique for predicting landfalling tropical cyclones precipitation,” the National Basic Research Program of China (973 Program: 2014CB441402), and the Jiangsu Collaborative Innovation Center for Climate Change. DLZ was also funded by the U.S. Office of Naval Research Grants N000141410143 and N000141712210.

## REFERENCES

*Quart. J. Roy. Meteor. Soc.*

*Chin. Sci. Bull.*

*J. Atmos. Sci.*

*Nature*

*J. Atmos. Sci.*

*Adv. Atmos. Sci.*

*J. Appl. Meteor. Sci.*

*Sci. China Ser.*

*Some Problems of Long-Term Numerical Weather Prediction*. Water Resources and Electric Power Press, 277 pp

*Plateau Meteor.*

*J. Appl. Meteor.*

*Geophys. Res. Lett.*

*J. Atmos. Sci.*

*Mon. Wea. Rev.*

*Mon. Wea. Rev.*

*Mon. Wea. Rev.*

*Mon. Wea. Rev.*

*Meteor. Mon.*

*Mon. Wea. Rev.*

*Int. Conf. on Information Science and Technology*, Nanjing, China, IEEE, 1367–1370

*Tellus*

*Mon. Wea. Rev.*

*Asia-Pac. J. Atmos. Sci.*

*J. Climate*

*Mon. Wea. Rev.*

*Chin. Phys.*

*J. Appl. Meteor.*

*Quart. J. Roy. Meteor. Soc.*

*Sci. China Earth Sci.*

*Mon. Wea. Rev.*

*Mon. Wea. Rev.*

*Tellus*

*Atmospheric Modeling, Data Assimilation and Predictability*. Cambridge University Press, 364 pp

*Bull. Amer. Meteor. Soc.*

*J. Atmos. Sci.*

*Acta Meteor. Sin.*

*Acta Meteor. Sin.*

*Quart. J. Roy. Meteor. Soc.*

*Annu. Rev. Fluid. Mech.*

*Int. J. Climatol.*

*J. Climate*

*Quart. J. Roy. Meteor. Soc.*

*J. Atmos. Sci.*

*Mon. Wea. Rev.*

*Wea. Forecasting*

*Chin. J. Atmos. Sci.*

*Wea. Forecasting*

*J. Mar. Meteor.*

*Geophys. Res. Lett.*

*Adv. Atmos. Sci.*

*Wea. Forecasting*

*Acta Meteor.*

*Sci. China Ser. D Earth Sci.*

*Acta Phys. Sin.*

*Adv. Atmos. Sci.*

*Atmos. Oceanic Sci. Lett.*

*Acta Meteor. Sin.*

*Mon. Wea. Rev.*

*Mon. Wea. Rev.*

*Chinese J. Atmos. Sci.*

*J. Appl. Meteor. Sci.*

*Asia-Pac. J. Atmos. Sci.*

*Asia-Pac. J. Atmos. Sci.*

*Wea. Forecasting*

*Tellus*

*Mon. Wea. Rev.*

*J. Climate*

*Mon. Wea. Rev.*

*Acta Phys. Sin.*

*Wea. Forecasting*

## Footnotes

Denotes content that is immediately available upon publication as open access.