1. Introduction
A numerical weather prediction (NWP) model is only an approximation of real atmosphere and contains unavoidable errors in various components like initial conditions (ICs) and model physics. These small errors will nonlinearly grow with time during model integration to contaminate a forecast due to chaotic nature of models (Lorenz 1963; Epstein 1969; Leith 1974). Therefore, it is necessary to quantify predictability or uncertainty associated with a model forecast. Since ensemble forecasting is a model-based dynamical approach to quantify forecast uncertainty, an ensemble prediction system (EPS) has now become a standard modeling system at major numerical weather prediction centers in the world (e.g., Buizza et al. 2018).
A key technical component for an EPS is to perturb ICs. There are many existing IC perturbation methods (see the review of those methods in Du et al. 2018). Some are random perturbations (e.g., Monte Carlo method), some are just a collection of multiple existing analyses (e.g., multi-analysis approach), some are focusing on error structure in analysis (e.g., breeding method), some are mathematically targeting spread growth at a future time (e.g., singular vector), some are simulating observational errors (e.g., ETKF). Du et al. (2018) has described each of the methods in detail and discussed their advantages and shortages. Based on our own and other’s experiences in developing EPSs, three basic scientific principles are behind all those IC-perturbation generation schemes (e.g., Lacarra and Talagrand 1988; Du 2002; Du et al. 2018). One is the representativeness that a perturbation should represent the true uncertainty of an IC by keeping perturbation size similar to analysis uncertainty (intrinsic error in an analysis). Another is the growing structure that perturbation’s spatial structure should contain atmospheric growing or unstable modes (e.g., baroclinic and convective instabilities) so that a perturbation will grow when a model integrates with time. The third is the independence that orthogonality of perturbations among ensemble members needs to be maximized so that perturbations can grow independently into different directions during model integration to fully span forecast uncertainty space. Given a finite ensemble size, the number of ensemble member is normally much smaller than the model’s degree of freedom. Therefore, the second and third principles are necessary for a limited-size ensemble to effectively encompass all possible scenarios of a future atmospheric state. Horizontal perturbation structure is taken care of by well-designed perturbation schemes. For example, a blending of ensemble transform Kalman filter (ETKF; Wang and Bishop 2003) and bred vector (BV; Toth and Kalnay 1997) is used as a base IC perturbation scheme to provide raw IC perturbation to be rescaled in this study (see section 2b). This study will focus on improving the first principle by making an IC perturbation be closer to analysis uncertainty (intrinsic analysis error) in vertical distribution.
To satisfy the first principle, a rescaling mask is normally used to adjust (either amplify or reduce) perturbation magnitude after an initial “raw perturbation” being generated from an IC perturbation scheme. Currently, a rescaling mask is two-dimensional (2D), where a rescaling factor is calculated at a representative model level (reference level) and then applied indistinguishably to all levels such as in the NCEP global (Toth and Kalnay 1997; Zhou et al. 2016, 2017) and regional EPSs (Du and Tracton 2001), CMA EPS (Deng et al. 2010; Liu et al. 2013), and other EPSs (e.g., Anderson and Anderson 1999). As a result, vertical distribution of IC perturbations with a 2D rescaling mask often does not closely match vertical distribution of analysis uncertainty (e.g., Zhou et al. 2017). To mitigate the problem, this study extended from 2D to a three-dimensional (3D) rescaling mask in a 3-km storm-scale EPS, then systematically evaluated if it can improve ensemble forecast performance and further investigated how it is achieved.
Extension from 2D to 3D rescaling mask has also gained attention to others. For example, Met Office uses different rescaling factors for planetary boundary layer, stratosphere, and troposphere in a global EPS (Flowerdew and Bowler 2013). A study has also been done using the NCEP coarse-resolution global EPS (GEFS; Ma et al. 2014), which showed an improvement in ensemble performance of basic atmospheric state variables in a fall transition season (September–November) with a 3D rescaling mask (derived from a hybrid ensemble data assimilation system). Is this conclusion also valid for a high-resolution regional EPS and other sensible weather elements like precipitation? Thus, Wang et al. (2021) used a 3D rescaling mask to ETKF perturbation in CMA’s Global/Regional Assimilation and Prediction Enhanced System-based regional EPS (GRAPES-REPS) and showed an improvement in ensemble performance including precipitation forecast. Following the bred vector’s masking strategy, their mask is derived from time-averaged difference between two independent analyses (ECMWF and GRAPES) and varies both horizontally and vertically (only horizontally in the original BV’s mask). However, there are three possible deficiencies in their work. One is about the robustness of their result since their work is only based on a very short period of 9 days in spring transition season (7–15 May 2019). The second is that their rescaling factor is not directly related to the analysis uncertainty of its own model’s data assimilation system but static differences between two independent analyses. The third is the horizontal variation of their rescaling factor, which will alter the raw perturbation’s spatial structure coming out of a perturbation scheme and possibly destroy fast-growing modes. There were other 3D rescaling related works that did not explicitly and systematically compare forecast performance between 2D and 3D masking strategies. For example, Feng et al. (2019) used 3D rescaling in storm-scale ensemble and focused on testing an ensemble-sensitivity analysis-based perturbation method rather than 2D versus 3D comparison for a squall-line case. From the review of the past work, we can see that a systematic comparison between 2D and improved 3D rescaling masks in a high-resolution storm-scale EPS over a longer time period including both warm and cold seasons is still needed, which motivates us to perform this study. To have a robust and generalized conclusion, this study will systematically compare the ensemble performances of both atmospheric state variables and precipitation between 2D and 3D rescaling in a 3-km storm-scale regional EPS (GRAPES-EPS) for one summer month and one winter month. The summer and winter months will be verified separately for a comparison. The possible improvement mechanism will also be explored to understand how and why it is achieved. Comparing to Wang et al. (2021), the 3D rescaling mask employed by this study is different in the following two aspects: 1) rescaling factor is derived from a comparison to the estimated analysis uncertainty of its own model’s data assimilation system (rather than a third independent data assimilation system such as ECMWF’s), and 2) rescaling factor varies vertically but not horizontally so that the horizontal structure (fast-growing modes) of raw perturbation remains after rescaling (see section 2b for details).
In the remaining part of this paper, we will describe the ensemble model configuration, the 3D mask design, and data in section 2. Section 3 compares the performances of ensemble forecasts between 3D and 2D masks in terms of ensemble mean, spread and probabilistic forecasts, as well as an exploration of possible underlying mechanism. A summary and discussion are given in section 4.
2. Experiment design
a. Model and EPS configuration
The base model used in the experiment is a regional version of the GRAPES model called GRAPES_Meso. The GRAPES is developed at the Earth System Modeling and Prediction Center (former Numerical Weather Prediction Center) of China Meteorological Administration (CMA, Chen et al. 2008; Chen and Shen 2006). The main features of the GRAPES include a full compressible dynamical core with nonhydrostatic approximation, a semi-implicit and semi-Lagrangian scheme for time integration, and a height-based terrain following coordinate. The model physics includes the Rapid Radiative Transfer Model (RRTM) longwave radiation (Mlawer et al. 1997), Dudhia shortwave radiation (Dudhia 1989), WSM-6 microphysics (Hong and Lim 2006), Noah land surface model (Mahrt and Ek 1984), MRF PBL scheme (Hong and Pan 1996), and Monin–Obukhov surface layer scheme (Noilhan and Planton 1989). Model analysis (IC) is produced by a three-dimensional variational data assimilation scheme (Zhuang et al. 2014). In this study, the GRAPES_Meso model runs on a regular latitude–longitude grid with a horizontal resolution of 0.03° (about 3 km) and a vertical resolution of 51 levels. The model domain covers the following area: 32°–46.01°N and 108°–123.99°E (468 × 534 = 249 912 grid points, Fig. 1) with complex terrain from mountain to sea.
The GRAPES-Meso-based regional EPS (GRAPES-REPS) consists of 15 members, including a control run and 14 perturbed ensemble members. The lateral boundary conditions (LBCs) are provided by a T639 (about 28 km) global EPS “T639-GEFS” which is also running operationally at CMA (Ma et al. 2008). In this study, the GRAPES-EPS runs two cycles per day, initiated at 0000 and 1200 UTC, respectively. The forecast length is 36 h. A blended method is used to create IC perturbations by combining the smaller-scale ETKF perturbations (Wang and Bishop 2003) from GRAPES-REPS and the larger-scale breeding perturbations (Toth and Kalnay 1997) from T639-GEFS. The details about this blended method are described in Zhang et al. (2015). The benefit of a blended perturbation has been well recognized in the scientific community by taking advantages of both smaller-scale and larger-scale information (e.g., Caron 2013; Du et al. 2015; Wang et al. 2014; Zhang et al. 2015). It has also proven to be effective in ameliorating the problem of mismatch in the LBCs (Caron 2013; Wang et al. 2014; Zhang et al. 2015). There are five state variables (zonal wind u, meridional wind υ, potential temperature θ, dimensionless pressure π, and specific humidity q) in the GRAPES model. All their ICs are perturbed in hoping to have the fullest sampling in IC uncertainty space. A 2D rescaling mask (Fig. 2b) is currently used to adjust IC perturbation magnitude in the operational GRAPES-REPS, which is used as the control experiment to be compared to the new 3D scheme in this study.
Besides the perturbations to IC, physics perturbation is also included in the GRAPES-REPS in the following two ways. One uses multi-physics approach by varying PBL schemes and convective parameterization schemes (see the Table 1 of Xia et al. 2019 or Chen et al. 2020). Another is the stochastically perturbed parameterization tendency approach (SPPT, Buizza et al. 1999; Du et al. 2018) which is also described in detail in Xia et al. (2019). Both the control EPS and experiment EPS use the exact same configurations in raw IC perturbation-generating scheme and physics perturbation method except the rescaling mask.
b. Three-dimensional (3D) rescaling mask
From the above discussion, we can see that the rescaling factor in this study is calculated differently from Wang et al. (2021) in the following two ways. 1) We use the model’s own data assimilation system to estimate analysis error as the “analysis uncertainty” used in Eq. (1) in hoping to be more relevant to the IC of the model, while Wang et al. (2021) uses the long-term average of two independent analyses (GRAPES and ECMWF) as “analysis uncertainty” which reflects more of general statistical property of analysis error. 2) Our rescaling factor varies only vertically but not horizontally in hoping to preserve the horizontal structure of raw fast-growing perturbations at each level, while the rescaling factor of Wang et al. (2021) varies both vertically and horizontally and adjusts perturbation magnitude differently at each grid point, which will alter horizontal spatial structure of the raw IC perturbations.
c. Cases and data
A summer month (28 days from 1 to 28 July 2018) is examined first for warm season, which is a meteorologically active period with about 11% above normal precipitation (Zhang and Sun 2018). Since the GRAPES-REPS runs twice per day (0000 and 1200 UTC cycles) to 36 h, there are a total of 28 × 2 = 56 36-h forecasts during this period. Therefore, the warm-season verification results presented in section 3 is the average of these 56 forecast cases over the model domain. Then, a weather-active winter month (1–27 February 2019 with a total of 54 36-h forecasts) is repeated for cold season. The warm-season results are compared to the cold-season results for a deeper understanding.
The 3-km GRAPES-Meso analysis is used as truth for verification of all variables except for precipitation. There are 468 × 534 = 249 912 data points can be used for a robust verification at each forecast hour. The analysis is produced by a 3DVAR data assimilation system (the method is described by Ma et al. 2009). Station observation, rawinsonde, aircraft data (ACARS), and satellite data are assimilated. No radar data are assimilated. For precipitation verification, auto rain gauge observation is used. There are a total of 2413 auto rain gauges within the model domain (Fig. 1).
3. Results
a. Vertical structure of IC perturbation
Figure 3 shows the vertical profiles of the wind perturbation from the 2D (red dash line) and 3D (blue dash line) masks, compared to the analysis uncertainty (black solid line). The perturbations are averaged over the model domain and the experiment period (1–28 July 2018). It shows that the perturbation with the 2D rescaling mask is too small in the entire atmosphere and has incorrect vertical distribution below 700-hPa level, while the perturbation with the 3D rescaling mask is much closer to the analysis uncertainty with correct vertical distribution in the entire atmosphere. In other words, the IC perturbation can better represent the real analysis uncertainty through the 3D rescaling mask, which is a desired feature (the first principle of “representativeness”) for a good IC perturbation design. Similar result is for the cold season (not shown).
b. Forecast improvements
In this section, all aspects of ensemble forecasts (ensemble mean, spread and probability) will be thoroughly examined to see if they can be improved after using a more realistic IC perturbation in vertical distribution through the 3D rescaling mask. Besides showing two cases, seven scoring rules are used to measure different aspects of an ensemble performance: one (RMSE) for ensemble mean, three (spread, outlier, and consistency) for ensemble spread, and three (CRPS, BS, and AROC) for probabilistic forecasts. These scores will be briefly described when they are used hereafter. A review of ensemble verification scores can be found in Jolliffe and Stephenson (2003) as well as Du and Zhou (2017). Verification will be carried out at three representative levels: upper (250 hPa), middle (500 hPa), and lower (850 hPa) levels for wind (U and V), temperature (T), and geopotential height (H). The four surface weather elements were also verified, which includes 2-m temperature T, 10-m wind (U and V), precipitation, and mean sea level pressure (P_MSL). Individual results are presented in sections 3b(1)–3b(4), while a summary score card of all results is presented in section 3b(5). Since there are so many scores being used to verify each of those variables for two seasons (warm and cold season), a large number of figures are produced. It is not feasible to show all of them in the article. Therefore, only representative variables of warm season will be demonstrated as examples in individual verification results, but all variables will be shown in summary figures. To show as many variables as possible, different variables are demonstrated for different aspects (mean, spread, and probability) of ensemble performance. In section 3b(5), the cold-season results will be compared to the warm-season results. The scores are robustly calculated over the entire model domain using 249 912 (468 ×534) data points at each forecast hour. The verification results shown are the average of either all summer cases or all winter cases unless specified otherwise.
1) Case show
Figure 4 shows the ensemble mean forecasts of 24-h accumulated precipitation for two cases: a warm-season case (the top panel) and a cold-season case (the bottom panel). For the warm-season case (initiated from 0000 UTC 10 July 2018), the forecast of 2D rescaling mask (Fig. 4b) obviously missed the observed northeast–southwest-oriented heavier precipitation band (≥10 mm with some areas of exceeding 25 mm) along the coast of Shandong peninsula (the highlighted area of Fig. 4a), while the forecast of 3D rescaling mask provided the information of this event (Fig. 4c). Similarly, for the cold-season case (initiated from 0000 UTC 10 February 2019), the forecast of 2D rescaling mask (Fig. 4e) completely missed a large area of observed snow event (with liquid water equivalent of 0.1–2.5 mm which is a high-impact event given lower-latitude and larger areal coverage) spreading from the southern part of North China to Jiangsu Province (the highlighted south portion of the domain) (Fig. 4d), while the forecast of 3D rescaling mask predicted it (Fig. 4f). The enhanced information in the ensemble mean implies that the observed events were correctly captured by more members in the 3D ensemble than in the 2D ensemble if it is not completely missed by the 2D EPS. To demonstrate this, the corresponding probabilistic forecasts are shown in Fig. 5. Over the highlighted observed 25-mm area of the summer case (Fig. 5a), the probability of exceeding 25 mm in the 3D ensemble is about 10%–15% (Fig. 5c), while it is 0% in the 2D ensemble (Fig. 5b). For the highlighted winter snow case (Fig. 5d), the probability of exceeding 0.25-mm liquid water equivalent in the 3D ensemble is generally 30%–35% with some over 40% (Fig. 5f), while it is largely 0% in the 2D ensemble (Fig. 5e). This useful probability information contained in the 3D ensemble will certainly give forecasters a heads up of these incoming events.
Admittedly, a success comes not free but with an expense. Due to the imperfect model physics and ICs, a larger diversity in an ensemble will also lead to more false alarm cases although some ensemble members might contain more accurate forecast information. This adverse side effect is typical for an EPS. For example, some precipitation areas in the 3D ensemble forecasts were also falsely expanded in both cases, which is especially obvious in the winter case (e.g., Figs. 4e,f). There is always a trade-off between increasing ensemble diversity and false alarm rate in building an EPS. As a matter of fact, the primary mission of an EPS is not to provide an accurate deterministic forecast but estimate reliable confidence or uncertainty information associated with a forecast (Du et al. 2018). Individual cases give us only a snapshot but not the overall performance of an ensemble. That is why it needs many cases but not just one case in evaluating an EPS. Many cases will be statistically evaluated for ensemble mean, spread and probabilistic forecasts hereafter [sections 3b(2)–3b(5)]. For an under-dispersive EPS (which is the case in this study), increasing diversity among ensemble members is generally beneficial [see section 3b(3) about ensemble spread evaluation]. By the way, an ensemble mean forecast will theoretically overestimate areal coverage for lighter precipitation and underestimate it for heavier precipitation during arithmetic averaging for a non-Gaussian distributed variable like precipitation, which was discussed by Du et al. (1997). Therefore, probabilistic information is more preferred than ensemble mean in predicting precipitation.
2) Ensemble mean forecasts
Below we will use root-mean-squared error (RMSE) to quantitatively measure the ensemble mean performance of more variables. Figure 6 is the RMSE of zonal wind U at four levels varying with forecast hours. The 3D mask run (blue curve) has less error than the 2D mask run (red curve) at all levels and all forecast hours (except for the 250 hPa at 30 and 36 h). The improvement is larger at surface and lower level than upper level. For example, the averaged improvement of 6–36-h U forecasts is about 2.8%, 2.1%, 0.5%, and 0.2% at 10 m above ground level and 850, 500, and 250 hPa, respectively. The improvement decreases with the increase of forecast length too (see the percentage values of the improvement listed in the figure caption). Results of temperature and geopotential height/pressure are similar to that of wind: the 3D mask runs have generally less error than the 2D mask runs at all levels and all forecast hours. Decreasing of the improvement with the increase of altitude and forecast hour are also observed. For example, the averaged improvement of 6–36-h temperature forecasts is about 2.2%, 2.5%, 0.8%, and 1.4% at 2 m and 850, 500, and 250 hPa, respectively. The averaged improvement of 6–36-h geopotential height or pressure forecasts is about 2.5%, 3.4%, 2.0%, and 0.3% at sea level, 850, 500, and 250 hPa, respectively. All the improvements of ensemble mean forecasts are summarized in Fig. 7, where the relative improvement among levels and the improvement’s decreasing trend with forecast hours can be clearly seen.
By the way, although the improvement in ensemble mean forecasts is noticeable and statistically significant, it is the least compared to the improvements in ensemble spread and probabilistic forecasts (to be discussed in the next two subsections).
3) Ensemble spread
Three scores are used to measure the quality of ensemble spread. One is the ensemble spread itself, which is defined as a standard deviation of ensemble members’ forecasts with respect to ensemble mean. For an underdispersive EPS such as this one (the GRAPES EPS used in this study is under-dispersive), ensemble spread is not large enough to match ensemble mean forecast error (not shown). Therefore, increasing ensemble spread is a positive improvement. Figure 8 shows the ensemble spread of temperature at the four levels varying with forecast hours. The ensemble spread is greatly enhanced in the 3D mask run at all levels and forecast hours. These improvements are statistically significant (except for 30 and 36 h at 250 mb). Like the improvement in ensemble mean, the following two features are even more obvious for ensemble spread: the improvement decreases with the increase of forecast length and the increase of altitude. The averaged improvement of 6–36-h ensemble spread is about 24.2%, 22.2%, 10.1%, and 4.9% for 2-m and 850-, 500-, and 250-hPa temperature, respectively. Similar results are observed for the zonal wind U and geopotential height/pressure forecasts. The averaged improvement of 6–36-h ensemble spread is about 18.6%, 18.4%, 12.2%, and 5.7% for 10-m and 850-, 500-, and 250-hPa wind U, respectively. The averaged improvement of 6–36-h ensemble spread is about 11.9%, 8.7%, 5.0%, and 3.2% for mean sea level pressure and 850-, 500-, and 250-hPa geopotential height, respectively. Among the three variables, the least improvement in ensemble spread is in the mass field (3.2%–11.9%) compared to the wind (5.7%–18.6%) and temperature fields (4.9%–24.2%).
The second score is the outlier that counts how often (in %) an observation falls out of an ensemble envelope (i.e., the ensemble forecast range from minimum member to maximum member). It is to measure the capability of an ensemble to encompass an observation. The outlier is derived from the rank histogram (or Talagrand distribution) by adding the two far-end bins together. Forecasters and users at the CMA and its field offices normally prefer the outlier score more than the histogram because the outlier directly shows the missing rate of an EPS forecast. Since the outlier is a missing rate, the smaller the better. As the spread increases (with no worsening in ensemble mean position at the same time), the outlier has also significantly reduced. For example, the average reduction of 6–36-h outlier is about 16.1%, 23.2%, 20.1%, and 14.5% for 2-m and 850-, 500-, and 250-hPa temperature, respectively (Fig. 9). The average reduction of 6–36-h outlier is about 18.7%, 24.3%, 20.7%, and 12.1% for 10-m and 850-, 500-, and 250-hPa wind U, respectively. The average reduction of 6–36-h outlier is about 14.4%, 15.0%, 12.3%, and 9.0% for mean sea level pressure and 850-, 500-, and 250-hPa geopotential height, respectively. Again, the improvement in outlier also decreases with the increase of forecast length and the increase of altitude. More improvement is in temperature (14.5%–23.2%) and wind (12.1%–24.3%) than in geopotential height and pressure (9.0%–15.0%).
4) Probabilistic forecasts
The continuous ranked probability score (CRPS) is often used to evaluate performance of probabilistic forecasts. Analogous to the mean squared error for deterministic forecasts, CRPS is a mean squared difference between predicted cumulative probability density function (CDF) and observed CDF (either 0 or 1) over continuous mutually exclusive and collectively exhaustive categories (see the appendix of Du et al. 1997). It is a negatively oriented score, i.e., the smaller the better, with zero as a perfect score. CRPS is impacted by both reliability and resolution of a forecast (Jolliffe and Stephenson 2003). Figure 11 is the CRPS of geopotential height at various levels as well as sea level pressure varying with forecast hours. The 3D mask run has significantly less error (lower CRPS) over the 2D mask run at all levels and all forecast hours. More improvement is at lower levels (850 hPa and sea level) than the higher levels (250 and 500 hPa), which is more obvious in wind and temperature fields (Fig. 12). The average improvement of 6–36 h CRPS score is about 6.0%, 7.7%, 5.0%, and 1.7% for P_MSL and 850-, 500-, and 250-hPa geopotential height, respectively. This improvement decreases quickly from 6 to 12 or 18 h and remains generally at a same level for the rest of forecast hours (can be seen more clearly from Fig. 12). Similar results were observed for wind and temperature. The average improvement of 6–36 h CRPS score is about 6.1%, 5.5%, 1.6%, and 1.9% for 10-m and 850-, 500-, and 250-hPa wind U, respectively. The average improvement of 6–36-h CRPS score is about 6.6%, 6.2%, 3.0%, and 2.2% for 2-m and 850-, 500-, and 250-hPa temperature, respectively. Figure 12 summarizes all the improvements in probabilistic forecasts, where we can see the relative improvement among levels and the improvement’s decreasing trend with forecast hours clearer.
When the number of forecast categories is reduced to two (e.g., rain or no rain), the CRPS becomes the Brier score (BS). Figure 13 is the BS of probabilistic precipitation forecasts for light [≥0.1 mm (6 h)−1], moderate [≥4 mm (6 h)−1], and heavy rain [≥13 mm (6 h)−1]. The error has been significantly reduced (lower BS) especially prior to 18 h. The improvement quickly decreases with the increase of forecast hours prior to 24 h, and then remains similarly for the rest of forecast hours. On average over 6–36-h forecast range, the improvement is similar (about 4% and 5%) for all three rain categories, which indicates that the 3D rescaling mask cannot only improve light rain but also heavy rain events. However, readers need to keep in mind that although light, moderate and heavy rain events occur almost every day (Table 1) during the verification periods, the areal coverage of heavy rain is smaller than light and moderate categories, caution should be used in interpreting heavy rain category results.
Days of observed precipitation categories during the verification period.
Another measure to verify probabilistic precipitation forecasts is the relative operating characteristic (ROC). An ROC curve is plotted on a plane where the probability of detection (hit rate) is in the vertical axis (y axis) against the probability of false detection (false alarm rate) in the horizontal axis (x axis). Therefore, it measures ability of a forecast to discriminate between event (occurrence) and nonevent (nonoccurrence) (i.e., the resolution aspect of a forecast). For a good forecast, the hit rate should be as high as possible, while the false alarm rate should be as low as possible at the same time. The area under the ROC curve (AROC) is often used as a summary score of forecast resolution. The AROC value ranges from 0 (the worst score, hit rate is 0.0 and false alarm rate is 1.0 or ROC curve is the x axis) to 1 (the perfect score, hit rate is 1.0 and false alarm rate is 0.0 or the ROC curve is alone the y-axis forming a 1 × 1 square). The diagonal line (AROC is 0.5) indicates that hit rate is equal to false alarm rate (50%), which is a boundary to distinguish a random forecast. In other words, a forecast is better (worse) than a random forecast when AROC is greater (less) than 0.5, indicating that hit rate exceeds (is lower than) false alarm rate. Figure 14 shows the AROC scores for the same three rain categories (light, moderate and heavy). Both 2D and 3D mask runs are skillful (AROC > 0.5). The improvement in terms of AROC is also observed for all three rain categories. On average over 6–36-h forecast range, the improvement is similar (around 4%) for all three rain categories. The improvement is, however, particularly striking prior to 18 h with the statistical confidence level of exceeding at least 75%. This is because the 3D mask run has almost eliminated the model spinup issue of the 2D mask run in precipitation forecasts. For example, the forecast skill in the 2D mask run unusually increases with forecast hours prior to 18 h and then naturally decreases with the increase of forecast length. This “spinup” phenomenon has been overcome in the 3D mask run. After that the improvement slightly decreases or remains similarly with forecast time. Figure 15 summarizes the improvements in probabilistic quantitative precipitation forecasts in terms of both BS and AROC, where the improvement’s decreasing trend with forecast hours can be clearly seen for all three categories.
5) A summary of all scores
All the verification scores of ensemble mean, spread and probabilistic forecasts are summarized in a scorecard for all variables (Fig. 16). From the scorecard, we can see the following. (i) The benefit of 3D rescaling mask to ensemble performance is obvious: out of 420 verification measurements, 49% (205/420) is significantly improved (at 75%, 95%, and 99.7% levels), 50% (210/420) is comparable or neutral (not statistically significant), and only 1% (5/420) is significantly degraded (mainly upper level geopotential height spread). (ii) Significant improvement (at 75%, 95%, and 99.7% levels) is mainly occurred in the early forecast hours primarily prior to 24 h and lower levels (850 hPa and surface). The decreasing impact of IC perturbation methods with forecast length is similar to what Li et al. (2017) found in their study. (iii) Overall, the biggest improvement occurred in the ensemble spread in terms of outlier and consistency. Due to the increased spread, ensemble envelope can encompass observation more often (i.e., reduced outlier), and ensemble spread is more representative to forecast error of ensemble mean (i.e., improved consistency or spread–skill relationship). The second biggest improvement is probabilistic forecast in terms of CRPS, BS and AROC, which becomes more reliable and sharper. The least improved is ensemble mean forecasts in terms of RMSE. This is expected given the fact that the base model GRAPES_Meso has a quite large forecast bias and the bias is a big part of ensemble mean forecast error (see Wang et al. 2018). Unless a model-based bias correction scheme (such as Chen et al. 2020) is implemented together with an ensemble perturbation scheme (such as Xia et al. 2019), pure ensemble perturbation techniques address random error only but not model bias (Du et al. 2018).
To better demonstrate this layered improvement structure among ensemble mean, spread, and probabilistic information, Fig. 17 shows an example of surface variables (10-m U, 2-m T, P_MSL, and light, moderate, and heavy precipitation), where improvements in ensemble mean, spread and probabilistic forecasts are compared side by side. For the average of 6–36-h forecasts (Fig. 17a), the improvement is about 20% for ensemble spread, 5% for probabilistic forecasts, and 2.5% for ensemble mean forecasts. For the average of first 18-h forecasts (Fig. 17b), the improvement is about 30% for ensemble spread, 9% for probabilistic forecasts, and 4% for ensemble mean forecasts.
We have repeated the same set of experiment and verification for a cold-season period: 1–27 February 2019. Similar results are also seen although the improvement is in a lesser degree compared to the warm-season period. All verification scores of the cold-season cases are summarized in the scorecard in Fig. 18. We can see that 27% (114/420) of all scores have statistically significant improvement, 72% (303/420) of them are comparable or neutral (not statistically significant), and less than 1% (3/420) is significantly degraded (overdispersion). Also note that due to no enough sample for the heavy rain category in the cold season, some statistical significance levels cannot be calculated but just leave blank in the scorecard for heavy rain category (Fig. 18). In the following section, we will try to preliminarily explore how this 3D rescaling method might work and to understand why it works better in warm season than in cold season and why the improvement decreases with the increasing of forecast length.
c. Improvement mechanism
From the above analysis such as Figs. 16 and 17, we see that most obvious improvement is in the boost of ensemble spread, which subsequently results in the improvements in probability distribution and ensemble mean forecasts for an underdispersive EPS. In this section, we will explore possible mechanism leading to the differences in ensemble spread growth as well as the different performances between warm and cold seasons. Since ensemble IC perturbation growth is related to atmospheric instabilities (Toth and Kalnay 1997), the relationship between instability and ensemble spread change (e.g., spread increase from 2D rescaling-based EPS to 3D rescaling-based EPS) is investigated.
We have, therefore, examined if the spread change from the 2D to 3D rescaling-factor based EPS is related to the baroclinic or convective instabilities. The instabilities are calculated from the ensemble mean forecast of the 3D rescaling-factor EPS. The horizontal temperature gradient magnitude is used for the baroclinic instability, and Eq. (6) is used for the convective instability. Figure 19 compares the spread increment (color shaded) of 850-hPa specific humidity to the baroclinic instability (contour) at forecast hours of 18, 24, 30, and 36, initialized at 0000 UTC 10 February 2019 [the same cold-season snow event demonstrated in section 3b(1)]. The baroclinic instability area is not well organized but scattered in smaller scale and does not match to the spread increment area. Figure 20 compares the spread increment to the convective instability for the same cold-season case.
In contrast to the baroclinic instability, the convective instability is more organized at the larger scale and matches the spread increment area better. Figures 21 and 22 are the same as Figs. 19 and 20 but initiated from 0000 UTC 10 July 2018 [the same warm-season heavy rain case of section 3b(1)], where the similar results are observed as in the snow event.
To examine this relationship more quantitatively, spatial correlation between the spread change of 850-hPa specific humidity and the instabilities have been calculated for the summer and winter experiment periods. The result is shown in Fig. 23. We can see that the spread change is more correlated to convective instability than baroclinic instability for both seasons. The correlation seems to be stronger in summer than winter. This result suggests that the increase of ensemble spread in the experiment EPS is likely through the convective instability. Given that the EPS is in storm-scale (3 km), it is not surprising to see that convective instability plays a more important role than baroclinic instability for the spread growth. Therefore, to have a more effective storm-scale EPS of short-range forecasts over a region, its IC perturbation should be designed to target the structure of convective instability. This might be different for a synoptic-scale and global EPS where IC perturbation targets more on baroclinic instability (Toth and Kalnay 1997).
Comparing the summer case (Fig. 22) with the winter case (Fig. 20), we can see that the convective instability is much stronger in warm season (∼2.0 × 0.000 01 K Pa−1) than in cold season (∼0.5 × 0.000 01 K Pa−1). This explains why the spread increment and forecast improvement is more in the summer month than in the winter month. Since convective instability is a fast-growing mode, ensemble spread associated with it becomes saturated quickly with time. This could also explain why the improvement decreases with the increase of forecast length. This result implies that retaining or further increasing ensemble spread beyond a certain forecast length might be more difficult for a storm-scale EPS than a synoptic-scale global EPS, which EPS developers should pay attention to. Note that since this investigation into the relationship between ensemble spread growth and atmospheric instabilities in a storm-scale EPS is very preliminary (based on only one variable 850-hPa specific humidity), the result here only tends to shed light on this topic but not definitive. More rigorous in-depth study is needed to thoroughly understand storm-scale EPS’ behaviors.
4. Summary and discussion
A 3D mask to rescale IC perturbation has been proposed and tested using a 3-km storm-scale EPS. A systematic evaluation has been conducted to examine if it improves ensemble forecasts. The forecasts of wind, temperature, and geopotential height at various levels, sea level pressure and precipitation have been examined in terms of ensemble mean forecast, ensemble spread and probabilistic forecast. The study was carried out in a summer month (1–28 July 2018) and a winter month (1–27 February 2019) over a region in North China. The experiment runs twice per day initiated at 0000 and 1200 UTC into 36 h in forecast length, providing a total of 56 36-h forecast warm-season cases and 54 cold-season cases for verification. To compare the differences in performance, warm and cold seasons were verified separately. Below is a summary of the findings with some discussions.
-
As intended, the 3D mask makes the IC perturbation more representative to analysis uncertainty than the 2D mask. The vertical profile of the IC perturbation size is much closer to the estimated intrinsic analysis error.
-
The performance of ensemble forecasts has been significantly improved in all aspects including ensemble mean forecast, ensemble spread and probabilistic forecasts. The most improvement occurred in ensemble spread, followed by probabilistic forecasts, while the least improvement is associated with the ensemble mean forecast. This could be explained by the fact that an improvement to IC perturbation method mainly improves ensemble diversity (spread) but does not reduce model bias, while bias error constitutes a major part of ensemble mean forecast error for the base model. The improvements are larger in warm season than in cold season.
-
The improvement decreases with the increase of forecast length. This result is consistent with the conclusion of Li et al. (2017) who found that the resulting ensemble spread from vastly different perturbation-generating methods became similar to each other when forecast length increased.
-
The improvement is found to be greater in lower levels and surface than upper levels. It is probably due to the fact that near surface variables are normally more under-dispersive than upper air variables in an ensemble (Du et al. 2018), providing more room to be improved in lower levels than in upper levels.
-
The increment of ensemble spread from the 2D to 3D rescaling-factor based IC perturbation is realized likely through convective instability rather than baroclinic instability. This explains why the forecast improvement is more in the summer month (stronger convective instability) than in the winter month. This could also explain why the improvement decreases with the increase of forecast length because convective instability is a fast-growing mode and becomes saturated quickly with time. This result is preliminary and needs to be further studied because it is based on only one variable 850-hPa specific humidity. If ensemble spread growth is indeed more sensitive to convective instability than baroclinic instability in a storm-scale EPS of short-range forecasts, IC perturbation should be designed to target the structure of convective instability to have a more effective ensemble of forecasts. This is different from designing a synoptic-scale and global EPS which IC perturbation targets more on baroclinic instability. On a negative side of this result is that retaining or further increasing ensemble spread beyond a certain forecast length could be more challenging for a storm-scale EPS than a synoptic-scale global EPS.
This study recommends that a 3D rescaling mask should be used to replace a commonly used 2D one in current operational EPSs. It is important for shorter time range and near surface weather element forecasts, on which are particularly focused by storm-scale ensembles. Storm-scale EPS has been proven to be useful for high impact weather events (Roberts et al. 2020). There is a side benefit of this 3D rescaling mask for data assimilation. Being more representative to true analysis uncertainty, 3D rescaled IC perturbations could also be more useful in ensemble-conventional hybrid data assimilation such as ensemble Kaman filter (EnKF; Zhou et al. 2017). Computationally, the calculation of 3D rescaling factor costs almost nothing and can be done instantaneously in a supercomputer. The rescaling factor [Eq. (1)] and new perturbation [Eq. (2)] are calculated only once at model’s initial time for each forecast cycle. The only difference between the 2D and 3D rescaling factor is that the former is calculated at one level, while the latter is calculated at all model levels. In other words, the extra computing time for the 3D rescaling factor can be practically neglected. Finally, a possible pitfall of this new 3D rescaling mask is that it could lead to over-dispersion of ensemble spread if an original 2D rescaling-based EPS has already an adequate spread–skill relation. However, this situation should occur rarely because almost all current EPSs in operation are underdispersive including the GRAPES EPS of this study. Theoretically, an EPS should not be overdispersive if a model’s uncertainty sources are not fully sampled in a perturbation method. Spurious ensemble spread only occurs when a perturbation method is not well designed or tuned like the ad hoc multimodel method, where a right answer could stem from a wrong reason. Therefore, how to increase diversity among ensemble members is still a main task of EPS design by improving perturbation methods nowadays (Du et al. 2018). The 3D rescaling mask demonstrated in this study is a low-hanging fruit to improve ensemble diversity.
Acknowledgments.
The work was jointly supported by National Key Research and Development Program (2018YFF0300103), National Natural Science Foundation of China (41975137, 42175012, and 41475097), and Key Scientific and Technology Research and Development Program of Jilin Province (20180201035SF).
Data availability statement.
The experiments and verification data are accessible at https://pan.baidu.com/s/1m1jmWuFR9dh-Fucikcpymw. The password is 1111, and the folder name is REPS_DATA.
REFERENCES
Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127, 2741–2758, https://doi.org/10.1175/1520-0493(1999)127<2741:AMCIOT>2.0.CO;2.
Buizza, R., M. Miller, and T. N. Palmer, 1999: Stochastic representation of model uncertainties in the ECMWF ensemble prediction system. Quart. J. Roy. Meteor. Soc., 125, 2887–2908, https://doi.org/10.1002/qj.49712556006.
Buizza, R., J. Du, Z. Toth, and D. Hou, 2018: Major operational ensemble prediction systems (EPS) and the future of EPS. Handbook of Hydrometeorological Ensemble Forecasting, Q. Duan et al., Eds., Springer, 1–43, https://doi.org/10.1007/978-3-642-40457-3_14-1.
Caron, J. F., 2013: Mismatching perturbations at the lateral boundaries in limited-area ensemble forecasting: A case study. Mon. Wea. Rev., 141, 356–374, https://doi.org/10.1175/MWR-D-12-00051.1.
Chen, D. H., and X. S. Shen, 2006: Recent progress on GRAPES research and application. J. Appl. Meteor. Sci., 17, 773–777, https://doi.org/10.3969/j.issn.1001-7313.2006.06.014.
Chen, D. H., J. S. Xue, and X. S. Yang, 2008: New generation of multi-scale NWP system (GRAPES): General scientific design. Chin. Sci. Bull., 53, 3433–3445, https://doi.org/10.1007/s11434-008-0494-z.
Chen, J., J. Wang, J. Du, Y. Xia, F. Chen, and H. Li, 2020: Forecast bias correction through model integration: A dynamical wholesale approach. Quart. J. Roy. Meteor. Soc., 146, 1149–1168, https://doi.org/10.1002/qj.3730.
Deng, G., and Coauthors, 2010: Development of mesoscale ensemble prediction system at National Meteorological Center. Chin. J. Appl. Meteor. Sci., 21, 513–523, https://doi.org/10.3969/j.issn.1001-7313.2010.05.001.
Du, J., 2002: Present situation and prospects of ensemble numerical prediction. J. Appl. Meteor. Sci., 13, 16–28.
Du, J., and M. S. Tracton, 2001: Implementation of a real-time short-range ensemble forecasting system at NCEP: An update. Preprints, Ninth Conf. on Mesoscale Processes, Ft. Lauderdale, FL, Amer. Meteor. Soc., P4.9, 355–356, https://ams.confex.com/ams/WAF-NWP-MESO/techprogram/paper_23074.htm.
Du, J., and B. Zhou, 2017: Ensemble fog prediction. Marine Fog: Challenges and Advancements in Observations, Modeling, and Forecasting, D. Koracin and C. E. Dorman, Eds., Springer, 477–509, https://link.springer.com/chapter/10.1007/978-3-319-45229-6_10.
Du, J., S. L. Mullen, and F. Sanders, 1997: Short-range ensemble forecasting of quantitative precipitation. Mon. Wea. Rev., 125, 2427–2459, https://doi.org/10.1175/1520-0493(1997)125<2427:SREFOQ>2.0.CO;2.
Du, J., G. DiMego, B. Zhou, D. Jovic, B. Ferrier, and B. Yang, 2015: Short Range Ensemble Forecast (SREF) system at NCEP: Recent development and future transition. 23rd Conf. on Numerical Weather Prediction/27th Conf. on Weather Analysis and Forecasting, Chicago, IL, Amer. Meteor. Soc., 2A.5, https://ams.confex.com/ams/27WAF23NWP/webprogram/Paper273421.html.
Du, J., and Coauthors, 2018: Ensemble methods for meteorological predictions. Handbook of Hydrometeorological Ensemble Forecasting, Q. Duan et al., Eds., Springer, 1–52, https://doi.org/10.1007/978-3-642-40457-3_13-1.
Dudhia, J., 1989: Numerical study of convection observed during the Winter Monsoon Experiment using a mesoscale two-dimensional model. J. Atmos. Sci., 46, 3077–3107, https://doi.org/10.1175/1520-0469(1989)046<3077:NSOCOD>2.0.CO;2.
Epstein, E. S., 1969: Stochastic dynamic prediction. Tellus, 21, 739–759, https://doi.org/10.3402/tellusa.v21i6.10143.
Feng, Y., J. Min, X. Zhuang, and S. Wang, 2019: Ensemble sensitivity analysis-based ensemble transform with 3D rescaling initialization method for storm-scale ensemble forecast. Atmosphere, 10, 24, https://doi.org/10.3390/atmos10010024.
Flowerdew, J., and N. Bowler, 2013: On-line calibration of the vertical distribution of ensemble spread. Quart. J. Roy. Meteor. Soc., 139, 1863–1874, https://doi.org/10.1002/qj.2072.
Hong, S. Y., and H. L. Pan, 1996: Nonlocal boundary layer vertical diffusion in a medium-range forecast model. Mon. Wea. Rev., 124, 2322–2339, https://doi.org/10.1175/1520-0493(1996)124<2322:NBLVDI>2.0.CO;2.
Hong, S. Y., and J. O. Lim, 2006: The WRF Single Moment 6-class microphysics scheme (WSM6). J. Korean Meteor. Soc., 42, 129–151.
Jolliffe, I. T., and D. B. Stephenson, Eds., 2003: Forecast Verification. Wiley Press, 247 pp.
Lacarra, J. F., and O. Talagrand, 1988: Short range evolution of small perturbation in a barotropic model. Tellus, 40A, 81–95, https://doi.org/10.1111/j.1600-0870.1988.tb00408.x.
Leith, C. E., 1974: Theoretical skill of Monte Carlo forecasts. Mon. Wea. Rev., 102, 409–418, https://doi.org/10.1175/1520-0493(1974)102<0409:TSOMCF>2.0.CO;2.
Li, J., J. Du, M. Wang, and C. Cui, 2009: Experiments of perturbing initial conditions in the development of mesoscale ensemble prediction system for heavy rainstorm forecasting. Plateau Meteor., 28, 1365–1375.
Li, J., J. Du, Y. Liu, and J. Xu, 2017: Similarities and differences in the evolution of ensemble spread using various ensemble perturbation methods including topography perturbation. Acta Meteor. Sin., 75, 123–146, https://doi.org/10.11676/qxxb2017.011.
Liu, Y. Z., X. S. Shen, and X. L. Li, 2013: Research on the singular vector perturbation of the GRAPES global model based on the total energy norm. Acta Meteor. Sin., 71, 517–526, https://doi.org/10.3969/j.issn.1004-4965.2013.03.020.
Lorenz, E. N., 1963: Deterministic nonperiodic flow. J. Atmos. Sci., 20, 130–141, https://doi.org/10.1175/1520-0469(1963)020<0130:DNF>2.0.CO;2.
Ma, J., Y. Zhu, D. Hou, X. Zhou, and M. Peña, 2014: Ensemble transform with 3D rescaling initialization method. Mon. Wea. Rev., 142, 4053–4073, https://doi.org/10.1175/MWR-D-13-00367.1.
Ma, X., J. Xue, and W. Lu, 2008: Preliminary study on ensemble transform Kalman filter based initial perturbation scheme in GRAPES global ensemble prediction. Acta Meteor. Sin., 66, 526–536, https://doi.org/10.11676/qxxb2008.050.
Ma, X., Z. Zhuang, J. Xue, and W. Lu, 2009: Development of the three-dimensional variational data assimilation system for the nonhydrostatic GRAPES. Acta Meteor. Sin., 23, 725–737.
Mahrt, L., and M. Ek, 1984: The influence of atmospheric stability on potential evaporation. J. Climate Appl. Meteor., 23, 222–234, https://doi.org/10.1175/1520-0450(1984)023<0222:TIOASO>2.0.CO;2.
Mlawer, E. J., S. J. Taubman, P. D. Brown, M. J. Iacono, and S. A. Clough, 1997: Radiative transfer for inhomogeneous atmospheres: RRTM, a validated correlated-k model for the longwave. J. Geophys. Res., 102, 16 663–16 682, https://doi.org/10.1029/97JD00237.
Molteni, F., R. Buizza, T. Parmer, and T. Petroloagis, 1996: The ECMWF ensemble prediction system: Methodology and validation. Quart. J. Roy. Meteor. Soc., 122, 73–119, https://doi.org/10.1002/qj.49712252905.
Noilhan, J., and S. Planton, 1989: A simple parametrization of land surface processes for meteorological models. Mon. Wea. Rev., 117, 536–549, https://doi.org/10.1175/1520-0493(1989)117<0536:ASPOLS>2.0.CO;2.
Parrish, D., and J. Derber, 1992: The National Meteorological Center spectral statistical interpolation analysis. Mon. Wea. Rev., 120, 1747–1763, https://doi.org/10.1175/1520-0493(1992)120<1747:TNMCSS>2.0.CO;2.
Roberts, B., T. G. Burkely, I. L. Jirak, A. J. Clark, D. C. Dowell, X. Wang, and Y. Wang, 2020: What does a convection-allowing ensemble of opportunity buy us in forecasting thunderstorms. Wea. Forecasting, 35, 2293–2316, https://doi.org/10.1175/WAF-D-20-0069.1.
Toth, Z., and E. Kalnay, 1997: Ensemble forecasting at NCEP: The breeding method. Mon. Wea. Rev., 125, 3297–3319, https://doi.org/10.1175/1520-0493(1997)125<3297:EFANAT>2.0.CO;2.
Wang, J., J. Chen, J. Du, Y. Zhang, Y. Xia, and D. Guo, 2018: Sensitivity of ensemble forecast verification to model bias. Mon. Wea. Rev., 146, 781–796, https://doi.org/10.1175/MWR-D-17-0223.1.
Wang, J., J. Chen, H. Zhang, H. Tian, and Y. Shi, 2021: Initial perturbations based on ensemble transform Kalman filter with rescaling method for ensemble forecasting. Wea. Forecasting, 36, 823–842, https://doi.org/10.1175/WAF-D-20-0176.1.
Wang, X., and C. H. Bishop, 2003: A comparison of breeding and ensemble transform Kalman filter ensemble forecast schemes. J. Atmos. Sci., 60, 1140–1158, https://doi.org/10.1175/1520-0469(2003)060<1140:ACOBAE>2.0.CO;2.
Wang, Y., M. Bellus, J. Geleyn, X. Ma, W. Tian, and F. Weidle, 2014: A new method for generating initial condition perturbations in regional ensemble prediction system: Blending. Mon. Wea. Rev., 142, 2043–2059, https://doi.org/10.1175/MWR-D-12-00354.1.
Wu, W., and R. J. Purser, 2002: Three-dimensional variational analysis with spatially inhomogeneous covariances. Mon. Wea. Rev., 130, 2905–2916, https://doi.org/10.1175/1520-0493(2002)130<2905:TDVAWS>2.0.CO;2.
Xia, Y., J. Chen, J. Du, X. Zhi, J. Wang, and X. Li, 2019: A unified scheme of stochastic physics and bias correction in an ensemble model to reduce both random and systematic errors. Wea. Forecasting, 34, 1675–1691, https://doi.org/10.1175/WAF-D-19-0032.1.
Zhang, H. B., J. Chen, X. F. Zhi, Y. Wang, and Y. N. Wang, 2015: Study on multi-scale blending initial condition perturbations for a regional ensemble prediction system. Adv. Atmos. Sci., 32, 1143–1155, https://doi.org/10.1007/s00376-015-4232-6.
Zhang, X., and J. Sun, 2018: Analysis of the July 2018 atmospheric circulation and weather. Meteor. Mon., 44, 1370–1376.
Zhou, X., Y. Zhu, D. Hou, and D. Kleist, 2016: Comparison of the ensemble transform and the ensemble Kalman filter in the NCEP global ensemble forecast system. Wea. Forecasting, 31, 2057–2074, https://doi.org/10.1175/WAF-D-16-0109.1.
Zhou, X., Y. Zhu, D. Hou, Y. Lou, J. Peng, and R. Wobus, 2017: Performances of the new NCEP global ensemble forecast system in a parallel experiment. Wea. Forecasting, 32, 1989–2004, https://doi.org/10.1175/WAF-D-17-0023.1.
Zhuang, Z., J. Xue, and H. Lu, 2014: Experiments of global GRAPES-3DVar analysis based on pressure level and prediction system. Plateau Meteor., 33, 666–674.
Zhuang, Z., R. Wang, J. Wang, and J. Gong, 2019: Characteristics and application of background errors in GRAPES_Meso. J. Appl. Meteor. Sci., 30, 316–331, https://doi.org/10.11898/1001-7313.20190306.