Convective-Scale Sampling Error and Its Impact on the Ensemble Radar Data Assimilation System: A Case Study of a Heavy Rainfall Event on 16 June 2008 in Taiwan

Sampling error stems from the use of ensemble-based data assimilation (EDA) with a limited ensemble size andcanresultinspuriousbackgrounderrorcovariances,leadingtofalseanalysiscorrections.TheWRF-LETKFradarassimilationsystem(WLRAS)isperformedseparatelywith256and40memberstoinvestigatethe characteristics of convective-scale sampling errors in the EDA and its impact on precipitation prediction based ona heavy rainfall eventon16 June2008.The resultssuggest that the sampling errors for this eventaresensitive to the relationships between the simulated observations and model variables, the intensity of reﬂectivity, and how the prevailing wind projects to the radial wind in the areas that the radar cannot resolve U or V wind. The sampling errors lead to an underprediction of heavy rainfall when the horizontal localization radius is inade- quately large, but this can be mitigated when a more accurate moisture condition is provided. In addition, being able to use a larger vertical localization plays an important role in providing necessary adjustments for repre- senting the vertical thermodynamical structure of convection, which further improves precipitation prediction. Astrategymitigatingtheimpactofsamplingerrorsassociatedwiththelimitationofradialwindmeasurementby inﬂating the observation error over sensitive areas can bring beneﬁts to precipitation prediction.


Introduction
Data assimilation (DA) methods, which estimate the optimal initial condition for numerical modeling, synergize the information provided by numerical model simulation and observations and have become an important component for numerical weather prediction (NWP). For convective-scale NWP models, radar DA is widely utilized to represent realistic convective-scale structures because of the high spatial and temporal resolution of radar data. With the advantage of using flow-dependent background error statistics, ensemble-based radar data assimilation (EnRDA) has been extensively applied in operational centers and Denotes content that is immediately available upon publication as open access. research works (e.g., Gao and Xue 2008;Dowell et al. 2011;Chang et al. 2014;Wheatley et al. 2015;Wang et al. 2016;Miyoshi et al. 2016). EnRDA has shown benefits of providing probabilistic predictions for convective storms (Snook et al. 2015;Yussouf et al. 2016;Putnam et al. 2017). There also has been an increasing interest in applying EnRDA to improving the accuracy of precipitation predictions (Bick et al. 2016;Yokota et al. 2018;Gastaldo et al. 2018). With the purpose of improving very short-term (,6 h) rainfall forecasting termed as quantitative precipitation nowcasting (QPN), an EnRDA framework, which coupled the local ensemble transform Kalman filter (LETKF) with the Weather Research and Forecasting (WRF) Model, was established to assimilate data from the radar network in Taiwan (Tsai et al. 2014, hereafter TYL14). This WRF-LETKF radar assimilation system (WLRAS) has been demonstrated its skills for heavy rainfall predictions for cases of typhoons and heavy precipitation episodes (Tsai et al. 2016;Yang et al. 2020;Cheng et al. 2019Cheng et al. , 2020. In EnRDA, model state variables can be updated directly by radar data through the flow-dependent background error covariance (BECV) estimated by the ensemble forecast. Many EnRDA studies have demonstrated the importance of the cross-variable BECV between observed variables and model state variables for updating the model states that are not directly observed by radar (Snyder and Zhang 2003;Tong and Xue 2005;Snook et al. 2015). However, the BECV estimated by ensemble forecast could be suboptimal due to the sampling error introduced by using an insufficient ensemble size (Houtekamer and Mitchell 1998;Miyoshi et al. 2014). Sampling error can contaminate the BECV estimation and result in false analysis corrections. Usually, strategies of covariance localization (Houtekamer and Mitchell 1998;Hamill et al. 2001) are applied to deal with the issues associated with sampling error in the ensemble-based data assimilation (EDA) algorithms. The general idea of covariance localization is to taper the covariance values and limit the analysis corrections at long distances. Another form of localization strategy referred to as ''variable localization'' (Kang et al. 2011) nullifies the cross-variable BECV whenever the relationship between the variables is sufficiently weak, to avoid false corrections on the model state variables. Such a strategy is also applied in Tsai (2014) and Shao (2015) with WLRAS, in which the radial velocity (V r ) is used to update the wind and moisture fields while the radar reflectivity (Z h ) is used to only update the hydrometeor fields.
However, the use of localization is empirical and it can be computationally expensive to find out the optimal setting for different situations. An improper or naive application of these strategies may damage the advantage of EDA and even bring detrimental impacts to the analysis. For example, an inappropriate radius of localization may distort meaningful flow-dependent error covariance (Snyder and Zhang 2003;Miyoshi et al. 2014;Kondo and Miyoshi 2016) or introduce nonnegligible imbalances (Kepert 2009;Greybush et al. 2011). To mitigate the negative impact of sampling error and achieve the full benefit of flow-dependent error covariance, advanced approaches of localization have been proposed for EDA algorithms (e.g., Bishop and Hodyss 2007;Buehner and Shlyaeva 2015;Ménétrier et al. 2015;Anderson 2016;Miyoshi and Kondo 2013). However, limited literature focuses on the advanced approaches tackling sampling error in EnRDA for convective-scale precipitation predictions. In EnRDA, simple isotropic localizations are still most widely adopted (Sobash and Stensrud 2013) while the error covariance at convective scale is more heterogeneous and anisotropic than the one with larger scales (Ménétrier et al. 2014). Strong nonlinear dynamics and uncertainties in the convective-scale physical processes make it difficult to determine an optimal localization for EnRDA universally. Furthermore, different types of precipitating processes during the development of convective storms add the complication of the sampling error issue (Aonashi and Eito 2011;Aonashi et al. 2016).
The advancement of computing resources and technology allows studies to use ensemble forecasts with an order close to or larger than 100 members to investigate the characteristic of ensemble-estimated error covariance and sampling error (e.g., Kunii 2014;Poterjoy et al. 2014;Miyoshi et al. 2014;Kondo and Miyoshi 2016;Kawabata and Ueno 2020). Studies with a relatively larger ensemble size provide new insights into how sampling error or localization distorts the real flowdependent error covariance. Ménétrier et al. (2014) showed that sampling error leads to a spuriously larger horizontal scale of the estimated error variances by comparing the convective-scale ensemble forecasts with 6 and 84 members. Necker et al. (2020a) indicated that a 40-member ensemble is able to provide qualitative information of large-scale error correlations, while at least 200 members are required for providing a reliable estimation of convective-scale error correlations. Necker et al. (2020b) further show that, for convective-scale EDA, it is possible to provide sampling error corrections estimated by a 1000-member ensemble. Among the previous works investigating sampling error with a relatively larger ensemble size, the focuses are mainly on the error covariance between model states or on the benefit for forecast accuracy from increasing the ensemble size. Few studies have investigated the impact of sampling error on the cross-variable BECV between observed and model variables in the context of EnRDA with an ensemble size larger than an order of 100. Even though it is expected that sampling error can have a greater impact when the error correlation is weak (Houtekamer and Mitchell 1998), it is unclear how such impact affects the updating process for the model states, especially the ones that are not directly related to the observed variables in EnRDA. With a purpose to improve the performance of EnRDA, more efforts need to be made to understand the characteristics and attributions of the convective-scale sampling errors that appear in the BECV between nonobservable model states and observed variables (e.g., moisture versus radar observations). By performing WLRAS separately with 40 and 256 ensemble members, this study attempts to identify the attributions of sampling errors in EnRDA. Whether sampling errors have different sensitivities to variables related to the development of convections, such as wind, moisture, and hydrometeors, in regions characterized by different conditions? Is the detrimental impact universal, or is it conditionally dependent on precipitation types or other variables of interest? Also, through what kind of mechanism, sampling error affects the precipitation prediction? Does the measurement limitation contribute to sampling error? Answering these questions can help us design strategies to cope with the sampling errors in WLRAS, such as the optimization of the localization scale and inflation parameter.
The investigations in this study are based on a heavy rainfall event associated with a mesoscale convective system (MCS) developed during the prefrontal condition. This case is characterized by long duration and heavy precipitation near the coast and is a typical heavy rainfall episode during mei-yu seasons. The case is also important in the way that it involves multiscale interactions and WLRAS is applied over complex terrain. In the following, the paper is organized as follows. Section 2 introduces the configuration of WLRAS and the experimental settings. Section 3 presents the comparison of the background error correlations derived from different ensemble sizes, while section 4 discusses the precipitation prediction results of the experiments with different localization scales and ensemble sizes. The summary and discussion are provided in section 5.
2. Configuration of the ensemble radar data assimilation system a. Model and data assimilation system The WLRAS used in this study is performed with the Advanced Research WRF model version 3.3.1 (Skamarock et al. 2008). This version has been applied to the same rainfall event with thoroughly exploration of the assimilation parameters and results have shown skillful rainfall forecast (Shao 2015). All experiments use the twoway nested model domains whose horizontal dimensions are 180 3 180 and 330 3 330 grid points with grid spacings equal to 15 and 3 km, respectively (Fig. 1). The model has 44 vertical levels with the top level at 30 hPa and the gap between model levels is about 90 m near the surface (below 500 m). The same physics parameterization schemes are used in all experiments, including the Goddard Cumulus Ensemble (GCE) models microphysics scheme (Tao et al. 2003), Rapid Radiative Transfer Model for GCMs (RRTMG) radiation scheme (Iacono et al. 2008), Unified Noah land surface model (Tewari et al. 2004), Yonsei University scheme (YSU) for the planetary boundary layer (Hong et al. 2006), and Kain-Fritsch scheme (Kain 2004) for cumulus parameterization. The cumulus parameterization is only applied in the outer domain.
Both V r and Z h are assimilated in all experiments. The observation errors (3 m s 21 for V r and 5 dBZ for Z h ) and multiplicative covariance inflation used in WLRAS are the same as TYL14. The observation operator for simulating Z h is improved based on the scheme described in Dowell et al. (2011) to consider the cold-rain process; Z h (dBZ) is calculated by Z h 5 10 log 10 (Z r 1 Z g 1 Z s ), where Z r , Z g , and Z s are the reflectivity factors associated with rainwater, graupel, and snow, respectively, calculated by Z r 5 3:63 3 10 9 (r a q r ) 1:75 , Z g 5 1:12 3 10 9 (r a q g ) 1:75 , FIG. 1. The range of model domain and radar data (gray contour) used in this study. The color shading indicates the topography, and the red and blue points are the location of RCCG and RCKT radar sites, respectively.
In Eqs.
(1)-(3), r a is the air density (kg m 23 ), and q r , q s , and q g are the mixing ratios (kg kg 21 ) of rainwater, graupel, and snow from the model, respectively. It is expected that the ice-phase hydrometeors above the bright band can be better simulated than the one used in TYL14. The operator of V r is the same as the one described in TYL14 and the relationship between V r (m s 21 ) and the model three-dimensional velocity components U, V, and W (m s 21 ) is where r 5 (x 2 1 y 2 1 z 2 ) 21/2 ; x, y, and z are the Cartesian coordinates with the origin at the radar site; and y t (m s 21 ) is the terminal velocity calculated by y t 5 5:4( p 0 /p) 0:4 (r a q r ) 0:125 , where p 0 is the surface pressure (Pa) and p is the base-state pressure (Pa). The observation localization (R-localization, Hunt et al. 2007) is adopted by multiplying the observation error variance with a distance-dependent weighting function w [Eq. (5)]: Here, d and s are the distance (m) between the observation and the model analysis grid point, and the localization scale (m), respectively. Subscripts h and y denote the horizontal and vertical direction, respectively. Observations located farther than the cutoff scale (triple the value of s h or s y ) from each model grid point will not be used to update the model state. Discussion about the impact of using different cutoff localization scales will be described in section 4.

b. Experimental setup
We focus on the heavy rainfall event that occurred on the 16 June 2008, during the Southwest  Monsoon Experiment/Terrain-Influenced Monsoon Rainfall Experiment (SoWMEX/TiMREX) field campaign. The MCS developed offshore of southwestern Taiwan and brought heavy rainfall over the coastal area. Warm, moist southwesterly flow from the northern South China Sea and a low-level jet (LLJ) set up an unstable condition favorable for convective development (Xu et al. 2012;Chang et al. 2015). The low-level convergence was also enhanced by the cold pool formed by previous rainfall episodes. This plays an important role in maintaining the development of the MCS offshore (Xu et al. 2012;Tu et al. 2014).
The WLRAS was performed separately with ensemble sizes of 256 and 40 members, which hereafter will be referred to as ''large ensemble'' and ''small ensemble,'' respectively. We note that 40-member is a common ensemble size for EnRDA which was also used in TYL14. For both large and small ensembles, we applied three different localization settings (Table 1) as main experiments to investigate the impact of sampling error on precipitation prediction with different localization scales. All experiments were initialized at 1800 UTC 15 June 2008, with initial and boundary conditions generated from the NCEP 18 3 18FNL (Final) Operational Global Analysis data and perturbed according to the background error covariance constructed for the WRF-3DVAR system. After the ensemble was spun-up for 6 h, the data from two S-band radars at Chigu (RCCG) and Kentin (RCKT) were assimilated for the inner domain every 15-min from 0000 to 0200 UTC 16 June 2008. The observing range of RCCG and RCKT ( Fig. 1) covers the southwestern Taiwan, where the main rainfall area was in this event. The quality control of the raw radar data, including ground clutter removal and velocity unfolding, is provided by the Radar Meteorology Laboratory in National Central University, Taiwan. The superobbing procedure (e.g., Alpert and Kumar 2007;Zhang et al. 2009) was carried out after the data were quality controlled. The radial and azimuth spacings of superobs are 5 km and 58, respectively. Figure 2a shows an example of the spatial distribution of the superobs at the first radar plan position indicator (PPI) elevation angle (0.58). In all experiments, both V r and Z h are used to update the three-dimensional velocity components (U, V, W ), perturbation state of potential temperature (u), perturbation state of geopotential, mixing ratio of water vapor (q y ), and mixing ratios of hydrometeors including cloud water (q c ), rain (q r ), cloud ice (q i ), snow (q s ), and graupel (q g ). At the end of the assimilation period, deterministic forecasts were initialized from the analysis ensemble means at 0200 UTC.

Characteristics of sampling error on the ensemble-based background error correlation
EDA follows the Kalman filter update, which the analysis state (x a ) is obtained by updating the background state of model variables (x b ) with observations (y o ): In Eq. (6), P b is BECV, H is the linearized observation operator, and R is the observation error covariance (Houtekamer and Zhang 2016). The term P b H T is the BECV between simulated observations and model variables, such as V r and q y , and it determines how the innovation (y o 2 Hx b ) is spread out to update the background state x b . In EDA, P b and P b H T are estimated by ensemble forecast and can be contaminated by sampling error. Given that BECV can be decomposed to background error correlation (BECR) and error variance, BECR can be regarded as the standardized BECV. In the following, the impact of sampling error is highlighted by comparing the BECRs between simulated observations and model variables derived from the large and small ensemble sets at 0000 UTC 16 June, right before the radar assimilation.
The difference in the BECRs between the two ensembles is regarded as the effect of sampling error. Previous work by Chung et al. (2013) examined how the convective-scale flow-dependency in the BECR is established and found that the structures of the BECR can have different characteristics between precipitation and nonprecipitation regions. This motivates a question whether the sampling errors are sensitive to the flow characteristics, and the BECRs in regions of intense and weak reflectivity are compared separately in this section. Figure 2b is the probabilistic-matched (PM) ensemble mean (Ebert 2001;Fang and Kuo 2013;Schwartz et al. 2014) of the composite reflectivity computed from the large ensemble at 0000 UTC 16 June. The PM ensemble mean is applied here to provide the pattern of the ensemble mean and emphasize the probability density function of the large ensemble. Points A and B in Fig. 2b are the selected observation points locate in weak and intense reflectivity regions, respectively. We calculate point BECRs between the simulated observation variable (V r or Z h ) at the selected points and the model state variable (U, V, q y , or q r ) at a model level near the altitude of the selected observation point, with either the small or large ensemble. The point BECRs will be denoted as CORR(V r , U), for example, which is defined as the BECR between V r at the selected point and the U wind at the model level.  ensemble sets for CORR(V r , V) associated with point B is also apparent ( Fig. 5b versus Fig. 5f). In contrast, CORR(V r , U) associated with point B shows a distinct difference between the two ensemble sets (Fig. 5a versus Fig. 5e). This difference implies that CORR(V r , U) associated with point B is more vulnerable to sampling error. With such error correlation, if there is a positive innovation of V r at point B, the small ensemble will lead to a westerly wind correction north of point B, while almost no correction of this type would be derived with the large ensemble. As will be discussed in more detail later, CORR(V r , U) at point B is sensitive to sampling error because it is in a region where the RCCG radar cannot resolve the U component.
For the hydrometeor variable q r , CORR(Z h , q r ) with both points A and B is characterized by large positive correlation around the observation points with either the large or the small ensemble. This is expected since the simulation of Z h in the observation operator is related to q r directly [Eq. (1)], and both q r and Z h directly reflect the development of moist convection. In comparison, CORR(V r , q r ) of both ensemble sets exhibits a dipole pattern at both point A and B. Such a pattern can be attributed to the effect of the advection of q r caused by V r . Given a condition with abundant q r in the upstream of radial direction to the radar site, a negative V r innovation at point A and B will lead to a positive correction in the downstream with the CORR(V r , q r ) shown in Figs. 3d and 5d. These results explain that CORR(V r , q r ) is dominated by the dynamical condition while the CORR(Z h , q r ) shows a more localized pattern in response to convective-scale thermodynamics and microphysics. In general, both CORR(V r , q r ) and CORR(Z h , q r ) with two points show similar patterns between the large and small ensembles (i.e., low sensitivity to sampling error). Regarding the moisture field, CORR(Z h , q y ) estimated from the two ensemble sets also exhibits similarity at point A ( Fig. 4c versus Fig. 4g), but the estimates from the ensemble sets become very different at point B (Fig. 6c versus Fig. 6g). Such sensitivity of CORR(Z h , q y ) to sampling error associated with points A and B is reversed with CORR(V r , q y ). These results imply that sampling error greatly affects the estimation of CORR(Z h , q y ) in the intense reflectivity area (Fig. 6c versus Fig. 6g) while it has a stronger effect on CORR(V r , q y ) in the weak reflectivity area (Fig. 3c versus Fig. 3g).
The sensitivity of BECR to sampling error is further quantified by the resemblance between the point BECRs derived from the two ensemble sets based on the standardized mean absolute difference (SMAD) defined as The SMAD is computed in a 72 3 72 km 2 area centered at the observation point used to calculate the point BECR. This area is chosen to evaluate the impact of sampling error within the horizontal localization radius equal to 36 km. In Eq. (7), superscript L and S denote the large and small ensembles, respectively, and N is the total model grid numbers in the calculation area. CORR i is the point BECR (e.g., results shown in Figs. 3-6) between observation variables at the observation location and model states at a model grid point i. CORR M is the maximum value of absolute BECRs with either ensemble set, that is, Max[jCORR L i (i 5 1, . . . , N)j, jCORR s i (i 5 1, . . . , N)j] in the area. Because jCORR L i 2 CORR s i j tends to be larger when the error correlation itself is larger, the use of CORR M standardizes the differences when making comparisons between different variables. This allows us to focus on the difference of the pattern between the BECRs derived from the two ensemble sets. A larger SMAD indicates that the BECR derived from the small ensemble shows less similarity to that derived from the large ensemble (i.e., more sensitive to sampling error).
In addition to points A and B, the SMADs are computed at other observation points. The results are shown in Fig. 7. With a direct relationship, CORR(Z h , q r ) exhibit small SMADs at all observation points while the SMADs of CORR(V r , q r ) are generally larger (Fig. 7d versus Fig. 7h). The relationship between V r and the amount of q r can be influenced by multiple processes, including horizontal/vertical advection, convergence, and condensation, etc. Therefore, sampling errors in these processes can lead to large SMADs in CORR(V r , q r ). SMADs of CORR (V r , U) and CORR(V r , V) are generally small given the direct relationship between V r and horizontal winds; however, they are the only two in Fig. 7 that have much larger values at some specific azimuths of radar beams (Figs. 7a,b). Comparing with Fig. 2a, these sensitive areas of CORR(V r , U) and CORR(V r , V) shown in Figs. 7a and 7b do not match the area where simulated V r is near to zero. This suggests that the impact of sampling error on CORR(V r , U) and CORR(V r , V) is not related to how much the simulated V r resolves the total wind. Instead, it is related to how much U or V component separately contributes to the simulated radial wind. When the prevailing southwesterly wind comes toward south of RCCG (offshore of the southwestern Taiwan), only the southerly component contributes to the radial wind of RCCG radar. Thus the radial wind has little correlation with the westerly component. Therefore, CORR(V r , U) over this region can be contaminated easily by sampling error. Similarly, the southerly component does not contribute to the radial velocity west of RCKT, leading to a large SMAD of CORR(V r , V) in that region. In contrast, the SMADs of CORR(V r , V) south of RCKT are smaller since the prevailing wind in this region is dominated by southerly (see Fig. 2a) and projects strongly to the radial wind.
To further prove this point, we examine the ratios (r u and r y ) of the projection of the U and V wind on the radial direction to the radial wind at observation points for all members of the large ensemble: In Eqs. (8a) and (8b), k is the index of the ensemble member, a is the elevation angle (8), b is the azimuth angle (8) of the observation point, and u k o , y k o , and V k r are the U wind, V wind, and radial wind at observation point, respectively. Figure 8 is the map of the differences between the ensemble mean of these two ratios (r u 2 r y ) south and west of the radar. The orange color indicates that the U component contributes larger to V r , while the green color indicates more V component. For example, the contributions of V to radial wind over south of RCKT radar are much larger than those of U due to the stronger southerly wind south of RCKT (Fig. 2a), while the contributions from U and V in the south of RCCG are relatively comparable (Fig. 8a). Similarly, given that westerly wind dominates west of RCCG, the contributions of U are relatively larger than V in the region west of RCCG while those in the west of RCKT are more comparable (Fig. 8b). Figure 8a shows that locations south of the radar with small ratio difference (dark orange and dark green) correspond to the area with large SMADs in Fig. 7a. The same characteristic applies to Figs. 8b and 7b. Therefore, the ''hot spots'' for sampling error in Figs. 7a and 7b are results of combining factors, including the prevailing wind and the blind area, where the radar radial wind cannot resolve the U or V component.
Regarding the moisture field, the impact of sampling error on the BECRs between observations and moisture is sensitive to the intensity of reflectivity. The SMADs of CORR(Z h , q y ) are generally larger over the intense FIG. 9. The boxplot of the SMADs calculated by comparing the BECRs estimated from the large ensemble and those from the 50 subensemble sets with 40 members at 0000 UTC 16 Jun 2008. Each column in this figure has 50 values of SMAD computed based on the 50 subensembles. The shaded box indicates the range of the first quartile (Q1) and the third quartile (Q3). The horizontal black line in the box is the median. The upper whisker are the last data less than Q3 1 1.5 3 IQR, and the lower whisker are the first data larger than Q1 2 1.5 3 IQR, where IQR is the interquartile range (Q3 2 Q1). The cross marks are outliers which are defined as the data those outside the upper and lower whisker. reflectivity area (enclosed by the red contour in Fig. 7g) while the SMADs of CORR(V r , q y ) are larger over the area with weaker reflectivity (Fig. 7c). These features may be related to different dynamic characteristics over the areas of intense and weak reflectivity. When there is dominant dynamic forcing, such as moisture transport and convergence in the intense reflectivity area, the wind direction is more consistent among ensemble members, and CORR(V r , q y ) can be robust and is easier to be estimated. In comparison, the wind field can be impacted by multiple factors outside the convective cells, such as prevailing wind, offshore flow, and topography, thus the uncertainty of the wind direction is large there. This makes the estimation of CORR(V r , q y ) vulnerable to sampling error and leads to the large SMADs of CORR(V r , q y ) in the area outside of intense reflectivity. On the other hand, q y and Z h are implicitly related through the nonlinear phase transition in the microphysics during convective development. The phase transition from q y to hydrometeors (i.e., Z h ) is related to the dynamical (e.g., vertical motion) and thermodynamical (e.g., temperature) conditions, which have large uncertainties in the region of intense reflectivity. These results imply that if only Z h is assimilated, sampling error tends to introduce erroneous corrections on the moisture field in the intense reflectivity area, while such detrimental effect will take place in the weak reflectivity area when only V r is assimilated.
To further justify our argument for Fig. 7, which only uses one small ensemble set, a bootstrapping method is applied to test the robustness of SMADs. The method is used to avoid running the experiments with small ensemble size multiple times and the large ensemble is taken as the pool to withdraw the samples to form small ensemble sets. We randomly withdraw 40 members 50 times from the large ensemble at 0000 UTC and compute the point BECRs between the observation and model variables at points A and B based on these subensemble sets. Then the SMADs are calculated for each sub ensemble by comparing their BECRs with the large ensemble. Figure 9 shows the boxplot of the 50 SMADs. Generally, the results of the SMADs show features consistent with Figs. 3-6 and 7. The SMADs of CORR(Z h , q r ) are smaller than those of CORR(V r , q r ) while both of them show little difference between points A and B. These findings again support the notion that the direct relationship between Z h and q r leads to a robust BECR which is less sensitive to sampling error in either weak or intense reflectivity area. The SMADs of CORR(V r , U) and CORR(V r , V) add further support to our statement that sampling error can contaminate the BECR between radial wind and horizontal wind fields at a certain azimuthal direction of the radar site (point B is located south of RCCG). Also, the SMADs of CORR(Z h , q y ) with point B are larger than those with point A, while those of CORR(V r , q y ) are larger with point A. This also echoes the idea that the BECR between reflectivity and moisture is more robust in the weak reflectivity area, but such relationship between wind and moisture is more robust in the intense reflectivity area. The moisture can be affected by dynamics (e.g., wind advection, convergence) and thermodynamics (e.g., condensation and evaporation) which are factors characterizing the intensity of the reflectivity. These dependencies reveal the limitation of correcting the moisture when assimilating radar data with limited ensemble size. In summary, if the BECR is small due to nonlinear dynamics or thermodynamics, it will be more vulnerable to sampling error. This agrees with the statistical theory described in Stuart and Ord (1987) and Houtekamer and Mitchell (1998) which indicated that the error in the correlation estimation is larger when the correlation is smaller.

Impact of strategies applied to error covariance in WLRAS on precipitation prediction
This section presents the results of WLRAS using different horizontal and vertical localization scales with the large and small ensembles to demonstrate the impact of sampling errors on the EnRDA and convective-scale precipitation prediction. With WLRAS, the simulated FIG. 10. Forecast skill diagram for accumulated rainfall prediction over southwestern Taiwan area (indicated by the dashed box in Fig. 11g) using gradual changing localization scales from 12 to 36 km with the small ensemble set. The probability of detection, success ratio, threat score, and bias are indicated with the abscissa, ordinate, green contours, and brown dashed lines, respectively. Different colors indicate forecast skills evaluated at different precipitation thresholds.

SEPTEMBER 2020
W U E T A L .
radar reflectivity in all experiments establishes a rainband pattern offshore of southwestern Taiwan similar to the observed reflectivity pattern (figure not shown). However, this simulated rainband pattern is established 2 hours earlier than what was observed. For this study, we will neglect this phase shift and focus on the comparison of the rainfall prediction accumulated from 0200 to 0700 UTC and observation from 0200 to 0900 UTC, in order to emphasize the behaviors that lead to the differences among the experiments.

a. Impact of horizontal localization
To identify the impact of localization on precipitation prediction, Fig. 10 shows the performance diagram of experiments that use gradual changing horizontal localization scales from 12 to 36 km with the small ensemble. The number in the experiment name indicates the horizontal and vertical localization scales. For instance, S2404 is the experiment that uses horizontal and vertical localization scale equal to 24 and 4 km, respectively. Based on the results using a vertical localization of 4 km, a scale of 20 km is the upper limit of the localization scale and the experiments that use a horizontal localization scale longer than 20 km clearly have poorer performances for thresholds of intense rainfall (.50 mm). In more details, S1204 has the highest probability of detection while S1604 and S2004 have a higher success ratio and smaller bias. In the following, we focus on the experiments using horizontal localization of 12 and 36 km to discuss the impact of sampling error with different localization scales in more details. Figure 11 shows the accumulated rainfall predictions of all experiments listed in Table 1. We note that the choice of a 12-km horizontal and 4-km vertical scale is a conventional localization setting the same as TYL14. As shown in the observation (Fig. 11g), heavy rainfall over the coast of southwestern Taiwan is an important feature of this event. Both S1204 (Fig. 11a) and L1204 (Fig. 11d) exhibit heavy rainfall over the coastal area except that the rainfall maximum of L1204 is more intense and extends farther south. The precipitation prediction with the two ensemble sets becomes very different with a longer horizontal localization (Figs. 11b,e). L3604 exhibits a rainfall pattern similar to that of L1204 while S3604 shows a substantial reduction in rainfall intensity compared to S1204. This suggests that sampling error can degrade the performance of rainfall prediction when using WLRAS with a small ensemble size as well as an inadequately long horizontal localization. Such precipitation discrepancies can be attributed to the differences of the moisture field and hydrometeors between the analysis means of L3604 and S3604 after 2-h assimilation (Figs. 12a,c); the differences are much larger than those between the analysis means of L1204 and S1204 (Figs. 12b,d). This again demonstrates that sampling error can contaminate the distant BECV and adversely affects the analysis.
To examine how the differences caused by sampling errors in S3604 lead to the precipitation differences from L3604, three sensitivity experiments are conducted by modifying the initial condition at 0200 UTC. The experiment SQall2L takes the L3604 analysis mean, but replaces the mixing ratios of hydrometeors (i.e., q r , q c , q i etc.) and q y with those from the S3604 analysis mean at 0200 UTC. The experiment SHyd2L is the same as SQall2L, except that the substitution is done without replacing q y . Reversely, LQv2S is the experiment which takes the S3604 analysis mean but replaces its q y with that from the L3604 analysis mean. The precipitation predictions from these experiments are shown in Fig. 13. Compared with L3604, SQall2L exhibits much weaker rainfall intensity over the land while SHyd2L shows a result similar to L3604. In comparison, LQv2S shows a stronger rainfall intensity than S3604 does and even FIG. 12. The differences of (a),(b) precipitable water and (c),(d) total hydrometeors between analysis mean of (a),(c) L3604 and S3604 and (b),(d) L1204 and S1204 at 0200 UTC 16 Jun 2008.
SEPTEMBER 2020 W U E T A L .
agrees better with the observation. The orientation of the heavy rainfall offshore of southwestern Taiwan also changes from a northeast-southwest (Fig. 11b) to an east-west configuration (Fig. 13c). The comparison between SQall2L and SHyd2L implies that even as both q y and hydrometeors may be less accurate in S3604, the moisture field plays a much more dominant role in precipitation prediction for this case. The result of LQv2S further confirms the importance of having a more accurate initial moisture field. Figure 14 shows the convergence and horizontal wind of the experiments and the observation wind analysis product from the Wind Synthesis System Using Doppler Measurements (WISSDOM; Liou et al. 2016) at 0300 UTC. Similar to L3604, SHyd2L exhibits a stronger convergence near the southwest coast of Taiwan than SQall2L does (indicated by the boxes). In addition, SHyd2L and L3604 show westerly winds near 22.58N; the winds are comparable to the observations (Fig. 14f) and stronger than S3604 and SQall2L. With the moisture field taken from L3604, LQv2S also exhibits stronger westerly winds and convergence than S3604 does over the southwest of Taiwan. These responses to the wind field and convergence are a consequence of modifications of the moisture content and buoyancy. The difference of initial moisture amount between L3604 and S3604 can greatly affect the corresponding buoyancy due to the release of latent heat. This results in a different condition for low-level convergence. Figure 14 also suggests that the impact of the moisture field on the convergence can be established within 1-h forecast. Therefore, the detrimental impact from sampling errors on the moisture field can considerably affect the precipitation prediction and is not negligible when using a small ensemble. Similar experiments are performed to replace the wind field in L3604 with the one from S3604 and vice versa (figure not shown). Results confirm that the impact of high moisture content with the large ensemble dominates and it can overcome the deficiency in the initial wind field and the associated buoyancy helps establish the convergence, resulting in heavy rainfall over southwestern Taiwan. The performance of WLRAS is also evaluated based on the innovation statistics (differences between observation and background). Figure 15 is the time series of root-mean-square innovation (RMSI) of the ensemble mean of V r and Z h during analysis-forecast cycles. In Fig. 15, dashed lines indicate the RMSI of the experiments with the small ensemble sets, while solid lines show those with the large ensemble sets. Compared with L3604 and L1204, S3604 exhibits much larger RMSI with both V r and Z h . Among experiments with a small ensemble size, only the RMSI of S1204 is comparable to the experiments with the large ensemble; this again justifies the choice of horizontal localization in TYL14. It is also evident that L3612 exhibits the smallest RMSI in Z h , indicating that a longer localization has the advantage when there is a large enough ensemble size to better represent the flow-dependent error covariance. FIG. 13. Accumulated rainfall from 0200 to 0700 UTC of the sensitivity experiments (a) SQall2L, which took the analysis mean from L3604 but replaced the hydrometeors and the q y by S3604 at 0200 UTC 16 Jun 2008, and (b) SHyd2L, which is the same as SQall2L except the substitution is done without replacing q y . (c) LQv2S is the experiment which took the S3604 analysis mean but replaces its q y with that from the L3604 analysis mean at 0200 UTC 16 Jun 2008. The difference between L3612 and L3604 will be further discussed in the next subsection.
When a much longer horizontal localization radius (108 km) is adopted, the V r RMSI becomes considerably worse and the rainfall intensity is much weaker with either the large or small ensemble set (figure not shown). This suggests that under the WLRAS framework, even though an ensemble size of 256 members seems to be large, sampling error still limits radar observations to provide a reliable correction beyond a horizontal range broader than 100 km even for the wind field. This limitation reflects the uncertainty nature of this precipitation event, which involves multiscale characteristics including prevailing wind, MCS, and the interaction with topography.

b. Impact of vertical localization
In this subsection, we investigate the impact of increasing the vertical localization scale from 4 to 12 km.
Comparing the results of L3612 with L3604 (Figs. 11e,f), it appears that adopting a larger vertical localization scale reduces the overestimation of rainfall intensity over the coastal area of southern Taiwan. Figure 16 shows the cross section of the BECVs between the simulated Z h and model states (u or q y ) derived from the large ensemble at 0000 UTC. It is evident that there is a strong covariance between lower and higher levels for both moisture and temperature fields. Such characteristics reflect the vertical scale of convection cells. With a shorter vertical localization (4 km), the correction becomes more locally limited than that for 12 km (as indicated by dashed boxes in Fig. 16), unless observations are sufficiently dense to capture the vertical structure. Figure 16 also implies that different vertical localizations result in different vertical adjustments and thus lead to different thermodynamic conditions. This can be seen from the vertical cross section of difference in u and q y between the analysis mean of L3604 and L3612 at 0200 UTC after the radar data are assimilated (Fig. 17). With more correction of u and q y in the vertical, L3612 is cooler and drier than L3604 below 5 km, but is warmer and more moist above 5 km, which is a feature of reducing the buoyancy. Thus, the atmospheric condition in L3612 is thermodynamically less unstable than that of L3604, alleviating the excessive rainfall shown in L3604. These results also imply the use of a larger vertical localization in convective-scale data assimilation can better address the issue of vertical adjustment for convective-scale thermodynamic structures. This proper adjustment in vertical is essential for correctly representing the strength of the convections and thus precipitation intensity. At last, Fig. 18 shows the bias scores in the 5-h accumulated precipitation over land of southwestern Taiwan (the dashed box in Fig. 11g) for all experiments. Although L3604 has a good forecast skill in terms of bias, whose perfect value is 1, for the thresholds smaller than 75 mm (5 h) 21 , it has a wet bias at large thresholds. By increasing the vertical localization scale, the bias of L3612 becomes closer to 1, much better than those with large ensembles using a short vertical localization scale (L1204 and L3604). With small ensemble, S3612 still underestimates the precipitation amount over land, similar to that in S3604. In other words, the importance of having the proper vertical adjustment for thermodynamic structure is concealed with the presence of sampling error. For S3604 and S3612, the bias scores are both much less than 1, which again confirms that sampling error results in underestimation of precipitation prediction for this case.
c. Mitigating the impact of sampling error associated with nonobservable tangential wind Given the unique pattern of sampling errors associated with the projection of U and V wind on the radial direction (Figs. 7a,b), it is possible to apply strategies to alleviating the detrimental impact of such errors under the presence of strong prevailing wind. Therefore, we propose a strategy to lessen the corrections at these specific locations by increasing the observation error of V r . When updating the V wind with a limited ensemble size, the weighting of V r observation error [Eq. (5)] is tripled if the superobs is located between azimuth 608 and 1208 (east of radar) or between 2408 and 3008 (west of radar). Similarly, the V r observation error is tripled in south and north of radar when updating the U wind. An experiment (named LimVr), using the same setting as S3604 but applying the strategy mentioned above, is conducted to evaluate its feasibility. Figure 19a shows that the accumulated precipitation from the LimVr is more intense than that of S3604 (Fig. 11b) over the coastal area. Results also show that there is a substantial difference in the wind fields between the analysis means of the two experiments at 0200 UTC, especially for the zonal wind component (Fig. 19b). The westerly wind difference appears near the coast of southwestern Taiwan where the intense rainfall was observed (Fig. 11g). By enhancing the westerly wind, LimVr enhances the moisture flux into Taiwan relative to that of S3604 (Fig. 19c). Consequently, the precipitation amount over southwestern Taiwan becomes larger, despite that the LimVr moisture is still underestimated as compared to L3604. The result of LimVr shows that the proposed strategy has the potential to deal with sampling errors associated with the limitations of radial wind measurements and bring benefit to precipitation prediction.

a. Summary
This study aims to investigate the characteristics of sampling errors accompanied by the EnRDA and FIG. 15. Root-mean-square innovations of (a) V r and (b) Z h ensemble mean in the observation space during the analysisforecast cycles from 0000 to 0200 UTC 16 Jun 2008. how they affect the convective-scale background error correlation (BECR) and the performance of very short-term precipitation prediction. Data assimilation experiments using WLRAS were conducted with large (256 member) and small (40 member) ensemble sizes, with different localization scales, for a heavy rainfall event in southwestern Taiwan brought by the MCS on 16 June 2008. This heavy rainfall event is a typical and important case of ''long-lived rainy mesoscale system'' with characteristics of the prefrontal unstable environment, prevailing southwesterly wind and interactions of topography. Therefore, we anticipate that the findings of this study can be applied to the cases with similar dominant dynamical features.
Results from comparing experiments using small and large ensembles show consistent results with previous studies. The BECRs have characteristics for the areas with different precipitation types (e.g., Aonashi et al. 2016) and sampling error contributes spurious smallscale structure (e.g., Ménétrier et al. 2014;Necker et al. 2020a). New findings are derived with respect to the characteristics and attributions of sampling error based on the cross-variable BECR between observed and model variables, and explain how they lead to the degradation of precipitation prediction. Focusing on the assimilation of radar observations, our new findings are: 1) The impact of sampling errors on cross-variable BECR for this heavy rainfall event can be characterized by (i) the relationship between the simulated observations and model states (observable versus nonobservable) and (ii) how the prevailing wind projects to the radial wind in the areas that the radar cannot resolve the U or V component. 2) If an improperly large horizontal localization is used with a limited ensemble size, deficiencies in the moisture field becomes a key for precipitation prediction. 3) With a large ensemble, the sensitivity of precipitation prediction to the horizontal localization scale is less but the optimization of the vertical localization scale becomes important to better represent the thermodynamic structure for convection development. 4) A strategy to tune the observation errors of V r can mitigate the detrimental impact on wind corrections in the hot-spot areas sensitive to sampling error.
More specifically, the ensemble-based BECR estimation is less sensitive to the ensemble size when the relationship between variables is direct and robust, such as Z h and q r . This agrees with the statistic nature that the weak signal is easily contaminated by sampling error and thus a limited ensemble size can lead to large uncertainties in estimating this kind of error correlation. For sampling errors arising from less direct relationships between variables, moisture is a particularly difficult variable to be updated. The estimated BECRs between q y and Z h show strong sensitivity to sampling error in the intense reflectivity area, while such sensitivity of the BECRs between q y and V r is shown in the weak reflectivity area. This implies that sampling error can contaminate the moisture corrections by assimilating either Z h or V r , and thus it can greatly affect the development of convective storms. The variations of moisture can be contributed by different dynamical and physical processes, such as moisture advection, convergence and evaporation. This makes estimating error correlations between the simulated radar observations and moisture much more uncertain and thus, sampling error has a considerable impact on them. Last, another noticeable impact of the sampling errors shown in WLRAS is associated with how much the prevailing wind projects to the unobservable wind component tangential to the radar beams. In the areas that the radar beams cannot resolve the U or V component (e.g., U wind in south/north, and V wind in east/west of radar), the BECR between the simulated V r and model horizontal wind is sensitive to sampling error when the projections of the U and V component to the simulated V r are comparable. In summary, the sampling errors in EnRDA have considerable complexity, which is not only related to the dynamic/thermodynamic properties of the weather systems but also the limitation of the radial wind measurement.
The use of an excessively horizontal localization scale with small ensemble encourages the detrimental effect of sampling errors in sensitive areas. This effect results in underestimation in moisture and hydrometer fields and affects the convergence field within 1-h forecast. As a consequence, the precipitation is underpredicted. In addition, sensitivity experiments show that the precipitation prediction is more sensitive to the moisture condition while it seems to be less sensitive to the accuracy of hydrometers for this case. On the other hand, adopting a large vertical localization scale with a large ensemble size can be important in terms of representing the buoyancy effect. The proper vertical adjustment for the thermodynamic structure of the convection  Fig. 11g). The horizontal black dash-dotted line indicates the optimal bias value, which is equal to 1. modifies the condition of stability and thus improves the intensity and location of heavy rainfall prediction. With sparse observations in the vertical, corrections by using a short vertical localization are limited to either low or middle levels, and result in a more convectively unstable condition and overpredicted precipitation.
To tackle sampling errors associated with unobservable tangential wind component, a strategy is proposed to modify the R-localization used in WLRAS. The strategy aims to mitigate the impact of contaminated correction for updating the horizontal wind by increasing the observation error of V r in specific beams that are sensitive to sampling error. As a proof of concept, the preliminary test with the small ensemble and a long horizontal localization (36 km) shows that the strategy can reduce the detrimental impact of the sampling errors on the horizontal wind. With this new strategy, the westerly wind at the coast of southwestern Taiwan (south of RCCG) is enhanced and thus, the moisture flux becomes greater, leading to an improvement in rainfall intensity prediction.
In this study, the single-suite physics configuration was chosen for the experiments in order to understand how sampling errors in the initial condition impacts the forecast performance. However, we should note that model errors related to physical processes can also contribute to sampling errors and may require further investigation in a future study. b. Implication for convective-scale data assimilation in Taiwan The detrimental impact from sampling error on precipitation prediction shown in this study suggests that the performance of the convective-scale data assimilation framework in Taiwan based on WLRAS could be limited with a small ensemble size. The new strategy proposed in this study demonstrates an example of strategies to cope with sampling errors in WLRAS, and it can be applied for other conditions with a prevailing wind such as synoptic-scale fronts approaching Taiwan. The success of the proposed strategy also implies the potential to design variabledependent localization strategies based on the comparison shown in section 3 for the state variables that severely suffer from sampling error. For example, a short localization or large observation error can be adopted when assimilating Z h to update moisture in the intense reflectivity area. Potential improvement may also be pursued by assimilating specific types of variables or additional observations at specific locations, in addition to the solution of increasing the ensemble size. As a remedy for the issues from updating the moisture field with radar data, observations that carry moisture information with a high temporal frequency, such as the ground-based Global Navigation Satellite System (GNSS) observations (Yang et al. 2020; de Haan 2013), water vapor differential absorption lidar (Harnish et al. 2011), or water vapor channels of the new generation satellite data (Honda et al. 2018), can be complementary to radar data. Given that southwesterly flow prevails during the mei-yu seasons, wind profilers can be installed at the coast of Kaohsiung city (south of RCCG), in addition to applying the inflation strategies to deal with sampling errors associated with the unobservable wind tangential to radar beams. It is also expected that assimilating dense surface observations, such as the Automatic Rainfall and Meteorological Telemetry System in Taiwan, partially alleviates the issues discussed above and has the potential to further improve very short-term severe weather prediction.