Short-Range (0–12 h) PQPFs from Time-Lagged Multimodel Ensembles Using LAPS

Hui-Ling Chang Central Weather Bureau, Taipei, and Department of Atmospheric Sciences, National Central University, Jhong-Li, Taiwan

Search for other papers by Hui-Ling Chang in
Current site
Google Scholar
PubMed
Close
,
Huiling Yuan School of Atmospheric Sciences, and Key Laboratory of Mesoscale Severe Weather, Ministry of Education, Nanjing University, Nanjing, Jiangsu, China, and NOAA/Earth System Research Laboratory, Boulder, Colorado

Search for other papers by Huiling Yuan in
Current site
Google Scholar
PubMed
Close
, and
Pay-Liam Lin Department of Atmospheric Sciences, National Central University, Jhong-Li, Taiwan

Search for other papers by Pay-Liam Lin in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

This study pioneers the development of short-range (0–12 h) probabilistic quantitative precipitation forecasts (PQPFs) in Taiwan and aims to produce the PQPFs from time-lagged multimodel ensembles using the Local Analysis and Prediction System (LAPS). By doing so, the critical uncertainties in prediction processes can be captured and conveyed to the users. Since LAPS adopts diabatic data assimilation, it is utilized to mitigate the “spinup” problem and produce more accurate precipitation forecasts during the early prediction stage (0–6 h).

The LAPS ensemble prediction system (EPS) has a good spread–skill relationship and good discriminating ability. Therefore, though it is obviously wet biased, the forecast biases can be corrected to improve the skill of PQPFs through a linear regression (LR) calibration procedure. Sensitivity experiments for two important factors affecting calibration results are also conducted: the experiments on different training samples and the experiments on the accuracy of observation data. The first point reveals that the calibration results vary with training samples. Based on the statistical viewpoint, there should be enough samples for an effective calibration. Nevertheless, adopting more training samples does not necessarily produce better calibration results. It is essential to adopt training samples with similar forecast biases as validation samples to achieve better calibration results. The second factor indicates that as a result of the inconsistency of observation data accuracy in the sea and land areas, only separate calibration for these two areas can ensure better calibration results of the PQPFs.

Corresponding author address: Huiling Yuan, School of Atmospheric Sciences, Nanjing University, 22 Hankou Rd., Nanjing, Jiangsu 210093, China. E-mail: yuanhl@nju.edu.cn

Abstract

This study pioneers the development of short-range (0–12 h) probabilistic quantitative precipitation forecasts (PQPFs) in Taiwan and aims to produce the PQPFs from time-lagged multimodel ensembles using the Local Analysis and Prediction System (LAPS). By doing so, the critical uncertainties in prediction processes can be captured and conveyed to the users. Since LAPS adopts diabatic data assimilation, it is utilized to mitigate the “spinup” problem and produce more accurate precipitation forecasts during the early prediction stage (0–6 h).

The LAPS ensemble prediction system (EPS) has a good spread–skill relationship and good discriminating ability. Therefore, though it is obviously wet biased, the forecast biases can be corrected to improve the skill of PQPFs through a linear regression (LR) calibration procedure. Sensitivity experiments for two important factors affecting calibration results are also conducted: the experiments on different training samples and the experiments on the accuracy of observation data. The first point reveals that the calibration results vary with training samples. Based on the statistical viewpoint, there should be enough samples for an effective calibration. Nevertheless, adopting more training samples does not necessarily produce better calibration results. It is essential to adopt training samples with similar forecast biases as validation samples to achieve better calibration results. The second factor indicates that as a result of the inconsistency of observation data accuracy in the sea and land areas, only separate calibration for these two areas can ensure better calibration results of the PQPFs.

Corresponding author address: Huiling Yuan, School of Atmospheric Sciences, Nanjing University, 22 Hankou Rd., Nanjing, Jiangsu 210093, China. E-mail: yuanhl@nju.edu.cn

1. Introduction

Taiwan is noted for its distinctive geographic environment. Various weather systems such as spring rainfall, mei-yu fronts, typhoons, and afternoon thunderstorms constantly occur throughout the year. Among all weather systems, mei-yu fronts and typhoons accompanied by heavy rainfall often cause disasters and economic loss to Taiwan. Therefore, the academic research and governmental organizations emphasize the importance of quantitative precipitation forecasts (QPFs), especially on the short-range (0–12 h) QPFs of severe meso- and convective-scale weather systems, as they could have the most direct impact on people’s property and safety.

One of the primary difficulties in short-range QPFs is the spinup problem (Heckley 1985; Donner 1988), because the processes of condensation and latent heat release are not easily predicted by traditional models. The fundamental causes include the following: 1) the distribution of humidity and convergence fields of the atmosphere cannot be fully resolved by the observations, and 2) most models adopt adiabatic initializations, leading to a situation that the hydrometeors’ information cannot be provided by initial fields and should be driven gradually by the microphysical processes of mesoscale models (Mohanty et al. 1986). Therefore, accurate precipitation forecasts are not easily obtained during the early stage of model integration and the ability of short-range QPFs is seriously affected for mesoscale models.

Early research indicates that if diabatic information could be provided by initial fields, the performance of numerical models during the early stage of model integration would be largely improved, and the ability of short-range QPFs for mesoscale models would be enhanced. Krishnamurti et al. (1991) and Harms et al. (1993) retrieved the vertical distribution of moisture and latent heat by using observed rainfall rates and then introduced them into the initial fields to enrich the diabatic information of numerical models. This showed that the spinup time was reduced and the precipitation forecasts during the early stage of model integration were improved.

The Local Analysis and Prediction System (LAPS) used in this research can mitigate the spinup problem because the diabatic effect has been included during the atmospheric analysis and initialization processes. Therefore, more accurate precipitation forecasts can be obtained during the early stage of a forecast period (Jian et al. 2003). This forecast system was developed by the Central Weather Bureau (CWB) in partnership with the National Oceanic and Atmospheric Administration/Earth System Research Laboratory/Global Systems Division (NOAA/ESRL/GSD), for the purpose of improving the capability of short-range QPFs for severe weather systems.

At present, major forecast centers pay more and more attention to advanced data assimilation schemes. The three-dimensional variational data assimilation (3DVAR) technique has been widely used in operational centers, and several centers [e.g., the European Centre for Medium-Range Weather Forecasts (ECMWF), France, United Kingdom, Japan, and Canada] have switched to 4DVAR. The ensemble Kalman filter (EnKF) is another data assimilation method, which is relatively new and has been tested in some operational developments. Though LAPS does not adopt similar advanced schemes, it does really have stable and good performance of short-range precipitation forecasts from long-term verifications in the past 6 years at CWB. Refer to section 2 for a more detailed description of LAPS.

In addition to improving data assimilation in the numerical weather prediction (NWP) system, in recent years, increasing attention has been paid to ensemble prediction system (EPS) to reduce the forecast biases, especially in short-range QPFs. Rather than the deterministic viewpoint in traditional NWP models, there are various uncertainties in all steps of the NWP system, including observation, first-guess, data assimilation, and prediction processes. The chaos theory states that a small change in the initial conditions can drastically change the long-term behavior of a system through interactions. EPS uses perturbed initial states or considers the physics as stochastic processes, which reflects the chaotic nature in the atmosphere. Averaging the ensemble forecasts from slightly perturbed initial conditions can filter out some unpredictable components of the forecast, and the spread among the forecasts can provide some guidance on the reliability of the forecasts (Toth and Kalnay 1993). This is a fundamental transition and revolutionary change in the NWP development.

The major question of EPS is how to generate ensemble perturbations that reflect the real initial uncertainty (Toth and Kalnay 1993). Two operational ensemble perturbation methods include the breeding of growing modes (BGM) method at the National Centers for Environmental Prediction (NCEP), which contains fast-growing modes corresponding to the evolving atmosphere, and the singular vector method which involves the linear tangent model at the ECMWF.

The early development of EPS focuses on capturing the critical uncertainties in an ensemble system with the final goal of achieving a single best forecast (e.g., adopting the ensemble weighting). For example, several studies investigated ensemble precipitation forecasts in the Taiwan area (Chien et al. 2003; Chien and Jou 2004; Yang et al. 2004) using the fifth-generation Pennsylvania State University–National Center for Atmospheric Research (NCAR) Mesoscale Model (MM5; Grell et al. 1995). They discussed different weighting methods to improve the skill of the ensemble QPFs, including the weighted averaging method, multiple linear regression technique, and the probability matching approach. In some ensemble designs, the ensemble mean was not the best forecast (e.g., Chien et al. 2003), while in other studies the precipitation forecast from the ensemble mean was superior to that of any single member (e.g., Chien et al. 2005). Chien et al. 2005 show that the best skill scores were obtained when the ensemble configuration included three uncertainty sources: initial fields, cumulus parameterizations, and microphysical schemes. Of all the three uncertainty sources, the most advantageous method for ensemble precipitation forecasts was to vary initial fields, followed by varying cumulus parameterizations and microphysical schemes.

As with other recent EPSs, the development of the LAPS EPS emphasized on not only capturing the critical uncertainties, but also conveying the uncertainties to the forecasters and end users, which will help the users further understand the possibility and reliability of forecasts. Some sensitivity experiments were conducted in this study to identify the critical uncertainties in EPS and the time-lagged multimodel ensemble was created. To convey the uncertainties in the prediction process to the users, PQPFs were developed (see section 3).

Early research indicates that calibration is a critical procedure to correct forecast biases and enhance forecast skill in a biased forecast system (Mass 2003). Calibration methods of PQPFs include model output statistics (MOS; Glahn and Lowry 1972; Vislocky and Fritsch 1997), the artificial neural network technique (ANN; Mullen and Buizza 2004; Yuan et al. 2007a), and the linear regression method (LR; Lu et al. 2007; Yuan et al. 2008) to name a few. Calibration results are deeply affected by insufficient numbers (Atger 2003) and interdependence of the samples (Eckel and Walters 1998). In addition, using the long-term training samples with similar climatology characteristics could effectively improve bias correction (e.g., calibration was conducted using the training samples classified by topography or climatology; Yuan et al. 2008). In this study, the verification results showed that the LAPS ensemble was apparently wet biased. Therefore, the LR method (Yuan et al. 2008) was used to correct forecast biases (see section 4).

This study focuses on the short-range PQPFs of typhoons or tropical cyclones (TC) using the time-lagged multimodel ensembles. Since predictability usually decreases in subsynoptic and mesoscale systems, it is more difficult to develop an effective short-range EPS, in particular the TC ensembles (e.g., Cheung 2001). An additional difficulty is the different error characteristics for the tropics, which is mainly the result of the strong convection in the area and the air–sea interaction. In general, skill improvement in forecasting TC motion is quite promising when the uncertainties in the large-scale steering flow can be simulated by the perturbations, while the ensembles for TC intensity are being developed, in part because intensification is not yet fully understood.

The ensemble forecasts of TC precipitation using the LAPS EPS are examined in this study. This report is organized as follows: LAPS and verified observation data are introduced in section 2. The design of the LAPS EPS and the PQPF products are presented in section 3. Sections 4 and 5 describe the calibration methodology of PQPFs and the verification results of PQPFs before and after the LR calibration, respectively. The sensitivity experiments on calibration, including the training samples and accuracy of observation data, are presented in section 6. A summary and future works are given in the last section.

2. Model and data

a. Short-range forecast system LAPS

LAPS has three main components: the observation data ingestion, diabatic data assimilation, and mesoscale model forecast (Fig. 1). The ingested data include model forecasts (used as the background field), surface observations, soundings, Aeronautical Radio Incorporated, Communications, Addressing and Retrieval System (ACARS), Doppler radar data (including reflectivity and radial velocity fields from Wu–Fen–Shan, Ken–Ting, Hua–Lian, and Chi–Gu radars; Fig. 2), satellite IR and visible (VIS) data [from the geostationary multifunctional transport satellite (MTSAT)], and satellite-derived wind fields provided by the University of Wisconsin Cooperative Institute for Meteorological Studies (UW-CIMSS).

Fig. 1.
Fig. 1.

Schematic diagram of short-range forecast system LAPS.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

Fig. 2.
Fig. 2.

QPESUMS domain. The radar coverage is indicated by the shaded area, and the four radar sites are indicated by closed triangles, including Wu–Fen–Shan (RCWF), Ken–Ting (RCKT), Hua–Lian (RCHL), and Chi–Gu (RCCG) radars.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

After data ingestion, LAPS performs diabatic data assimilation, including wind analysis (Albers 1995), surface analysis, temperature analysis, cloud analysis (Albers et al. 1996), moisture analysis (Birkenheuer 1999), and a dynamical balance module. In the procedure of wind, surface pressure, and temperature analysis, LAPS adopts a two-pass successive correction method that can retrieve resolvable information from conventional observations.

Cloud analysis is the key procedure to hot-start mesoscale models, and its products could provide the initial fields with diabatic information, such as cloud liquid water, cloud ice, and vertical motions in the cloud-covered area. Therefore, the spinup problem could be mitigated, more accurate precipitation forecasts could be obtained during the early stage of model integration, and the ability of short-range precipitation forecasts could be largely improved. The moisture analysis is achieved by using the satellite data via a variational scheme.

After completing the atmospheric analysis procedures, the dynamical balance module (Jian and Mcginley 2005) is a crucial component to initialize a mesoscale model diabatically, since it ensures that the momentum and mass fields are consistent with the cloud-derived vertical motions. This module is based on a variational formulation, and the adjustment is done by minimizing a functional containing two constraints. The first constraint is mass continuity equation (strong constraint), which forces mass continuity everywhere in the domain. Therefore, the horizontal winds are adjusted to balance with cloud-derived vertical motions to reduce the model shock in the first few time steps. The second constraint (weak constraint) is to reduce the Eulerian time tendencies of the horizontal motion components (u and υ), which couple the mass to momentum fields. The net effect is an instant spinup or precipitation. At the final step of the LAPS forecast system, better initial fields from the LAPS data assimilation system are provided to conduct numerical forecasts and produce nowcasting products.

The scheme of TC bogussing was used in this study for short-range PQPFs of TCs. Since the structure of TCs contained in the background is relatively broad and weak, the LAPS analyzed vortex is also too weak as there is not enough observation data over sea areas to support the analysis of typhoon structure. For this reason, a vortex at a location confirmed by the observations (satellite, radar, etc.) is inserted into the background field before data ingestion in LAPS, by using the NCAR–Air Force Weather Agency (AFWA) typhoon bogussing scheme (Davis and Low-Nam 2001). Therefore, the track errors of typhoons are usually small for 0–6-h short-range forecasts.

The sea surface temperature data are provided by NCEP. The initial atmospheric fields come from the LAPS analyses. In LAPS, the background fields in the LAPS analysis and lateral boundary conditions are from the same sources, including the model forecasts of 1) the nonhydrostatic forecast system (NFS) at CWB, with 15-km horizontal resolution, and 2) the Global Forecast System (GFS) at NCEP, with 0.5° horizontal resolution. There are two mesoscale numerical models associated with LAPS, including the MM5 model and the Weather Research and Forecasting (WRF) model with the Advanced Research WRF (WRF-ARW) dynamic core. Therefore, totally four different forecast models (Fig. 3) are used by LAPS, including 1) LAPS-MM5: NFS (refers to LAPS-MM5 model with the background field from CWB NFS, and similar for other notations), 2) LAPS-MM5: GFS, 3) LAPS-WRF-ARW: NFS, and 4) LAPS-WRF-ARW: GFS.

Fig. 3.
Fig. 3.

Schematic diagram of LAPS ensemble system.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

Both the MM5 and WRF-ARW models have been widely implemented by international operational and research centers. They are nonhydrostatic mesoscale models, which use the terrain-following vertical coordinate, and possess flexible and multiple nesting capability. The LAPS domain has 141 by 151 grid points (left column of Fig. 5) with 9-km horizontal resolution and 30 sigma levels vertically with 100 hPa at the vertical top level. For microphysical parameterization, the Schultz scheme (Schultz 1995) is used in the MM5 model and the WRF Single-Moment 5-Class Microphysics scheme (WSM5) in the WRF-ARW model. For planetary boundary layer process parameterizations, the MRF scheme is used in the MM5 model and Yonsei University (YSU) scheme is used in the WRF-ARW model. Cumulus parameterization schemes are deactivated, which is acceptable for orographically forced precipitation in the typhoon cases with 9-km horizontal resolution.

b. Observation data for forecast verification

Regarding the observation data needed for precipitation verification, since the conventional automatic rainfall stations are located over the land areas in Taiwan and the land area is only a small part of the LAPS domain, the verification results cannot represent the performance of the LAPS precipitation forecasts as a whole. Therefore, the radar-estimated rainfall data from the quantitative precipitation estimation (QPE) and Segregation Using Multiple Sensors (QPESUMS; Gourley et al. 2001), a system which was developed by CWB and the Water Resources Agency (WRA) in Taiwan cooperating with the National Severe Storm Laboratory (NSSL) in the United States, were used as observation data (i.e., ground truth). The QPESUMS uses the radar data (Fig. 2), covering the island of Taiwan and its nearby sea areas with 1.25-km horizontal resolution. Note that the precipitation estimation in land in Taiwan is calibrated with rainfall observations from rain gauges, but is not calibrated over the sea areas.

3. LAPS ensemble configuration and PQPF products

a. Time-lagged multimodel ensemble configuration

Expanded from the single model LAPS-MM5: NFS, we developed the LAPS EPS in order to capture more uncertainties. However, limited by computer resources and real-time operation, critical uncertainty factors such as microphysical parameterizations, background fields, and mesoscale models must be traded off. In the sensitivity experiments, we chose two typhoon and four mei-yu front cases, and performed simulations using five microphysical parameterizations from WRF-ARW (the Lin et al. scheme, the WSM three-class simple ice scheme, the WSM five-class scheme, the Ferrier microphysics scheme, and the WSM six-class graupel scheme), two different background fields (GFS from NCEP and NFS from CWB), and two mesoscale models (MM5 and WRF-ARW). For the 0–6- or 0–12-h QPFs of LAPS, there are only marginal differences when using different microphysical schemes, while different background fields or mesoscale models cause significant differences and become more important uncertainty factors than microphysical parameterizations. As a result, four members with different backgrounds or mesoscale models were chosen as the basis of the LAPS EPS (see section 2; Fig. 3).

In addition, time-lagged configurations were adopted to increase the ensemble members by using previous forecasts without additional computational cost. The advantage of time-lagged multimodel ensemble forecasts against simple multimodel ensemble forecasts is discussed in section 5e. The LAPS time-lagged multimodel ensemble (Fig. 4) uses multimodel forecasts initialized at different times to construct ensemble members for the same verification period. The LAPS EPS has four models, and each model is initialized once every 3 h with the forecast length of 12 h. Thus, for the 0–6-h ensemble precipitation forecasts, 3 time-lagged members (the 0–6, 3–9, and 6–12-h QPFs) are available for each model and 4 models in total build up the EPS of 12 members.

Fig. 4.
Fig. 4.

Schematic diagram of LAPS time-lagged multimodel ensemble.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

In brief, a time-lagged multimodel ensemble system was designed using different background fields, mesoscale models, and initialization times, so as to capture more important uncertainties in the LAPS EPS.

b. PQPF products

The advantage of ensemble PQPFs lies in that the probabilities are determined by the actual data distribution from ensemble members and display the possibility of precipitation over a certain threshold. Therefore, this plays an important part in the development of EPS. In other words, the ultimate goal of ensemble forecasts is to provide more possibility and probability information to the end users instead of a single best forecast. The limitation of PQPFs is that we cannot take all uncertainties into account because of the limited ensemble members.

The PQPFs were created based upon the precipitation forecasts from 12 members of the LAPS time-lagged multimodel ensemble system at different thresholds. For example, at the threshold of 10 mm (6 h)−1, if 9 of the 12 members predict 6-h accumulated precipitation over 10 mm, then the precipitation probability is 75% . Figure 5 shows the PQPF products and corresponding observed probabilities of Typhoon Fanapi, which was the most powerful typhoon to hit Taiwan in 2010 and caused a flash flood over areas of southern Taiwan on 19 September 2010. If the estimated rainfall from the QPESUMS is less than the selected threshold, the observed precipitation probability is zero; otherwise, it is one. At the threshold of 100 mm (6 h)−1, the LAPS PQPFs show the precipitation probabilities in southern Taiwan are above 90%. There are still 9 of 12 models predicting 6-h accumulated precipitation over 200 mm. These large probabilities imply a high possibility of heavy precipitation.

Fig. 5.
Fig. 5.

Distribution of (left) LAPS 0–6-h PQPFs and (right) QPESUMS precipitation (used as truth) probabilities at thresholds (a) 50, (b) 100, and (c) 200 mm (6 h)−1 ending at 1200 UTC 19 Sep 2010. (right) The orange shaded area denotes pixels where QPESUMS precipitation estimations exceed the indicated threshold, and pink shaded area indicates QPESUMS radar coverage.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

4. Calibration methodology

In short, calibration is bias correction. In this study, we adopted the LR method (Yuan et al. 2008) to calibrate PQPFs. Before calibration, a series of thresholds [0.25, 0.5, 1.0, 1.5, 2.5, 3.5, 5.0, 7.5, 10.0, 12.5, 15.0, 20.0, 30.0, 40.0, 50.0, and 60.0 mm (6 h)−1] were selected based on the distribution of 6-h accumulated precipitation from all of the typhoon cases in 2008. Note that calibration was conducted separately for each selected threshold, including the training and validation processes. During the training process, each record of training samples consisted of one observed precipitation probability P(x, t) and the closet seven ensemble precipitation probabilities fi(x, t) centered at the selected calibration threshold, which were used to represent the critical part of the probability distribution function (PDF) for the calibration threshold. The LR relationship was obtained by minimizing the errors between the forecasted precipitation probabilities and the observed ones. Then during the validation process another set of data (i.e., validation samples) were applied to the derived LR relationship to obtain the calibrated precipitation probabilities for later verifications.

Because of insufficient statistical samples (eight typhoon cases in 2008 and 2009, Table 1), the calibration was performed via a cross-validation procedure, with the purpose of increasing validation samples to consolidate the representativeness of statistical verification results. For example, in the experiment SMP-T (LR) (Table 2), the five typhoon cases in 2008 were used as the training samples to calibrate the three typhoon cases in 2009, and in turn the 2009 cases were used as the training samples.

Table 1.

Typhoon cases in 2008 and 2009.

Table 1.
Table 2.

Summary of the difference of statistical samples in the sensitivity experiments.

Table 2.

After the calibration for all selected thresholds, a checking step was needed to ensure that the distribution of probabilities at each grid point would be monotonic (i.e., the probabilities at lower thresholds were not smaller than those at higher thresholds because the precipitation events reaching higher thresholds implies the events reaching lower thresholds). Precipitation probabilities with correct monotonic distribution were the final calibrated precipitation probabilities.

The LR method is expressed as the LR equation:
e1
where M = 7, fi(x, t), i = 1, 2, … , 7 are the seven ordinal input probabilities (i.e., seven ordinal ensemble precipitation probabilities centered at the calibration threshold described earlier), P(x, t) is the corresponding observed precipitation probability, and a is a constant interpreted as error residual. The coefficients a and bi are estimated by minimizing the errors between the observed probabilities and the derived ones in the LR equation using N training samples. The regression coefficients are derived by using the least squares method:
e2

By applying the ensemble precipitation probabilities fi(x, t) from validation samples into the LR in Eq. (1) after obtaining the regression coefficients a and bi, new forecasted precipitation probabilities P(x, t) (i.e., calibrated probabilities) are obtained. For negative values (or >1), new probabilities are reset to 0 (or 1).

5. Verification and results

Some verification methods were used to evaluate the forecast bias, the discriminating ability, the skill of PQPFs, the spread–skill relationship, as well as the advantage of a time-lagged multimodel EPS against simple multimodel EPSs. Regarding the experiments in this study, please refer to Tables 2 and 3 for a detailed description.

Table 3.

Summary of models and ensemble prediction systems.

Table 3.

a. Forecast bias

The rank histogram (RH; Hamill 2001), also known as a Talagrand diagram, can be used to evaluate whether or not the ensemble spread of the forecast adequately represents the true variability of the observations. The reference experiment SMP-T (Fig. 6a) from eight typhoon cases in 2008 and 2009 shows an “L” shape of RH distribution, which indicates that the LAPS EPS has significant wet biases.

Fig. 6.
Fig. 6.

Rank histograms for LAPS 0–6-h QPFs from the experiments (a) SMP-T and (b) SMP-L. The horizontal dashed line denotes the frequency for a uniform rank distribution.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

The reliability diagram (Hsu and Murphy 1986; Hamill 1997) can be used to determine how well the forecast probabilities of an event correspond to their observed frequencies. Reliability is reflected by the proximity of the reliability curve (Fig. 7) to the diagonal line, which depicts a perfect forecast. The closer the reliability curve to the diagonal line, the smaller the probabilistic forecast bias and the higher the reliability. Except for the slightly dry bias in lower forecast probabilities at smaller thresholds [below 5 mm (6 h)−1] from the experiment SMP-T (before calibration; Fig. 7), all reliability curves indicate wet biases (i.e., the reliability curve lies below the diagonal line), and the higher the threshold, the more significant the wet bias. The wet bias is apparently corrected after the LR calibration [SMP-T(LR)].

Fig. 7.
Fig. 7.

(left) Reliability diagrams for LAPS 0–6-h PQPFs at thresholds (a) 1, (b) 5, and (c) 20 mm (6 h)−1. Reliability curves from the experiments SMP-T (before LR calibration, dashed line with solid dots) and SMP-T(LR) (after LR calibration, solid line with hollow circles) are shown. The horizontal dashed line indicates the sample climatology frequency. (right) Histograms indicate the corresponding sample ratio (%) of each forecast probability subrange for the experiments SMP-T (before LR, gray) and SMP-T(LR) (after LR, blank).

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

The corresponding histogram (Fig. 7) shows the sample ratio in each forecast probability bin. Regarding the higher thresholds [>20 mm (6 h)−1], the sample ratios in high forecast probability bins were small, which indicates that the consistency of the forecasts of 12 ensemble members was worse when predicting heavy rainfall; and the wet biases were corrected by adjusting the higher probabilities (less reliable) to the lower ones (the highest calibrated probabilities were 83.37%). In general, the dry and wet biases were corrected by adjusting the lowest and highest probabilities to the midrange ones via the calibration procedure.

b. Discriminating ability

The relative operating characteristic (ROC; Mason and Graham 1999; Jolliffe and Stephenson 2003; Hamill and Juras 2006; Wilks 2006) curve plots the hit rates versus the false alarm rates using a set of increasing probabilities as warning thresholds (i.e., precipitation event is regarded as occurring when the forecast probability exceeds this warning threshold). The area under the ROC curve (i.e., ROC area) measures the ability of the forecast to discriminate between events and nonevents and it ranges from 0 to 1 (perfect score). A forecast with skillful discriminating ability has the ROC area greater than 0.5. The ROC curves (Fig. 8a) and the ROC areas (>0.825; Fig. 8b) from the experiment SMP-T indicate good discriminating ability, which slightly decrease with increasing threshold. Unlike reliability diagrams, the ROC is conditioned on the observations and is not sensitive to the forecast bias. Therefore, a biased forecast may still have good discriminating ability and possibly be improved through calibration, such as the LAPS PQPFs for typhoons (Fig. 8).

Fig. 8.
Fig. 8.

(a) ROC curves from the experiment SMP-T and (b) the area under the ROC from the experiments SMP-T (line with circles) and SMP-L (line with squares) for LAPS 0–6-h PQPFs at different thresholds [1, 5, 10, 15, 20, and 30 mm (6 h)−1].

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

c. Forecast skill

The Brier skill score (BSS; Wilks 2006) measures the relative improvement of the probabilistic forecast over a reference forecast in terms of selected thresholds. Sample climatology was used as the reference forecast in this study. The BSS ranges from minus infinity to 1. The positive BSS indicates skillful forecasts with the perfect value of 1. The PQPFs (Fig. 9) from the experiment SMP-T (before calibration) are skillful when compared to climatology at the thresholds below 30 mm (6 h)−1. The BSS at each threshold increases significantly after calibration [i.e., SMP-T vs SMP-T (LR) or SMP-T8S (LR)], especially for higher thresholds.

Fig. 9.
Fig. 9.

Brier skill scores at different thresholds for LAPS 0–6-h PQPFs from the experiments SMP-T (dashed line with solid dots), SMP-T (LR) (solid line with hollow circles), and SMP-T8S (LR) (solid line with hollow squares).

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

Similar to the BSS, the rank probability skill score (RPSS; Wilks 2006) measures relative improvement of the probabilistic forecast over climatology for a multicategory probabilistic forecast. The RPSS ranges from minus infinity to 1 (perfect), and the positive RPSS indicates a skillful forecast. The spatial distribution of the RPSS from the experiment SMP-T (Fig. 10a) shows that the PQPFs are more skillful over the northeastern island of Taiwan, high mountain areas, and the northeastern sea areas, but less skillful over the eastern sea areas along the coast line and the western sea areas. The black sector area with an angle of 15° in the northwestern direction of Taiwan (Figs. 10a,b) results from the lack of observation data because the radar beams from the Wu–Fen–Shan [Weather Surveillance Radar-1988 Doppler (WSR-88D)] radar site are blocked by the Chi–Shin Mountains. The black line’s area (about an angle of 30° with the tangential direction of the eastern coast line of Taiwan), lack of observation data, arises from the blockage of building on radar beams from the Hua–Lian radar site.

Fig. 10.
Fig. 10.

The spatial distribution of the ranked probabilistic skill score for LAPS 0–6-h PQPFs from the experiments (a) SMP-T (before LR calibration) and (b) SMP-T (LR) (after LR calibration) using four thresholds [1, 5, 10, and 20 mm (6 h)−1] to define five categories.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

The Brier score (BS; Wilks 2006) is used to calculate the magnitude of the probabilistic forecast errors and can be partitioned into three terms: reliability, resolution, and uncertainty (Murphy 1973):
e3
where N is the number of verifying samples; K−1 is the number of forecast probability subranges (K equals 13 in this study); ni, Pi, and are the number of verifying subsamples, the median value of the forecast probabilities, and the conditional observed frequency at forecast probability subrange i, respectively; and Oavg is the sample climatology frequency. The reliability term, which has negative orientation (smaller scores is better), stands for the conditional forecast bias, and the resolution term (positive orientation) represents the forecast ability to discriminate occurrence–nonoccurrence of the events from climatology. The uncertainty term represents for the variability of the observations and will not be altered during the calibration process. For sample climatology, the BSS can be expressed as
e4

Figure 11 shows that at various thresholds the increases of the BSS values (Fig. 9) after calibration [SMP-T(LR) or SMP-T8S(LR)] are achieved by decreasing the reliability term via the calibration procedure, while the resolution term is almost the same after calibration [only slightly decreased in the experiment SMP-T8S (LR) at the threshold of 30 mm (6 h)−1]. This calibration result is pretty positive, because sometimes the reduction of the reliability term may be achieved at the cost of the decrease of the resolution term.

Fig. 11.
Fig. 11.

Decomposition of the Brier score for LAPS 0–6-h PQPFs from the experiments SMP-T (dashed line with circle symbols), SMP-T(LR) (solid line with square symbols), and SMP-T8S(LR) (solid line with triangle symbols). Solid symbols: reliability terms. Hollow symbols: resolution terms.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

d. Spread–skill relationship in the LAPS EPS

In general, an ideal EPS will be expected to have the same size of ensemble spread (SPRD) as their forecast error at the same lead time in order to represent full forecast uncertainty (Kalnay and Dalcher 1987; Whitaker and Loughe 1998; Zhu 2005; Buizza et al. 2005). The SPRD and root-mean-square error (RMSE) of ensemble mean forecasts (Fig. 12) are highly correlated in the LAPS EPS. Except for TY 3, the correlations are all higher than 0.92 with a good regression (high coefficient of determination). In addition, the LAPS EPS is slightly overdispersive for each typhoon case [i.e., the SPRD is slightly larger than the RMSE (almost all data points are located below the diagonal)]. Similarly, the scatterplot for all typhoon cases (Fig. 13) indicates a high correlation between SPRD and RMSE and slight overdispersion, which becomes more obvious with increasing SPRD.

Fig. 12.
Fig. 12.

Scatterplots of the RMSE against the ensemble spread (SPRD) for eight typhoon cases (TY 1–TY 8) from the experiment SMP-T. Each point in the scatterplot comes from one 0–6-h QPF (i.e., RMSE and SPRD are averaged over the QPESUMS domain). The linear regression line, correlation coefficient (C), and the coefficient of determination (R2) are shown.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

Fig. 13.
Fig. 13.

As in Fig. 12., but for all typhoon cases (TY 1–TY 8) in one plot.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

Table 4 is a contingency table of SPRD and RMSE for 0–6-h ensemble QPFs in the experiment SMP-T. The entries in the table are the joint probability of obtaining the SPRD and RMSE values in the indicated quartiles. If there is no relationship between SPRD and RMSE (i.e., the correlation is zero), all entries in the table will be 0.0625 (=1/16). If there is a perfect linear relationship (i.e., the correlation is unity), all the diagonal entries will be 0.25 and the off-diagonal ones will be zero. The diagonal entries in the table are much higher than 0.0625 and even up to 0.2225. In addition, the contingency table forms a tridiagonal matrix, which indicates that a good spread–skill relationship exists in the LAPS EPS and the SPRD can be used as a predictor of skill, especially when it is extreme (very large or very small).

Table 4.

Contingency table of ensemble spread (SPRD) and RMSE for LAPS 0–6-h ensemble QPFs in experiment SMP-T using the configuration of EPS-mmtl. The entries in the table are the joint probability of obtaining the SPRD and RMSE values in the indicated quartiles.

Table 4.

The scatterplots of EPS-mmtl (Fig. 14a, see Tables 2 and 3 for different samples and ensemble configurations) show that the correlation coefficient in the experiment SMP-T is higher than that in SMP-L. In the experiment SMP-T, TY 4, and TY 7 are the more predictable cases, followed by TY 1 and TY 6. TY 3, TY 5, TY 2, and TY 8 are the less predictable ones. The BSS of eight typhoon cases for the experiment SMP-T (Fig. 15) shows that TY 4 and TY 7 (the more predictable cases) have the highest BSS, followed by TY 1 and TY 6, and then TY 3, TY 5, TY 2, and TY 8 (the less predictable cases). Again, this indicates that small-spread ensemble forecasts have higher skill than large-spread forecasts in the LAPS EPS.

Fig. 14.
Fig. 14.

As in Fig. 13., but for the four different ensemble configurations (a) EPS-mmtl, (b) EPS-m06h, (c) EPS-m09h, and (d) EPS-m12h, respectively. The results of (left) experiment SMP-T and (right) experiment SMP-L. Each point in the scatterplot comes from one typhoon case. The ratio of SPRD over RMSE is also indicated on the plot of EPS-mmtl.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

Fig. 15.
Fig. 15.

Brier skill scores of eight typhoon cases (TY 1–TY 8) for experiment SMP-T at different thresholds [1, 5, 10, and 20 mm (6 h)−1].

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

e. Advantage of a time-lagged configuration

One reason to adopt the time-lagged ensemble technique to the short-range forecasting is because a short-range forecast generally possesses a relatively strong dependency on the initial conditions. The forecast errors in the very short range may be strongly correlated to the uncertainties in the initial analysis (Lu et al. 2007). The time-lagged ensembles can be interpreted as the forecasts obtained from a set of perturbed initial conditions (Van den Dool and Rukhovets 1994). In this study, the advantage of time-lagged multimodel ensemble forecasts is compared with simple multimodel ones.

Figure 16 indicates that EPS-mmtl has the highest BSS values among the four different configurations (EPS-mmtl, EPS-m06h, EPS-m09h, and EPS-m12h in Table 3) at all thresholds in the experiment SMP-L (Fig. 16b) and at the thresholds below 15 mm (6 h)−1 in the experiment SMP-T (Fig. 16a). Figure 17 shows the relative RMSE (RRMSE) and spatial correlation coefficient of 6-h QPFs of 12 members and 4 ensembles, and EPS-mmtl has the lowest RRMSE and the highest correlation coefficient in both SMP-T and SMP-L.

Fig. 16.
Fig. 16.

Brier skill scores at different thresholds for four different ensemble configurations (EPS-mmtl, EPS-m06h, EPS-m09h, and EPS-m12h) for the experiments (a) SMP-T and (b) SMP-L.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

Fig. 17.
Fig. 17.

(a) The RRMSE and (b) the correlation coefficient of 6-h accumulated precipitation forecasts of 12 members and four ensemble configurations (EPS-mmtl, EPS-m06h, EPS-m09h, and EPS-m12h) in Table 3 from the experiments SMP-T (line with circles) and SMP-L (line with squares).

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

Regarding the spread–skill relationship (Fig. 14), all three EPSs without time-lagged configuration (EPS-m06h, EPS-m09h, and EPS-m12h) are underdispersive (i.e., the SPRD is smaller than the RMSE) in both SMP-T and SMP-L. Such underdispersion becomes more significant with increasing SPRD, and more severe for shorter-range ensembles (EPS-m06h) and the land samples (SMP-L vs SMP-T). As a time-lagged configuration (EPS-mmtl) is adopted, slight overdispersion exhibits in the experiment SMP-T, and underdispersion is mitigated in the experiment SMP-L but still exists for larger SPRD (>6 mm, shown in the linear regression line), which is consistent with the smaller spread in the rank histogram (Fig. 6b). A time-lagged configuration, which is much more important for the less predictable cases (i.e., larger SPRD) and for the forecasts in the land, can mitigate the underdispersion to better represent the forecast uncertainty (SPRD comparable to RMSE).

6. Sensitivity experiments

In this section, two sets of sensitivity experiments (Table 2) were carried out, including 1) the experiments on different training samples to understand the influence of different training samples over the calibration results, and 2) the experiments on the accuracy of observation data to understand how the inconsistency of observation data accuracy from the sea and land areas will influence the calibration results. In fact, the QPESUMS QPEs, used as the ground truth, have been calibrated with the observed precipitation from rain gauges in the land areas, while they have no calibration in the sea areas. In a few cases, there was clear discontinuity in QPESUMS QPEs along the costal line of Taiwan.

a. Experiments on different training samples

The difference between the two calibration experiments SMP-T(LR) and SMP-T8S(LR) lies in that the reference experiment SMP-T adopts different training samples during the LR calibration process. All statistical samples in the experiment SMP-T(LR) were divided into two groups when performing the cross-validation procedure, typhoon cases in 2008 (TY 1–TY 5) and 2009 (TY 6–TY 8), respectively, where one of these two groups was used as the training samples to calibrate the other (the validation samples). The experiment SMP-T8S(LR) puts all statistical samples into eight groups, eight different typhoon cases (TY 1–TY 8) in 2008 and 2009, where seven groups were used as the training samples to calibrate the remaining one. In other words, each typhoon case serves as the validation samples in turn. All the calibration results shown in this study have combined all validation samples in one set of figures.

At various thresholds, the LR calibration [SMP-T(LR) or SMP-T8S(LR)] can increase the BSS over the experiment SMP-T (Fig. 9) and shows the improved skill of PQPFs. In addition, the improvement grows with increasing threshold. Only at 30 mm (6 h)−1 threshold, the BSS of SMP-T(LR) is much higher than that of SMP-T8S(LR). At other thresholds lower than 30 mm (6 h)−1, the resolution values (Fig. 11) vary slightly among these three experiments [SMP-T, SMP-T(LR), and SMP-T8S(LR)], while the reliability values of SMP-T8S(LR) decrease slightly more than those of SMP-T(LR). However, at the threshold of 30 mm (6 h)−1, the reliability value of SMP-T8S(LR) decreases far less than that of SMP-T(LR). In addition, the resolution value of SMP-T8S(LR) decreases after the LR calibration, which means that the forecast ability to discriminate the extreme precipitation events from climatology has been lost during the calibration process.

Figure 18 indicates that eight typhoon cases do not show a very good precipitation similarity at 30 mm (6 h)−1 threshold in the reliability. In other words, some inconsistency exists among the forecast biases of these eight typhoon cases. Therefore, although the experiment SMP-T8S(LR) used far more training samples than the experiment SMP-T(LR), it does not necessarily produce a better calibration result. Indeed, the calibration principle is constructed based on the consistent (or very similar) distribution of the forecast biases between the training and validation samples.

Fig. 18.
Fig. 18.

Reliability diagram for LAPS 0–6-h PQPFs at the threshold of 30 mm (6 h)−1. Reliability curves from TY 1 to TY 8 are shown.

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

This sensitivity experiment shows that the calibration results vary with the training samples, and adopting more training samples does not necessarily produce better calibration results. It is essential to apply data with similar forecast biases as the training samples to achieve better calibration results. In the future, the similar typhoon types should be considered to establish various LR relationships based on different typhoon classifications. In addition, individual calibration for the areas with different precipitation characteristics (Yuan et al. 2007b) would also be the direction for further research and study.

b. Experiments on the accuracy of observation data

The forecast skill highly depends on verification/observation data, especially precipitation analyses over the mountainous areas (Yuan et al. 2005). In this study, there are two designed experiments using statistical samples from different radar coverage: the samples of the experiments SMP-T and SMP-T(LR) are adopted from all radar coverage (including the sea and land areas) within the QPESUMS domain, while those of the experiments SMP-L and SMP-L(LR) are only from the land areas in Taiwan, accounting for 9.4% of all samples. Thus, the performance of the experiment SMP-T is almost dominated by the ocean samples.

The RH from the experiment SMP-T (Fig. 6a) shows that the LAPS forecasts for typhoon cases have significant wet biases. The experiment SMP-L (Fig. 6b) shows too small LAPS ensemble spread in the land areas, with about 35% of samples whose observations are outside the extremes of the ensemble forecasts, about 23% (12%) of samples with the observations ranked the lowest (highest). The RH solely from the ocean samples is very similar to that from the experiment SMP-T since more than 90% of samples in the experiment SMP-T come from the sea areas.

Similar to the experiments SMP-T and SMP-T(LR) (Fig. 7), the reliability diagrams (Fig. 19) of SMP-L and SMP-L(LR) show mixed biases at each threshold. Dry (wet) biases come mainly from the samples with lower (higher) forecast probabilities. The higher the threshold, the smaller the sample percentage with higher forecast probabilities. In addition, the experiments SMP-T and SMP-L (Figs. 7 and 19) show that the wet biases become more evident with increasing threshold, and the severe wet biases in SMP-T result mainly from the ocean samples. This overestimation tendency for heavy precipitation may be caused by deactivation of cumulus parameterization in LAPS with 9-km horizontal resolution during the warm rain processes. Generally speaking, both experiments achieve the bias correction by adjusting the extreme forecast probabilities to the medium ones. Note that the samples with higher forecast probabilities are viewed as unreliable at higher thresholds [above 10 mm (6 h)−1, not shown] in the experiment SMP-T (Fig. 7), and thus they are removed after calibration [SMP-T(LR)]. Since the experiment SMP-L (Fig. 19) does not have obvious biases before calibration, the LR calibration only makes a minimal bias correction. This indicates that the LAPS EPS have better discrimination capabilities in the land areas.

Fig. 19.
Fig. 19.

As in Fig. 7, but from the experiments SMP-L (before LR calibration) and SMP-L (LR) (after LR calibration).

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

The ROC area (Fig. 8b) indicates that at each threshold, the experiment SMP-L has better resolution than the experiment SMP-T (i.e., a better ability to discriminate between precipitation occurrence and nonoccurrence). In addition, the potential usefulness of both the experiments SMP-T and SMP-L slightly reduces with increasing threshold.

Figure 17 reveals that almost every ensemble member has lower RRMSE and higher spatial correlation coefficients in the experiment SMP-L than that in SMP-T (i.e., the performance of QPFs is better in the land areas). In addition, the performance of the ensemble mean in EPS-mmtl excels that of individual ensemble members.

Compared with the experiments SMP-T and SMP-T(LR), the experiments SMP-L and SMP-L(LR) have higher BSS values (Fig. 20). In addition, the BSS values reduce with increasing threshold, and the experiment SMP-T is unskillful when compared to climatology at the threshold of 30 mm (6 h)−1. And the forecast skill is improved after calibration at various thresholds [SMP-T(LR) vs SMP-T; SMP-L(LR) vs SMP-L]. Except for the threshold of 1 mm (6 h)−1, the BSS values of the experiment SMP-L are even higher than those of the experiment SMP-T(LR) at all thresholds, and the gap widens with increasing threshold.

Fig. 20.
Fig. 20.

Brier skill scores at different thresholds for LAPS 0–6-h PQPFs from the experiments SMP-T (dashed line with solid dots), SMP-T(LR) (solid line with hollow circles), SMP-L (dashed line with solid squares), and SMP-L(LR) (solid line with hollow squares).

Citation: Monthly Weather Review 140, 5; 10.1175/MWR-D-11-00085.1

The spatial distributions of the RPSS (Fig. 10) from the experiments SMP-T and SMP-T(LR) show that the skill of PQPFs was improved at most areas (the southwestern, northwestern, eastern, and southeastern sea areas of Taiwan) after calibration. However, the RPSS values in the northeastern oceanic and mountainous areas of Taiwan apparently decrease after calibration. This could be associated with the poor quality of the QPESUMS QPEs used as the observation data in the sea areas. Because of the inconsistency of the observation data accuracy between the sea and land areas and more abundant ocean samples than the land ones (about 9 times), the RPSS values therefore, in some land areas, obviously fall after calibration (i.e., the probabilistic forecast skill in most sea areas is improved at the cost of those in a few land areas during the calibration process). Compared with the experiment SMP-L (not shown), the RPSS values in the experiment SMP-L(LR) became slightly higher in most coastal areas, with only a minimal reduction in very small parts of mountainous areas.

In summary, there are severe wet biases in the LAPS EPS in the sea areas. Therefore, the performance of precipitation forecasts in the experiment SMP-L (land samples) excels that in the experiment SMP-T (mainly dominated by ocean samples). For typhoon cases, the precipitation in the land areas affected by orographic lifting may be easier to be predicted accurately, in comparison with the forecasts of typhoon rainbands in the sea areas. In addition, the relatively inferior forecast performance in the sea areas probably results from the underestimation of the QPESUMS QPEs. Currently, the QPESUMS applies the same Z–R relationship for various weather systems and geographic areas, while previous verifications revealed that the QPEs derived from this Z–R relationship showed a feature of underestimation in the land areas. In the sea areas far from radars, the likelihood of underestimated QPEs is rather high due to higher radar beams. It is not easy to verify these two inferences because of the lack of precipitation observations in the sea areas. To conclude, individual calibration for the sea and land areas is needed to obtain better calibrated PQPFs.

7. Summary and future work

This study pioneers the development of short-range PQPFs in Taiwan. It aims mainly to provide more valuable 0–12-h forecast products for severe weather systems seriously affecting general livelihood. Because the data assimilation procedure of LAPS can provide the initial fields of mesoscale models with diabatic information, it can effectively mitigate the major problem of short-range QPFs—the spinup problem during the early stage of model integration. Therefore, this study applies LAPS as a basic tool to develop the PQPFs from time-lagged multimodel ensembles in order to capture the critical uncertainties, and to convey the uncertainties in prediction processes to the forecasters and end users.

Based on the sensitivity experiments, for the 0–6- or 6–12-h QPFs of LAPS, background field and mesoscale model are more critical uncertainty factors compared to microphysical parameterizations. Therefore, we built a multimodel EPS by using different background fields and mesoscale models in order to enlarge the ensemble spread. In addition, we also adopted the time-lagged configuration to increase the ensemble members using previous forecasts without additional computational cost. A time-lagged configuration can make the LAPS EPS have better forecast performance and skill–spread relationship, especially for the less predictable typhoon cases and over the land areas. The PQPF products can actually provide the users with the reliability of precipitation forecasts and the possibility for precipitation exceeding a certain threshold, and can serve as the reference for policy decision makers to deal with disaster prevention and relief.

Because the verification results show significant wet biases in the LAPS EPS for typhoon cases, this study applies the LR method to calibrate the PQPFs. The cross-validation results based on eight typhoon cases in 2008 and 2009 indicate that forecast biases can be corrected via the calibration procedure to improve the skill of PQPFs.

In addition, this study carries out sensitivity experiments on two factors affecting the calibration results, including the training sample size and the accuracy of observation data. The LR calibration results vary with the training samples, and adopting more training samples does not guarantee better calibration results. In fact, the eight typhoon cases did not show a very good precipitation similarity in terms of the reliability and forecast biases (Fig. 18). Therefore, more training samples could cause the poor calibration performance [SMP-T8S(LR) vs SMP-T(LR)]. Most important, the calibration principle is constructed based on the consistent (or very similar) distribution of the forecast biases between the training and validation samples.

In the future, with more collected typhoon cases, the distributions of precipitation forecast biases can be analyzed for different typhoon paths, moving speeds, or precipitation intensities. Then various LR relationships can be established and applied to different distributions of forecast biases in the typhoon cases, thus to produce better calibration results.

The inconsistency of observation data accuracy from the sea and land areas influences the calibration results. This is due to the fact that the QPEs of QPESUMS (used as the ground truth) have been calibrated with the observed precipitation in the land areas, while they have no calibration in the sea areas, which results in slightly worse accuracy. The verifications of PQPFs show that the forecast skill of experiment SMP-T (including ocean and land samples) is improved (such as higher BSS values) after calibration. However, because of the inconsistency of the observation data accuracy in the sea and land areas and more abundant ocean samples (about 9 times of land samples), the bias correction of most sea samples is achieved at the sacrifice of the bias correction of some land samples in the experiment SMP-T(LR). Hence, separate calibration for the sea and land areas may ensure better calibration results of PQPFs.

However, the result was not very positive using the LR method to calibrate the PQPFs for mei-yu front cases. The reason is that the phenomenon of the pattern shift often exists in the LAPS forecasts for mei-yu front cases, which results in incorrect correspondence between the observation and forecast precipitation systems and would seriously affect the calibration results. In the future, the pattern shift needs to be corrected for mei-yu front cases before calibration of PQPFs. In addition, forecast biases in the LAPS EPS vary with distinctive geographic environment in Taiwan, thus performing individual calibration of PQPFs for different regional areas should be considered.

Acknowledgments

The NOAA/ESRL/GSD has supported the technology transfer of the PQPF calibration technique (Yuan et al. 2008) and the LAPS system. Thanks to Drs. Fanthune Moeng, John A. McGinley, and Yuanfu Xie, and Steve Albersat NOAA/ESRL/GSD for facilitating the collaboration and CWB of Taiwan for supporting this project. Thanks to Ms. Annie Reiser at NOAA/ESRL for editing this manuscript and Dr. Chien-Ben Chou at CWB for helpful discussions. We also thank two anonymous reviewers for their valuable comments. The second author (H. Yuan) received the support from Natural Science Foundation of China (41175087), the R&D Special Fund for Public Welfare Industry (Meteorology) (GYHY201206005), Scientific Research Foundation for Introduced Talent, Nanjing University (020722631003), and the Priority Academic Program Development of Jiangsu Higher Education Institutions.

REFERENCES

  • Albers, S., 1995: The LAPS wind analysis. Wea. Forecasting, 10, 342352.

  • Albers, S., J. McGinley, D. Birkenheuer, and J. Smart, 1996: The Local Analysis and Prediction System (LAPS): Analyses of clouds, precipitation, and temperature. Wea. Forecasting, 11, 273287.

    • Search Google Scholar
    • Export Citation
  • Atger, F., 2003: Spatial and interannual variability of the reliability of ensemble-based probabilistic forecasts: Consequences for calibration. Mon. Wea. Rev., 131, 15091523.

    • Search Google Scholar
    • Export Citation
  • Birkenheuer, D., 1999: The effect of using digital satellite imagery in the LAPS moisture analysis. Wea. Forecasting, 14, 782788.

  • Buizza, R., P. L. Houtekamer, Z. Toth, G. Pellerin, M. Wei, and Y. Zhu, 2005: A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems. Mon. Wea. Rev., 133, 10761097.

    • Search Google Scholar
    • Export Citation
  • Cheung, K. K. W., 2001: A review of ensemble forecasting techniques with a focus on tropical cyclone forecasting. Meteor. Appl., 8, 315332.

    • Search Google Scholar
    • Export Citation
  • Chien, F.-C., and B. J.-D. Jou, 2004: MM5 ensemble mean precipitation forecasts in the Taiwan area for three early summer convective (mei-yu) seasons. Wea. Forecasting, 19, 735750.

    • Search Google Scholar
    • Export Citation
  • Chien, F.-C., Y.-C. Shao, B. J.-D. Jou, P.-L. Lin, M.-J. Yang, J.-S. Hong, J.-H. Teng, and H.-C. Lin, 2003: Precipitation verification of the MM5 ensemble forecast (in Chinese with an English abstract). Atmos. Sci., 31, 7794.

    • Search Google Scholar
    • Export Citation
  • Chien, F.-C., Y.-C. Liu, B. J.-D. Jou, P.-L. Lin, J.-S. Hong, and L.-F. Hsiao, 2005: MM5 ensemble rainfall forecasts during the 2003 Mei-yu season (in Chinese with an English abstract). Atmos. Sci., 33, 255275.

    • Search Google Scholar
    • Export Citation
  • Davis, C. A., and S. Low-Nam, 2001: The NCAR-AFWA tropical cyclone bogussing scheme. Air Force Weather Agency (AFWA) Rep., 13 pp. [Available online at http://www.mmm.ucar.edu/mm5/mm5v3/tc-report.pdf.]

  • Donner, L. J., 1988: An initialization for cumulus convection in numerical weather prediction models. Mon. Wea. Rev., 116, 377385.

  • Eckel, F. A., and M. K. Walters, 1998: Calibrated probabilistic quantitative precipitation forecasts based on the MRF ensemble. Wea. Forecasting, 13, 11321147.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., and D. A. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11, 12031211.

    • Search Google Scholar
    • Export Citation
  • Gourley, J. J., J. Zhang, R. A. Maddox, C. M. Calvert, and K. W. Howard, 2001: A real-time precipitation monitoring algorithm—Quantitative Precipitation Estimation Using Multiple Sensors (QPE-SUMS). Preprints, Symp. on Precipitation Extremes: Prediction, Impacts, and Responses, Albuquerque, NM, Amer. Meteor. Soc., 57–60.

  • Grell, G., J. Dudhia, and D. Stauffer, 1995: A description of the fifth-generation Penn State/NCAR Mesoscale Model (MM5). NCAR Tech. Note NCAR/TN-398+STR, 138 pp.

  • Hamill, T. M., 1997: Reliability diagrams for multicategory probabilistic forecasts. Wea. Forecasting, 12, 736741.

  • Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129, 550560.

  • Hamill, T. M., and J. Juras, 2006: Measuring forecast skill: Is it real skill or is it the varying climatology? Quart. J. Roy. Meteor. Soc., 132, 29052923.

    • Search Google Scholar
    • Export Citation
  • Harms, D. E., R. V. Madala, S. Raman, and K. D. Sashegyi, 1993: Diabatic initialization tests using the Naval Research Laboratory limited-area numerical weather prediction model. Mon. Wea. Rev., 121, 31843190.

    • Search Google Scholar
    • Export Citation
  • Heckley, W. A., 1985: Systematic errors of the ECMWF operational forecasting model in tropical regions. Quart. J. Roy. Meteor. Soc., 111, 709738.

    • Search Google Scholar
    • Export Citation
  • Hsu, W.-R., and A. H. Murphy, 1986: The attributes diagram: A geometrical framework for assessing the quality of probability forecasts. Int. J. Forecasting, 2, 285293.

    • Search Google Scholar
    • Export Citation
  • Jian, G.-J., and J. McGinley, 2005: Evaluation of a short-range forecast system on quantitative precipitation forecasts associated with tropical cyclones of 2003 near Taiwan. J. Meteor. Soc. Japan, 83, 657681.

    • Search Google Scholar
    • Export Citation
  • Jian, G.-J., S.-L. Shieh, and J. McGinley, 2003: Precipitation associated with Typhoon Sinlaku (2002) in Taiwan area using the LAPS diabatic initialization for MM5. TAO, 14, 128.

    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., and D. B. Stephenson, 2003: Forecast Verification: A Practitioner’s Guide in Atmospheric Science. Wiley, 254 pp.

  • Kalnay, E., and A. Dalcher, 1987: Forecasting forecast skill. Mon. Wea. Rev., 115, 349356.

  • Krishnamurti, T. N., J. Xue, H. S. Bedi, K. Ingles, and D. Oosterhof, 1991: Physical initialization for numerical weather prediction over the tropics. Tellus, 43AB, 5381.

    • Search Google Scholar
    • Export Citation
  • Lu, C., H. Yuan, B. Schwartz, and S. Benjamin, 2007: Short-range forecast using time-lagged ensembles. Wea. Forecasting, 22, 580595.

  • Mason, S. J., and N. E. Graham, 1999: Conditional probabilities, relative operating characteristics, and relative operating levels. Wea. Forecasting, 14, 713725.

    • Search Google Scholar
    • Export Citation
  • Mass, C. F., 2003: IFPS and the future of the National Weather Service. Wea. Forecasting, 18, 7579.

  • Mohanty, U. C., A. Kasahara, and R. Errico, 1986: The impact of diabatic heating on the initialization of a global forecast model. J. Meteor. Soc. Japan, 64, 805817.

    • Search Google Scholar
    • Export Citation
  • Mullen, S. L., and R. Buizza, 2004: Calibration of probabilistic precipitation forecasts from the ECMWF EPS by an artificial neural network. Preprints, 17th Conf. on Probability and Statistics in the Atmospheric Sciences, Seattle, WA, Amer. Meteor. Soc., J5.6.

  • Murphy, A. H., 1973: A new vector partition of the probability score. J. Appl. Meteor., 12, 595600.

  • Schultz, P., 1995: An explicit cloud physics parameterization for operational numerical weather prediction. Mon. Wea. Rev., 123, 33313343.

    • Search Google Scholar
    • Export Citation
  • Toth, Z., and E. Kalnay, 1993: Ensemble forecasting at NMC: The generation of perturbations. Bull. Amer. Meteor. Soc., 74, 23172330.

  • Van den Dool, H. M., and L. Rukhovets, 1994: On the weights for an ensemble-averaged 6–10-day forecast. Wea. Forecasting, 9, 457465.

    • Search Google Scholar
    • Export Citation
  • Vislocky, R. L., and J. M. Fritsch, 1997: Performance of an advanced MOS system in the 1996–97 national collegiate weather forecasting conference. Bull. Amer. Meteor. Soc., 78, 28512857.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., and A. F. Loughe, 1998: The relationship between ensemble spread and ensemble mean skill. Mon. Wea. Rev., 126, 32923302.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. Academic Press, 627 pp.

  • Yang, M.-J., B. J.-D. Jou, S.-C. Wang, J.-S. Hong, P.-L. Lin, J.-H. Teng, and H.-C. Lin, 2004: Ensemble prediction of rainfall during the 2000–2002 Mei-Yu seasons: Evaluation over the Taiwan area. J. Geophys. Res., 109, D18203, doi:10.1029/2003JD004368.

    • Search Google Scholar
    • Export Citation
  • Yuan, H., S. L. Mullen, X. Gao, S. Sorooshian, J. Du, and H. H. Juang, 2005: Verification of probabilistic quantitative precipitation forecasts over the southwest United States during winter 2002/03 by the RSM ensemble system. Mon. Wea. Rev., 133, 279294.

    • Search Google Scholar
    • Export Citation
  • Yuan, H., X. Gao, S. L. Mullen, S. Sorooshian, J. Du, and H. H. Juang, 2007a: Calibration of probabilistic quantitative precipitation forecasts with an artificial neural network. Wea. Forecasting, 22, 12871303.

    • Search Google Scholar
    • Export Citation
  • Yuan, H., S. L. Mullen, X. Gao, S. Sorooshian, J. Du, and H. H. Juang, 2007b: Short-range probabilistic Quantitative Precipitation Forecasts over the southwest United States by the RSM ensemble system. Mon. Wea. Rev., 135, 16851698.

    • Search Google Scholar
    • Export Citation
  • Yuan, H., J. A. McGinley, P. J. Schultz, C. J. Anderson, and C. Lu, 2008: Short-range precipitation forecasts from time-lagged multimodel ensembles during the HMT-West-2006 campaign. J. Hydrometeor., 9, 477491.

    • Search Google Scholar
    • Export Citation
  • Zhu, Y., 2005: Ensemble forecast: A new approach to uncertainty and predictability. Adv. Atmos. Sci., 22 (6), 781788.

Save
  • Albers, S., 1995: The LAPS wind analysis. Wea. Forecasting, 10, 342352.

  • Albers, S., J. McGinley, D. Birkenheuer, and J. Smart, 1996: The Local Analysis and Prediction System (LAPS): Analyses of clouds, precipitation, and temperature. Wea. Forecasting, 11, 273287.

    • Search Google Scholar
    • Export Citation
  • Atger, F., 2003: Spatial and interannual variability of the reliability of ensemble-based probabilistic forecasts: Consequences for calibration. Mon. Wea. Rev., 131, 15091523.

    • Search Google Scholar
    • Export Citation
  • Birkenheuer, D., 1999: The effect of using digital satellite imagery in the LAPS moisture analysis. Wea. Forecasting, 14, 782788.

  • Buizza, R., P. L. Houtekamer, Z. Toth, G. Pellerin, M. Wei, and Y. Zhu, 2005: A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems. Mon. Wea. Rev., 133, 10761097.

    • Search Google Scholar
    • Export Citation
  • Cheung, K. K. W., 2001: A review of ensemble forecasting techniques with a focus on tropical cyclone forecasting. Meteor. Appl., 8, 315332.

    • Search Google Scholar
    • Export Citation
  • Chien, F.-C., and B. J.-D. Jou, 2004: MM5 ensemble mean precipitation forecasts in the Taiwan area for three early summer convective (mei-yu) seasons. Wea. Forecasting, 19, 735750.

    • Search Google Scholar
    • Export Citation
  • Chien, F.-C., Y.-C. Shao, B. J.-D. Jou, P.-L. Lin, M.-J. Yang, J.-S. Hong, J.-H. Teng, and H.-C. Lin, 2003: Precipitation verification of the MM5 ensemble forecast (in Chinese with an English abstract). Atmos. Sci., 31, 7794.

    • Search Google Scholar
    • Export Citation
  • Chien, F.-C., Y.-C. Liu, B. J.-D. Jou, P.-L. Lin, J.-S. Hong, and L.-F. Hsiao, 2005: MM5 ensemble rainfall forecasts during the 2003 Mei-yu season (in Chinese with an English abstract). Atmos. Sci., 33, 255275.

    • Search Google Scholar
    • Export Citation
  • Davis, C. A., and S. Low-Nam, 2001: The NCAR-AFWA tropical cyclone bogussing scheme. Air Force Weather Agency (AFWA) Rep., 13 pp. [Available online at http://www.mmm.ucar.edu/mm5/mm5v3/tc-report.pdf.]

  • Donner, L. J., 1988: An initialization for cumulus convection in numerical weather prediction models. Mon. Wea. Rev., 116, 377385.

  • Eckel, F. A., and M. K. Walters, 1998: Calibrated probabilistic quantitative precipitation forecasts based on the MRF ensemble. Wea. Forecasting, 13, 11321147.

    • Search Google Scholar
    • Export Citation
  • Glahn, H. R., and D. A. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11, 12031211.

    • Search Google Scholar
    • Export Citation
  • Gourley, J. J., J. Zhang, R. A. Maddox, C. M. Calvert, and K. W. Howard, 2001: A real-time precipitation monitoring algorithm—Quantitative Precipitation Estimation Using Multiple Sensors (QPE-SUMS). Preprints, Symp. on Precipitation Extremes: Prediction, Impacts, and Responses, Albuquerque, NM, Amer. Meteor. Soc., 57–60.

  • Grell, G., J. Dudhia, and D. Stauffer, 1995: A description of the fifth-generation Penn State/NCAR Mesoscale Model (MM5). NCAR Tech. Note NCAR/TN-398+STR, 138 pp.

  • Hamill, T. M., 1997: Reliability diagrams for multicategory probabilistic forecasts. Wea. Forecasting, 12, 736741.

  • Hamill, T. M., 2001: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Wea. Rev., 129, 550560.

  • Hamill, T. M., and J. Juras, 2006: Measuring forecast skill: Is it real skill or is it the varying climatology? Quart. J. Roy. Meteor. Soc., 132, 29052923.

    • Search Google Scholar
    • Export Citation
  • Harms, D. E., R. V. Madala, S. Raman, and K. D. Sashegyi, 1993: Diabatic initialization tests using the Naval Research Laboratory limited-area numerical weather prediction model. Mon. Wea. Rev., 121, 31843190.

    • Search Google Scholar
    • Export Citation
  • Heckley, W. A., 1985: Systematic errors of the ECMWF operational forecasting model in tropical regions. Quart. J. Roy. Meteor. Soc., 111, 709738.

    • Search Google Scholar
    • Export Citation
  • Hsu, W.-R., and A. H. Murphy, 1986: The attributes diagram: A geometrical framework for assessing the quality of probability forecasts. Int. J. Forecasting, 2, 285293.

    • Search Google Scholar
    • Export Citation
  • Jian, G.-J., and J. McGinley, 2005: Evaluation of a short-range forecast system on quantitative precipitation forecasts associated with tropical cyclones of 2003 near Taiwan. J. Meteor. Soc. Japan, 83, 657681.

    • Search Google Scholar
    • Export Citation
  • Jian, G.-J., S.-L. Shieh, and J. McGinley, 2003: Precipitation associated with Typhoon Sinlaku (2002) in Taiwan area using the LAPS diabatic initialization for MM5. TAO, 14, 128.

    • Search Google Scholar
    • Export Citation
  • Jolliffe, I. T., and D. B. Stephenson, 2003: Forecast Verification: A Practitioner’s Guide in Atmospheric Science. Wiley, 254 pp.

  • Kalnay, E., and A. Dalcher, 1987: Forecasting forecast skill. Mon. Wea. Rev., 115, 349356.

  • Krishnamurti, T. N., J. Xue, H. S. Bedi, K. Ingles, and D. Oosterhof, 1991: Physical initialization for numerical weather prediction over the tropics. Tellus, 43AB, 5381.

    • Search Google Scholar
    • Export Citation
  • Lu, C., H. Yuan, B. Schwartz, and S. Benjamin, 2007: Short-range forecast using time-lagged ensembles. Wea. Forecasting, 22, 580595.

  • Mason, S. J., and N. E. Graham, 1999: Conditional probabilities, relative operating characteristics, and relative operating levels. Wea. Forecasting, 14, 713725.

    • Search Google Scholar
    • Export Citation
  • Mass, C. F., 2003: IFPS and the future of the National Weather Service. Wea. Forecasting, 18, 7579.

  • Mohanty, U. C., A. Kasahara, and R. Errico, 1986: The impact of diabatic heating on the initialization of a global forecast model. J. Meteor. Soc. Japan, 64, 805817.

    • Search Google Scholar
    • Export Citation
  • Mullen, S. L., and R. Buizza, 2004: Calibration of probabilistic precipitation forecasts from the ECMWF EPS by an artificial neural network. Preprints, 17th Conf. on Probability and Statistics in the Atmospheric Sciences, Seattle, WA, Amer. Meteor. Soc., J5.6.

  • Murphy, A. H., 1973: A new vector partition of the probability score. J. Appl. Meteor., 12, 595600.

  • Schultz, P., 1995: An explicit cloud physics parameterization for operational numerical weather prediction. Mon. Wea. Rev., 123, 33313343.

    • Search Google Scholar
    • Export Citation
  • Toth, Z., and E. Kalnay, 1993: Ensemble forecasting at NMC: The generation of perturbations. Bull. Amer. Meteor. Soc., 74, 23172330.

  • Van den Dool, H. M., and L. Rukhovets, 1994: On the weights for an ensemble-averaged 6–10-day forecast. Wea. Forecasting, 9, 457465.

    • Search Google Scholar
    • Export Citation
  • Vislocky, R. L., and J. M. Fritsch, 1997: Performance of an advanced MOS system in the 1996–97 national collegiate weather forecasting conference. Bull. Amer. Meteor. Soc., 78, 28512857.

    • Search Google Scholar
    • Export Citation
  • Whitaker, J. S., and A. F. Loughe, 1998: The relationship between ensemble spread and ensemble mean skill. Mon. Wea. Rev., 126, 32923302.

    • Search Google Scholar
    • Export Citation
  • Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. Academic Press, 627 pp.

  • Yang, M.-J., B. J.-D. Jou, S.-C. Wang, J.-S. Hong, P.-L. Lin, J.-H. Teng, and H.-C. Lin, 2004: Ensemble prediction of rainfall during the 2000–2002 Mei-Yu seasons: Evaluation over the Taiwan area. J. Geophys. Res., 109, D18203, doi:10.1029/2003JD004368.

    • Search Google Scholar
    • Export Citation
  • Yuan, H., S. L. Mullen, X. Gao, S. Sorooshian, J. Du, and H. H. Juang, 2005: Verification of probabilistic quantitative precipitation forecasts over the southwest United States during winter 2002/03 by the RSM ensemble system. Mon. Wea. Rev., 133, 279294.

    • Search Google Scholar
    • Export Citation
  • Yuan, H., X. Gao, S. L. Mullen, S. Sorooshian, J. Du, and H. H. Juang, 2007a: Calibration of probabilistic quantitative precipitation forecasts with an artificial neural network. Wea. Forecasting, 22, 12871303.

    • Search Google Scholar
    • Export Citation
  • Yuan, H., S. L. Mullen, X. Gao, S. Sorooshian, J. Du, and H. H. Juang, 2007b: Short-range probabilistic Quantitative Precipitation Forecasts over the southwest United States by the RSM ensemble system. Mon. Wea. Rev., 135, 16851698.

    • Search Google Scholar
    • Export Citation
  • Yuan, H., J. A. McGinley, P. J. Schultz, C. J. Anderson, and C. Lu, 2008: Short-range precipitation forecasts from time-lagged multimodel ensembles during the HMT-West-2006 campaign. J. Hydrometeor., 9, 477491.

    • Search Google Scholar
    • Export Citation
  • Zhu, Y., 2005: Ensemble forecast: A new approach to uncertainty and predictability. Adv. Atmos. Sci., 22 (6), 781788.

  • Fig. 1.

    Schematic diagram of short-range forecast system LAPS.

  • Fig. 2.

    QPESUMS domain. The radar coverage is indicated by the shaded area, and the four radar sites are indicated by closed triangles, including Wu–Fen–Shan (RCWF), Ken–Ting (RCKT), Hua–Lian (RCHL), and Chi–Gu (RCCG) radars.

  • Fig. 3.

    Schematic diagram of LAPS ensemble system.

  • Fig. 4.

    Schematic diagram of LAPS time-lagged multimodel ensemble.

  • Fig. 5.

    Distribution of (left) LAPS 0–6-h PQPFs and (right) QPESUMS precipitation (used as truth) probabilities at thresholds (a) 50, (b) 100, and (c) 200 mm (6 h)−1 ending at 1200 UTC 19 Sep 2010. (right) The orange shaded area denotes pixels where QPESUMS precipitation estimations exceed the indicated threshold, and pink shaded area indicates QPESUMS radar coverage.

  • Fig. 6.

    Rank histograms for LAPS 0–6-h QPFs from the experiments (a) SMP-T and (b) SMP-L. The horizontal dashed line denotes the frequency for a uniform rank distribution.

  • Fig. 7.

    (left) Reliability diagrams for LAPS 0–6-h PQPFs at thresholds (a) 1, (b) 5, and (c) 20 mm (6 h)−1. Reliability curves from the experiments SMP-T (before LR calibration, dashed line with solid dots) and SMP-T(LR) (after LR calibration, solid line with hollow circles) are shown. The horizontal dashed line indicates the sample climatology frequency. (right) Histograms indicate the corresponding sample ratio (%) of each forecast probability subrange for the experiments SMP-T (before LR, gray) and SMP-T(LR) (after LR, blank).

  • Fig. 8.

    (a) ROC curves from the experiment SMP-T and (b) the area under the ROC from the experiments SMP-T (line with circles) and SMP-L (line with squares) for LAPS 0–6-h PQPFs at different thresholds [1, 5, 10, 15, 20, and 30 mm (6 h)−1].

  • Fig. 9.

    Brier skill scores at different thresholds for LAPS 0–6-h PQPFs from the experiments SMP-T (dashed line with solid dots), SMP-T (LR) (solid line with hollow circles), and SMP-T8S (LR) (solid line with hollow squares).

  • Fig. 10.

    The spatial distribution of the ranked probabilistic skill score for LAPS 0–6-h PQPFs from the experiments (a) SMP-T (before LR calibration) and (b) SMP-T (LR) (after LR calibration) using four thresholds [1, 5, 10, and 20 mm (6 h)−1] to define five categories.

  • Fig. 11.

    Decomposition of the Brier score for LAPS 0–6-h PQPFs from the experiments SMP-T (dashed line with circle symbols), SMP-T(LR) (solid line with square symbols), and SMP-T8S(LR) (solid line with triangle symbols). Solid symbols: reliability terms. Hollow symbols: resolution terms.

  • Fig. 12.

    Scatterplots of the RMSE against the ensemble spread (SPRD) for eight typhoon cases (TY 1–TY 8) from the experiment SMP-T. Each point in the scatterplot comes from one 0–6-h QPF (i.e., RMSE and SPRD are averaged over the QPESUMS domain). The linear regression line, correlation coefficient (C), and the coefficient of determination (R2) are shown.

  • Fig. 13.

    As in Fig. 12., but for all typhoon cases (TY 1–TY 8) in one plot.

  • Fig. 14.

    As in Fig. 13., but for the four different ensemble configurations (a) EPS-mmtl, (b) EPS-m06h, (c) EPS-m09h, and (d) EPS-m12h, respectively. The results of (left) experiment SMP-T and (right) experiment SMP-L. Each point in the scatterplot comes from one typhoon case. The ratio of SPRD over RMSE is also indicated on the plot of EPS-mmtl.

  • Fig. 15.

    Brier skill scores of eight typhoon cases (TY 1–TY 8) for experiment SMP-T at different thresholds [1, 5, 10, and 20 mm (6 h)−1].

  • Fig. 16.

    Brier skill scores at different thresholds for four different ensemble configurations (EPS-mmtl, EPS-m06h, EPS-m09h, and EPS-m12h) for the experiments (a) SMP-T and (b) SMP-L.

  • Fig. 17.

    (a) The RRMSE and (b) the correlation coefficient of 6-h accumulated precipitation forecasts of 12 members and four ensemble configurations (EPS-mmtl, EPS-m06h, EPS-m09h, and EPS-m12h) in Table 3 from the experiments SMP-T (line with circles) and SMP-L (line with squares).

  • Fig. 18.

    Reliability diagram for LAPS 0–6-h PQPFs at the threshold of 30 mm (6 h)−1. Reliability curves from TY 1 to TY 8 are shown.

  • Fig. 19.

    As in Fig. 7, but from the experiments SMP-L (before LR calibration) and SMP-L (LR) (after LR calibration).

  • Fig. 20.

    Brier skill scores at different thresholds for LAPS 0–6-h PQPFs from the experiments SMP-T (dashed line with solid dots), SMP-T(LR) (solid line with hollow circles), SMP-L (dashed line with solid squares), and SMP-L(LR) (solid line with hollow squares).

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 670 253 24
PDF Downloads 160 40 1