Many existing models that predict landslide hazards utilize ground-based sources of precipitation data. In locations where ground-based precipitation observations are limited (i.e., a vast majority of the globe), or for landslide hazard models that assess regional or global domains, satellite multisensor precipitation products offer a promising near-real-time alternative to ground-based data. NASA’s global Landslide Hazard Assessment for Situational Awareness (LHASA) model uses the Integrated Multisatellite Retrievals for Global Precipitation Measurement (IMERG) product to issue hazard “nowcasts” in near–real time for areas that are currently at risk for landsliding. Satellite-based precipitation estimates, however, can contain considerable systematic bias and random error, especially over mountainous terrain and during extreme rainfall events. This study combines a precipitation error modeling framework with a probabilistic adaptation of LHASA. Compared with the routine version of LHASA, this probabilistic version correctly predicts more of the observed landslides in the study region with fewer false alarms by high hazard nowcasts. This study demonstrates that improvements in landslide hazard prediction can be achieved regardless of whether the IMERG error model is trained using abundant ground-based precipitation observations or using far fewer and more scattered observations, suggesting that the approach is viable in data-limited regions. Results emphasize the importance of accounting for both random error and systematic satellite precipitation bias. The approach provides an example of how environmental prediction models can incorporate satellite precipitation uncertainty. Other applications such as flood and drought monitoring and forecasting could likely benefit from consideration of precipitation uncertainty.
Landslides result in thousands of fatalities, property loss, and infrastructure damage around the world every year (Dilley et al. 2005; Froude and Petley 2018; Petley 2012). They occur across a broad range of geographic, climatic, and land use settings and can range from minor slope failures to kilometers-long debris flows. Factors that determine landslide hazard can be sorted into two categories: 1) static factors that determine an area’s preexisting susceptibility to landsliding, such as slope, aspect, forest loss, road cut activity, lithology, and distance to fault zones, and 2) dynamic factors that trigger landslides (Dai et al. 2002; Sassa et al. 2014). Static factors can be conceptualized as determining where landslides are most likely to occur and dynamic factors as determining when they occur within susceptible areas. Though landslides can be initiated by seismic and human activity, rainfall is recognized as the most widespread and frequent trigger (Dai et al. 2002; Guzzetti et al. 2007; Petley et al. 2005).
Most existing landslide hazard monitoring systems use ground-based precipitation measurements. Japan and Norway, for example, operate nationwide early-warning systems that use radar rainfall measurements (Krøgli et al. 2018; Devoli et al. 2015; Osanai et al. 2010), while Italy’s early-warning system for rainfall-induced landslides, SANF, and Rio de Janiero’s “Alerta Rio” system rely on rain gauge networks (Calvello et al. 2015; Piciullo et al. 2017; Rossi et al. 2012). In many parts of the globe, however, including in steep terrain and developing countries, such measurements are often lacking (Gebregiorgis and Hossain 2014; Kidd et al. 2017), hampering real time monitoring and warning of potential landslide hazards.
Satellite multisensor precipitation products (SMPPs) provide near-real-time estimates of precipitation with near-global coverage, potentially enabling prediction of landslides and other environmental phenomena in locations and at scales not previously possible. SMPPs use algorithms that merge passive microwave and infrared sensing data from multiple satellites (e.g., Kidd and Levizzani 2011; Kidd and Huffman 2011; Tapiador et al. 2012; Wright 2018). Commonly used SMPPs include the TRMM Multisatellite Precipitation Analysis (TMPA; Huffman et al. 2007), the Climate Prediction Center (CPC) morphing technique (CMORPH; Joyce et al. 2004), and the Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks (PERSIANN; Sorooshian et al. 2000). In this study, we use the recent Integrated Multisatellite Retrievals for GPM (IMERG; Huffman et al. 2017), which is available at 0.1°, 30-min resolution available for 2000–present. IMERG retrieves passive microwave (PMW) precipitation estimates using the Goddard profiling algorithm (GPROF) and blends PMW estimates with the IR-based PERSIANN–Cloud Classification System algorithm using the CMORPH–Kalman filter Lagrangian time interpolation scheme (Huffman et al. 2017; Joyce et al. 2004; Sorooshian et al. 2000).
Hong et al. (2006) first demonstrated the potential for near-real-time global landslide hazard assessment by combining TMPA with a global susceptibility map (Hong et al. 2007). Farahmand and AghaKouchak (2013) developed a global landslide model based on Support Vector Machines that used PERSIANN precipitation data to globally classify landslide events. Brunetti et al. (2018) and Nikolopoulos et al. (2017) evaluated the suitability of SMPPs for use in landslide prediction in Italy, finding that rain gauge products and SMPPs may differ in their estimation of landslide-triggering precipitation thresholds. Kirschbaum et al. (2015b) developed a model to assess landslide hazard in Central America using precipitation data from TMPA. This effort evolved into the global Landslide Hazard Assessment for Situational Awareness (LHASA) model framework, which uses precipitation data from IMERG to provide publicly available near-real-time “nowcasts” of landslide hazard around the world (Kirschbaum and Stanley 2018). Nowcasts identify areas with currently elevated landslide hazard by indicating either moderate hazard (yellow shading) or high hazard (red shading) in LHASA model output.
None of these existing global landslide hazard models explicitly address the systematic biases and random errors that are prevalent in SMPPs. These errors pose a key obstacle to the usage of satellite precipitation in landslide hazard prediction and environmental prediction more generally (e.g., AghaKouchak et al. 2011; Tian et al. 2009; Wright et al. 2017). SMPP performance declines at high latitudes and over ice-covered land surfaces (Ferraro et al. 2013; Tian and Peters-Lidard 2010). SMPPs have difficulty accurately depicting the extreme rainfall rates and orographic enhancement that typify landslide triggering conditions (Shige et al. 2013; AghaKouchak et al. 2011). Furthermore, retrospective studies that characterize bias and other error statistics of satellite precipitation based on comparison to ground reference data, which constitute the bulk of existing SMPP error studies (see Maggioni et al. 2016) are not directly applicable to models that ingest precipitation estimates. Instead, it is necessary to use an error model, which can generate a corrected estimate, range, or distribution of errors or values as soon as new satellite estimates are made available. Development of such models is nontrivial, since SMPP errors tend to be non-Gaussian, heteroscedastic, and can be both discrete and continuous (Maggioni et al. 2014; Tian et al. 2013).
To date, SMPP error modeling has relied on ground-based precipitation data (e.g., rain gauges; bias-corrected radar) to use as “ground truth” in order for model fitting (henceforth referred to as “training”). Unfortunately, many parts of the world lack spatially or temporally complete records of ground-based precipitation (Kidd et al. 2017; Sun et al. 2018). This paucity of ground-truth information (e.g., limited numbers of rain gauges) has led previous error modeling studies to note that some form of “regionalization” of error estimates or error model parameters would be necessary (Gebregiorgis and Hossain 2014, 2013; Tang and Hossain 2009, 2012). Modeling SMPP errors regionally may reduce finescale variability in error structure, but the alternative is no error models in data-limited regions.
In addition to producing error estimates in the first place, another challenge is enabling environmental models to ingest such estimates. One approach is to generate ensembles consisting of multiple realizations of precipitation time series or space–time fields and then use each ensemble member to drive a prediction model. This approach allows prediction models to be used without any particular modification, as demonstrated in a number of studies using stochastic rainfall input for soil moisture and landslide hazard modeling (Maggioni et al. 2011; Nikolopoulos et al. 2010; White and Singham 2012). Though ensemble prediction models have been developed for some applications, flood forecasting in particular (Cloke and Pappenberger 2009), assembling such ensembles can be nontrivial, while for complex physics-based models, the requisite multiple simulations may not be computationally feasible for real-time applications. An alternative approach is to directly ingest precipitation distributions generated by SMPP error models into environmental prediction models. This latter approach is challenging to implement and has received little attention.
In this study, we combine IMERG with a recent, relatively simple, SMPP error model (Wright et al. 2017) and a probabilistic adaptation of the existing LHASA model framework. The approach is evaluated in the mountainous Appalachian region of the southeastern United States. This region features high-quality ground-based precipitation observations, which are used to train the error model. We examine the sensitivity of both SMPP error estimates and landslide predictions to the quantity of ground reference data used to train the error model. The results highlight the potential value of incorporating both systematic and random precipitation errors into environmental models, as well as the benefits of modifying those models’ structures to explicitly accommodate such error estimates. We consider the latter point to be critical, since probabilistic representations of precipitation are anticipated to become more common in the near future (Kirstetter et al. 2018; Wright 2018), and existing prediction models are not generally configured to directly ingest probabilistic estimates of precipitation.
Study region, precipitation data, and landslide susceptibility and inventory data are presented in section 2. The existing LHASA model, the SMPP error model, and a new probabilistic formulation of LHASA are introduced in section 3. Results are presented in section 4, and a discussion follows in section 5. A closing summary and conclusions are provided in section 6.
2. Study region and data
a. Study region
The study region encompasses the Appalachian Mountains in western North Carolina and eastern Tennessee and extends north into Virginia and Kentucky and south into Georgia and South Carolina (Figs. 1a,b). Extreme precipitation is the primary natural hazard in this region, causing floods and landslides that result in deaths and economic losses (Moore et al. 2015). These can result from tropical cyclones, mesoscale convective systems, orographic uplift, and atmospheric rivers that interact with the region’s complex terrain (Barros et al. 2014; Mahoney et al. 2016; Moore et al. 2015). Hurricane Frances was followed by Hurricane Ivan within a 2-week period in September 2004, for example, and caused approximately 400 landslides, 11 deaths, and widespread property damage (Boyle 2014). Landslides continue to pose a threat to the region, with three fatalities occurring in May 2018 (Carter 2018; Doom 2018).
b. Rainfall data
The study period is 2002–18. IMERG Version 6B Early and Late, both of which are used in the operational version of LHASA and are produced with latencies of 4 and 14 h, respectively (Huffman et al. 2015), are used. The Stage IV radar–gauge merged precipitation product (Lin 2011), available over the continental United States at hourly, roughly 4-km resolution, serves as the ground reference. Stage IV has been used previously to validate SMPPs (e.g., AghaKouchak et al. 2011). Though ground-reference data such as Stage IV can contain errors, we compared Stage IV against rain gauge observations (results not shown) and assume that such errors are negligible in comparison to SMPP errors for this study. Stage IV, IMERG Early, and IMERG Late are aggregated to a daily scale. Within the study area, IMERG observations of daily rainfall often differ from Stage IV and IMERG observations of high precipitation are especially error-prone (Figs. 1c,d).
Rather than use precipitation directly, LHASA uses an antecedent rainfall index (ARI) to define the dynamic component of a landslide hazard. ARIt is a 7-day weighted accumulation of precipitation pt for day t and the prior 6 days:
where wi = (i + 1)−2. ARI is formulated to account for the combined impacts of current and recent precipitation on slope stability since soil moisture, pore water pressure, and other physical processes are not explicitly modeled in LHASA. More details on ARI and assigned weights can be found in Kirschbaum et al. (2015b) and Kirschbaum and Stanley (2018).
c. Global landslide susceptibility map and landslide inventory
The LHASA model (section 3a) uses the 1-km global landslide susceptibility map developed by Stanley and Kirschbaum (2017). The map depicts the static susceptibility index (SI) determined using a fuzzy overlay model to combine gridded datasets of five static factors: slope, geology, distance to fault zones, presence of roads, and forest loss. SI consists of integer values from 1 to 5, corresponding to very low, low, moderate, high, and very high susceptibility (Fig. 2). The methodology used to create the susceptibility map can be found in Stanley and Kirschbaum (2017).
A landslide inventory was obtained from the North Carolina Geological Survey (Wooten et al. 2016) and the Global Landslide Catalog (GLC; Kirschbaum et al. 2010, 2015a), which provide databases of rainfall-triggered landslides based on media reports and other disaster databases. Landslide reports in the GLC include information on the date, location, and impacts of the events and provide an estimate of the landslide’s location accuracy.
Only landslides occurring between 2002 and 2018 were used due to the limited length of IMERG and Stage IV records. Additionally, landslide reports were only considered if the location accuracy in the inventory was within 1 km, if the range of the date of occurrence spanned less than 5 days, and if the trigger was listed as rainfall. The inventory was inspected to remove any duplicate landslide reports. This resulted in an inventory of 214 landslides, approximately half of which are associated with Hurricane Frances and Hurricane Ivan in September 2004 (Fig. 1b). Over 80% of landslides recorded during the 2002–18 study period occurred in high or very high susceptibility areas (SI ≥ 4; Table 1).
a. CSGD-based error modeling framework
SMPPs exhibit three types of error: false alarms, in which the SMPP reports precipitation but none actually occurs; missed cases, in which precipitation occurs but the SMPP does not “see” it; and hits, in which the SMPP correctly reports nonzero precipitation, but of the wrong magnitude. The error modeling framework based on censored shifted gamma distributions (CSGDs), developed by Scheuerer and Hamill (2015) for postprocessing ensemble numerical precipitation forecasts and adapted by Wright et al. (2017) to model errors in SMPPs, is capable of quantifying all three error types. The CSGD is an adaptation of the two-parameter gamma distribution (here written in terms of its mean and standard deviation, but which can be reparametrized in terms of shape and scale parameters) with an additional “shift” parameter δ that shifts the probability density function (PDF) leftward. The distribution is left-censored at zero, replacing all negative values with zero. The probability density left of zero thus represents the probability of zero precipitation, while the density at any value greater than zero represents the likelihood of that amount of precipitation (Figs. 3a,b). The CSGD is thus able to describe both precipitation occurrence and magnitude. The cumulative distribution function (CDF; Fig. 3a) is defined by
A “climatological CSGD” with parameters μ, σ, and δ is fitted to the record of ground-truth precipitation (Fig. 3a). A nonlinear regression system is then trained based on contemporaneous collocated SMPP and ground-truth observations to produce regression parameters α1, α2, α3, α4 and, at any time t, unique “conditional” CSGD parameters μ(t), σ(t), and δ(t):
where is the mean of the satellite observations. The terms μt, σt, and δt define the distribution of possible “true” precipitation at time t based on the SMPP observation (Fig. 3b). The CSGD framework implicitly downscales estimates if the spatial resolution of the satellite and ground-reference precipitation records differ, as shown in Scheuerer and Hamill (2015). In this study, we use the error model to downscale from IMERG’s 0.1° resolution to the 1/24° (roughly 4 km) resolution of Stage IV. The CSGD error model thus characterizes the relationship between 0.1° IMERG estimates of precipitation and 1/24° Stage IV estimates. This relationship will differ from what would be obtained if Stage IV were used at 0.1° resolution, since some discrepancies between the two datasets would be smoothed out at that coarser scale.
The CSGD error modeling framework could be applied to daily precipitation to generate an ensemble of daily precipitation time series for input to LHASA. Such an approach would be complicated, however, since it would be necessary to preserve temporal autocorrelation in daily precipitation. Instead, we train the CSGD model on ARI time series since ARI is a time-integrated precipitation value [Eq. (1)]. Climatological CSGDs fit to ARI time series show good agreement with empirical ARI CDFs (Fig. 3c), while the probability of zero precipitation both in the climatological CSGD of ARI and in the conditional CSGDs of ARI (Fig. 3d) are unsurprisingly much lower than that for daily precipitation. It has been previously shown that IMERG error depends on the amount and source of passive microwave and infrared data used (Tan et al. 2016). Since this data availability varies over relatively short time scales (generally subhourly), it is not feasible to consider it when modeling multiday ARI.
Since both Stage IV and IMERG observations of ARI are available over the entire study domain, we can train a CSGD error model for any collocated set of IMERG and Stage IV ARI time series. This training method produces unique error model parameters for each of the 8640 four-kilometer pixels in the study area. We henceforth referred to this as the “localized model.” Since ground-based precipitation data are more limited in many parts of the world, we also adopt a regional parameter estimation scheme that may better reflect data-limited settings. The ARI time series of all pixels in the region are concatenated into a single satellite and ground-truth time series to use as input into the CSGD error model framework, generating one set of error model parameters. This regional training scheme neglects the heterogeneity in error characteristics across the region and provides a more generalized model of IMERG error in the study area. The result of this regional approach is henceforth referred to as the “regional model.” We also explore the possibility of training a regional model on data from fewer pixels by using ten randomly selected locations (instead of all study area pixels) in the regional parameter estimation scheme. This random regional training, further detailed in section 4a, better reflects operational conditions in data-limited regions where extensive precipitation records are likely unavailable.
Regional and localized CSGD error models for daily ARI from IMERG are trained on the 5-yr period from 2002 to 2006. Both models are incorporated into the probabilistic LHASA model [section 3b(3)]. We assess the consequences of using these different training methods by comparing the probabilistic LHASA model performance using each model (section 4b).
b. LHASA model
1) Existing “deterministic LHASA”
The LHASA model characterizes rainfall-induced landslide potential worldwide in near–real time using a heuristic decision tree model (Fig. 4; Kirschbaum and Stanley 2018). Its outputs are 1-km resolution nowcasts, i.e., predictions that indicate either moderate hazard or high hazard if the combination of static and dynamic factors meet specific criteria.
The LHASA model decision tree assesses landslide hazard in two steps. In step 1, satellite-estimated ARI on day t, or , is calculated and compared against the 95th percentile of historical (e.g., climatological) satellite-estimated ARI (ARIthresh) for that location. Thresholds other than ARIthresh could be used; this is outside the scope of our study. IMERG-Early observations are used for the current day due to their lower latency, while IMERG-Late observations are used for the prior 6 days due to their improved accuracy. If is below ARIthresh, no nowcasts are issued to any of the 1-km pixels covered by the 0.1° IMERG grid cell. If exceeds ARIthresh, the 1-km static susceptibility map (section 2c) is consulted in step 2. One-kilometer pixels with moderate or high static susceptibility (SI = 3 or 4) are issued a moderate hazard nowcast and very high susceptibility pixels (SI = 5) are issued a high hazard nowcast. All nowcasts are issued at a 1-km resolution of the static susceptibility map.
This existing version of LHASA is referred to as “deterministic LHASA” in recognition that only the IMERG observation is used, without any estimate of the potential uncertainty associated with it. In this study, deterministic LHASA is run on a daily scale using IMERG for the period 2002–18.
2) Deterministic LHASA using Stage IV and CSGD-median
Deterministic LHASA is additionally run using Stage IV precipitation as input for the period 2002–18. Accordingly, the ARIthresh values used in this run of deterministic LHASA are based on Stage IV data. This simulation is conducted to compare satellite-based LHASA predictions to those produced by ground-reference rainfall. This helps to contextualize how much of LHASA’s predictive power is associated with precipitation uncertainty, as opposed to other possible sources of prediction error.
We also evaluate a version of deterministic LHASA in which the median of the conditional CSGD, rather than the original IMERG observation, is used to calculate daily . Using the median or mean of the conditional CSGD corrects for systematic bias in SMPPs, but does not address random error. As will be shown in section 4, using this bias-corrected value actually degrades LHASA’s predictive capability, while considering the full conditional CSGD improves prediction.
3) Probabilistic LHASA with IMERG error modeling
We adapted the LHASA framework so that it is able to incorporate probabilistic precipitation estimates produced by the CSGD error model (section 3a). This modified version is referred to as “probabilistic LHASA” and is shown schematically in Fig. 5. It should be emphasized that our purpose was not to develop a new model, but rather to make as few modifications to the existing framework as possible to allow it to ingest probabilistic precipitation estimates. Key elements, therefore, including the fixed 95th percentile threshold of ARI, remain in the probabilistic version. Concepts such as probabilistic ARI thresholds or a continuous scale of static susceptibility may be useful, but are beyond the scope of this study.
In step 1, a conditional CSGD for day t is generated for each pixel based on the satellite-observed and the CSGD error model. The conditional CSGD is compared to ARIthresh of the same pixel; P(ARIt > ARIthresh) is the area of the conditional CSGD PDF that lies above ARIthresh (highlighted as the light blue area in Fig. 4b), where ARIt is the true ARI at time t, which is unknown. In the CSGD PDF in Fig. 4b, IMERG is depicted in red to illustrate how deterministic LHASA evaluates rainfall hazard: whether or not is above or below ARIthresh; probabilistic LHASA, in contrast, utilizes the probability that the actual ARI value, ARIt, is above this threshold based on the SMPP observation.
After calculating P(ARIt > ARIthresh), probabilistic LHASA factors in static susceptibility using the global susceptibility map (section 2c) in step 2. No nowcasts are issued to pixels with very low or low susceptibility (SI ≤ 2). P(ARIt > ARIthresh) for pixels with moderate, high, and very high susceptibility (SI ≥ 3) are multiplied by 0.5, 0.75, and 1.0, respectively, to produce a landslide hazard index (LHI). The 0.5 and 0.75 multipliers in probabilistic LHASA were chosen to allow a category 4 (3) hazard nowcast to be issued in high (moderate) susceptibility pixels provided that there is a near certain probability that ARIt > ARIthresh. Pixels with LHI less than 0.1 are assigned as no nowcast. The resulting 0.1–1.0 LHI scale is continuous and reflects a range of moderate to very high landslide hazard that considers precipitation uncertainty. We divide this continuous scale into five equally spaced categories: category 1 LHI = (0.1, 0.28], category 2 LHI = (0.28, 0.46], and so on up to category 5 (Fig. 4b). While these LHI category ranges as well as the aforementioned 0.5 and 0.75 multipliers are assigned somewhat arbitrarily, landsliding events are too sparse in the study area to justify calibration of these values. We show in section 4 that this uncalibrated formulation nonetheless improves prediction compared with deterministic LHASA.
It should be noted that these nowcast categories 1–5 in probabilistic LHASA are not the same as in deterministic LHASA, which instead issues moderate and high hazard nowcasts. The continuous nature of the LHI scale in probabilistic LHASA may lend itself to a variety of nowcast communication approaches that are not available in deterministic LHASA.
c. LHASA evaluation metrics
To assess the performance of deterministic and probabilistic LHASA, the true positive rate (TPR), false positive rate (FPR), and area-wide false alarm ratio (FAR) are calculated. A nowcast is considered correct if it is issued within 1 km of a recorded landslide the day of or the day before the date of occurrence. A correct nowcast is said to “capture” a landslide. A “pixel-day” refers to a 1-km pixel on any given day in the study period, to which a landslide may be reported or a nowcast may be issued. TPR is the percentage of reported landslides that are correctly detected by a landslide hazard nowcast. FPR is the percentage of pixel-days when a nowcast was issued but should not have been (i.e., no landslide was reported within 1 km). Area-wide FAR is the percentage of pixel-days in which a nowcast is issued but no landslide occurs anywhere in the study area on the day of or the day after the nowcast. Area-wide FAR allows for nowcasts to not be considered false alarms as long as a landslide is reported within the study area. This is relevant because the static and dynamic conditions at a particular location can indeed be hazardous even if no landslide actually occurs, and thus a nowcast in such conditions should not necessarily be considered erroneous. These three evaluation metrics are provided in the form of equations below:
a. CSGD error model
To explore how SMPP error models may be applied to areas without extensive ground-truth precipitation estimates, it is critical to evaluate the robustness of error model parameter estimation. Localized and regional CSGD error models for IMERG observations of ARI were trained over the 2002–06 period using data from all collocated IMERG and Stage IV grid cells in the study region.
To assess the importance of record length in error model parameter estimation, a regional CSGD model trained using the entire 2002–18 period was compared against one trained using only the 2002–06 period (lines in Fig. 5a). These two models are nearly indistinguishable. Additional models were fit using 5-yr subsets from 2003 to 2018 (i.e., 2003–07, 2004–08, etc.; shaded areas of Fig. 5a). Only modest differences from the 2002–06 and the 2002–18 models are visible. These results suggest that relatively short data records can produce satisfactory parameter estimates.
One-hundred additional CSGD error models were fitted that were intended to assess the model variability that would result if spatially incomplete rainfall data were available. Each error model was trained on ARI data (for 2002–06) from 10 pixels randomly selected from the study region. The spread of the uncertainty estimates from these 100 models (shaded areas in Fig. 5b) are very similar to those produced by the 2002–06 regional model (lines in Fig. 5b), and nearly identical for IMERG-observed ARI below 30 mm. Thus, although the regional CSGD model utilizes all 8640 collocated Stage IV and IMERG pixels in the study region, such a model can be effectively approximated using data from far fewer pixels. The 2002–06 regional model with all pixels is used in the evaluation of probabilistic LHASA (section 4b).
Localized CSGD models fitted for every single pixel in the domain yielded uncertainty estimates (shaded areas in Fig. 5c) that can vary widely from the regional model (lines in Fig. 5c). This highlights that substantial variation in IMERG-observed ARI error properties can exist throughout the study region (Fig. 6) which is ignored in the regional model. While radar beam patterns and umbrellas are visible in the Localized model outputs in Fig. 6, the regional model smooths out such artifacts of radar estimation, demonstrating one benefit of a regional approach. The regional approach also reduces sampling error because it uses a larger sample of precipitation estimates during model calibration. The implicit downscaling of IMERG uncertainty in the CSGD model training process results in localized models that reflect the very specific relationships between precipitation in individual Stage IV grid cells and the IMERG estimates covering those grid cells. For this reason, radar artifacts indicating beam blockage are visible in the localized model results in Fig. 6. Regardless, we show in section 4b that once uncertainty estimates are integrated into the LHASA framework, the lack of spatial variability in the regional error model approach is of virtually no consequence.
b. LHASA model performance
Fifteen landslides were reported in the study region on 5 May 2003. For this day, deterministic LHASA issues nowcasts encompassing all landslides (Fig. 7a). The same is true for probabilistic LHASA using the regional CSGD error model (Fig. 7b). The spatial distributions of nowcasts differ, however. Deterministic LHASA issues high hazard nowcasts throughout the study area, including a large part of the north where no landslides were reported. Probabilistic LHASA, on the other hand, issues category 3–5 nowcasts in the central part of the study area where most landslides were reported; nowcast categories were generally low (1–3) farther away from reported landslides.
Conditional CSGDs for two 0.1° IMERG pixels demonstrate how deterministic and probabilistic LHASA differ in their evaluation of IMERG-observed (Figs. 7c,d; locations shown in Figs. 7a,b). IMERG-observed for both locations is above ARIthresh. Deterministic LHASA therefore issues moderate hazard (high hazard) nowcasts for all areas within the two pixels that have moderate to high (very high) static susceptibility, in accordance with Fig. 4a. Probabilistic LHASA, on the other hand, estimates the probability that the true ARIt is above ARIthresh by calculating P(ARIt > ARIthresh). For location 1, P(ARIt > ARIthresh) = 0.47, while for location 2, P(ARIt > ARIthresh) = 0.99 (Figs. 7c,d). For this reason, probabilistic LHASA issues higher category nowcasts for location 2, in accordance with Fig. 4b.
Over the 2002–18 study period deterministic (probabilistic) LHASA using IMERG captures 150 (157) of 214 reported landslides (Figs. 8a,b). TPR is fairly evenly dispersed for probabilistic LHASA among all nowcast categories (Fig. 8b). FPR for category 5 nowcasts from probabilistic LHASA is an order of magnitude lower than the FPR for high hazard nowcasts in deterministic LHASA (Figs. 8c,d). With the exception of category 1, FAR for all probabilistic LHASA nowcast categories is less than 55%. This is lower than the FAR of both moderate and high hazard nowcasts from deterministic LHASA (both have FAR of 60%), meaning that probabilistic LHASA category ≥ 2 nowcasts are less likely to be issued than moderate or high hazard nowcasts by deterministic LHASA when conditions do not actually produce landsliding in the region (Figs. 8e,f). The relatively high FPR and FAR for category 1 nowcasts from probabilistic LHASA are expected since these nowcasts are issued when there exists a nonzero but low probability of ARI exceeding ARIthresh.
Deterministic LHASA will never issue a nowcast when the is less than ARIthresh, which is problematic given the prevalence of large random errors in SMPPs, particularly during extreme rainfall events. In contrast, probabilistic LHASA evaluates P(ARIt > ARIthresh) to determine the probability of hazardous ARI. Though P(ARIt > ARIthresh) may not be high when is below ARIthresh, probabilistic LHASA can still generate a category 1 or 2 nowcast.
Differences between probabilistic LHASA using the regional and localized CSGD models are negligible (Figs. 8b,d,f). Probabilistic LHASA captures the same number of landslides regardless of which CSGD error model is used, while the localized CSGD error model within probabilistic LHASA produces a slightly lower total FPR (Fig. 8d). Even though the two models generate different P(ARIt > ARIthresh) estimates, resulting in slight differences in which nowcast category is issued, each model is able to broadly identify cases when is associated with greater uncertainty and when ARIt has a nonnegligible probability of exceeding ARIthresh.
Deterministic LHASA using the CSGD median performs poorly, capturing only 110 out of 214 reported landslides (Fig. 8a). Recall from section 3b(2) that the median or mean from the CSGD error model removes systematic bias, but does not reflect random error. This effectively eliminates any high values of ARI, resulting in fewer nowcasts and fewer landslides captured. Although FPR and FAR are low in this case relative to other models (Figs. 8c,e), this is outweighed by an unsatisfactory TPR.
Deterministic LHASA using Stage IV captures 193 out of 214 landslides, with lower FAR and FPR than deterministic LHASA using IMERG. This represents the maximum performance achievable when LHASA is forced with the best available precipitation dataset. In terms of TPR, probabilistic LHASA is closer to this “optimal” performance than either of the IMERG-based deterministic models. Because of the conceptual differences in nowcast categories between the deterministic and probabilistic versions of LHASA, it is difficult to directly compare FAR and FPR between the two versions.
c. Spatial variation in nowcasts
Nowcast rates over the study region, calculated as the percentage of days on which nowcasts are issued for 2002–18, are shown in Fig. 9. Nowcast rates for all models are zero in areas of low or very low static susceptibility (see Fig. 2). The nowcast rate for deterministic LHASA using IMERG (Stage IV) is 6%–8% (4%–6%) for most of the study region in pixels with moderate or higher static susceptibility, which is logical since exceeds ARIthresh on 5% of days. Probabilistic LHASA using the regional CSGD model has higher nowcast rates in the mountainous terrain in the center and northeast of the study region and lower rates in the less-steep northwest (Fig. 9c). Probabilistic LHASA using the localized CSGD model increases nowcast rates along the edges of the mountainous terrain and decreases nowcast rates in the center, reflecting differing levels of uncertainty in IMERG estimates in these two regions (Fig. 9d).
In both cases, nowcast rates from probabilistic LHASA exhibit more spatial variability than deterministic LHASA. While all moderate to very high susceptibility areas within an 0.1° IMERG grid cell will have the same nowcast rate in deterministic LHASA, the nowcast rate in probabilistic LHASA will vary according to static susceptibility (Fig. 9). For instance, if IMERG-observed in a 0.1° pixel barely exceeds ARIthresh, deterministic LHASA will issue nowcasts for all areas in that pixel with moderate to very high susceptibility, as per Fig. 4a. Probabilistic LHASA, on the other hand, calculates P(ARIt > ARIthresh) to determine which static susceptibility level merits a nowcast. A probability P(ARIt > ARIthresh) = 0.2, for example, would result in nowcasts being issued only for high and very high static susceptibility pixels, as per Fig. 4b.
When category 1 nowcasts, which imply relatively modest landslide hazard, are excluded, the nowcast rate of probabilistic LHASA drops by approximately 50% (Figs. 9e,f). Results show that when probabilistic LHASA issues nowcasts to an area but deterministic LHASA does not, these nowcasts are moderate hazard (category 1) and are issued because probabilistic LHASA predicts a low but nonzero probability of landslide hazard (see nowcasts issued in the north east of study region in Figs. 7a,b).
a. CSGD error models with limited ground-truth data
CSGD model training results (section 4a) demonstrate that relatively little ground-reference data is needed to approximate the regional SMPP error model trained on abundant ground data. Specifically, 10 randomly selected pixels in the study area are found to be sufficient to approximate a regional CSGD error model of ARI trained over all 8640 pixels (Fig. 5b). This extends the applicability of the regional CSGD error model incorporated in probabilistic LHASA to data-limited regions with few ground reference records, and indicates that SMPP uncertainty can still be accounted for in the LHASA model in data-limited regions provided that at least some amount of ground truth precipitation such as multiyear rain gauge records are available.
Localized CSGD error models trained on individual IMERG and Stage IV pixels reveal that there can be a wide range of variation in the uncertainty around IMERG-based ARI across the study region (Fig. 5c). Such variation could imply that the regional approach may not adequately characterize the uncertainties associated with IMERG-based ARI at any particular location, but could also stem at least in part from sampling error. True regional variability would argue in favor of localized error modeling; regional variability resulting from sampling error (e.g., insufficient record length to sample the extreme tail of the local precipitation distribution) would argue in favor of a regional approach. The consequence of differences between regional and localized CSGD error models is not readily apparent in comparing the CSGD model results (i.e., Fig. 5c), but is best evaluated through the resulting performance of probabilistic LHASA, which is discussed below. More work would be needed to verify whether or not the conclusions regarding localized versus regional error models of 7-day ARI would pertain to error models of precipitation at shorter time steps (e.g., daily, hourly, etc.).
Even though regional and localized error models generate varying estimates of ARI uncertainty (Fig. 5c), their incorporation into the probabilistic LHASA model produces nearly identical results: improved TPR relative to deterministic LHASA and lower FAR in all but the lowest nowcast category (Fig. 8). This indicates that the mere inclusion of IMERG uncertainty in LHASA is more critical than whether this uncertainty is estimated based on broader regional error characteristics or localized error characteristics.
b. Deterministic and probabilistic LHASA model performance
Probabilistic LHASA outperforms deterministic LHASA and provides greater gradation in landslide hazard using a continuous nowcast scale. Since probabilistic LHASA models using localized and regional CSGD error models perform similarly (section 5a), discussion of probabilistic LHASA in this section refers to the results using the regional error model.
The superior performance of deterministic LHASA using Stage IV confirms that accurate precipitation data can significantly improve LHASA’s performance (Figs. 8a,c,e). The Stage IV–based model only misses 10% (21 out of 214) landslides in the study period. These missed landslides are attributable to some combination of deficiencies in Stage IV, the LHASA model structure, the susceptibility map, or the landslide inventory. Deterministic LHASA using IMERG, in contrast misses 64 out of 214 (30%). This is improved somewhat by incorporating the IMERG uncertainty information via probabilistic LHASA (57 out of 214 landslides missed; 27%). This modest improvement in TPR is accompanied by reductions in FAR and FPR associated with higher hazard (category 2–5) nowcasts and more realistic spatial distributions of nowcasts.
While probabilistic LHASA improves total TPR and landslides captured relative to its deterministic counterpart (Figs. 8a,b), comparing the distribution of TPR across nowcasts is less straightforward since the two model frameworks use different nowcast categories. For deterministic LHASA, TPR for high hazard nowcasts is higher than for moderate hazard nowcasts, and thus can be said to have higher skill. On the other hand, TPR for probabilistic LHASA is more evenly dispersed across nowcast categories. Though this suggests equal skill in all nowcast categories, the FAR and FPR indicate far fewer instances of category 3–5 nowcasts being issued in error. While moderate hazard and high hazard nowcasts from deterministic LHASA are equally likely to be a false alarm (Fig. 8e), the FAR for probabilistic LHASA nowcasts decreases above category 1 such that higher category nowcasts are less likely to be false alarms (Fig. 8f). When category 1 nowcasts are excluded, the total FPR of probabilistic LHASA is half that of deterministic LHASA (Fig. 8d). Probabilistic LHASA category 4 and 5 nowcasts have much lower FPRs and FARs than high hazard nowcasts from deterministic LHASA, which are important for stakeholders to establish how confident they can be in a nowcast. Probabilistic LHASA’s ability to issue low category nowcasts even when < ARIthresh could be particularly beneficial in regions with high random error (such as mountainous terrain with finescale heterogeneities) or where satellite observations frequently underestimate heavy precipitation events (Derin et al. 2019).
Eighty-six percent of reported landslides in the study area occur in moderate and high static susceptibility pixels (Table 1). This suggests that it would be reasonable for moderate and high static susceptibility areas (from SI = 3 to SI = 4) to merit high hazard nowcasts under certain conditions, which is not possible in deterministic LHASA (Fig. 4a). The deterministic LHASA decision tree only permits high hazard nowcasts to be issued in areas with SI = 5 (Fig. 4a). The deterministic LHASA framework could be modified to allow issuance of high hazard nowcasts in areas with SI = 4 by greatly increasing the associated FPR and FAR, or by introducing some moderating factor to decide whether a High susceptibility pixel is issued a moderate or high hazard nowcast. Probabilistic LHASA allows the nowcast category to vary based on the probability of hazardous ARI (Fig. 4b). For example, a category 4 nowcast can be issued to areas with SI = 4 under the circumstance that P(ARIt > ARIthresh) is greater than 0.85. Thus, probabilistic LHASA is able to differentiate between two locations with the same static susceptibility and for which > ARIthresh, but which have different probabilities of experiencing hazardous precipitation (Figs. 7c,d).
The poor performance of deterministic LHASA using the CSGD-median of IMERG reveals that removing systematic bias in SMPPs can actually significantly worsen model performance, if the potential for random error is neglected. These results emphasize the importance of considering the range of random error present in SMPP observations in addition to systematic bias when accounting for SMPP uncertainty, especially for models of environmental processes driven by extreme precipitation (such as floods and landslides).
c. Spatial nowcast variations
The largest differences in nowcast rates between deterministic and probabilistic LHASA occur in the steepest terrain (Fig. 9), where uncertainty-prone extreme precipitation events occur most frequently. Probabilistic LHASA’s incorporation of the relatively wide range of error surrounding extreme events results in more frequent nowcasts. It is also noteworthy that probabilistic LHASA does not uniformly increase the nowcast rate across the entire study region, but actually decreases the nowcast rate in the lower-elevation northwest, which experiences positive systematic bias from IMERG (Fig. 6) and receives less extreme rainfall. Even though the evaluation statistics for regional and localized CSGD models are very similar (Fig. 8), the spatial patterns of nowcast rate for these two models across the study area are distinct (Figs. 9c,d). Although probabilistic LHASA is issuing more nowcasts than deterministic LHASA (Figs. 9a,c,d), the two models’ nearly identical overall FAR implies that probabilistic nowcasts are more likely to be issued on days when conditions actually produce landsliding in the region. Further, if a user chose to only consider category 2–5 nowcasts from probabilistic LHASA, the nowcast rate and associated FAR and FPR drop significantly (Figs. 8d,f, 9e,f).
6. Summary and conclusions
Satellite multisensor precipitation products (SMPPs) offer promise for environmental modeling in regions that lack ground-based sources of precipitation information. This promise has gone largely unfulfilled, however, mainly due to the high uncertainty of satellite-based precipitation estimates. This study demonstrates how accounting for SMPP uncertainty can improve predictions from a landslide hazard model over the mountainous southeastern United States. Historical records of Integrated Multisatellite Retrievals for GPM (IMERG) and ground-based precipitation are used to train an error model that characterizes uncertainty in current and antecedent rainfall. A probabilistic version of NASA’s Landslide Hazard Assessment for Situational Awareness (LHASA) model is developed that can translate this rainfall uncertainty into near-real-time landslide nowcasts. The hazard nowcast scale is continuous, though we transform it into a simple five-category scale. Run retrospectively using IMERG, the probabilistic version of LHASA performs well in the study region, capturing more landslides than the existing deterministic version with lower false positive rates and false alarm ratios in high hazard nowcast categories. Using the error model to generate only the “best guess” (e.g., median) antecedent rainfall as input for the deterministic LHASA model leads to worse model performance, highlighting that removal of systematic bias is not sufficient for addressing SMPP uncertainty in modeling. Instead, results demonstrate the need to account for the range of SMPP random error.
Data-limited regions have few available ground records on which to train SMPP error models. We show that while ground truth precipitation data are indeed necessary for error model training, such data need not be ubiquitous and a few ground truth records can go a long way. Nearly identical improvements in landslide hazard model performance can be obtained by using a regionalized parameter training scheme instead of a data-intensive localized training scheme, and this regionalized scheme can be robust with relatively few training data. These findings suggest accounting for SMPP uncertainty within a landslide hazard model is viable even in data-limited regions.
Though the probabilistic LHASA model framework was developed over a limited region in the United States, its performance is promising for application to other regions. The probabilistic nowcast scale, which can be subdivided into discrete categories indicating varied levels of landslide hazard based on factors such as number of desired categories or a desired trade-off between true positive rate and false positive rate, provides a flexible way to adapt landslide hazard nowcast categories to stakeholder requirements. Though the nowcast categories in probabilistic LHASA are presented here as equally subdivided portions of a continuous scale (Fig. 4b), this scale could be divided in any number of ways. This flexibility could be used to optimize specific performance metrics such as TPR or FPR. A stakeholder could, for example, use four categories instead of five, and specify that the highest category have a minimum TPR of 0.25 or a maximum FPR of 0.001. This is consistent with the original goal of the LHASA framework—to provide a flexible model for providing situational awareness of landslide hazard that can be fine-tuned as required (Kirschbaum et al. 2015b; Kirschbaum and Stanley 2018).
Accounting for SMPP uncertainty would allow the LHASA model to better utilize SMPPs globally and in data-limited regions. Since precipitation forecasts such as those from the Global Forecasting System (Saha et al. 2006) are generally produced as ensembles, this methodology could be used to incorporate probabilistic precipitation forecasts as inputs into the probabilistic LHASA model to generate landslide hazard forecasts. We anticipate that other environmental prediction models such as the Global Flood Monitoring System (Wu et al. 2012) and the Global Land Data Assimilation System (GLDAS; Fang et al. 2009) could likely benefit from deeper consideration of SMPP uncertainty and error—both systematic and random.
In this study, we found it necessary to modify (albeit modestly) the operational LHASA framework to accommodate IMERG uncertainty estimates, as opposed to inserting those estimates directly into the existing version. Probabilistic precipitation estimates are likely to become more common in the near future. To realize the full benefits of probabilistic representations of precipitation, we believe that it will be necessary to seriously consider how existing environmental models can be adapted to accommodate such estimates, or how new models could be developed that are explicitly designed around such estimates. We acknowledge that this may be very challenging for more complex physically based models. Given the centrality of precipitation (and thus, precipitation errors) to so many phenomena, however, relatively simple models that make effective use of precipitation uncertainty could prove useful. In addition, as our experience with LHASA has shown, it may be necessary to rethink how outputs from such models would be produced and interpreted. This study shows that quantifying satellite precipitation uncertainty and converting SMPPs from an unknown to a known source of uncertainty can significantly improve model performance, even if it remains impossible to derive the true precipitation from SMPP observations.
S.H. Hartke was supported by the NASA Earth and Space Science Fellowship Program (Grant 80NSSC18K1321) and the Arthur H. Frazier Fellowship at University of Wisconsin–Madison. D.B. Wright, D.B. Kirschbaum, T.A. Stanley, and Z. Li were supported by the NASA Precipitation Measurement Mission Program (Grant NNX16AH72G). We thank Ana Barros for sharing rain gage data from the region, which we used to confirm the accuracy of the Stage IV dataset. We also thank the organizers of the 12th International Precipitation Conference for their contributions towards publication fees. The generous support by NSF (Conference Grant EAR-1928724) and NASA (Conference Grant 80NSSC19K0726) to organize IPC12 and produce the IPC12 Special Collection of papers is gratefully acknowledged.