This paper utilizes statistical and statistical–dynamical methodologies to select, from the full observational record, a minimal subset of dates that would provide representative sampling of local precipitation distributions across the contiguous United States (CONUS). The CONUS region is characterized by a great diversity of precipitation-producing systems, mechanisms, and large-scale meteorological patterns (LSMPs), which can provide favorable environment for local precipitation extremes. This diversity is unlikely to be adequately captured in methodologies that rely on grossly reducing the dimensionality of the data—by representing it in terms of a few patterns evolving in time—and thus requires data thinning techniques based on high-dimensional dynamical or statistical data modeling. We have built a novel high-dimensional empirical model of temperature and precipitation capable of producing statistically accurate surrogate realizations of the observed 1979–99 (training period) evolution of these fields. This model also provides skillful hindcasts of precipitation over the 2000–20 (validation) period. We devised a subsampling strategy based on the relative entropy of the empirical model’s precipitation (ensemble) forecasts over CONUS and demonstrated that it generates a set of dates that captures a majority of high-impact precipitation events, while substantially reducing a heavy-precipitation bias inherent in an alternative methodology based on the direct identification of large precipitation events in the Global Ensemble Forecast System (GEFS), version 12 reforecasts. The impacts of data thinning on the accuracy of precipitation statistical postprocessing, as well as on the calibration and validation of the Hydrologic Ensemble Forecast Service (HEFS) reforecasts are yet to be established.
High-impact weather events are usually associated with extreme precipitation, which is notoriously difficult to predict even using highly resolved state-of-the-art numerical weather prediction models based on first physical principles. The same is true for statistical models that use past data to anticipate the future behavior likely to stem from an observed initial state. Here we use both types of models to identify the occurrences of the states, over the historical climate record, which are likely to lead to extreme precipitation events. We show that the overall statistics of precipitation over the contiguous United States is encapsulated in a greatly reduced set of such states, which could substantially alleviate the computational burden associated with testing of hydrological forecast models used for decision support.
© 2022 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).