1. Introduction
Stochastic weather generators (SWGs) were primarily introduced to simulate daily meteorological variables, namely precipitation and temperature, that replicate statistical properties of the observed data at the location in question. SWGs are especially useful tools for hydrologists, climate scientists, agriculturalists, ecologists, planners, engineers, and practitioners in related fields given missing meteorological data or an interest in ensemble statistics (e.g., for uncertainty analysis). The development of SWGs often begins with the precipitation process since most other meteorological variables depend on whether or not precipitation occurred, and the addition of air temperature is a natural next step. SWGs are constructed to work on a point scale, but to further capture variations between sites or examine hydrologic or climate change impacts on a broader scale the methods need to be generalized to multiple sites. Generalization to multiple sites has its own set of challenges, especially as the number of sites increases.
Wilks (1998) introduced the widely known multisite generalization model of precipitation occurrence and amount based on chain-dependent processes (a two-state, second-order Markov chain for occurrence and a mixed exponential distribution for amount) that were described in Todorovic and Woolhiser (1975) and later applied in Richardson (1981). This is done by applying spatially correlated yet time-independent random vectors on the models of each individual site within the domain (Wilks 1998). With this method, each site retains its own statistical properties while maintaining realistic correlations with the neighboring sites.
Wilks (1999b) expanded on the multisite generalization method presented in Wilks (1998) by applying the method over an area with complex terrain in the western United States. In addition, the method was expanded to include daily maximum and minimum temperature and solar radiation following Richardson (1981). The Richardson (1981) method involves simulating the residuals of air temperature and solar radiation. Fitted correlation functions were used to capture the seasonal variations in the study area, and this multisite generation was able to model the precipitation over complex terrain while preserving the spatial correlations found in nature (Wilks 1999b). Later, Wilks (2009) showed the practicality of a spatially coherent SWG that interpolated parameters for single sites as described in Wilks (2008). In addition, the study was able to synchronize the gridded synthetic data to true weather data at reference stations within the domain and provide more realistic simulations for hydrologic purposes.
Caraway et al. (2014) developed a nonparametric multisite SWG using the k-nearest neighbor resampling approach. This model uses clustering of homogeneous sites in addition to Markov chain states to simulate precipitation at multiple sites within a heterogeneous watershed. While most present-day weather generators are parametric and are based off the work of Richardson (1981) and Wilks (1998), including the stochastic harmonic autoregressive parametric weather generator (SHArP; Smith et al. 2017) and Multi-site Weather Generator of École de Technologie Supérieure (MulGETS; Chen et al. 2014), the advantages of a nonparametric weather generator include the ability to capture the nonlinear variability that is missed in the linear parametric SWGs. Kleiber et al. (2012) introduced a generalized linear model (GLM) that uses spatial Gaussian processes to model the statistical parameters of precipitation over a domain. A similar nonparametric GLM for maximum and minimum temperature was also developed and is described in Kleiber et al. (2013). Verdin et al. (2015) combined the methods in Kleiber et al. (2012) and Kleiber et al. (2013) and developed a GLM-based weather generator that can be applied to any area regardless of data density or availability.
Existing multisite SWG research has focused primarily on the precipitation component of the generator and the complexities associated with generating it realistically at multiple sites. There is comparatively minimal focus on the temperature component, but temperature fidelity is important for determining the phase (snow vs rain) and fate of winter precipitation over complex terrain. Many previous studies involving the temperature component have opted to use the method applied in Richardson (1981), which was generalized to multiple sites over complex terrain in Wilks (1999b). This method involves prescribing and then removing the means and standard deviations of the temperature values (separating the dry and wet days) in advance and generating the temperature residuals. The resulting mean temperatures switch abruptly between the dry- and wet-state values due to the prescribed means. To overcome this limitation and provide a more realistic temporal evolution of daily weather, Smith et al. (2017) introduced the SHArP weather generator, which simulates autocorrelated temperatures directly without prescribing mean values in advance. The extension of existing multisite SWGs usually involves a conditioning of temperature on precipitation occurrence, but this is largely sitewise, meaning variations in the larger-scale spatial pattern of precipitation do not affect the temporal evolution or spatial covariance of temperature.
In this study, we present a multisite generalization of SHArP. The mathematical formulation follows that of the single-site, single-temperature case in Smith et al. (2017), but major differences stem from how objectively identified precipitation-occurrence patterns (POPs) impact temperature stochasticity and autocorrelation. In the single-site, single-temperature case, we used a temporally varying noise coefficient that depended on whether the given day was wet or dry at the site. Having multiple sites introduces between-site covariances that are found to depend on POPs whose number increases as 2M for M sites. To circumvent this situation, we use empirical orthogonal function analysis to objectively categorize the POPs, yielding a compact set of matrices for driving between-site temperature covariance. An example application in the western United States is used to illustrate the fidelity of the framework in complex terrain where precipitation patterns can change markedly over the study domain, and the simulation period is 1950–2100 to illustrate how trends are handled.
2. Data and study area
SHArP was initially developed, tested, and presented in an entirely observational context (Smith et al. 2017). We use observations again here and incorporate statistically downscaled historical and future climate model output in part because the principal anticipated application of SHArP is simulation of future weather under climate change. From the Global Historical Climatology Network (Menne et al. 2012), we use daily precipitation, minimum temperature, and maximum temperature from 19 sites over northern Utah and southwestern Wyoming (Fig. 1) during 1950–2005. Additional available stations that had 10% or more missing values over this period were eliminated.
We also use 0.125° bias-corrected constructed analogs (BCCA) of daily CCSM4 output from phase 5 of the Coupled Model Intercomparison Project (CMIP5; Maurer et al. 2007; Bureau of Reclamation 2013). We used the historical BCCA data, which span from 1 January 1950 to 31 December 2005, in this analysis, as well as the future RCP 8.5 (high emissions scenario) data, which span from 1 January 2006 to 31 December 2100. We select from the statistically downscaled data a transect of 30 sites from the western desert of Utah (40.8125°N, 113.6875°W) to the Uinta Mountains (40.8125°N, 110.0625°W), which includes a point near the Salt Lake International Airport (KSLC; 40.8125°N, 111.9375°W; site 15 adjacent to the red circle in Fig. 1). Half of the sites are located in the “valley,” and the other half of the sites are located in the mountains. The study region is located within the larger Great Basin, which is known for its semiarid climate and basin-and-range topography (e.g., Thompson and Burke 1974).
To pair with the statistically downscaled climate data (1950–2100), we use indices of El Niño–Southern Oscillation (ENSO; e.g., Diaz et al. 2001) and the Pacific decadal oscillation (PDO; e.g., McCabe and Dettinger 2002). These data are bandpass-filtered, spatially averaged historical CCSM4 sea surface temperature output processed by following the method of Smith et al. (2015).
3. Multisite simulation of daily maximum and minimum air temperature
a. Model formulation
b. Specification of parameters
In the single-site, single-temperature case, the noise coefficient ck was a time-dependent vector that depended on whether the day was wet or dry. For multisite SHArP,
For the study area here, the first quantized eigenvector captured all sites being wet in its positive polarity (Fig. 2a) and all sites being dry in its negative polarity. The second quantized eigenvector captured the mountain sites being wet and the valley sites being dry in its positive polarity (Fig. 2b) and the reverse in its negative polarity. Similar EOFs were found analyzing precipitation occurrence along the statistical downscaling transect (Figs. 2c,d), and these first two EOFs accounted for 62% and 11% of the variance for the transect.
4. Model significance, validation, and illustration
a. Model significance
The POP-specific covariance patterns are one of the principal innovations in the multisite generalization of SHArP, so we provide some statistical analysis to emphasize their importance. As noted at the end of section 3b, for each month we have four
To depict some of the patterns underlying these hypothesis testing results, we average over elements of the
b. Model validation and illustrative patterns
The model skill and validation at individual sites was presented in Smith et al. (2017), so we focus on the multisite aspects of the performance here based on results from the historical station observations. Performance for the historical and future statistically downscaled data was also assessed and found to be comparable or better (not shown), consistent with the downscaling producing smoother variations in space and time compared to station observations.
We begin with Fig. 4 to illustrate the model’s fidelity in capturing spatial extrema of daily minimum and maximum temperature. Shown are quantile–quantile plots of the spatial maximum of Tmax, minimum of Tmin, maximum of Tmin, and minimum of Tmax. SHArP tends to slightly overpredict the maximum extrema (Figs. 4b,c) and slightly underpredict the minimum extrema (Figs. 4a,d), but the agreement is very good overall.
Segregating by POP, SHArP captures seasonal contrasts in mean Tmax and Tmin well (Figs. 5a–d). For pairwise, between-station covariation, SHArP captures aspects of the seasonal changes such as the smaller covariances in summer and larger covariances in the transition seasons (Figs. 5e–h). Large changes in the position of a given seasonal cluster between panels in Fig. 5 illustrate the importance of the POPs in modulating intersite covariation. A portion of the covariance in Figs. 5e–h is associated with changes in mean temperature, so we also show the same analysis calculated for the stochastic residuals (Tk+1 −
To further illustrate the utility of the POP-based matrices
We close this section with some remarks and results for simulation of future temperature variations, including trends. As noted above, SHArP trained on statistically downscaled climate model output along the transect in Fig. 1 produces similar POP EOFs to those found for station data (Fig. 2) and comparable or better validation due to data smoothness. Training SHArP on the statistically downscaled data for 1950–2100 enables us to simulate future variations. As an example, annual cycles of maximum temperature are shown for four sites corresponding to a late-century year in Figs. 7a and 7b. SHArP, in addition to simulating realistic annual cycles and variance at each site, provides realistic intersite covariation that is temporally synchronized by POPs. SHArP is also able to capture long-term trends with realistic intersite covariation as illustrated by annual mean minimum temperatures at the four sites (cf. Figs. 7c,d).
5. Multisite simulation of daily precipitation
a. Formulation and parameter estimation for precipitation occurrence
The precipitation model we use with SHArP largely follows formulations presented in Woolhiser (2008) and Wilks (2009), except we introduce a trend term in the perturbation of the Markov chain precipitation-occurrence probabilities so that the framework can simulate climate change. We provide details leading up to the introduction of the trend here for completeness.
b. Formulation and parameter estimation for precipitation amount
c. Formulation and parameter estimation for climate perturbation
Because of the likely effects of climate change in the future, especially relating to the snowpack in the western United States (e.g., Mote 2006), we modified the precipitation components of SHArP to include these effects. We use the formulation for simulating precipitation occurrence introduced in Smith et al. (2017), where we define perturbed versions of the pijk values that can incorporate trends and sensitivity to teleconnection indices, extending ideas presented in Woolhiser (2008). In addition, the larger mean in the mixed exponential precipitation amount distribution (β2) is here allowed to have a trend and dependence on teleconnection indices, analogous to the perturbed formulation of pij1.
As an illustrative example, the perturbed
We simulated multisite daily precipitation 500 times from 1950 to 2100 to illustrate variability in total precipitation from year to year. Figure 10 shows this variability at KSLC in comparison with the training data. Note the overall increasing trend and low-frequency variability due to ENSO and the PDO. The tendency for correlation between the training data and the ensemble mean arises because the oceanic modes (E and P) driving the precipitation occurrence and amount were diagnosed from the coupled global climate model simulation that produced the training data.
6. Discussion and conclusions
We extended the stochastic temperature simulation framework introduced by Smith et al. (2017) by generalizing the single-temperature, single-site formulation to encompass maximum and minimum temperatures correlated between multiple sites. This study focused substantially on the temperature component because temperature has received comparatively less attention in the literature despite its importance to snowpack variability, and we were able to markedly improve its realism by leveraging techniques described in Smith et al. (2017). The formulation of SHArP shares conceptual similarity with some recent GLM frameworks referenced in the introduction, but it is unique in its incorporation of precipitation spatial pattern effects (e.g., mountain-wet/valley-dry patterns) on temperature temporal evolution and spatial covariance. Hypothesis testing shows that the POP-based decomposition captures statistically significant differences in observed temperature covariance. In addition, we presented a compatible multisite daily precipitation simulation framework based on Markov chain ideas introduced by Woolhiser (2008) and Wilks (1998, 2009). The precipitation framework can capture lagged dependence on climate modes such as ENSO and PDO and was generalized here to capture trends associated with climate change. This study used data from weather station locations and grid points on the downscaled model output grid. Data can be generated for “unobserved” or off-grid locations by interpolating the model parameters using elevation and distance relationships via, for example, kriging or use of distance and elevation dependencies following Wilks (1999b).
A key advance from the mathematical formulation of single-site SHArP introduced in Smith et al. (2017) is the change from temporally varying noise coefficient vectors (one for dry days and one for wet days) to noise coefficient matrices that depend on the multisite spatial POP and simulate observed intersite correlations in temperature. A total of M sites yields an unmanageably large 2M possible POPs, so we employ EOF analysis to reduce the number of possible POPs to some number much smaller than 2M. Here, we used the leading two EOFs for illustration, with the first capturing the contrast between all sites wet and all sites dry, and the second capturing mountain-wet/valley-dry versus mountain-dry/valley-wet patterns. The number of EOFs used might be increased depending on the patterns of variability in a particular study region, but it needs to be balanced against the accompanying decrease in sample size for estimation of the covariance matrices. After the residual error for each day is assigned to one of four noise coefficient matrices
Another key change associated with the multisite generalization is related to the
Testing the encoding of SHArP, we verified that the model accurately estimates the parameters of a broad range of synthetic multisite input data that we generated using Eq. (1) (i.e., the estimation procedures recover
Acknowledgments
This material is based upon work supported by the National Science Foundation under grants EPS-1135482, EPS-1135483, EPS-1208732, and DMS-1407574. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. Provision of computer infrastructure by the Center for High Performance Computing at the University of Utah is gratefully acknowledged. We acknowledge the World Climate Research Programme’s Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups for producing and making available their model output. For CMIP, the U.S. Department of Energy’s Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. CPC U.S. Unified Precipitation data were provided by the NOAA/OAR/ESRL Physical Sciences Division (https://www.esrl.noaa.gov/psd/).
REFERENCES
Brissette, F., M. Khalili, and R. Leconte, 2007: Efficient stochastic generation of multi-site synthetic precipitation data. J. Hydrol., 345, 121–133, https://doi.org/10.1016/j.jhydrol.2007.06.035.
Bureau of Reclamation, 2013: Downscaled CMIP3 and CMIP5 climate and hydrology projections: Release of downscaled CMIP5 climate projections, comparison with preceding information, and summary of user needs. Bureau of Reclamation Tech. Rep., 47 pp. + appendixes, https://gdo-dcp.ucllnl.org/downscaled_cmip_projections/techmemo/downscaled_climate.pdf.
Caraway, N. M., J. L. McCreight, and B. Rajagopalan, 2014: Multisite stochastic weather generation using cluster analysis and k-nearest neighbor time series resampling. J. Hydrol., 508, 197–213, https://doi.org/10.1016/j.jhydrol.2013.10.054.
Chen, J., F. Brissette, and X. Zhang, 2014: A multi-site stochastic weather generator for daily precipitation and temperature. Trans. ASABE, 57, 1375–1391, https://doi.org/10.13031/trans.57.10685.
Diaz, H. F., M. P. Hoerling, and J. K. Eischeid, 2001: ENSO variability, teleconnections and climate change. Int. J. Climatol., 21, 1845–1862, https://doi.org/10.1002/joc.631.
Hannachi, A., I. T. Jolliffe, and D. B. Stephenson, 2007: Empirical orthogonal functions and related techniques in atmospheric science: A review. Int. J. Climatol., 27, 1119–1152, https://doi.org/10.1002/joc.1499.
Kleiber, W., R. W. Katz, and B. Rajagopalan, 2012: Daily spatiotemporal precipitation simulation using latent and transformed Gaussian processes. Water Resour. Res., 48, W01523, https://doi.org/10.1029/2011WR011105.
Kleiber, W., R. W. Katz, and B. Rajagopalan, 2013: Daily minimum and maximum temperature simulation over complex terrain. Ann. Appl. Stat., 7, 588–612, https://doi.org/10.1214/12-AOAS602.
Lareau, N. P., E. Crosman, C. D. Whiteman, J. D. Horel, S. W. Hoch, W. O. J. Brown, and T. W. Horst, 2013: The Persistent Cold-Air Pool Study. Bull. Amer. Meteor. Soc., 94, 51–63, https://doi.org/10.1175/BAMS-D-11-00255.1.
Maurer, E. P., L. Brekke, T. Pruitt, and P. B. Duffy, 2007: Fine-resolution climate projections enhance regional climate change impact studies. Eos, Trans. Amer. Geophys. Union, 88, 504–504, https://doi.org/10.1029/2007EO470006.
McCabe, G. J., and M. D. Dettinger, 2002: Primary modes and predictability of year-to-year snowpack variations in the western United States from teleconnections with Pacific Ocean climate. J. Hydrometeor., 3, 13–25, https://doi.org/10.1175/1525-7541(2002)003<0013:PMAPOY>2.0.CO;2.
Menne, M. J., I. Durre, R. S. Vose, B. E. Gleason, and T. G. Houston, 2012: An overview of the Global Historical Climatology Network-Daily database. J. Atmos. Oceanic Technol., 29, 897–910, https://doi.org/10.1175/JTECH-D-11-00103.1.
Mote, P. W., 2006: Climate-driven variability and trends in mountain snowpack in western North America. J. Climate, 19, 6209–6220, https://doi.org/10.1175/JCLI3971.1.
Rencher, A. C., and W. F. Christensen, 2012: Methods of Multivariate Analysis. 3d ed. John Wiley and Sons, 800 pp.
Richardson, C. W., 1981: Stochastic simulation of daily precipitation, temperature, and solar radiation. Water Resour. Res., 17, 182–190, https://doi.org/10.1029/WR017i001p00182.
Smith, K., C. Strong, and S.-Y. Wang, 2015: Connectivity between historical Great Basin precipitation and Pacific Ocean variability: A CMIP5 model evaluation. J. Climate, 28, 6096–6112, https://doi.org/10.1175/JCLI-D-14-00488.1.
Smith, K., C. Strong, and F. Rassoul-Agha, 2017: A new method for generating stochastic simulations of daily air temperature for use in weather generators. J. Appl. Meteor. Climatol., 56, 953–963, https://doi.org/10.1175/JAMC-D-16-0122.1.
Stern, R. D., and R. Coe, 1984: A model fitting analysis of daily rainfall data. J. Roy. Stat. Soc., 147A, 1–34, https://doi.org/10.2307/2981736.
Thompson, G. A., and D. B. Burke, 1974: Regional geophysics of the basin and range province. Annu. Rev. Earth Planet. Sci., 2, 213–238, https://doi.org/10.1146/annurev.ea.02.050174.001241.
Todorovic, P., and D. A. Woolhiser, 1975: A stochastic model of ω-day precipitation. J. Appl. Meteor., 14, 17–24, https://doi.org/10.1175/1520-0450(1975)014<0017:ASMODP>2.0.CO;2.
Verdin, A., B. Rajagopalan, W. Kleiber, and R. W. Katz, 2015: Coupled stochastic weather generation using spatial and generalized linear models. Stochastic Environ. Res. Risk Assess., 29, 347–356, https://doi.org/10.1007/s00477-014-0911-6.
Wilks, D. S., 1998: Multisite generalization of a daily stochastic precipitation generation model. J. Hydrol., 210, 178–191, https://doi.org/10.1016/S0022-1694(98)00186-3.
Wilks, D. S., 1999a: Interannual variability and extreme-value characteristics of several stochastic daily precipitation models. Agric. For. Meteor., 93, 153–169, https://doi.org/10.1016/S0168-1923(98)00125-7.
Wilks, D. S., 1999b: Simultaneous stochastic simulation of daily precipitation, temperature and solar radiation at multiple sites in complex terrain. Agric. For. Meteor., 96, 85–101, https://doi.org/10.1016/S0168-1923(99)00037-4.
Wilks, D. S., 2008: High-resolution spatial interpolation of weather generator parameters using local weighted regressions. Agric. For. Meteor., 148, 111–120, https://doi.org/10.1016/j.agrformet.2007.09.005.
Wilks, D. S., 2009: A gridded multisite weather generator and synchronization to observed weather data. Water Resour. Res., 45, W10419, https://doi.org/10.1029/2009WR007902.
Wise, E. K., 2010: Spatiotemporal variability of the precipitation dipole transition zone in the western United States. Geophys. Res. Lett., 37, L07706, https://doi.org/10.1029/2009GL042193.
Woolhiser, D. A., 2008: Combined effects of the Southern Oscillation index and the Pacific decadal oscillation on a stochastic daily precipitation model. J. Climate, 21, 1139–1152, https://doi.org/10.1175/2007JCLI1862.1.