1. Introduction
The drought-stricken western United States, including the Great Basin region of Utah, Wyoming, Idaho, Oregon, Nevada, and California, is facing an uncertain water future because of climate change. The northern half of the Great Basin, which includes northern Utah, is located in the center of the El Niño–Southern Oscillation (ENSO) dipole. ENSO is a well-known climatic teleconnection between sea surface temperatures and the atmosphere in the equatorial Pacific Ocean that affects global weather patterns (Troup 1965; Horel and Wallace 1981). The occurrence of precipitation in the Great Basin in any given year is dependent on both the phase of ENSO and the phase of the Pacific decadal oscillation (PDO), as the phase of the PDO shifts the ENSO dipole either north or south (Wise 2010; Brown 2011). Because of its complex terrain, the majority of the water used by those who live in the region is dependent on the snowpack that is stored in the mountains and released throughout the year via the reservoir system. This semiarid region is already experiencing inconsistent water availability throughout any given year because of the drastically different number of winter precipitation events from year to year. The ability to statistically model the occurrence of precipitation and air temperature is imperative to better forecast potential changes in future water availability as the climate changes. In this study, we introduce a stochastic harmonic autoregressive parametric (SHArP) weather generator, which statistically models meteorological variables (in this case, the occurrence and amount of precipitation and maximum air temperature). The model can be used to investigate how the future of the Great Basin may be impacted by climate change and to understand the meteorological extremes that are likely to play a part in that impact.
While the outputs of both statistically based stochastic weather generators (SWGs) and dynamically based global climate models (GCMs) are used in climate impacts studies, there are major differences between them. SWGs work on a point scale, or on a point scale expanded via multisite generalization to a basin scale, whereas GCMs work on a broad regional scale and can be downscaled to the basin or smaller scale. GCMs have difficulty capturing detail in areas of complex terrain, including the Great Basin, which is characterized by its basin-and-range topography (e.g., Thompson and Burke 1974). SWGs also have a faster computational time than GCMs, which can take upward of months to complete a single run. GCMs are very computationally expensive in comparison with SWGs, and thus there are not many GCM runs available for analysis. GCMs also have difficulty capturing the very low-frequency (century scale) connections between the Pacific Ocean and the Great Basin. The performance of state-of-the-art GCMs has been evaluated in terms of ability to capture the “extremes” in precipitation and temperature, and it has been found that GCMs poorly capture the extremes, though they perform better at temperature extremes than precipitation (Kiktev et al. 2007).
SWGs alleviate some limitations of GCMs and were introduced as a way to overcome a lack of observational meteorological data and problems associated with missing data both temporally and spatially (Wilks and Wilby 1999; Wilks 2008). In addition, they have been used to better understand the uncertainties associated with future climate (e.g., Wilks 1992; Forsythe et al. 2014). These statistical models generate synthetic time series of precipitation and in some cases also air temperature and solar radiation, which statistically resemble the data used to force the model—usually daily observational weather data (Wilks and Wilby 1999). There have been a multitude of early studies on SWGs that solely generate precipitation occurrence and amount because air temperature and other meteorological variables are affected by whether precipitation occurred.
The first studies using stochastic simulators of weather data employed two-state, first-order Markov chain frameworks regarding precipitation (Bailey 1964; Richardson 1981; Roldàn and Woolhiser 1982), meaning that the probability of precipitation occurrence on a given day is only dependent on whether precipitation occurred on the previous day. Precipitation amount was modeled separately, and maximum/minimum temperatures and solar radiation were modeled as a function of precipitation occurrence. Other studies involving SWGs considered a two-state, second-order Markov chain process (Stern and Coe 1984; Wilks 1999a). Markov chains of higher order have been found to better capture dry spells than first-order Markov chains, thus providing more accurate results for most areas of the western United States where dry spells are common, such as the semiarid Great Basin.
One limitation of the common SWGs is the ability to successfully capture nonstationary variability. Previous studies have found that over the western United States, El Niño results in a wetter Southwest and a drier Northwest, while La Niña results in the opposite (Ropelewski and Halpert 1986; Dettinger et al. 1998; Woolhiser 2008). In addition, the PDO also has significant impacts on precipitation in the western United States. The PDO is linked to ENSO, which in turn affects how the different phases of ENSO will impact the western United States (Gershunov and Barnett 1998; Gershunov et al. 1999; Mauget 2003). Woolhiser (2008) introduced the idea of adding nonstationarity to the stochastic framework in order to capture the effects these major oceanic oscillations have on western U.S. precipitation. Essentially, perturbations given as time series of the oscillations were linearly added to the probability of precipitation, and the coefficients associated with each perturbation give information on the sensitivity of each of the oscillations (Woolhiser 2008). We employ this method in this study and also include a trend to account for the changing climate.
In the SWG literature, simulation of daily maximum and minimum air temperature is usually conditioned on whether the day is wet or dry. The most widely used method for simulating temperature is the method used by Richardson (1981). This method involves generating the standardized residual time series of temperature (maximum and minimum temperature; the study also included solar radiation) and using the multivariate generation model as described by Matalas (1967). These standardized residuals are assumed normally distributed, and the coefficients in the generating model are matrices containing the cross correlations and autocorrelations between the residuals (Matalas 1967). After generating the synthetic residuals, the wet- or dry-state means and standard deviations that were initially removed are reintroduced to yield daily values of the variables. The means and standard deviations depend on whether the day was wet or dry; they are assumed to be cyclostationary and are determined by fitting harmonics of the annual cycle to observations (Richardson 1981).
In addition to the common parametric SWGs described thus far, including the SWGs introduced by Matalas (1967) and Richardson (1981), recent studies have employed nonparametric SWGs and generalized linear models (GLMs). These SWGs are data driven and involve either kernel density estimation (e.g., Rajagopalan et al. 1997; Harrold et al. 2003) or resampling via k-nearest-neighbor (k-NN) bootstrapping (e.g., Rajagopalan and Lall 1999; Caraway et al. 2014). These models do not rely on the statistical relationships applied in the parametric SWGs. They offer an alternative to the standard linear models presented in the parametric SWGs, which are unable to capture the nonlinear relationships between meteorological variables. The use of GLMs in SWGs, first introduced by Stern and Coe (1984), has also been increasing in popularity because they can easily model discrete variables and variables with nonnormal distributions (Furrer and Katz 2007). In addition, GLMs are especially useful tools because of their ability to treat ENSO and other major oceanic modes of variability as continuous variables (e.g., Chandler 2005). More details behind GLMs can be found in McCullagh and Nelder (1989).
A limitation of the widely used Richardson model is that its mean and standard deviation switch abruptly between wet- and dry-state values prescribed in advance of the simulation, and temperature is not simulated directly but rather through its residuals. This method inaccurately captures what occurs in reality, which instead are smooth, autocorrelated transitions between wet- and dry-state values. In this study, we introduce the mathematics and present illustrative results for a SHArP weather generator that is based on the Richardson model but that simulates temperature values directly with a mean that makes autocorrelated transitions between wet-and dry-state temperature values. Because of this innovation, the method described here better captures the temperature transitions between days with different precipitation states, including following frontal passages.
2. Data and study area
We chose to illustrate the SHArP weather generator using observations from the Salt Lake City International Airport (KSLC), which is located in the Great Basin. Its precipitation depends largely on a combination of the state of ENSO and the state of the PDO (Wise 2010; Brown 2011). The precipitation and temperature data used to force SHArP are daily observational data recorded at KSLC (40.78°N, 111.97°W) from 1 January 1948 to 31 December 2010 via the Global Historical Climatology Network (GHCN-Daily) provided by the National Centers for Environmental Information (obtained from http://www.ncdc.noaa.gov; Menne et al. 2012a,b). In addition, we obtained GHCN-Daily precipitation and temperature data for four climatologically similar surrounding sites to illustrate the autocorrelated transitions during frontal passages. The domain map (see Fig. 1) shows the location of KSLC in addition to the four surrounding sites: Boise Air Terminal (KBOI) and Pocatello Regional Airport (KPIH) in Idaho, Elko Regional Airport (KEKO) in Nevada, and Grand Junction Regional Airport (KGJT) in Colorado.
The study area: the eastern half of the Great Basin (which includes northern and western Utah, extreme southwestern Wyoming, extreme southern Idaho, and Nevada) and the surrounding area. The stars indicate the location of KSLC and surrounding sites: KBOI, KPIH, KEKO, and KGJT. The color bar indicates elevation in meters above sea level.
Citation: Journal of Applied Meteorology and Climatology 56, 4; 10.1175/JAMC-D-16-0122.1
Future precipitation and temperature output used to force SHArP are daily 0.125° gridded bias correction constructed analog (BCCA) projections from the CCSM4 model, which was part of phase 5 of the Coupled Model Intercomparison Project (CMIP5) multimodel ensemble (Maurer et al. 2007; Brekke et al. 2013). We use the high-emissions scenario (RCP 8.5) data, and they span from 1 January 2006 to 31 December 2100. We use the data starting from 1 January 2011 following the end of the observational data.
A day was considered “wet” and given value χ = 1 if the total precipitation on that day reached at least 0.25 mm (approximately 0.01 in.), corresponding to the minimum depth recorded by rain gauges. Otherwise, the day was considered dry and given value χ = 0. The χ vector was determined from the precipitation time series, and this provided the precipitation occurrence needed to model temperature with SHArP. In this study, we use and generate only maximum surface air temperature at a single site. Generalization to multiple variables at multiple sites has been completed, and the formulation will be presented in a future paper.
3. Simulation of maximum air temperature and precipitation





a. Maximum likelihood estimation


Annual composite of the observation (thin lines) and model (thick lines) means for dry (red) and wet days (blue). Results are based on KSLC observations for years 1948–2010.
Citation: Journal of Applied Meteorology and Climatology 56, 4; 10.1175/JAMC-D-16-0122.1




Illustration of the SHArP weather generator with (a) input observational data for comparison. The blue curve (raw observations) shows 2008 as an example year, and shading in each panel corresponds to percentiles of the historical data for 1948–2010. Two simulations (red) of the temperature model with (b),(c) a constant c and (d),(e) a seasonally varying ck.
Citation: Journal of Applied Meteorology and Climatology 56, 4; 10.1175/JAMC-D-16-0122.1
b. Least squares estimation with varying ck









The parameters are inserted back into (8) or (9) to generate the synthetic temperature series using the linear model (1). Example simulations with the seasonally varying ck are shown in the right column of Fig. 3. Note how seasonally varying ck better captures the low variability in the summer and high variability in the winter. The dry and wet ck curves are shown with composite annual cycles of the standard deviation of the noise in Fig. 4. These curves highlight the larger variability associated with wet days as well as the larger variability associated with the transition seasons (spring and fall) featuring strong frontal temperature contrasts.
Seasonally varying ck curves for (left) dry and (right) wet days (black lines) and standard deviations of the noise (colored lines). Note the relatively higher variability in the transitional seasons and overall higher variability associated with the wet days.
Citation: Journal of Applied Meteorology and Climatology 56, 4; 10.1175/JAMC-D-16-0122.1
c. Simulation of precipitation
4. Comparison with the Richardson method
Because the Richardson method of simulating stochastic temperature (referred to as the multivariate generation model) is the most widely used parametric method in the field and the one upon which SHArP builds, it is a useful point of comparison. The Richardson method is essentially an autoregressive process that simulates standardized residuals; the details of this method can be found in Richardson (1981) and Matalas (1967). The Richardson method prescribes the means and standard deviations of the data (for wet and dry days) prior to simulation via a harmonic fit and then reintroduces them after simulating standardized residuals. This causes the model mean and standard deviation to abruptly switch between wet- and dry-state values. The model we introduce here (1) also has wet- and dry-state harmonics (bk) and noise amplitudes (bk) prescribed in advance, but the mean of the model (
We highlight the difference between the methods in Fig. 5, which compares the composite synthetic temperature simulated by the two models with the observational temperature for precipitation occurrence sequences of dry-dry-wet-wet-dry-dry for each season. The observational temperature reflects a typical cold frontal passage in each season (e.g., Shafer and Steenburgh 2008). In general, the observed maximum temperature increases shortly before the frontal passage because of southerly flow and warm air advection; on the first day of precipitation, the maximum temperature decreases modestly. On the second day of precipitation, the temperature continues to decrease, and it slowly rebounds following the precipitation event. SHArP is able to capture this overall pattern. In contrast, the abrupt switching between wet- and dry-state means in the Richardson model results in an unrealistically large decrease in temperature on the first day of precipitation, followed by minimal change on the second day (actually zero change with large enough sample). While there is little to no seasonal bias in the Richardson model, there is a bias in the temperature around frontal passages. The temperature bias in the Richardson model is up to 4°C following frontal passages, and SHArP is able to reduce that bias by 2°C in three seasons.
Composite observation temperature (black lines) and composite synthetic temperature for sets of days that follow the precipitation occurrence sequence dry-dry-wet-wet-dry-dry, in each season. In addition, the bias for each season is also shown immediately below these composite panels. Composite is of each occurrence of this sequence at five climatologically similar sites (see Fig. 1). The red lines indicate SHArP, the model presented here, and the blue lines indicate the Richardson model. The number of samples in each set is approximately 500.
Citation: Journal of Applied Meteorology and Climatology 56, 4; 10.1175/JAMC-D-16-0122.1
Although the Richardson framework as originally formulated does not contain a trend term, one could be added in principle. One approach would be to fit the trend by LSE and remove it prior to estimating the annual cycles of the mean and residual standard deviations, and then adding the trend back in after generating the simulated temperatures. In contrast to this multistep approach involving removing components, fitting, simulating, and reintroducing components, the model presented here involves only fitting and simulation because all variations are captured in the fit formulation, including a trend term that is incorporated into bk. Trended output from observations (1948–2010) and future BCCA CCSM4 high-emissions scenario output (2011–2100) is shown in Fig. 6a with an example corresponding realization from the temperature model presented here shown in Fig. 6b.
(a) KSLC observation GHCN-Daily maximum temperature (1948–2010) and BCCA CCSM4 high-emissions (RCP8.5) maximum temperature output (2011–2100). (b) An example of a trended stochastic maximum temperature simulated from 1948 to 2100 for KSLC. The simulation was trained on the data shown in (a). The red dots indicate the average annual maximum temperature for each year of the simulation.
Citation: Journal of Applied Meteorology and Climatology 56, 4; 10.1175/JAMC-D-16-0122.1
5. Discussion and conclusions
This study presents a new linear model for simulating stochastic temperature realizations called SHArP, and the method was illustrated for maximum temperature at a single site within the Great Basin. We first considered a simplified version of the model with a constant noise coefficient c and applied MLE to obtain its parameters. However, this constant c compromised between the variance in the summer and the variance in the winter, which resulted in a simulation that did not adequately capture the seasonal variance found in the observations. A seasonally varying noise coefficient ck rendered the MLE nonlinear, and we presented analytical solutions via LSE. The resulting temperature realization more closely matched that of observations, with increased wintertime variance and decreased summertime variance.
Further realism may also be possible by relaxing assumptions used here. For example, we assume the amplitude of noise ck to be annually cyclostationary but without trend. It is possible for the noise to have similar nonstationarity because of ENSO and PDO. Curvilinearity (a trend) and variables related to ENSO and PDO could be added to the
Even though this study is focused on only maximum temperature at a single site, we have generalized the method described to include minimum temperature in addition to maximum temperature at multiple sites. The linear model remains the same, but the scalar computations become matrix computations. We extended ideas described in Wilks (1998) and Wilks (1999b), where the sites themselves have spatial correlation but are generated independently of each other, by introducing spatial correlations in the ck matrices but not in the a matrix. However, this method introduced an increased number of parameters in the variance–covariance matrix that required a nontrivial technique to mitigate the issue, and this will be described in a future paper.
Acknowledgments
This material is based upon work supported by the National Science Foundation under Grants EPS-1135482, EPS-1135483, EPS-1208732, and DMS-1407574. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. Provision of computer infrastructure by the Center for High Performance Computing at the University of Utah is gratefully acknowledged. We acknowledge the World Climate Research Programme’s Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups for producing and making available their model output. For CMIP the U.S. Department of Energy’s Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals.
REFERENCES
Bailey, N. T. J., 1964: The Elements of Stochastic Processes with Applications to the Natural Sciences. John Wiley and Sons, 249 pp.
Brekke, L., B. L. Thrasher, E. P. Maurer, and T. Pruitt, 2013: Downscaled CMIP3 and CMIP5 climate projections: Release of downscaled CMIP5 climate projections, comparison with preceding information, and summary of user needs. U.S. Department of the Interior Bureau of Reclamation Tech. Rep., 116 pp. [Available online at http://gdo-dcp.ucllnl.org/downscaled_cmip_projections/techmemo/downscaled_climate.pdf.]
Brown, D. P., 2011: Winter circulation anomalies in the western United States associated with antecedent and decadal ENSO variability. Earth Interact., 15, doi:10.1175/2010EI334.1.
Caraway, N. M., J. L. McCreight, and B. Rajagopalan, 2014: Multisite stochastic weather generation using cluster analysis and k-nearest neighbor time series resampling. J. Hydrol., 508, 197–213, doi:10.1016/j.jhydrol.2013.10.054.
Chandler, R. E., 2005: On the use of generalized linear models for interpreting climate variability. Environmetrics, 16, 699–715, doi:10.1002/env.731.
Dettinger, M. D., D. R. Cayan, H. F. Diaz, and D. M. Meko, 1998: North–south precipitation patterns in western North America on interannual-to-decadal timescales. J. Climate, 11, 3095–3111, doi:10.1175/1520-0442(1998)011<3095:NSPPIW>2.0.CO;2.
Forsythe, N., H. Fowler, S. Blenkinsop, A. Burton, C. Kilsby, D. Archer, C. Harpham, and M. Hashmi, 2014: Application of a stochastic weather generator to assess climate change impacts in a semi-arid climate: The upper Indus basin. J. Hydrol., 517, 1019–1034, doi:10.1016/j.jhydrol.2014.06.031.
Furrer, E. M., and R. W. Katz, 2007: Generalized linear modeling approach to stochastic weather generators. Climate Res., 34, 129–144, doi:10.3354/cr034129.
Gershunov, A., and T. P. Barnett, 1998: Interdecadal modulation of ENSO teleconnections. Bull. Amer. Meteor. Soc., 79, 2715–2725, doi:10.1175/1520-0477(1998)079<2715:IMOET>2.0.CO;2.
Gershunov, A., T. P. Barnett, and D. R. Cayan, 1999: North Pacific interdecadal oscillation seen as factor in ENSO-related North American climate anomalies. Eos, Trans. Amer. Geophys. Union, 80, 25–30, doi:10.1029/99EO00019.
Harrold, T. I., A. Sharma, and S. J. Sheather, 2003: A nonparametric model for stochastic generation of daily rainfall amounts. Water Resour. Res., 39, 1343, doi:10.1029/2003WR002570.
Horel, J. D., and J. M. Wallace, 1981: Planetary-scale atmospheric phenomena associated with the Southern Oscillation. Mon. Wea. Rev., 109, 813–829, doi:10.1175/1520-0493(1981)109<0813:PSAPAW>2.0.CO;2.
Kiktev, D., J. Caesar, L. V. Alexander, H. Shiogama, and M. Collier, 2007: Comparison of observed and multimodeled trends in annual extremes of temperature and precipitation. Geophys. Res. Lett., 34, L10702, doi:10.1029/2007GL029539.
Matalas, N. C., 1967: Mathematical assessment of synthetic hydrology. Water Resour. Res., 3, 937–945, doi:10.1029/WR003i004p00937.
Mauget, S. A., 2003: Intra- to multidecadal climate variability over the continental United States: 1932–99. J. Climate, 16, 2215–2231, doi:10.1175/2751.1.
Maurer, E. P., L. Brekke, T. Pruitt, and P. B. Duffy, 2007: Fine-resolution climate projections enhance regional climate change impact studies. Eos, Trans. Amer. Geophys. Union, 88, 504–504, doi:10.1029/2007EO470006.
McCullagh, P., and J. Nelder, 1989: Generalized Linear Models. 2nd ed. Chapman & Hall, 532 pp.
Menne, M. J., and Coauthors, 2012a: Global Historical Climatology Network—Daily (GHCN-Daily), version 3. National Climatic Data Center, accessed 7 December 2015. [Available online at ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/.]
Menne, M. J., I. Durre, R. S. Vose, B. E. Gleason, and T. G. Houston, 2012b: An overview of the Global Historical Climatology Network-Daily database. J. Atmos. Oceanic Technol., 29, 897–910, doi:10.1175/JTECH-D-11-00103.1.
Rajagopalan, B., and U. Lall, 1999: A k-nearest-neighbor simulator for daily precipitation and other weather variables. Water Resour. Res., 35, 3089–3101, doi:10.1029/1999WR900028.
Rajagopalan, B., U. Lall, and D. G. Tarboton, 1997: Evaluation of kernel density estimation methods for daily precipitation resampling. Stochastic Hydrol. Hydraul., 11, 523–547, doi:10.1007/BF02428432.
Richardson, C. W., 1981: Stochastic simulation of daily precipitation, temperature, and solar radiation. Water Resour. Res., 17, 182–190, doi:10.1029/WR017i001p00182.
Roldàn, J., and D. A. Woolhiser, 1982: Stochastic daily precipitation models: 1. A comparison of occurrence processes. Water Resour. Res., 18, 1451–1459, doi:10.1029/WR018i005p01451.
Ropelewski, C. F., and M. S. Halpert, 1986: North American precipitation and temperature patterns associated with the El Niño/Southern Oscillation (ENSO). Mon. Wea. Rev., 114, 2352–2362, doi:10.1175/1520-0493(1986)114<2352:NAPATP>2.0.CO;2.
Shafer, J. C., and W. J. Steenburgh, 2008: Climatology of strong intermountain cold fronts. Mon. Wea. Rev., 136, 784–807, doi:10.1175/2007MWR2136.1.
Stern, R. D., and R. Coe, 1984: A model fitting analysis of daily rainfall data. J. Roy. Stat. Soc., 147A, 1–34, doi:10.2307/2981736.
Thompson, G. A., and D. B. Burke, 1974: Regional geophysics of the basin and range province. Annu. Rev. Earth Planet. Sci., 2, 213–238, doi:10.1146/annurev.ea.02.050174.001241.
Troup, A. J., 1965: The ‘southern oscillation.’ Quart. J. Roy. Meteor. Soc., 91, 490–506, doi:10.1002/qj.49709139009.
Wilks, D. S., 1992: Adapting stochastic weather generation algorithms for climate change studies. Climatic Change, 22, 67–84, doi:10.1007/BF00143344.
Wilks, D. S., 1998: Multisite generalization of a daily stochastic precipitation generation model. J. Hydrol., 210, 178–191, doi:10.1016/S0022-1694(98)00186-3.
Wilks, D. S., 1999a: Interannual variability and extreme-value characteristics of several stochastic daily precipitation models. Agric. For. Meteor., 93, 153–169, doi:10.1016/S0168-1923(98)00125-7.
Wilks, D. S., 1999b: Simultaneous stochastic simulation of daily precipitation, temperature and solar radiation at multiple sites in complex terrain. Agric. For. Meteor., 96, 85–101, doi:10.1016/S0168-1923(99)00037-4.
Wilks, D. S., 2008: High-resolution spatial interpolation of weather generator parameters using local weighted regressions. Agric. For. Meteor., 148, 111–120, doi:10.1016/j.agrformet.2007.09.005.
Wilks, D. S., and R. L. Wilby, 1999: The weather generation game: A review of stochastic weather models. Prog. Phys. Geogr., 23, 329–357, doi:10.1191/030913399666525256.
Wise, E. K., 2010: Spatiotemporal variability of the precipitation dipole transition zone in the western United States. Geophys. Res. Lett., 37, doi:10.1029/2009GL042193.
Woolhiser, D. A., 2008: Combined effects of the Southern Oscillation index and the Pacific decadal oscillation on a stochastic daily precipitation model. J. Climate, 21, 1139–1152, doi:10.1175/2007JCLI1862.1.