Statistical Modeling of Monthly and Seasonal Michigan Snowfall Based on Machine Learning: A Multiscale Approach

Lei Meng aSchool of Environment, Geography, and Sustainability, Western Michigan University, Kalamazoo, Michigan

Search for other papers by Lei Meng in
Current site
Google Scholar
PubMed
Close
and
Laiyin Zhu aSchool of Environment, Geography, and Sustainability, Western Michigan University, Kalamazoo, Michigan

Search for other papers by Laiyin Zhu in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0003-3612-7583
Open access

Abstract

Snow is an important component of Earth’s climate system, and snowfall intensity and variation often significantly impact society, the environment, and ecosystems. Understanding monthly and seasonal snowfall intensity and variations is challenging because of multiple controlling mechanisms at different spatial and temporal scales. Using 65 years of in situ snowfall observation, we evaluated seven machine learning algorithms for modeling monthly and seasonal snowfall in the Lower Peninsula of Michigan (LPM) based on selected environmental and climatic variables. Our results show that the Bayesian additive regression tree (BART) has the best fitting (R2 = 0.88) and out-of-sample estimation skills (R2 = 0.58) for the monthly mean snowfall followed by the random forest model. The BART also demonstrates strong estimation skills for large monthly snowfall amounts. Both BART and the random forest models suggest that topography, local/regional environmental factors, and teleconnection indices can significantly improve the estimation of monthly and seasonal snowfall amounts in the LPM. These statistical models based on machine learning algorithms can incorporate variables at multiple scales and address nonlinear responses of snowfall variations to environmental/climatic changes. It demonstrated that the multiscale machine learning techniques provide a reliable and computationally efficient approach to modeling snowfall intensity and variability.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Laiyin Zhu, laiyin.zhu@wmich.edu

Abstract

Snow is an important component of Earth’s climate system, and snowfall intensity and variation often significantly impact society, the environment, and ecosystems. Understanding monthly and seasonal snowfall intensity and variations is challenging because of multiple controlling mechanisms at different spatial and temporal scales. Using 65 years of in situ snowfall observation, we evaluated seven machine learning algorithms for modeling monthly and seasonal snowfall in the Lower Peninsula of Michigan (LPM) based on selected environmental and climatic variables. Our results show that the Bayesian additive regression tree (BART) has the best fitting (R2 = 0.88) and out-of-sample estimation skills (R2 = 0.58) for the monthly mean snowfall followed by the random forest model. The BART also demonstrates strong estimation skills for large monthly snowfall amounts. Both BART and the random forest models suggest that topography, local/regional environmental factors, and teleconnection indices can significantly improve the estimation of monthly and seasonal snowfall amounts in the LPM. These statistical models based on machine learning algorithms can incorporate variables at multiple scales and address nonlinear responses of snowfall variations to environmental/climatic changes. It demonstrated that the multiscale machine learning techniques provide a reliable and computationally efficient approach to modeling snowfall intensity and variability.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Laiyin Zhu, laiyin.zhu@wmich.edu

1. Introduction

Snowfall is an important indicator of winter season severity along with low temperatures, freezing rain, winds, and visibility in cold climates (Ford et al. 2021). Snowfall intensity, duration, and amount have either beneficial or adverse impacts on society and the environment (Kunkel et al. 2002), forest ecosystems (Zhou et al. 2021), plant phenology (Bjorkman et al. 2015), and hydrological processes (Kolka et al. 2010). The Lower Peninsula of Michigan (LPM) in the U.S. Midwest region (Fig. 1) experiences substantial amounts of snowfall during winter seasons and frequent extreme snowstorms due to the lake effect. It demonstrated that snowfall variability in the LPM has increased, although long-term averaged snowfall has remained relatively stable since the 1970s (Meng and Ma 2021). Model outputs from phase 6 of the Coupled Model Intercomparison Project (CMIP6) suggest that snowfall intensity will increase while the amount of snowfall might decrease in midlatitudes of North America under future warming (Notaro et al. 2015; Quante et al. 2021). Understanding the variability of snowfall can improve its predictability at monthly to seasonal time scales and benefit winter road maintenance, the ski industry, insurance companies (Zeng 2000), and public health (Bernard et al. 2019; Chiu et al. 2021).

Fig. 1.
Fig. 1.

Locations of all COOP stations with snowfall measurement in LPM.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0016.1

Previous studies have discussed different mechanisms influencing the winter snowfall variability in the Great Lakes region of North America, including local/regional environmental factors and teleconnections. Local/regional environmental factors influence snowfall developments through their impacts on atmospheric instability, lift, and moisture exchanges between the surface and atmosphere. Kunkel et al. (2009) indicate a positive relationship between lake water temperatures and snowfall; both have upward trends in the Laurentian Great Lakes. It found a strong negative correlation between average winter temperatures and total winter snowfall in Lake Michigan (Braham and Dungey 1984). Our recent studies (Meng and Ma 2021; Meng et al. 2021) and other previous research (Clark et al. 2016) suggest that both the lake-effect snowfall and the seasonal total snowfall variability have significantly negative correlations with regional average winter temperatures in LPM. Lake surface water temperatures and ice cover also significantly impact lake-effect snowfall due to lake–atmosphere interactions. For instance, a warm lake surface introduces boundary layer instability and facilitates the exchange of moisture and energy, and the wind aloft controls the movement and location of snowbands (Niziol et al. 1995; Peace and Sykes 1966). Several regional modeling studies (Notaro et al. 2013; Shi and Xue 2019) support the same mechanism and describe the roles of lake surface temperatures, ice coverage, and wind directions in the development of lake-effect snowfall.

Teleconnections can influence the snowfall in the Great Lakes regions from mechanisms at global or regional scales. For example, the upper-level trough patterns favorable to the lake-effect snow are often associated with a negative phase of Arctic Oscillation (AO) and North Atlantic Oscillation (NAO) and/or a positive phase of Pacific–North American pattern (PNA) (Suriano and Leathers 2017). The statistical model also shows that the inclusion of the PNA, Pacific decadal oscillation (PDO), Northern Hemisphere temperature, and the NAO/AO improved the prediction skills of snowfall in most stations in the United States (Kluver and Leathers 2015). El Niño–Southern Oscillation (ENSO) controls the amount of snowfall in the Midwest by modulating the location of the jet stream (Smith and O’Brien 2001). In the Lake Michigan region, significantly less snowfall has been observed during El Niño (the warm phase of ENSO) years (Clark et al. 2016). Sea surface temperatures in the Niño-3.4 region (SST3.4) are also negatively correlated with both seasonal total snowfall and lake-effect snowfall in the LPM (Meng and Ma 2021; Meng et al. 2021).

Based on the discussions above, most previous studies often treat these mechanisms influencing the regional snow separately, such as regional mechanisms like lake ice coverage and air temperature or large-scale teleconnections such as ENSO and AO. These atmospheric and oceanic processes operate at different spatial and temporal scales in the development of snowfall events. They may influence monthly and seasonal snowfall in the LPM independently or collectively. One possible approach to understanding the long-term variability of regional snowfall is to dynamically downscale global circulation models (GCMs) by using their outputs as boundary conditions for regional climate model (RCMs) simulations. However, several challenges still exist. For example, dynamical downscaling long-term regional climatology requires enormous computational resources (Gutowski et al. 2020). Uncertainties still exist in parameterizing microphysics in precipitation processes and modeling the land–atmosphere feedback. Machine learning models can provide an alternative approach to studying long-term regional and global climate variability based on historical observations. For example, they can be directly applied to regional and global climate predictions or improvement of predictions (Ham et al. 2019). Previous studies have also reported that machine learning models can improve prediction skills for general climate variables like temperature and precipitation (Robertson et al. 2015; Gibson et al. 2021). Machine learning models also demonstrate improved abilities to predict hydroclimate extremes, such as droughts and extreme rainfall (Wei et al. 2022). In addition, machine learning has been used to parameterize cloud microphysics in climate models (Schneider et al. 2017; O’Gorman and Dwyer 2018) to reduce the computation required for global high-resolution simulations. Machine learning models have been applied for estimating snow and related hazards. For example, a nonlinear autoregressive networks model is developed to enhance the spatial resolution of snowfall estimates for the Black Forest using additional topographic information (Sauter et al. 2010). Similarly, the support vector machine (SVM) and multivariate discriminant analysis (MDA) models both have excellent performance in snow avalanche predictions in the Karaj watershed, northern Iran (Choubin et al. 2019).

The Great Lakes region experiences a significant amount of winter snowfall produced by synoptic systems such as midlatitude cyclones and lake-induced convective systems. So far, very few studies have used machine learning techniques in modeling monthly and seasonal snowfall in this region. This study will evaluate multiple machine learning techniques for modeling snowfall variations in the LPM. We used simultaneous variables in model development because we are particularly interested in the environmental and atmospheric conditions associated with monthly snowfall intensity and variations. The machine learning models will allow us to 1) compare different machine learning approaches for their fitting and out-of-sample estimation skills for monthly and seasonal snowfall in the LPM and 2) identify several important regional environmental factors and teleconnection indices that could influence the snowfall amount in the LPM. Several major new contributions beyond the previous studies (Meng and Ma 2021; Meng et al. 2021) are as follows: 1) this study incorporates more variables into each machine learning statistical model than previous studies; 2) previous studies focus on the snowfall trend and its interannual variability, while this study aims to model monthly and seasonal snowfall variations in the LPM; 3) previous studies only investigate linear correlations between each environmental/atmospheric variable and the snowfall amount, while this study can evaluate both linear and nonlinear responses using machine learning approaches.

2. Data and methods

a. Snowfall and independent variables

This study uses monthly snowfall datasets from eight Cooperative Observer Network (COOP) stations as the response variable. The National Weather Service (NWS) COOP is a network of daily weather observations in the United States. These eight COOP stations are temporally homogeneous and suitable for climate research (Kunkel et al. 2009). Our previous studies used these datasets (Meng and Ma 2021; Meng et al. 2021). Only the dataset from 1951 to 2015 is chosen to match the availability in the corresponding teleconnection indices. Each station’s total monthly snowfall during the peak snowfall season (December, January, and February) is used to develop the machine learning models. Therefore, this study focuses on the total monthly snowfall at each station, a sum of snowfall amounts from individual snow events within each month.

All variables for the development of the statistical models are listed in Table 1. Monthly temperature variables (maxT, minT, savgT, rangeT) and maximum and minimum vapor pressure deficit (vpdmax and vpdmin) at each COOP station are obtained from the nearest grid cell in the Precipitation-Elevation Regressions on Independent Slopes Model (PRISM) dataset with a 4-km spatial resolution. The avgT is extracted from the same data source by averaging monthly temperatures for the entire region. Temperatures are an important indicator of winter weather that has been suggested to be highly correlated with snowfall in multiple previous studies (Braham and Dungey 1984; Kunkel et al. 2009; Meng and Ma 2021; Meng et al. 2021). Therefore, we include different temperature statistics in our machine learning model as independent variables. The vapor pressure deficit (vpd) describes the difference between the actual amount of moisture in the air and the amount of moisture in saturated air at the same temperature. Temperatures and u (uwind) and υ (vwind) components of the surface wind are obtained from the nearest grid cell in the ERA5-land monthly dataset with a 0.25° spatial resolution (Bell et al. 2021). The wind direction is calculated from uwind and vwind values. The surface wind characteristics describe patterns of winter synoptic systems (e.g., cold front) that are important to the processes that lead to snowfall. Temperatures, vpd, and surface wind variables are local or regional dynamic variables (change over time).

Table 1.

List of variables used in the statistical modeling process. All dynamic variables are monthly averages and at the ground level, extracted from the PRISM and ERA5-Land reanalysis datasets.

Table 1.

Previous studies suggest that both the PRISM dataset and ERA5 reanalysis dataset can reproduce observed temperature and wind values with high accuracy. For example, a comparison of minimum and maximum temperatures between the PRISM and Daymet/WorldClim datasets suggests that the difference between these datasets is less than 1°C in Michigan (Daly et al. 2008). All these datasets are created using weather stations in United States including the NWS COOP, USDA Natural Resources Conservation Service (NRCS) SNOTEL network, or USHCN network. Sheridan et al. (2022) compare wind observations from meteorological towers and sodars across the United States with ERA5 wind and find that ERA5 has an average bias of 0.5 m s−1 and a mean absolute error (MAE) of 1.7 m s−1. Their studies also show a high correlation (around 0.9) between ERA5 and wind observations over the Great Lakes region.

Teleconnection indices used in the models are based on previous literature showing their impacts on the processes leading to snowfall in the United States (Hartnett et al. 2014; Kluver and Leathers 2015; Clark et al. 2016; Suriano and Leathers 2017; Meng and Ma 2021; Meng et al. 2021). All teleconnection indices, including tsi, tni, np, sst34, pna, nhavgT, nao, pdo, and ao are obtained from the NOAA Physical Sciences Laboratory (PSL; https://psl.noaa.gov/data/climateindices/list/). Latitude, longitude, elevation, and the shortest distance (in km) to Lake Michigan shorelines for each COOP station are also included in our models. They do not change with time and are defined as static variables.

b. Model overview

We test seven major categories of machine learning algorithms to obtain the optimal model for monthly snowfall in the LPM. The generalized linear model (GLM) is a series of special linear regression models first formulated by Nelder and Wedderburn (1972). They are beneficial for the situation when the response variable reacts nonlinearly with predictors by using a link function (such as a logarithm function) to allow the variance of each measurement to be a function of its prediction (Hastie and Tibshirani 1990). The generalized additive model (GAM) is a special kind of GLM where the response variable is linearly dependent on the smooth function of some predictor variables (Hastie and Tibshirani 1986). An exponential distribution is specified for the response variable and the predictor variables are linked with smooth functions such as polynomial, spline, or nonparametric functions. We chose GLM and GAM because they are closely related to multiple linear regression, which has a simple model structure and straightforward model interpretations.

The Bayesian regularization for feed-forward neural networks (BRNN) is a two-layer framework of neural networks that uses the Nguyen and Widrow algorithm (Nguyen and Widrow 1990) to assign initial weight and the Gauss–Newton algorithm to perform the optimization. It has been applied to predicting complex quantitative genetic traits (Gianola et al. 2011). Since deep learning based on a convolutional neural network (CNN) framework is widely applied in climate/weather predictions and model parameterization, we tested a neural network model (BRNN) for snowfall modeling.

The SVM is a machine learning algorithm that discovers an optimal hyperplane that classifies the data points from a multidimensional space (Boser et al. 1992). It has already been used in climate science, such as predicting extreme rainfall events (Nayak and Ghosh 2013) and downscaling precipitation from GCMs (Tripathi et al. 2006). We include the SVM in our model comparison to provide a more easily interpretable machine learning model than complex models like the BRNN.

Multivariate adaptive regression splines (MARS) is a nonparametric statistical modeling approach that can model nonlinearities and interactions without knowing the a priori of data (Friedman 1991). This algorithm expands product spline basis functions, where the data automatically determine the number of functions and their parameters. The MARS has been successively applied to predict monthly runoff in the tropical climate (Reddy et al. 2021) and burn areas from wildfire in western boreal North America (Balshi et al. 2009). Therefore, MARS is chosen as an alternative to the machine learning models and provides a sophisticated nonmachine learning comparative model.

The random forest (RF) and Bayesian additive regression trees (BART) are based on ensembles of decision trees. The RF constructs multiple decision trees, and those trees give its mean prediction (Breiman 2001). It can provide more robust predictions and suffer from less overfitting to the training set than a single decision tree. The BART algorithm is another “sum-of-trees”-based model where each tree starts with a constraint as a weak learner. The fitting and inference are finished using the iterative Bayesian backfitting Markov chain Monte Carlo (MCMC) algorithm to create samples from a posterior (Chipman et al. 2010). The BART adds the Bayesian prior-posterior framework in the ensemble tree modeling. We use the “BartMachine” R package to train our model and the algorithm has three prior components: the tree structure, the leaf parameters, and the error variance (Kapelner and Bleich 2016). Nodes at depth d are nonterminal with probability α(1 + d)β where α ∈ (0, 1) and β ∈ [0, ∞]. Default values for these hyperparameters of α = 0.95 and β = 2 are recommended by Chipman et al. (2010). The prior on each of the leaf parameters is given as μl=N(μμ/m,σμ2). The expectation μμ is picked to be the range center, (ymin + ymax)/2. The prior of error variance is chosen to follow the inverse gamma distribution: σ2 ∼ InvGamma(υ/2, υλ/2). The λ is determined from the data so there is a q = 90% a priori chance (by default) that the BART model will improve upon the root-mean-square error (RMSE) from an ordinary least squares regression. In a previous case of model comparison (Chipman et al. 2010), the BART’s cross validation performance is proven better than the boosting, the lasso, MARS, neural nets, and the RF with less computation demand. We test both RF and BART for our snowfall estimation models. In addition, both BART and RF have no explicit assumptions of collinearity among predictors. Both models are nonparametric regression models that use an ensemble of decision trees to model the relationship between predictors and the response variable. They are designed to handle complex, nonlinear relationships without relying on specific assumptions about the data.

All seven machine learning algorithms are selected based on different rationales that cover a variety of statistical theories. Most of them have never been used to model snowfall intensity and variations. Therefore, we will apply all seven algorithms to model snowfall intensity and variations and evaluate their performances here.

c. Model evaluation and selection

We start our statistical modeling by splitting our data randomly into 80% training and 20% testing data. The classification and regression training (caret) R package (Kuhn 2008) is used to select the optimal combination of variables for each model in the model training stage based on the recursive feature elimination (RFE) function. For example, the RFE uses a stepwise feature selection for the GLM model and a cross-validated recursive variable selection for GAM, SVM, MARS, RF, and BART. There is no variable selection for BRNN. The RFE tests all possible variable combinations in the model and evaluates their cross-validation results (RMSE, R2, and MAE from 10-fold cross validation) before selecting the model with the best results. We train these statistical models using the selected variables and compare the model-fitting results for all seven algorithms.

The next step is to use the 20% randomly selected testing data to execute the out-of-sample cross validation to test the models’ sensitivity and robustness to the new data. We use the identical models trained from the previous step to estimate snowfall observations in the testing data and make comparisons. The RMSE, MAE, and R2 are used to gauge different models’ out-of-sample estimation skills. The 20%/80% cross validation is designed to test all model’s general performance in out-of-sample estimation. Next, we choose two machine learning algorithms with the best estimation skills for further evaluation: the hold-1-yr-out cross validation. The purpose is to test models’ sensitivity and stability in estimating the annual variation of snowfall in the LPM. It starts by iteratively holding out each year’s data and training a model only with the other 64 years. Then, the fitted model estimates monthly snowfall at all stations for each hold-out year. We then compare the annually aggregated (three months) estimations and observations and evaluate the performance of selected models.

One of the major challenges for all research using machine learning techniques is model interpretation (Molnar et al. 2020). Model interpretation helps identify important variables/physical processes involved in the machine learning model and understand how the dependent variables interact with independent variables. We can calculate the variable importance of different machine learning approaches used in this study (Grömping 2009). For example, the t statistic for each model parameter is used for GLM. The reduction (addition) to the model performance (such as residual sums of squares) when a predictor is added to (removed from) the model is calculated as the variable importance (VI) for MARS, RF, and BART. For better comparison purposes, we will calculate the relative VI relative VIs based on a 0–100 scale for model comparison purposes. Finally, the partial dependence plot (PDP) is a useful tool to demonstrate the marginal effect of each predictor on the response variable (Friedman 2001). The model’s linear and nonlinear relationships between snowfall and any specific predictor can be interpreted by analyzing the predictor’s PDP. Combining the VI and PDP, we expect to reveal important physical mechanisms that control snowfall in the LPM.

3. Results and discussions

a. Linear correlation

Table 2 shows linear correlations between all monthly snowfall and environmental/climatological factors. Maximum temperature (maxT) has the highest correlation with the snowfall in the LPM, followed by the regional averaged temperatures (avgT). Both maxT and avgT have a strong negative correlation with snowfall. The station’s geographical location also impacts the amount of snowfall. More snowfall is associated with stations closer to Lake Michigan and at higher latitudes. Most teleconnections have weak or no statistically significant correlations with snowfall, and the two strongest signals are SST3.4 and North Atlantic Oscillation.

Table 2.

Linear correlations between Michigan snowfall and independent variables. The asterisk indicates statistically significant correlation at the 99% level.

Table 2.

b. Statistical model fitting

The correlation analysis provides information about snowfall’s linear response to individual predictors. In this section, we will evaluate seven different machine learning algorithms to explore their combined effect. We use the RFE algorithm to test all possible combinations of predictors and select the optimal combination [statistics shown as supplemental information (SI) 1–5 in the online supplemental material]. The number of required variables for the best fitting models varies from 4 to all independent (24) variables. Significant differences exist in the model fitting accuracy among the seven algorithms (Table 3). For example, the R2 varies from 25% (SVM) to 88% (BART). The BART model also has the lowest MAE and RMSE, followed by the RF model.

Table 3.

The model fitting result for the 80% training data.

Table 3.

The VI rankings also show differences among seven machine learning algorithms (Table 4). Similar to the correlation analysis, maxT is the most important controlling variable in GLM, MARS, SVM, and RF statistical models and the third most important variable in the BART model; avgT is another important variable in many machine learning models (ranked second in GLM, SVM, and RF). Finally, the elevation is an important static variable along with the latitude, which appears in the top 10 VIs of most models except the SVM. For the BART with the best fitting performance, the top 5 most important variables are vpdmax, dist2shore, maxT, rangeT, and elevation.

Table 4.

Relative VI for different ML algorithms [only up to 10 most important variables (var) are shown].

Table 4.

c. Out of sample validation

Next, we use those trained models to make out-of-sample validations using the 20% testing data (Fig. 2). Comparison of model estimations against observations demonstrates that most models are robust and have stable estimation skills. BART has the best estimation skill, followed by RF, BRNN, GAM, GLM, MARS, and SVM, based on the MAE and RMSE statistics. BART has an R2 = 0.58 with RMSE = 18.4 cm and MAE = 13.83 cm. The RF model’s estimation skill is slightly lower than BART, with R2 = 0.55 and MAE = 14.43 cm. The BART model has the best estimation skill for monthly snowfall > 100 cm (Table 5) if compared with GLM, GAM, SVM, MARS, and RF, which all have systematic underestimates. The BRNN’s estimates (Fig. 2c) for >100 cm snowfalls spread much wider than the BART, although there are both underestimates and overestimates. For observations of snowfall less than 100 cm, the BART also performs much better than other models.

Fig. 2.
Fig. 2.

The model estimation skills for different ML algorithms, were calculated from the out-of-sample cross validation using the 20% testing data. Estimation and observation (same 20% testing data) are compared with the y = x line (red).

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0016.1

Table 5.

The error statistics for >100 and <100 cm monthly snowfall for the 20% testing data among different models. ME stands for mean error, calculated as model estimation minus observation. Units are cm.

Table 5.

We further evaluate model performances through the hold-1-yr-out cross validation for BART and RF because they have already shown the best performance in model fitting and 20%/80% cross validation. Figure 3 indicates that the BART model can explain 62%–92.4% (the range for 8 stations) of seasonal snowfall variance (summed as December, January, and February) while RF can explain 15%–38.6% of seasonal snowfall variance in the LPM as indicated by their R2. When the snowfall estimations are averaged over the 8 stations, there is a significant improvement in model skill (Fig. 3i). The BART model demonstrates an exceptionally high R2 value of 99.8%, low values of MAE (1.32 cm), and RMSE (1.96 cm). The random forest model also shows improvement if we evaluate the regional mean snowfall estimation (R2 = 0.5, MAE = 19.67 cm) as compared to the single station estimations. We believe this improvement is mainly from the seasonal aggregation and the regional averaging smoothed out some local variabilities that the machine learning model does not capture very well. Besides interannual and interdecadal variability, both RF and BART also capture temporal trends at some locations’ time series (such as East Jordan, Battle Creek, and the regional mean).

Fig. 3.
Fig. 3.

Hold-1-yr-out cross validation results for RF and BART models. Each year, RF and BART are trained only using monthly snowfall observations from the other 64 years. Each hold-out model is used to estimate the monthly snowfall of all stations for the hold out year and then estimations are aggregated annually for each station.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0016.1

d. Discussion of important variables and model interpretations

The PDPs can be used to estimate each dependent variable’s marginal effect on the predicted outcome of a machine learning model (Friedman 2001). The PDPs for BART (Fig. 4) show that snowfall generally increases as maxT, minT, and avgT decreases. This agrees with our correlation analysis and previous studies (Meng and Ma 2021; Meng et al. 2021). Air temperatures can influence the rate of ice accretion, the sublimation/deposition of snow, and the mean size of snow aggregates (Hong et al. 2004). The BART PDP demonstrates nonlinear reactions of snowfall to the avgT (Fig. 4s), which may indicate the complex feedback between the snowfall and air temperature. The RF PDP reveals a more monotonic positive relationship between snowfall and savgT (Fig. 5i). At a much larger scale, North Hemisphere average temperatures (nhavgT) are positively correlated with the amount of snowfall in the LPM in both statistical models (Figs. 4o and 5o). This large-scale signal could be related to the breaking down of the polar vortex due to the melting of Arctic sea ice (Francis and Vavrus 2012) or the general warm-up of water temperatures that may intensify the lake-effect snow or midlatitude cyclones.

Fig. 4.
Fig. 4.

The PDP for the BART model developed by 80% of training data.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0016.1

Fig. 5.
Fig. 5.

The PDP for the RF model developed by 80% training data; the PDP shows how the dependent variable changes with each predictor used in the model.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0016.1

Static variables such as latitudes, dist2shore, and elevation also play essential roles in both RF and BART statistical models. Snowfall has a nonlinear relationship with elevation in the BART model: snowfall decreases with elevation from 0 to 250 m (Fig. 4h) and increases as elevation goes from 250 to 300 m and then decreases with elevation when elevation is above 300 m. In the RF model, snowfall decreases with elevation until 250 m and levels off when elevation is above 250 m (Fig. 5e). It suggests that with the possible influence from the topography, the lake effect probably produces significant snowfall in the peak season. Our previous study on regional lake-effect snowfall indicates that approximately 49.6% of snowfall in the peak season (December, January, and February) can be classified as lake-effect snowfall in the LPM (Meng and Ma 2021). Snowfall increases as the station’s latitude increases in both BART (Fig. 4e) and RF (Fig. 5f). The relationship is possibly controlled by the impact of latitudes on temperatures and snow frequencies. Temperature decreases, and the snow frequency generally increases with latitude (Shi and Liu 2021).

BART’s PDPs also demonstrate that the maximum vapor pressure (vpdmax) (Fig. 4a) has a nonlinear relationship with snowfall. More snowfall is generally corresponding to <14 kPa vpdmax. In our case, low vapor pressure deficit values (<14 kPa) correspond to a high amount of atmospheric water vapor, which is favorable to the nucleation process in all kinds of precipitation including snowfall (Maeda 2021). The snowfall remains fluctuated when vpd > 14 kPa and demonstrates two small peaks when the vpdmax is at ∼14.5 and ∼21 kPa (Fig. 4a). In the RF model, the vpdmax relationship has the same direction but with less nonlinearity (Fig. 5g). Interestingly, the BART model shows that higher snowfall is associated with weak vwind from the north (negative anomaly) and the wind direction from 250° to 310° (northwesterly). During the winter, the North American high transports cold air from the north and interacts with the warm air from the south to form synoptic winter storms. The cold air also interacts with the warm surface water, creating the lake-effect snow in Michigan. ENSO intensity also controls this process. For example, La Niña winter is associated with the displaced polar jet to the Great Plains (Smith and O’Brien 2001) and cooler/wetter winters in the upper Midwest (Budikova et al. 2022).

Both BART (Fig. 4q) and RF (Fig. 5l) statistical models suggest more snowfall amounts in the LPM are associated with the cold phase of ENSO (La Niña) (Meng et al. 2021). Negative correlations have been identified by recent studies on winter snowfall in Michigan (Meng and Ma 2021; Meng et al. 2021). Previous studies have discussed how ENSO regulates the Pacific jet stream and influences the storm tracks over the continental United States (Trenberth and Guillemot 1996; Chen and Kumar 2002). Studies also suggest that general circulation patterns during El Niño years tend to block cold Arctic air masses from moving into the Great Lakes region, leading to warm temperatures and less lake-effect snowfall in the LPM (Smith and O’Brien 2001; Bai et al. 2012).

Besides ENSO, several other teleconnection indices also show different influences on LPM snowfall in BART and RF models. The North Pacific index (NP) is calculated as the area-weighted sea level pressure over the North Pacific (Trenberth and Hurrell 1994). The NP is closely related to the tropical and subtropical SST through ocean–atmosphere interactions and interacts with ENSO cycle. Snowfall amounts substantially increase when the NP exceeds 1006 mb (1 mb = 1 hPa) in the PDPs (Figs. 4f and 5j). It is consistent with Chen and Song (2018), who show significant negative relationships between NP and temperatures in central Canada and the U.S. Great Lakes region, because higher NP corresponds to lower temperatures and a higher likelihood of heavy snowfall events. PDO reflects changes in SST in the North Pacific and sea level pressures over the Aleutian Islands (Mantua et al. 1997; Newman et al. 2016) and has teleconnections with winter temperatures and precipitation patterns in a large portion of the Midwest. Both BART and RF PDPs demonstrate that snowfall generally increases when the PDO anomaly is negative. Previous studies also indicate that negative phases of PDO are typically associated with above-normal winter precipitation in a large portion of the interior United States (Mantua et al. 1997; Newman et al. 2016). The PNA describes a quadripole pattern of 500-mb height anomalies and is a dominant mode of low-frequency variability in the Northern Hemisphere extratropics (Li et al. 2019). It strongly influences precipitation in North America by modifying polar jet flows and associated storm tracks. Negative PNA phases are usually more favorable to the northern displacement of the jet stream over the eastern United States. It frequently causes intruding of maritime tropical air from the gulf (Leathers et al. 1991; Budikova et al. 2022) and enhancement in local precipitation in the eastern United States. The PDPs (Figs. 4m and 5m) show similar patterns, where negative PNA anomalies are generally associated with more snowfall in the LPM. This PDP pattern is in agreement with Budikova et al. (2022) that negative phases of PNA are often associated with more severe winters in the Great Lakes regions through the existence of a well-developed trough in the Great Lakes regions (a negative PNA-like pattern) (Lin et al. 2022). The BART’s PNA PDP has more nonlinearity than the RF, with a spike of snowfall increase when the PNA value is between −0.8 and 0.5 (Fig. 4m).

The tropical Southern Atlantic index (TSI; Fig. 4b) has a generally positive relationship with the snowfall but also demonstrated a high nonlinearity. The TSI is a climate teleconnection index remote from the study area but shown in the top 10 independent variables with the highest VI in the MARS and BART (Table 4). As this physical linkage appears tenuous, we can only speculate that this correlation may reflect that the TSI is closely related to the NAO on interannual to decadal time scales (Marshall et al. 2001). The NAO and AO control the upper-level winds and the polar vortex in the Northern Hemisphere (Budikova et al. 2022). Positive NAO/AO phases are associated with a stronger polar vortex that locks the cold air in the higher latitude, while negative NAO/AO is usually associated with an enhanced meandering of polar jet and outbreaks of colder air into the lower latitude (Budikova 2012). This cold air usually introduces extremely low temperatures and snowfall (Ghatak et al. 2010). The NAO/AO only appears in the BART model with minor variable importance. Figure 4r shows a negative relationship between snowfall and NAO. A more complex relationship exists between snowfall and AO (Fig. 4x).

e. Estimation of large monthly snowfall

Our results show that BART and RF are the best of the seven models we tested for estimating monthly snowfall in the LPM. To evaluate their performance in estimating large snowfall amounts and examine contributing factors, we select the upper 30% (>70th percentile) of all monthly snowfall observations in 8 stations (a total of 1344 samples) to develop two new BART and RF statistical models (M70s hereafter). Results (Table 6) show that both M70s’ estimation skills have decreased in their fitting statistics compared to those general models (Table 4). The R2 for RF has changed from 0.58 to 0.30 (−48%), while the R2 for BART has changed from 0.88 to 0.63 (−28%). The RF’s RMSE increased from 20.24 to 21.06 (+4%) and MAE increased from 15.29 to 15.93 (+4%), while the BART’s RMSE increased from 11.07 to 15.29 (+38%) and MAE increased from 8.48 to 11.53 (+36%). Therefore, the RF has larger relative changes in R2 and the BART has larger relative changes in RMSE and MAE. Meanwhile, the BART model still performs better than the RF model with higher R2, lower MAE, and RMSE.

Table 6.

Fitting statistics for RF and BART models based on the upper 30th percentile of snowfall data.

Table 6.

Regarding VIs, both RF and BART M70s (Table 7) show slight differences from the general models. It is interesting to note that maxT and vpdmax are the two most important predictors for both M70s. More snowfall corresponds to lower maxT and lower vpdmax (Fig. 6), similar to their relationships with snowfall in general models. Fluctuations in estimated snowfall in the BART general model (Fig. 4a) in the higher range of vpdmax (>12 kPa) are not present in the BART M70 (Fig. 6b), indicating that vpdmax might be the dominant control of larger monthly snowfall in the BART M70. Other temperature variables (rangeT, avgT, minT) are also important in both RF and BART M70s, showing negative relationships with the snowfall. The NP is the only teleconnection variable in the top 10 VI list for both RF and BART (Table 7). In Fig. 6 and Fig. SI 6, we also find that the NP and other teleconnection variables (ENSO, PDO, NAO, AO, and PNA) follow their relationship with the LPM snowfall in general models (Figs. 4 and 5). The tsi has a positive relationship with snowfall in PDPs for both BART (Fig. 6d) and RF (Fig. SI 6i).

Table 7.

Relative VI for the 70th percentile models RF and BART.

Table 7.
Fig. 6.
Fig. 6.

The PDP for the BART model developed by the 70th percentile data.

Citation: Artificial Intelligence for the Earth Systems 2, 4; 10.1175/AIES-D-23-0016.1

Many previous studies have mentioned that modeling climate extremes is challenging in the current Earth System Models and machine learning models (Zwiers et al. 2013; Sillmann et al. 2017; Zhu and Aguilera 2021). Our result shows that the RF and BART M70s have slight degradations in their fitting skills. Encouragingly, despite the large variability and uncertainty in the upper 30th percentile of snowfall observations, both models still explain a substantial component of the variance. This result suggests that the BART model reasonably estimates monthly snowfall extremes (the upper 30% of total monthly snowfalls) in the LPM.

4. Conclusions

Our goal in this study is to evaluate the performance of seven machine learning techniques in the statistical modeling of monthly snowfall in the LPM and to identify important variables that could be used for monthly and seasonal snowfall estimations. Our analysis suggests that temperatures are one of the most important variables in modeling snowfall in the LPM using machine learning techniques. At each station, maximum and minimum temperatures substantially impact snowfall more than average temperatures. It indicates that the snow formation process in this region is more sensitive to temperature variation. Similar results are found in the Canadian domain of the Great Lakes basin (Baijnath-Rodino et al. 2018). At the global level, the North Hemisphere averaged temperatures are negatively correlated with the LPM snow. Physical mechanisms for this relationship, such as polar vortex breakdown or increased temperature difference between the lake surface and air, need further investigation (Agee and Hart 1990; Meng and Ma 2021).

Our statistical models also demonstrate that latitude, elevation, and distance to the shoreline are important predictors of monthly snowfall in the LPM. The importance of these variables is possibly associated with regional physical processes that lead to the development of lake-effect snowfall events. The impact of elevation on snowfall amounts in the Great Lakes region has been mentioned in the previous literature (Hill 1971; Niziol 1987). RCM simulations also confirm that annual snowfall and frequency (days per year) decrease as the downwind distance from the Great Lakes increases (Notaro et al. 2013). Including these local static variables has substantially improved the estimation skill of our machine learning statistical models.

Seven teleconnection indices, including NP, SST3.4, PNA, NAO, PDO, TNI, and AO, are included in our statistical models. Machine learning techniques have no assumption of noncollinearity among independent variables. These teleconnections can work together to improve the model estimation skill. Our results demonstrate several important teleconnection indices in the statistical models, including SST3.4, PDO, and NP. These indices have nonlinear or linear relationships with snowfall in the LPM. Further investigations are needed to validate the physical processes reflected by those relationships in the machine learning models. In addition, we need to further investigate the partition of snowfall into lake-effect and non-lake-effect snowfall in the Great Lakes region because these two types of snowfall are generated by different physical mechanisms (Pettersen et al. 2020).

Our comparison of machine learning models suggests that the BART and the random forest models can accurately model mean and larger monthly LPM snowfall. These machine learning approaches can incorporate both dynamic atmospheric/oceanic signals from multiple scales and static environmental variables in the statistical models and provide a reliable and computationally efficient alternative to dynamic models (Chantry et al. 2021) and a new way to identify possible physical mechanisms.

The findings from this study could be used to investigate snowfall variations in other lake-effect regions. The important variables identified in this study could change for non-lake-effect regions due to different physical mechanisms in the production of non-lake-effect and lake-effect snowfalls. Current snowfall measurements in the lake-effect regions do not separate non-lake-effect snowfall from lake-effect snowfall. Therefore, our analysis in this study includes both lake-effect and non-lake-effect snowfall. Very few studies have examined the trend and variations of only lake-effect snowfall since snowfall measurements at weather stations include both lake-effect and non-lake-effect snowfall. Meng and Ma (2021) explicitly calculated the average of lake-effect snowfall from total seasonal snowfall at the regional scale. Whether the method used by Meng and Ma (2021) can be applied at each weather station needs further validation.

Machine learning models can identify important environmental and climatological variables and their relationships with snowfall in the LPM. However, they cannot thoroughly verify the physical mechanisms behind the statistical relationships. For example, the snowpack and air temperature may have complex feedback mechanisms (Scherrer et al. 2012) that may not be captured by machine learning–based statistical models. Process-based climate models are necessary to understand atmospheric and hydrological processes leading to snowfall variations at different temporal and spatial scales. Overall, our analysis suggests that machine learning statistical models can incorporate the nonlinear responses of snowfall to several variables and have the potential to improve monthly and seasonal snowfall estimations. Our analysis also did not include the preceding variables as we are focused on the simultaneous environmental and climatic conditions associated with total monthly snowfall and the development of machine learning statistical models. In the future, machine learning models can be tested for other snow-prone regions and used to predict regional snowfall variability and changes before the snow season or for the future based on Coupled Model Intercomparison Project (CMIP) climate projections.

Acknowledgments.

L. M. and L. Z. are both supported by the Publication of Papers and Exhibition of Creative Works (PPP&E) from Western Michigan University. We acknowledge the valuable thoughts and suggestions provided by two anonymous reviewers and editor Dr. John Allen in the publication process. Both authors contributed equally to this research, including conceptualization and design of the research, data analysis, discussion, and manuscript writing and revisions. The authors declare no competing interests.

Data availability statement.

All machine learning software packages are available at The Comprehensive R Archive Network (https://cran.r-project.org/). The snowfall dataset and codes for machine learning model training and testing are available at zenodo (https://zenodo.org/record/7018137#.YwW3OXbMJD8) with the https://doi.org/10.5281/zenodo.7018137. Please feel free to contact the authors for additional information about the data and the codes.

REFERENCES

  • Agee, E. M., and M. L. Hart, 1990: Boundary layer and mesoscale structure over Lake Michigan during a wintertime cold air outbreak. J. Atmos. Sci., 47, 22932316, https://doi.org/10.1175/1520-0469(1990)047<2293:BLAMSO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Bai, X., J. Wang, C. Sellinger, A. Clites, and R. Assel, 2012: Interannual variability of Great Lakes ice cover and its relationship to NAO and ENSO. J. Geophys. Res., 117, C03002, https://doi.org/10.1029/2010JC006932.

    • Search Google Scholar
    • Export Citation
  • Baijnath-Rodino, J. A., C. R. Duguay, and E. LeDrew, 2018: Climatological trends of snowfall over the Laurentian Great Lakes Basin. Int. J. Climatol., 38, 39423962, https://doi.org/10.1002/joc.5546.

    • Search Google Scholar
    • Export Citation
  • Balshi, M. S., A. D. McGuire, P. Duffy, M. Flannigan, J. Walsh, and J. Melillo, 2009: Assessing the response of area burned to changing climate in western boreal North America using a Multivariate Adaptive Regression Splines (MARS) approach. Global Change Biol., 15, 578600, https://doi.org/10.1111/j.1365-2486.2008.01679.x.

    • Search Google Scholar
    • Export Citation
  • Bell, B., and Coauthors, 2021: The ERA5 global reanalysis: Preliminary extension to 1950. Quart. J. Roy. Meteor. Soc., 147, 41864227, https://doi.org/10.1002/qj.4174.

    • Search Google Scholar
    • Export Citation
  • Bernard, D., S. Konate, and E. Savoia, 2019: Snow storms and vulnerable populations: Local public health activities in response to the 2014–2015 severe winter weather. Disaster Med. Public Health Prep., 13, 647649, https://doi.org/10.1017/dmp.2018.81.

    • Search Google Scholar
    • Export Citation
  • Bjorkman, A. D., S. C. Elmendorf, A. L. Beamish, M. Vellend, and G. H. R. Henry, 2015: Contrasting effects of warming and increased snowfall on Arctic tundra plant phenology over the past two decades. Global Change Biol., 21, 46514661, https://doi.org/10.1111/gcb.13051.

    • Search Google Scholar
    • Export Citation
  • Boser, B. E., I. M. Guyon, and V. N. Vapnik, 1992: A training algorithm for optimal margin classifiers. COLT’92: Proc. Fifth Annual ACM Workshop on Computational Learning Theory, Pittsburgh, PA, Association for Computing Machinery, 144–152, https://doi.org/10.1145/130385.130401.

  • Braham, R. R., and M. J. Dungey, 1984: Quantitative estimates of the effect of Lake Michigan on snowfall. J. Climate Appl. Meteor., 23, 940949, https://doi.org/10.1175/1520-0450(1984)023<0940:QEOTEO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Breiman, L., 2001: Random forests. Mach. Learn., 45, 532, https://doi.org/10.1023/A:1010933404324.

  • Budikova, D., 2012: Northern Hemisphere climate variability: Character, forcing mechanisms, and significance of the North Atlantic/Arctic Oscillation. Geogr. Compass, 6, 401422, https://doi.org/10.1111/j.1749-8198.2012.00498.x.

    • Search Google Scholar
    • Export Citation
  • Budikova, D., T. W. Ford, and J. D. Wright, 2022: Characterizing winter season severity in the Midwest United States, Part II: Interannual variability. Int. J. Climatol., 42, 34993516, https://doi.org/10.1002/joc.7429.

    • Search Google Scholar
    • Export Citation
  • Chantry, M., H. Christensen, P. Dueben, and T. Palmer, 2021: Opportunities and challenges for machine learning in weather and climate modelling: Hard, medium and soft AI. Philos. Trans. Roy. Soc., A379, 20200083, https://doi.org/10.1098/rsta.2020.0083.

    • Search Google Scholar
    • Export Citation
  • Chen, J., and P. Kumar, 2002: Role of terrestrial hydrologic memory in modulating ENSO impacts in North America. J. Climate, 15, 35693585, https://doi.org/10.1175/1520-0442(2003)015<3569:ROTHMI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Chen, S., and L. Song, 2018: Impact of the winter North Pacific Oscillation on the surface air temperature over Eurasia and North America: Sensitivity to the index definition. Adv. Atmos. Sci., 35, 702712, https://doi.org/10.1007/s00376-017-7111-5.

    • Search Google Scholar
    • Export Citation
  • Chipman, H. A., E. I. George, and R. E. McCulloch, 2010: BART: Bayesian additive regression trees. Ann. Appl. Stat., 4, 266298, https://doi.org/10.1214/09-AOAS285.

    • Search Google Scholar
    • Export Citation
  • Chiu, Y. M., F. Chebana, B. Abdous, D. Bélanger, and P. Gosselin, 2021: Cardiovascular health peaks and meteorological conditions: A quantile regression approach. Int. J. Environ. Res. Public Health, 18, 13277, https://doi.org/10.3390/ijerph182413277.

    • Search Google Scholar
    • Export Citation
  • Choubin, B., M. Borji, A. Mosavi, F. Sajedi-Hosseini, V. P. Singh, and S. Shamshirband, 2019: Snow avalanche hazard prediction using machine learning methods. J. Hydrol., 577, 123929, https://doi.org/10.1016/j.jhydrol.2019.123929.

    • Search Google Scholar
    • Export Citation
  • Clark, C. A., and Coauthors, 2016: Spatiotemporal snowfall variability in the Lake Michigan region: How is warming affecting wintertime snowfall? J. Appl. Meteor. Climatol., 55, 18131830, https://doi.org/10.1175/JAMC-D-15-0285.1.

    • Search Google Scholar
    • Export Citation
  • Daly, C., M. Halbleib, J. I. Smith, W. P. Gibson, M. K. Doggett, G. H. Taylor, J. Curtis, and P. P. Pasteris, 2008: Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. Int. J. Climatol., 28, 20312064, https://doi.org/10.1002/joc.1688.

    • Search Google Scholar
    • Export Citation
  • Ford, T. W., D. Budikova, and J. D. Wright, 2021: Characterizing winter season severity in the Midwest United States, Part I: Climatology and recent trends. Int. J. Climatol., 42, 35373552, https://doi.org/10.1002/joc.7431.

    • Search Google Scholar
    • Export Citation
  • Francis, J. A., and S. J. Vavrus, 2012: Evidence linking Arctic amplification to extreme weather in mid-latitudes. Geophys. Res. Lett., 39, L06801, https://doi.org/10.1029/2012GL051000.

    • Search Google Scholar
    • Export Citation
  • Friedman, J. H., 1991: Multivariate adaptive regression splines. Ann. Stat., 19 (1), 167, https://doi.org/10.1214/aos/1176347963.

  • Friedman, J. H., 2001: Greedy function approximation: A gradient boosting machine. Ann. Stat., 29, 11891232, https://doi.org/10.1214/aos/1013203451.

    • Search Google Scholar
    • Export Citation
  • Ghatak, D., G. Gong, and A. Frei, 2010: North American temperature, snowfall, and snow-Depth response to winter climate modes. J. Climate, 23, 23202332, https://doi.org/10.1175/2009JCLI3050.1.

    • Search Google Scholar
    • Export Citation
  • Gianola, D., H. Okut, K. A. Weigel, and G. J. M. Rosa, 2011: Predicting complex quantitative traits with Bayesian neural networks: A case study with Jersey cows and wheat. BMC Genet., 12, 87, https://doi.org/10.1186/1471-2156-12-87.

    • Search Google Scholar
    • Export Citation
  • Gibson, P. B., W. E. Chapman, A. Altinok, L. Delle Monache, M. J. DeFlorio, and D. E. Waliser, 2021: Training machine learning models on climate model output yields skillful interpretable seasonal precipitation forecasts. Commun. Earth Environ., 2, 159, https://doi.org/10.1038/s43247-021-00225-4.

    • Search Google Scholar
    • Export Citation
  • Grömping, U., 2009: Variable importance assessment in regression: Linear regression versus random forest. Amer. Stat., 63, 308319, https://doi.org/10.1198/tast.2009.08199.

    • Search Google Scholar
    • Export Citation
  • Gutowski, W. J., and Coauthors, 2020: The ongoing need for high-resolution regional climate models: Process understanding and stakeholder information. Bull. Amer. Meteor. Soc., 101, E664E683, https://doi.org/10.1175/BAMS-D-19-0113.1.

    • Search Google Scholar
    • Export Citation
  • Ham, Y. G., J. H. Kim, and J. J. Luo, 2019: Deep learning for multi-year ENSO forecasts. Nature, 573, 568572, https://doi.org/10.1038/s41586-019-1559-7.

    • Search Google Scholar
    • Export Citation
  • Hartnett, J. J., J. M. Collins, M. A. Baxter, and D. P. Chambers, 2014: Spatiotemporal snowfall trends in central New York. J. Appl. Meteor. Climatol., 53, 26852697, https://doi.org/10.1175/JAMC-D-14-0084.1.

    • Search Google Scholar
    • Export Citation
  • Hastie, T., and R. Tibshirani, 1986: Generalized additive models. Stat. Sci., 1, 297310, https://doi.org/10.1214/ss/1177013604.

  • Hastie, T., and R. Tibshirani, 1990: Generalized Additive Models. CRC Press, 352 pp.

  • Hill, J. D., 1971: Snow squalls in the lee of Lakes Erie and Ontario: A review of the literature. NOAA Tech. Memo. NWS ER-43, 26 pp., https://repository.library.noaa.gov/view/noaa/6330.

  • Hong, S.-Y., J. Dudhia, and S.-H. Chen, 2004: A revised approach to ice microphysical processes for the bulk parameterization of clouds and precipitation. Mon. Wea. Rev., 132, 103120, https://doi.org/10.1175/1520-0493(2004)132<0103:ARATIM>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Kapelner, A., and J. Bleich, 2016: bartMachine: Machine learning with bayesian additive regression trees. J. Stat. Software, 70 (4), 140, https://doi.org/10.18637/jss.v070.i04.

    • Search Google Scholar
    • Export Citation
  • Kluver, D., and D. Leathers, 2015: Winter snowfall prediction in the United States using multiple discriminant analysis. Int. J. Climatol., 35, 20032018, https://doi.org/10.1002/joc.4103.

    • Search Google Scholar
    • Export Citation
  • Kolka, R. K., C. P. Giardina, J. D. McClure, A. Mayer, and M. F. Jurgensen, 2010: Partitioning hydrologic contributions to an ‘old-growth’ riparian area in the Huron Mountains of Michigan, USA. Ecohydrology, 3, 315324, https://doi.org/10.1002/eco.112.

    • Search Google Scholar
    • Export Citation
  • Kuhn, M., 2008: Building predictive models in R using the caret package. J. Stat. Software, 28 (5), 126, https://doi.org/10.18637/jss.v028.i05.

    • Search Google Scholar
    • Export Citation
  • Kunkel, K. E., N. E. Westcott, and D. A. R. Kristovich, 2002: Assessment of potential effects of climate change on heavy lake-effect snowstorms near Lake Erie. J. Great Lakes Res., 28, 521536, https://doi.org/10.1016/S0380-1330(02)70603-5.

    • Search Google Scholar
    • Export Citation
  • Kunkel, K. E., L. Ensor, M. Palecki, D. Easterling, D. Robinson, K. G. Hubbard, and K. Redmond, 2009: A new look at lake-effect snowfall trends in the Laurentian Great Lakes using a temporally homogeneous data set. J. Great Lakes Res., 35, 2329, https://doi.org/10.1016/j.jglr.2008.11.003.

    • Search Google Scholar
    • Export Citation
  • Leathers, D. J., B. Yarnal, and M. A. Palecki, 1991: The Pacific/North American teleconnection pattern and United States climate. Part I: Regional temperature and precipitation associations. J. Climate, 4, 517528, https://doi.org/10.1175/1520-0442(1991)004<0517:TPATPA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Li, X., Z.-Z. Hu, P. Liang, and J. Zhu, 2019: Contrastive influence of ENSO and PNA on variability and predictability of North American winter precipitation. J. Climate, 32, 62716284, https://doi.org/10.1175/JCLI-D-19-0033.1.

    • Search Google Scholar
    • Export Citation
  • Lin, Y.-C., A. Fujisaki-Manome, and J. Wang, 2022: Recently amplified interannual variability of the Great Lakes ice cover in response to changing teleconnections. J. Climate, 35, 62836300, https://doi.org/10.1175/JCLI-D-21-0448.1.

    • Search Google Scholar
    • Export Citation
  • Maeda, N., 2021: Brief overview of ice nucleation. Molecules, 26, 392, https://doi.org/10.3390/molecules26020392.

  • Mantua, N. J., S. R. Hare, Y. Zhang, J. M. Wallace, and R. C. Francis, 1997: A Pacific interdecadal climate oscillation with impacts on salmon production. Bull. Amer. Meteor. Soc., 78, 10691079, https://doi.org/10.1175/1520-0477(1997)078<1069:APICOW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Marshall, J., and Coauthors, 2001: North Atlantic climate variability: Phenomena, impacts and mechanisms. Int. J. Climatol., 21, 18631898, https://doi.org/10.1002/joc.693.

    • Search Google Scholar
    • Export Citation
  • Meng, L., and Y. Ma, 2021: On the relationship of lake-effect snowfall and teleconnections in the Lower Peninsula of Michigan, USA. J. Great Lakes Res., 47, 134144, https://doi.org/10.1016/j.jglr.2020.11.013.

    • Search Google Scholar
    • Export Citation
  • Meng, L., B. D. Ayon, N. Koirala, and K. M. Baker, 2021: Inter-annual variability of snowfall in the Lower Peninsula of Michigan. Front. Water, 3, 746354, https://doi.org/10.3389/frwa.2021.746354.

    • Search Google Scholar
    • Export Citation
  • Molnar, C., G. Casalicchio, and B. Bischl, 2020: Interpretable machine learning—A brief history, state-of-the-art and challenges. ECML PKDD 2020 Workshops, I. Koprinska et al., Eds., Communications in Computer and Information Science, Vol. 1323, Springer, 417–431.

  • Nayak, M. A., and S. Ghosh, 2013: Prediction of extreme rainfall event using weather pattern recognition and support vector machine classifier. Theor. Appl. Climatol., 114, 583603, https://doi.org/10.1007/s00704-013-0867-3.

    • Search Google Scholar
    • Export Citation
  • Nelder, J. A., and R. W. M. Wedderburn, 1972: Generalized linear models. J. Roy. Stat. Soc., 135A, 370384, https://doi.org/10.2307/2344614.

    • Search Google Scholar
    • Export Citation
  • Newman, M., and Coauthors, 2016: The Pacific decadal oscillation, revisited. J. Climate, 29, 43994427, https://doi.org/10.1175/JCLI-D-15-0508.1.

    • Search Google Scholar
    • Export Citation
  • Nguyen, D., and B. Widrow, 1990: Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. 1990 IJCNN Int. Joint Conf. on Neural Networks, San Diego, CA, Institute of Electrical and Electronics Engineers, 21–26, https://doi.org/10.1109/IJCNN.1990.137819.

  • Niziol, T. A., 1987: Operational forecasting of lake effect snowfall in western and central New York. Wea. Forecasting, 2, 310321, https://doi.org/10.1175/1520-0434(1987)002<0310:OFOLES>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Niziol, T. A., W. R. Snyder, and J. S. Waldstreicher, 1995: Winter weather forecasting throughout the eastern United States. Part IV: Lake effect snow. Wea. Forecasting, 10, 6177, https://doi.org/10.1175/1520-0434(1995)010<0061:WWFTTE>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Notaro, M., A. Zarrin, S. Vavrus, and V. Bennington, 2013: Simulation of heavy lake-effect snowstorms across the Great Lakes Basin by RegCM4: Synoptic climatology and variability. Mon. Wea. Rev., 141, 19902014, https://doi.org/10.1175/MWR-D-11-00369.1.

    • Search Google Scholar
    • Export Citation
  • Notaro, M., V. Bennington, and S. Vavrus, 2015: Dynamically downscaled projections of lake-effect snow in the Great Lakes Basin. J. Climate, 28, 16611684, https://doi.org/10.1175/JCLI-D-14-00467.1.

    • Search Google Scholar
    • Export Citation
  • O’Gorman, P. A., and J. G. Dwyer, 2018: Using machine learning to parameterize moist convection: Potential for modeling of climate, climate change, and extreme events. J. Adv. Model. Earth Syst., 10, 25482563, https://doi.org/10.1029/2018MS001351.

    • Search Google Scholar
    • Export Citation
  • Peace, R. L., Jr., and R. B. Sykes, 1966: Mesoscale study of a lake effect snow storm. Mon. Wea. Rev., 94, 495507, https://doi.org/10.1175/1520-0493(1966)094<0495:MSOALE>2.3.CO;2.

    • Search Google Scholar
    • Export Citation
  • Pettersen, C., M. S. Kulie, L. F. Bliven, A. J. Merrelli, W. A. Petersen, T. J. Wagner, D. B. Wolff, and N. B. Wood, 2020: A composite analysis of snowfall modes from four winter seasons in Marquette, Michigan. J. Appl. Meteor. Climatol., 59, 103124, https://doi.org/10.1175/JAMC-D-19-0099.1.

    • Search Google Scholar
    • Export Citation
  • Quante, L., S. N. Willner, R. Middelanis, and A. Levermann, 2021: Regions of intensification of extreme snowfall under future warming. Sci. Rep., 11, 16621, https://doi.org/10.1038/s41598-021-95979-4.

    • Search Google Scholar
    • Export Citation
  • Reddy, B. S. N., S. K. Pramada, and T. Roshni, 2021: Monthly surface runoff prediction using artificial intelligence: A study from a tropical climate river basin. J. Earth Syst. Sci., 130, 35, https://doi.org/10.1007/s12040-020-01508-8.

    • Search Google Scholar
    • Export Citation
  • Robertson, A. W., A. Kumar, M. Peña, and F. Vitart, 2015: Improving and promoting subseasonal to seasonal prediction. Bull. Amer. Meteor. Soc., 96, ES49ES53, https://doi.org/10.1175/BAMS-D-14-00139.1.

    • Search Google Scholar
    • Export Citation
  • Sauter, T., B. Weitzenkamp, and C. Schneider, 2010: Spatio-temporal prediction of snow cover in the Black Forest mountain range using remote sensing and a recurrent neural network. Int. J. Climatol., 30, 23302341, https://doi.org/10.1002/joc.2043.

    • Search Google Scholar
    • Export Citation
  • Scherrer, S. C., P. Ceppi, M. Croci-Maspoli, and C. Appenzeller, 2012: Snow-albedo feedback and Swiss spring temperature trends. Theor. Appl. Climatol., 110, 509516, https://doi.org/10.1007/s00704-012-0712-0.

    • Search Google Scholar
    • Export Citation
  • Schneider, T., J. Teixeira, C. Bretherton, F. Brient, K. G. Pressel, C. Schär, and A. P. Siebesma, 2017: Climate goals and computing the future of clouds. Nat. Climate Change, 7, 35, https://doi.org/10.1038/nclimate3190.

    • Search Google Scholar
    • Export Citation
  • Sheridan, L. M., and Coauthors, 2022: Validation of wind resource and energy production simulations for small wind turbines in the United States. Wind Energy Sci., 7, 659676, https://doi.org/10.5194/wes-7-659-2022.

    • Search Google Scholar
    • Export Citation
  • Shi, Q., and P. Xue, 2019: Impact of lake surface temperature variations on lake effect snow over the Great Lakes region. J. Geophys. Res. Atmos., 124, 12 55312 567, https://doi.org/10.1029/2019JD031261.

    • Search Google Scholar
    • Export Citation
  • Shi, S., and G. Liu, 2021: The latitudinal dependence in the trend of snow event to precipitation event ratio. Sci. Rep., 11, 18112, https://doi.org/10.1038/s41598-021-97451-9.

    • Search Google Scholar
    • Export Citation
  • Sillmann, J., and Coauthors, 2017: Understanding, modeling and predicting weather and climate extremes: Challenges and opportunities. Wea. Climate Extremes, 18, 6574, https://doi.org/10.1016/j.wace.2017.10.003.

    • Search Google Scholar
    • Export Citation
  • Smith, S. R., and J. J. O’Brien, 2001: Regional snowfall distributions associated with ENSO: Implications for seasonal forecasting. Bull. Amer. Meteor. Soc., 82, 11791191, https://doi.org/10.1175/1520-0477(2001)082<1179:RSDAWE>2.3.CO;2.

    • Search Google Scholar
    • Export Citation
  • Suriano, Z. J., and D. J. Leathers, 2017: Synoptic climatology of lake-effect snowfall conditions in the eastern Great Lakes region. Int. J. Climatol., 37, 43774389, https://doi.org/10.1002/joc.5093.

    • Search Google Scholar
    • Export Citation
  • Trenberth, K. E., and J. W. Hurrell, 1994: Decadal atmosphere-ocean variations in the Pacific. Climate Dyn., 9, 303319, https://doi.org/10.1007/BF00204745.

    • Search Google Scholar
    • Export Citation
  • Trenberth, K. E., and C. J. Guillemot, 1996: Physical processes involved in the 1988 drought and 1993 floods in North America. J. Climate, 9, 12881298, https://doi.org/10.1175/1520-0442(1996)009<1288:PPIITD>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Tripathi, S., V. V. Srinivas, and R. S. Nanjundiah, 2006: Downscaling of precipitation for climate change scenarios: A support vector machine approach. J. Hydrol., 330, 621640, https://doi.org/10.1016/j.jhydrol.2006.04.030.

    • Search Google Scholar
    • Export Citation
  • Wei, W., Z. Yan, X. Tong, Z. Han, M. Ma, S. Yu, and J. Xia, 2022: Seasonal prediction of summer extreme precipitation over the Yangtze River based on random forest. Wea. Climate Extremes, 37, 100477, https://doi.org/10.1016/j.wace.2022.100477.

    • Search Google Scholar
    • Export Citation
  • Zeng, L., 2000: Weather derivatives and weather insurance: Concept, application, and analysis. Bull. Amer. Meteor. Soc., 81, 20752082, https://doi.org/10.1175/1520-0477(2000)081<2075:WDAWIC>2.3.CO;2.

    • Search Google Scholar
    • Export Citation
  • Zhou, Q., D. Li, S. Xia, Z. Chen, B. Wang, and J. Wu, 2021: Plant–rodent interactions after a heavy snowfall decrease plant regeneration and soil carbon emission in an old-growth forest. For. Ecosyst., 8, 30, https://doi.org/10.1186/s40663-021-00310-2.

    • Search Google Scholar
    • Export Citation
  • Zhu, L., and P. Aguilera, 2021: Evaluating variations in tropical cyclone precipitation in eastern Mexico using machine learning techniques. J. Geophys. Res. Atmos., 126, e2021JD034604, https://doi.org/10.1029/2021JD034604.

    • Search Google Scholar
    • Export Citation
  • Zwiers, F. W., and Coauthors, 2013: Climate extremes: Challenges in estimating and understanding recent changes in the frequency and intensity of extreme climate and weather events. Climate Science for Serving Society, G. R. Asrar and J. W. Hurrell, Eds., Springer, 339–389.

Supplementary Materials

Save
  • Agee, E. M., and M. L. Hart, 1990: Boundary layer and mesoscale structure over Lake Michigan during a wintertime cold air outbreak. J. Atmos. Sci., 47, 22932316, https://doi.org/10.1175/1520-0469(1990)047<2293:BLAMSO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Bai, X., J. Wang, C. Sellinger, A. Clites, and R. Assel, 2012: Interannual variability of Great Lakes ice cover and its relationship to NAO and ENSO. J. Geophys. Res., 117, C03002, https://doi.org/10.1029/2010JC006932.

    • Search Google Scholar
    • Export Citation
  • Baijnath-Rodino, J. A., C. R. Duguay, and E. LeDrew, 2018: Climatological trends of snowfall over the Laurentian Great Lakes Basin. Int. J. Climatol., 38, 39423962, https://doi.org/10.1002/joc.5546.

    • Search Google Scholar
    • Export Citation
  • Balshi, M. S., A. D. McGuire, P. Duffy, M. Flannigan, J. Walsh, and J. Melillo, 2009: Assessing the response of area burned to changing climate in western boreal North America using a Multivariate Adaptive Regression Splines (MARS) approach. Global Change Biol., 15, 578600, https://doi.org/10.1111/j.1365-2486.2008.01679.x.

    • Search Google Scholar
    • Export Citation
  • Bell, B., and Coauthors, 2021: The ERA5 global reanalysis: Preliminary extension to 1950. Quart. J. Roy. Meteor. Soc., 147, 41864227, https://doi.org/10.1002/qj.4174.

    • Search Google Scholar
    • Export Citation
  • Bernard, D., S. Konate, and E. Savoia, 2019: Snow storms and vulnerable populations: Local public health activities in response to the 2014–2015 severe winter weather. Disaster Med. Public Health Prep., 13, 647649, https://doi.org/10.1017/dmp.2018.81.

    • Search Google Scholar
    • Export Citation
  • Bjorkman, A. D., S. C. Elmendorf, A. L. Beamish, M. Vellend, and G. H. R. Henry, 2015: Contrasting effects of warming and increased snowfall on Arctic tundra plant phenology over the past two decades. Global Change Biol., 21, 46514661, https://doi.org/10.1111/gcb.13051.

    • Search Google Scholar
    • Export Citation
  • Boser, B. E., I. M. Guyon, and V. N. Vapnik, 1992: A training algorithm for optimal margin classifiers. COLT’92: Proc. Fifth Annual ACM Workshop on Computational Learning Theory, Pittsburgh, PA, Association for Computing Machinery, 144–152, https://doi.org/10.1145/130385.130401.

  • Braham, R. R., and M. J. Dungey, 1984: Quantitative estimates of the effect of Lake Michigan on snowfall. J. Climate Appl. Meteor., 23, 940949, https://doi.org/10.1175/1520-0450(1984)023<0940:QEOTEO>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Breiman, L., 2001: Random forests. Mach. Learn., 45, 532, https://doi.org/10.1023/A:1010933404324.

  • Budikova, D., 2012: Northern Hemisphere climate variability: Character, forcing mechanisms, and significance of the North Atlantic/Arctic Oscillation. Geogr. Compass, 6, 401422, https://doi.org/10.1111/j.1749-8198.2012.00498.x.

    • Search Google Scholar
    • Export Citation
  • Budikova, D., T. W. Ford, and J. D. Wright, 2022: Characterizing winter season severity in the Midwest United States, Part II: Interannual variability. Int. J. Climatol., 42, 34993516, https://doi.org/10.1002/joc.7429.

    • Search Google Scholar
    • Export Citation
  • Chantry, M., H. Christensen, P. Dueben, and T. Palmer, 2021: Opportunities and challenges for machine learning in weather and climate modelling: Hard, medium and soft AI. Philos. Trans. Roy. Soc., A379, 20200083, https://doi.org/10.1098/rsta.2020.0083.

    • Search Google Scholar
    • Export Citation
  • Chen, J., and P. Kumar, 2002: Role of terrestrial hydrologic memory in modulating ENSO impacts in North America. J. Climate, 15, 35693585, https://doi.org/10.1175/1520-0442(2003)015<3569:ROTHMI>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Chen, S., and L. Song, 2018: Impact of the winter North Pacific Oscillation on the surface air temperature over Eurasia and North America: Sensitivity to the index definition. Adv. Atmos. Sci., 35, 702712, https://doi.org/10.1007/s00376-017-7111-5.

    • Search Google Scholar
    • Export Citation
  • Chipman, H. A., E. I. George, and R. E. McCulloch, 2010: BART: Bayesian additive regression trees. Ann. Appl. Stat., 4, 266298, https://doi.org/10.1214/09-AOAS285.

    • Search Google Scholar
    • Export Citation
  • Chiu, Y. M., F. Chebana, B. Abdous, D. Bélanger, and P. Gosselin, 2021: Cardiovascular health peaks and meteorological conditions: A quantile regression approach. Int. J. Environ. Res. Public Health, 18, 13277, https://doi.org/10.3390/ijerph182413277.

    • Search Google Scholar
    • Export Citation
  • Choubin, B., M. Borji, A. Mosavi, F. Sajedi-Hosseini, V. P. Singh, and S. Shamshirband, 2019: Snow avalanche hazard prediction using machine learning methods. J. Hydrol., 577, 123929, https://doi.org/10.1016/j.jhydrol.2019.123929.

    • Search Google Scholar
    • Export Citation
  • Clark, C. A., and Coauthors, 2016: Spatiotemporal snowfall variability in the Lake Michigan region: How is warming affecting wintertime snowfall? J. Appl. Meteor. Climatol., 55, 18131830, https://doi.org/10.1175/JAMC-D-15-0285.1.

    • Search Google Scholar
    • Export Citation
  • Daly, C., M. Halbleib, J. I. Smith, W. P. Gibson, M. K. Doggett, G. H. Taylor, J. Curtis, and P. P. Pasteris, 2008: Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. Int. J. Climatol., 28, 20312064, https://doi.org/10.1002/joc.1688.

    • Search Google Scholar
    • Export Citation
  • Ford, T. W., D. Budikova, and J. D. Wright, 2021: Characterizing winter season severity in the Midwest United States, Part I: Climatology and recent trends. Int. J. Climatol., 42, 35373552, https://doi.org/10.1002/joc.7431.

    • Search Google Scholar
    • Export Citation
  • Francis, J. A., and S. J. Vavrus, 2012: Evidence linking Arctic amplification to extreme weather in mid-latitudes. Geophys. Res. Lett., 39, L06801, https://doi.org/10.1029/2012GL051000.

    • Search Google Scholar
    • Export Citation
  • Friedman, J. H., 1991: Multivariate adaptive regression splines. Ann. Stat., 19 (1), 167, https://doi.org/10.1214/aos/1176347963.

  • Friedman, J. H., 2001: Greedy function approximation: A gradient boosting machine. Ann. Stat., 29, 11891232, https://doi.org/10.1214/aos/1013203451.

    • Search Google Scholar
    • Export Citation
  • Ghatak, D., G. Gong, and A. Frei, 2010: North American temperature, snowfall, and snow-Depth response to winter climate modes. J. Climate, 23, 23202332, https://doi.org/10.1175/2009JCLI3050.1.

    • Search Google Scholar
    • Export Citation
  • Gianola, D., H. Okut, K. A. Weigel, and G. J. M. Rosa, 2011: Predicting complex quantitative traits with Bayesian neural networks: A case study with Jersey cows and wheat. BMC Genet., 12, 87, https://doi.org/10.1186/1471-2156-12-87.

    • Search Google Scholar
    • Export Citation
  • Gibson, P. B., W. E. Chapman, A. Altinok, L. Delle Monache, M. J. DeFlorio, and D. E. Waliser, 2021: Training machine learning models on climate model output yields skillful interpretable seasonal precipitation forecasts. Commun. Earth Environ., 2, 159, https://doi.org/10.1038/s43247-021-00225-4.

    • Search Google Scholar
    • Export Citation
  • Grömping, U., 2009: Variable importance assessment in regression: Linear regression versus random forest. Amer. Stat., 63, 308319, https://doi.org/10.1198/tast.2009.08199.

    • Search Google Scholar
    • Export Citation
  • Gutowski, W. J., and Coauthors, 2020: The ongoing need for high-resolution regional climate models: Process understanding and stakeholder information. Bull. Amer. Meteor. Soc., 101, E664E683, https://doi.org/10.1175/BAMS-D-19-0113.1.

    • Search Google Scholar
    • Export Citation
  • Ham, Y. G., J. H. Kim, and J. J. Luo, 2019: Deep learning for multi-year ENSO forecasts. Nature, 573, 568572, https://doi.org/10.1038/s41586-019-1559-7.

    • Search Google Scholar
    • Export Citation
  • Hartnett, J. J., J. M. Collins, M. A. Baxter, and D. P. Chambers, 2014: Spatiotemporal snowfall trends in central New York. J. Appl. Meteor. Climatol., 53, 26852697, https://doi.org/10.1175/JAMC-D-14-0084.1.

    • Search Google Scholar
    • Export Citation
  • Hastie, T., and R. Tibshirani, 1986: Generalized additive models. Stat. Sci., 1, 297310, https://doi.org/10.1214/ss/1177013604.

  • Hastie, T., and R. Tibshirani, 1990: Generalized Additive Models. CRC Press, 352 pp.

  • Hill, J. D., 1971: Snow squalls in the lee of Lakes Erie and Ontario: A review of the literature. NOAA Tech. Memo. NWS ER-43, 26 pp., https://repository.library.noaa.gov/view/noaa/6330.

  • Hong, S.-Y., J. Dudhia, and S.-H. Chen, 2004: A revised approach to ice microphysical processes for the bulk parameterization of clouds and precipitation. Mon. Wea. Rev., 132, 103120, https://doi.org/10.1175/1520-0493(2004)132<0103:ARATIM>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Kapelner, A., and J. Bleich, 2016: bartMachine: Machine learning with bayesian additive regression trees. J. Stat. Software, 70 (4), 140, https://doi.org/10.18637/jss.v070.i04.

    • Search Google Scholar
    • Export Citation
  • Kluver, D., and D. Leathers, 2015: Winter snowfall prediction in the United States using multiple discriminant analysis. Int. J. Climatol., 35, 20032018, https://doi.org/10.1002/joc.4103.

    • Search Google Scholar
    • Export Citation
  • Kolka, R. K., C. P. Giardina, J. D. McClure, A. Mayer, and M. F. Jurgensen, 2010: Partitioning hydrologic contributions to an ‘old-growth’ riparian area in the Huron Mountains of Michigan, USA. Ecohydrology, 3, 315324, https://doi.org/10.1002/eco.112.

    • Search Google Scholar
    • Export Citation
  • Kuhn, M., 2008: Building predictive models in R using the caret package. J. Stat. Software, 28 (5), 126, https://doi.org/10.18637/jss.v028.i05.

    • Search Google Scholar
    • Export Citation
  • Kunkel, K. E., N. E. Westcott, and D. A. R. Kristovich, 2002: Assessment of potential effects of climate change on heavy lake-effect snowstorms near Lake Erie. J. Great Lakes Res., 28, 521536, https://doi.org/10.1016/S0380-1330(02)70603-5.

    • Search Google Scholar
    • Export Citation
  • Kunkel, K. E., L. Ensor, M. Palecki, D. Easterling, D. Robinson, K. G. Hubbard, and K. Redmond, 2009: A new look at lake-effect snowfall trends in the Laurentian Great Lakes using a temporally homogeneous data set. J. Great Lakes Res., 35, 2329, https://doi.org/10.1016/j.jglr.2008.11.003.

    • Search Google Scholar
    • Export Citation
  • Leathers, D. J., B. Yarnal, and M. A. Palecki, 1991: The Pacific/North American teleconnection pattern and United States climate. Part I: Regional temperature and precipitation associations. J. Climate, 4, 517528, https://doi.org/10.1175/1520-0442(1991)004<0517:TPATPA>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Li, X., Z.-Z. Hu, P. Liang, and J. Zhu, 2019: Contrastive influence of ENSO and PNA on variability and predictability of North American winter precipitation. J. Climate, 32, 62716284, https://doi.org/10.1175/JCLI-D-19-0033.1.

    • Search Google Scholar
    • Export Citation
  • Lin, Y.-C., A. Fujisaki-Manome, and J. Wang, 2022: Recently amplified interannual variability of the Great Lakes ice cover in response to changing teleconnections. J. Climate, 35, 62836300, https://doi.org/10.1175/JCLI-D-21-0448.1.

    • Search Google Scholar
    • Export Citation
  • Maeda, N., 2021: Brief overview of ice nucleation. Molecules, 26, 392, https://doi.org/10.3390/molecules26020392.

  • Mantua, N. J., S. R. Hare, Y. Zhang, J. M. Wallace, and R. C. Francis, 1997: A Pacific interdecadal climate oscillation with impacts on salmon production. Bull. Amer. Meteor. Soc., 78, 10691079, https://doi.org/10.1175/1520-0477(1997)078<1069:APICOW>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Marshall, J., and Coauthors, 2001: North Atlantic climate variability: Phenomena, impacts and mechanisms. Int. J. Climatol., 21, 18631898, https://doi.org/10.1002/joc.693.

    • Search Google Scholar
    • Export Citation
  • Meng, L., and Y. Ma, 2021: On the relationship of lake-effect snowfall and teleconnections in the Lower Peninsula of Michigan, USA. J. Great Lakes Res., 47, 134144, https://doi.org/10.1016/j.jglr.2020.11.013.

    • Search Google Scholar
    • Export Citation
  • Meng, L., B. D. Ayon, N. Koirala, and K. M. Baker, 2021: Inter-annual variability of snowfall in the Lower Peninsula of Michigan. Front. Water, 3, 746354, https://doi.org/10.3389/frwa.2021.746354.

    • Search Google Scholar
    • Export Citation
  • Molnar, C., G. Casalicchio, and B. Bischl, 2020: Interpretable machine learning—A brief history, state-of-the-art and challenges. ECML PKDD 2020 Workshops, I. Koprinska et al., Eds., Communications in Computer and Information Science, Vol. 1323, Springer, 417–431.

  • Nayak, M. A., and S. Ghosh, 2013: Prediction of extreme rainfall event using weather pattern recognition and support vector machine classifier. Theor. Appl. Climatol., 114, 583603, https://doi.org/10.1007/s00704-013-0867-3.

    • Search Google Scholar
    • Export Citation
  • Nelder, J. A., and R. W. M. Wedderburn, 1972: Generalized linear models. J. Roy. Stat. Soc., 135A, 370384, https://doi.org/10.2307/2344614.

    • Search Google Scholar
    • Export Citation
  • Newman, M., and Coauthors, 2016: The Pacific decadal oscillation, revisited. J. Climate, 29, 43994427, https://doi.org/10.1175/JCLI-D-15-0508.1.

    • Search Google Scholar
    • Export Citation
  • Nguyen, D., and B. Widrow, 1990: Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. 1990 IJCNN Int. Joint Conf. on Neural Networks, San Diego, CA, Institute of Electrical and Electronics Engineers, 21–26, https://doi.org/10.1109/IJCNN.1990.137819.

  • Niziol, T. A., 1987: Operational forecasting of lake effect snowfall in western and central New York. Wea. Forecasting, 2, 310321, https://doi.org/10.1175/1520-0434(1987)002<0310:OFOLES>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Niziol, T. A., W. R. Snyder, and J. S. Waldstreicher, 1995: Winter weather forecasting throughout the eastern United States. Part IV: Lake effect snow. Wea. Forecasting, 10, 6177, https://doi.org/10.1175/1520-0434(1995)010<0061:WWFTTE>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Notaro, M., A. Zarrin, S. Vavrus, and V. Bennington, 2013: Simulation of heavy lake-effect snowstorms across the Great Lakes Basin by RegCM4: Synoptic climatology and variability. Mon. Wea. Rev., 141, 19902014, https://doi.org/10.1175/MWR-D-11-00369.1.

    • Search Google Scholar
    • Export Citation
  • Notaro, M., V. Bennington, and S. Vavrus, 2015: Dynamically downscaled projections of lake-effect snow in the Great Lakes Basin. J. Climate, 28, 16611684, https://doi.org/10.1175/JCLI-D-14-00467.1.

    • Search Google Scholar
    • Export Citation
  • O’Gorman, P. A., and J. G. Dwyer, 2018: Using machine learning to parameterize moist convection: Potential for modeling of climate, climate change, and extreme events. J. Adv. Model. Earth Syst., 10, 25482563, https://doi.org/10.1029/2018MS001351.

    • Search Google Scholar
    • Export Citation
  • Peace, R. L., Jr., and R. B. Sykes, 1966: Mesoscale study of a lake effect snow storm. Mon. Wea. Rev., 94, 495507, https://doi.org/10.1175/1520-0493(1966)094<0495:MSOALE>2.3.CO;2.

    • Search Google Scholar
    • Export Citation
  • Pettersen, C., M. S. Kulie, L. F. Bliven, A. J. Merrelli, W. A. Petersen, T. J. Wagner, D. B. Wolff, and N. B. Wood, 2020: A composite analysis of snowfall modes from four winter seasons in Marquette, Michigan. J. Appl. Meteor. Climatol., 59, 103124, https://doi.org/10.1175/JAMC-D-19-0099.1.

    • Search Google Scholar
    • Export Citation
  • Quante, L., S. N. Willner, R. Middelanis, and A. Levermann, 2021: Regions of intensification of extreme snowfall under future warming. Sci. Rep., 11, 16621, https://doi.org/10.1038/s41598-021-95979-4.

    • Search Google Scholar
    • Export Citation
  • Reddy, B. S. N., S. K. Pramada, and T. Roshni, 2021: Monthly surface runoff prediction using artificial intelligence: A study from a tropical climate river basin. J. Earth Syst. Sci., 130, 35, https://doi.org/10.1007/s12040-020-01508-8.

    • Search Google Scholar
    • Export Citation
  • Robertson, A. W., A. Kumar, M. Peña, and F. Vitart, 2015: Improving and promoting subseasonal to seasonal prediction. Bull. Amer. Meteor. Soc., 96, ES49ES53, https://doi.org/10.1175/BAMS-D-14-00139.1.

    • Search Google Scholar
    • Export Citation
  • Sauter, T., B. Weitzenkamp, and C. Schneider, 2010: Spatio-temporal prediction of snow cover in the Black Forest mountain range using remote sensing and a recurrent neural network. Int. J. Climatol., 30, 23302341, https://doi.org/10.1002/joc.2043.

    • Search Google Scholar
    • Export Citation
  • Scherrer, S. C., P. Ceppi, M. Croci-Maspoli, and C. Appenzeller, 2012: Snow-albedo feedback and Swiss spring temperature trends. Theor. Appl. Climatol., 110, 509516, https://doi.org/10.1007/s00704-012-0712-0.

    • Search Google Scholar
    • Export Citation
  • Schneider, T., J. Teixeira, C. Bretherton, F. Brient, K. G. Pressel, C. Schär, and A. P. Siebesma, 2017: Climate goals and computing the future of clouds. Nat. Climate Change, 7, 35, https://doi.org/10.1038/nclimate3190.

    • Search Google Scholar
    • Export Citation
  • Sheridan, L. M., and Coauthors, 2022: Validation of wind resource and energy production simulations for small wind turbines in the United States. Wind Energy Sci., 7, 659676, https://doi.org/10.5194/wes-7-659-2022.

    • Search Google Scholar
    • Export Citation
  • Shi, Q., and P. Xue, 2019: Impact of lake surface temperature variations on lake effect snow over the Great Lakes region. J. Geophys. Res. Atmos., 124, 12 55312 567, https://doi.org/10.1029/2019JD031261.

    • Search Google Scholar
    • Export Citation
  • Shi, S., and G. Liu, 2021: The latitudinal dependence in the trend of snow event to precipitation event ratio. Sci. Rep., 11, 18112, https://doi.org/10.1038/s41598-021-97451-9.

    • Search Google Scholar
    • Export Citation
  • Sillmann, J., and Coauthors, 2017: Understanding, modeling and predicting weather and climate extremes: Challenges and opportunities. Wea. Climate Extremes, 18, 6574, https://doi.org/10.1016/j.wace.2017.10.003.

    • Search Google Scholar
    • Export Citation
  • Smith, S. R., and J. J. O’Brien, 2001: Regional snowfall distributions associated with ENSO: Implications for seasonal forecasting. Bull. Amer. Meteor. Soc., 82, 11791191, https://doi.org/10.1175/1520-0477(2001)082<1179:RSDAWE>2.3.CO;2.

    • Search Google Scholar
    • Export Citation
  • Suriano, Z. J., and D. J. Leathers, 2017: Synoptic climatology of lake-effect snowfall conditions in the eastern Great Lakes region. Int. J. Climatol., 37, 43774389, https://doi.org/10.1002/joc.5093.

    • Search Google Scholar
    • Export Citation
  • Trenberth, K. E., and J. W. Hurrell, 1994: Decadal atmosphere-ocean variations in the Pacific. Climate Dyn., 9, 303319, https://doi.org/10.1007/BF00204745.

    • Search Google Scholar
    • Export Citation
  • Trenberth, K. E., and C. J. Guillemot, 1996: Physical processes involved in the 1988 drought and 1993 floods in North America. J. Climate, 9, 12881298, https://doi.org/10.1175/1520-0442(1996)009<1288:PPIITD>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Tripathi, S., V. V. Srinivas, and R. S. Nanjundiah, 2006: Downscaling of precipitation for climate change scenarios: A support vector machine approach. J. Hydrol., 330, 621640, https://doi.org/10.1016/j.jhydrol.2006.04.030.

    • Search Google Scholar
    • Export Citation
  • Wei, W., Z. Yan, X. Tong, Z. Han, M. Ma, S. Yu, and J. Xia, 2022: Seasonal prediction of summer extreme precipitation over the Yangtze River based on random forest. Wea. Climate Extremes, 37, 100477, https://doi.org/10.1016/j.wace.2022.100477.

    • Search Google Scholar
    • Export Citation
  • Zeng, L., 2000: Weather derivatives and weather insurance: Concept, application, and analysis. Bull. Amer. Meteor. Soc., 81, 20752082, https://doi.org/10.1175/1520-0477(2000)081<2075:WDAWIC>2.3.CO;2.

    • Search Google Scholar
    • Export Citation
  • Zhou, Q., D. Li, S. Xia, Z. Chen, B. Wang, and J. Wu, 2021: Plant–rodent interactions after a heavy snowfall decrease plant regeneration and soil carbon emission in an old-growth forest. For. Ecosyst., 8, 30, https://doi.org/10.1186/s40663-021-00310-2.

    • Search Google Scholar
    • Export Citation
  • Zhu, L., and P. Aguilera, 2021: Evaluating variations in tropical cyclone precipitation in eastern Mexico using machine learning techniques. J. Geophys. Res. Atmos., 126, e2021JD034604, https://doi.org/10.1029/2021JD034604.

    • Search Google Scholar
    • Export Citation
  • Zwiers, F. W., and Coauthors, 2013: Climate extremes: Challenges in estimating and understanding recent changes in the frequency and intensity of extreme climate and weather events. Climate Science for Serving Society, G. R. Asrar and J. W. Hurrell, Eds., Springer, 339–389.

  • Fig. 1.

    Locations of all COOP stations with snowfall measurement in LPM.

  • Fig. 2.

    The model estimation skills for different ML algorithms, were calculated from the out-of-sample cross validation using the 20% testing data. Estimation and observation (same 20% testing data) are compared with the y = x line (red).

  • Fig. 3.

    Hold-1-yr-out cross validation results for RF and BART models. Each year, RF and BART are trained only using monthly snowfall observations from the other 64 years. Each hold-out model is used to estimate the monthly snowfall of all stations for the hold out year and then estimations are aggregated annually for each station.

  • Fig. 4.

    The PDP for the BART model developed by 80% of training data.

  • Fig. 5.

    The PDP for the RF model developed by 80% training data; the PDP shows how the dependent variable changes with each predictor used in the model.

  • Fig. 6.

    The PDP for the BART model developed by the 70th percentile data.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 2200 1663 241
PDF Downloads 619 203 14